Vibe coding, agentic hype, and why I’m still doing most of the thinking

Caveat emptor.

Author

Norman Simon Rodriguez

Published

3 December 2025

I’ve spent a fair amount of time experimenting with so-called vibe coding tools. The pitch is always the same: let the AI handle the engineering drudgework, reason about your codebase, plan changes, and keep the whole loop running while you focus on intent. After several rounds with Gemini CLI, Jules, and Spec Kit, the reality turned out very different. These tools can be helpful, but not in the autonomous way their marketing suggests.

Jules: impressive at first, chaotic in practice

Jules was my first serious test. It presents itself as an agent that can perform large refactors and deep architectural work. In use, it felt opaque. You hand it a task, it disappears to “think”, and it returns with confident output that looks structured and authoritative. The problem is what happens after a few iterations. Small inconsistencies creep in. Imports drift. Prior decisions get silently overwritten. Refactors only partially apply. Before long, the repo becomes a tangled mess that takes more time to untangle than it would have taken to refactor by hand.

Autonomy is the selling point, but it’s also the weakness. The cost of cleaning up behind the tool outweighs the convenience it promises.

Spec Kit: great for clarity, rigid for real development

Spec Kit lives at the opposite end of the spectrum. It forces you to articulate a clear specification before any code is generated. That part works well: writing the spec sharpens your thinking, and the initial output is usually coherent.

Where it struggles is change. The tool assumes your specifications will evolve linearly. Real projects never do. When you need to revise multiple interconnected behaviours, the system becomes brittle. Adjusting one part of the spec can trigger disproportionate rewrites elsewhere, and keeping the spec aligned with reality becomes its own form of overhead. It’s strong for a clean starting point, weak for the iterative, nonlinear nature of real software.

Gemini CLI: the workable middle ground

Gemini CLI sits between the two extremes. It gives you more freedom than Spec Kit but doesn’t try the sweeping autonomy of Jules. For narrow tasks—small scripts, single functions, localised fixes—it performs well. It forces you to articulate exactly what you want, which brings welcome mental clarity, and it saves typing on simple jobs.

As soon as tasks broaden, the familiar issues return. Drift. Overreach. Unexpected rewrites. A few cycles later you’re debugging things that shouldn’t have broken. I still use Gemini CLI, but only for well-bounded tasks where the blast radius is small.

The shared limitation: All three tools run into the same structural problem: they don’t maintain a stable internal model of your project. They don’t remember architectural constraints reliably, and they don’t protect design intent. They react to the latest prompt, not to the system as a whole. Because of that, they can’t take real responsibility for design or review. They amplify effort only when you remain the one thinking, deciding, and verifying.

My takeaway: Vibe coding can be a helpful accelerant for specific, tightly scoped tasks. It gives clarity, speeds up boilerplate, and helps prototype ideas. But it is not a substitute for engineering judgement, and it isn’t ready to run autonomously through a codebase.

Use it with precision. Keep it constrained. And treat it with the caution it deserves.