From Vibe Coding to Spec-Pushed Improvement

Find out how to Construct a Claude Code-Powered Data Base

Utilizing Transformers to Forecast Extremely Uncommon Photo voltaic Flares

I in my earlier article, “From Code to Insights: Software program Engineering Finest Practices for Information Analysts”, that engineering expertise and greatest practices will be extremely helpful for analysts and different knowledge professionals.

That is much more true now within the AI period, when we’ve much more alternatives to construct our personal analytical instruments: from fancy knowledge viewers that show charts or showcase totally different eventualities, to simulators that may predict outcomes primarily based on enter parameters. Personally, I exploit net functions on a regular basis in my day-to-day work.

There was quite a lot of hype round vibe coding, however it appears that evidently skilled engineers are already shifting past it and leaning extra towards spec-driven improvement. Even Andrej Karpathy, who coined the time period “vibe coding” in February 2025, admitted only a yr later that this period is ending and that we’re coming into the age of agentic engineering — orchestrating brokers in opposition to detailed specs with human oversight.

In the present day (1 yr later), programming through LLM brokers is more and more turning into a default workflow for professionals, besides with extra oversight and scrutiny. The aim is to say the leverage from the usage of brokers however with none compromise on the standard of the software program. Many individuals have tried to provide you with a greater title for this to distinguish it from vibe coding, personally my present favourite “agentic engineering”:
– “agentic” as a result of the brand new default is that you’re not writing the code immediately 99% of the time, you’re orchestrating brokers who do and appearing as oversight.
– “engineering” to emphasise that there’s an artwork & science and experience to it. It’s one thing you’ll be able to study and grow to be higher at, with its personal depth of a unique sort.

On this article, I’d wish to put spec-driven improvement into observe on a greenfield venture, following the perfect practices from JetBrains’ course on DeepLearning.AI, “Spec-Pushed Improvement with Coding Brokers”.

The venture is a little more private, however nonetheless data-related. As I’m getting ready for my half marathon in September, I’m making an attempt to stability working and energy coaching. There are such a lot of instruments on the market, every centered on a unique a part of the journey, that discovering one resolution that really works for me has been surprisingly tough. So, I made a decision to feed two birds with one scone: construct my very own net app whereas hopefully studying one thing new alongside the way in which.

Prepared for motion? Me too. However earlier than we soar into implementation, let me first spend a couple of minutes on the idea behind spec-driven improvement.

Vibe coding vs Spec-driven improvement

Many people have already skilled vibe coding: you write a brief immediate (for instance, “Please add a DAU chart to my net utility”), await the agent to generate the change, run it regionally, and test whether or not the outcome matches your expectations.

Normally, it doesn’t. So that you return to the identical chat, ask the agent to regulate the chart, and maintain iterating till the result’s adequate.

This strategy works moderately nicely for easy initiatives, nevertheless it doesn’t scale nicely, particularly when a number of builders are engaged on the identical codebase.

The primary drawbacks are the shortage of greatest practices and shared conventions. For instance, and not using a structured strategy, groups can simply find yourself with 5 alternative ways to run ML mannequin coaching inside the identical DBT pipeline.

One other frequent challenge is that we often don’t persist the outcomes or reasoning from our conversations with AI brokers. Because of this, it turns into straightforward to lose observe of why sure selections had been made. For instance, an agent would possibly overlook why you cleaned up knowledge in a specific manner, and the subsequent replace might silently introduce a unique outcome.

Context decay can also be an particularly frequent drawback. AI brokers are stateless, and when engaged on bigger initiatives, we frequently have to start out new chats due to context window limitations, successfully beginning our communication from scratch.

Spec-driven improvement (SDD) is far nearer to conventional engineering practices. As a substitute of leaping straight into implementation, we begin by doing the laborious considering ourselves: making architectural selections, defining necessities, and documenting them in a structured markdown specification saved within the repository and up to date alongside the venture. This creates an vital shift: we decouple the specification (what we’re constructing and why) from the implementation (the precise code).

SDD addresses most of the core problems with vibe coding by preserving context throughout classes (and even throughout totally different AI brokers) whereas aligning each people and brokers across the venture’s fundamental non-negotiables.

SDD workflow

A typical spec-driven improvement workflow often consists of the next phases.

Step one is defining the structure — an settlement on the important thing selections for the venture. It often contains a number of core paperwork:

Mission explains the why: why are we constructing this venture, and what are its key targets and options?
Tech Stack paperwork technical selections, in addition to deployment and replace processes.
Roadmap outlines venture phases, deliberate options, and is repeatedly up to date because the venture evolves.

Specs will be created for each new and current initiatives, which makes this strategy fairly versatile.

As soon as the project-level documentation is in place, we will transfer on to the function improvement section, which generally contains:

Understanding what we need to construct and writing an in depth specification.
Implementing the adjustments.
Validating that the implementation works as anticipated.

After efficiently implementing your first function, you would possibly instantly really feel the urge to maneuver on to the subsequent one. However that is really the precise second to pause and rethink.

That is the place replanning is available in. It’s a devoted section for revisiting the structure and reviewing earlier function selections and plans to ensure they nonetheless align with the venture targets.

Now that we’ve lined the idea, let’s put it into observe.

Constructing

Sufficient concept, it’s time to construct. To higher perceive how spec-driven improvement works in observe, I made a decision to use it to an actual greenfield venture.

I began by creating a brand new repository for this venture (and, in fact, spending half an hour selecting the title and emblem): repository. I additionally documented my preliminary product imaginative and prescient within the README.md file.

One of many good issues in regards to the SDD strategy is that it’s largely agnostic to the selection of LLM, agent, or IDE, so you’ll be able to work with no matter setup you favor. For this venture, I’ll be utilizing Visible Studio Code with the Claude Code plugin, because it permits me to make use of Claude as an agent whereas additionally reviewing all code adjustments immediately within the editor.

Making a structure

As we mentioned, step one is to write down the structure. In fact, we don’t must do it manually, we will use LLMs to place it collectively primarily based on the preliminary product imaginative and prescient, in addition to further context gathered by follow-up questions.

We're constructing Trainlytics, a private health monitoring net app constructed
for individuals who need extra management, flexibility, and insights than customary
health apps present. Discover the complete necessities in README.md.

Let's create a "structure" in a specs listing that consists of 
the next components:
- mission.md - what and why we're constructing; the primary mission of the product
- tech-stack.md - core technical selections
- roadmap.md - venture phases damaged down in implementation order

IMPORTANT: You could use your AskUserQuestion device to get my suggestions.

The agent then asks a collection of clarifying questions that assist outline the venture structure and create an preliminary implementation plan.

In the long run, the agent created the three recordsdata we requested for.

At this level, you would possibly really feel the urge to instantly ask the agent to start out constructing the venture, however that may be too quickly.

Earlier than shifting ahead, we first must validate and refine the structure. It’s price spending time now aligning on the plan, as a result of this specification will later translate into 1000’s of strains of code. It’s significantly better to resolve ambiguities and errors early.

I often do that by studying the paperwork myself and iterating with the agent, asking clarifying questions and refining the plan step-by-step. A very good observe is to make all adjustments by the agent somewhat than patching paperwork your self to take care of consistency throughout the venture. For instance, I advised the agent that we want authentication within the app, since my use case is to log exercises from each desktop and cellular units. This led to updates in each the tech stack doc and the roadmap.

When you’re proud of the overview, you too can ask a second agent — with recent context — to critique the plan. There are numerous proof that reflection improves output high quality.

When all checks are full, it’s time to commit the structure to the repository.

First function section

Now, it’s time to maneuver on the primary function section.
Based on our roadmap, we’ll begin with the MVP: Core Exercise Logging. On the finish of this section, a consumer ought to be capable to log in on each desktop and cellular, report a run and a gymnasium session, and consider each of their historical past with full particulars.

As mentioned, every function section follows a easy cycle: plan → implement → validate. So let’s begin by defining the specification and constructing the plan.

Discover the subsequent section in specs/roadmap.md and create a brand new department, 
ask me about any steps within the specs that aren't totally clear.

Then create a brand new listing within the format YYYY-MM-DD-feature-name beneath specs/ 
for this function, with the next recordsdata:
- plan.md - a structured checklist of numbered job teams
- necessities.md - scope, key selections, and context
- validation.md - how we outline success and ensure the implementation can 
be merged

Use specs/mission.md and specs/tech-stack.md as steerage.

Tip: it’s price beginning a brand new session with clear context in your LLM agent.

The agent put collectively specs fairly rapidly.

At this level, it’s once more time to overview the specs and guarantee every little thing is aligned with the unique imaginative and prescient. As you’ll be able to see, with agentic engineering, the position of the developer shifts towards steering, reviewing, and making architectural selections, somewhat than immediately writing specs or code.

When you’re proud of the plan, it’s time to maneuver on to implementation. I desire to implement every group of duties individually somewhat than one-shotting your complete function section, however this is dependent upon the dimensions of the function. For this venture, I used the next immediate.

Take the subsequent job group from 2026-05-04-phase-1-mvp/plan.md and implement it.
Use necessities.md and validation.md for steerage.
As soon as finished, replace the standing in each the plan and validation paperwork.

When the code is prepared, it’s time for overview. This is among the most vital steps, so it’s price investing a while right here.

In data-related functions, I often focus my overview on the core enterprise logic and test that the numbers match my expectations.

I need to confess that I’ve near zero data of frontend applied sciences, so I not often overview frontend code intimately. As a substitute, I merely take a look at the interface regionally and test whether or not every little thing works as anticipated. For this case, I made a decision to run the app and see the way it works.

After a couple of iterations with the agent, we managed to run the app regionally, and it labored. We will already add totally different workouts and exercise varieties, and log each cardio and energy classes.

After the guide overview, it’s additionally helpful to make use of reflection and ask the brand new agent to confirm whether or not the implementation aligns with the plan, in addition to to undergo the factors outlined in validation.md.

In concept, spec-driven improvement means that the function section ends with validation. In observe, it not often works that cleanly. You’ll probably discover that some components of the implementation don’t work as anticipated. At that time, you’ve got two choices:

Add a pair extra iterations to your plan.md and proceed refining the function (this works nicely for smaller adjustments), or
If the problems are extra substantial, deal with them as a part of the subsequent function section and deal with them throughout replanning.

One vital factor to be careful for: it may be tempting to easily clarify the difficulty to the LLM agent and ask for fixes, as a substitute of updating the specs and transforming the implementation. Strive to withstand that shortcut. Holding the specification because the supply of reality is what makes the strategy strong.

As soon as all checks are full, we will create and merge the pull request.

At this level, we have already got a working utility and the outcomes are genuinely satisfying. Much more surprisingly, the entire course of took only a bit greater than two hours end-to-end (together with drafting this text whereas the agent was working).

Replanning

With such good progress, you would possibly really feel the urge to proceed constructing. I perceive that, however within the present AI period, the primary worth of a human lies in considering and structure. So that is really the precise second to step again and replicate: will we nonetheless need to proceed in the identical course, and what ought to we alter in our product and course of?

After I began utilizing the appliance myself, I realised it wasn’t but prepared to completely help my use case. Which means we have to reprioritise so I can begin utilizing it in my day-to-day life as quickly as doable. So, I did it with the next immediate.

Let's revise our plan in roadmap.md.
I'd prioritise the subsequent phases as follows:
1. Power session templates
I can reside with out planning, however I would like templates, as a result of I typically wrestle 
to recollect all of the workouts in a session.

The thought is:
- If a template already exists within the log, present all stats (workouts, units, 
reps, weight, and many others.). Permit enhancing these values and committing adjustments
- If something is modified, ask whether or not the consumer desires to replace the template

2. UI enhancements
The present design will not be but smooth sufficient, so I would prioritise a spherical of UI 
enhancements:
- Add the brand and product motto to the web site
- Add a settings tab to handle exercise varieties and workouts
- Create a single display to log each cardio and energy classes
- Enhance the historical past display with richer exercise particulars
- Permit including titles to actions (energy/cardio classes) and segments
- Help specifying time, not solely date
- Add extra shade to the interface (I like shades of blue)
- For cardio workouts, modify items to: minutes, kilometers, and min/km tempo

3. Fundamental analytics
Add easy analytics to the historical past display displaying weekly stats at
the highest of the web page (e.g. whole minutes and energy cut up between cardio
and energy).

Replanning can also be second to revisit our course of itself. For instance, I observed that we haven’t up to date roadmap.md constantly, and the specs are beginning to drift. It might even be helpful to introduce a changelog, so we’ve a transparent historical past of how the product has advanced over time.

Let’s ask agent to do it for us.

Please overview plan.md, replace roadmap.md to replicate accomplished work, 
and create a CHANGELOG.md file with a concise abstract of the adjustments.

Now that we’re aligned on course and have the precise setup in place, let’s maintain constructing.

The subsequent section

Now we will observe the identical course of and iterate by phases. Since this can be a repeatable cycle, it’s second to debate doable automations.

Up to now, we’ve been writing all prompts manually, however these workflows will also be automated as “expertise” in Claude Code or different LLM coding brokers.

Additionally, there are already implementations of spec-driven improvement that can be utilized out of the field. Some of the in style is Spec Equipment by GitHub.

You may set up it like this.

uv device set up specify-cli --from git+https://github.com/github/spec-kit.git
specify model # to test that it really works

Subsequent, that you must initialise the talents in Claude. This units up the .specify/ folder and installs slash instructions into .claude/instructions/

specify init . --integration claude 
# there are 30 integrations with brokers so specify the one you are utilizing

You’ll realize it labored when see the speckit instructions within the Claude Code.

As soon as put in, you’ll be able to observe the same workflow: begin by defining the structure, then iterate by function loops.

One distinction is that in Spec Equipment, the structure is extra centered on high-level issues like code high quality, testing requirements, UX consistency, and efficiency necessities.

To be sincere, I barely desire the strategy proposed by JetBrains, as a result of it retains extra context within the structure itself. However as all the time, there is no such thing as a silver bullet and Spec Equipment may fit higher relying in your use case. It’s additionally handy that you’ve SDD workflow already applied for you.

Utilizing Spec Equipment, I ran by the 2 phases described above, and it labored nicely. After the primary function section, improvement naturally turns into a steady enchancment cycle somewhat than a linear course of. And with that, I feel it’s time to wrap up this story.

Abstract

In whole, it took me round 4.5 hours to construct a usable end-to-end product for monitoring and analysing my knowledge. There may be nonetheless loads of room for enchancment, and I’ll proceed iterating on it. I can already see a number of potential UI enhancements, and I’d additionally wish to ultimately combine AI to make the app extra clever.

Frankly talking, it has been an attention-grabbing expertise working by such a structured improvement circulate. In my day-to-day work, I typically depend on one-off LLM chats to make adjustments, with out sustaining a full hint of selections and specs within the repository.

Nonetheless, there is no such thing as a one-size-fits-all strategy right here.

Should you simply need to make a small enchancment or run some ad-hoc evaluation in one more Jupyter pocket book, writing full specs upfront might be overkill.
However if you’re engaged on a bigger venture (particularly with different folks) spec-driven improvement would positively be my default strategy.

It’s additionally attention-grabbing to watch how the position of an engineer is shifting: from writing code on to focusing extra on architectural selections, overview, and system design.

And whereas it could sound a bit excessive in the present day, I do assume we’re regularly shifting towards a world the place English turns into the first “programming language” interface. We’re already seeing early makes an attempt on this course, akin to CodeSpeak, which discover extra natural-language-driven programming paradigms. I’ll attempt CodeSpeak in my subsequent article, so keep tuned.