📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

The latest SDLC framework reveals that AI models constitute only about 10% of the system. The focus shifts to harness and context engineering, which are critical for effective AI implementation and cost management.

A new Google whitepaper titled The New SDLC With Vibe Coding emphasizes a counterintuitive insight: the AI model accounts for only about 10% of a system’s behavior. The real focus, it states, should be on the harness and context engineering surrounding the model. This shift in perspective could reshape how organizations approach AI development and deployment.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, reports that over 85% of professional developers now use AI coding agents, with 51% doing so daily. Despite the prominence of models like GPT and Claude, the authors argue that most failures and inefficiencies stem from how the AI is configured and integrated, not the model itself. For a deeper understanding of how to effectively manage AI systems, see The Model Is Only 10%: The Real Lesson of the New SDLC.

The key insight is that the ‘harness’ — including prompts, tools, rules, and observability — constitutes roughly 90% of an agent’s behavior. Evidence from benchmarks shows that tweaking the harness can significantly improve performance, even when using the same underlying model. For example, moving an agent from outside the top 30 to the top 5 on a benchmark was achieved solely through harness adjustments.

The paper advocates for a disciplined approach called agentic engineering, which combines structured context, verification, and oversight, contrasting with more casual ‘vibe coding’ workflows. This approach emphasizes the importance of context engineering—loading relevant information, instructions, and tools dynamically—to improve code quality and reduce costs.

Finally, the authors highlight that, although vibe coding appears cheap initially, it incurs higher long-term costs due to inefficiencies, security risks, and maintenance challenges. Conversely, investing in structured schemas and context management can lower total cost of ownership over time.

At a glance
reportWhen: published March 2026
The developmentA new Google whitepaper introduces a paradigm shift in software engineering, emphasizing that the core value lies outside the AI model itself.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Impact of Harness and Context Engineering on AI Effectiveness

This perspective shifts the focus from constantly chasing newer, larger models to optimizing how AI systems are configured and managed. For organizations, this means that competitive advantage lies in their ability to build, own, and improve the harness and context around AI models. It also suggests that long-term cost savings and reliability depend on disciplined engineering practices rather than model size or raw performance.

For developers and leaders, understanding that most AI failures are configuration issues underscores the importance of investing in skills like context engineering, verification, and observability. This could lead to more robust, secure, and cost-effective AI systems.

Amazon

AI prompt engineering tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI in Software Development

The whitepaper builds on the rapid adoption of AI coding agents, with over 85% of developers using them by early 2026. Previously, the focus was on model improvements and larger datasets. However, recent experiments and benchmarks indicate that performance gains are often achieved by better harness design rather than newer models.

This aligns with broader trends in software engineering emphasizing modularity, verification, and cost management. The paper’s insights challenge the common narrative that model size and raw AI power are the primary drivers of success, instead highlighting the importance of structured workflows and context management.

“The biggest shift in software engineering isn’t a new language or framework — it’s moving from writing code to expressing intent and trusting machines to do the rest.”

— Addy Osmani, co-author

Amazon

AI observability and monitoring software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Costs

While the paper provides compelling evidence that harness and context are critical, it does not specify exact best practices or standardized frameworks for implementing these strategies across different industries. The long-term impact on costs and security remains to be fully validated through real-world application and broader industry adoption.

Additionally, the precise extent to which model improvements will influence performance compared to harness optimization is still being studied, and the optimal balance between the two is not yet established.

Amazon

AI development environment with context management

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations Adopting the New SDLC Approach

Organizations should focus on developing expertise in context engineering, verification, and harness design. Pilot projects and benchmarking will help establish best practices and quantify cost savings. Industry leaders are expected to start integrating these principles into their AI workflows, emphasizing configuration and oversight.

Further research and case studies will clarify how to best structure these systems for scalability, security, and efficiency, guiding future standards and tools development.

Amazon

AI configuration and deployment tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

According to the whitepaper, most of an AI system’s behavior depends on how it is configured, including prompts, tools, rules, and observability—collectively called the harness—which accounts for roughly 90% of the system’s performance.

What is ‘agentic engineering’?

Agentic engineering is a disciplined approach that involves structuring context, verification, and oversight around AI models, moving beyond casual prompt-based workflows to create more reliable and cost-effective systems.

How does this shift impact AI development costs?

Initial investments in harness and context design may be higher, but they typically lead to lower long-term costs by reducing token waste, improving security, and decreasing maintenance needs, according to the whitepaper.

Does this mean larger models are less important?

Not necessarily. While model improvements can help, the whitepaper emphasizes that harness and context engineering are more influential in determining system performance and reliability.

What should organizations do next?

They should prioritize developing skills in context engineering, establish benchmarks, and experiment with harness design to optimize AI system performance and costs.

Source: ThorstenMeyerAI.com

You May Also Like

Hyperpolyglot Lisp: Common Lisp, Racket, Clojure, Emacs Lisp

An exploration of a hyperpolyglot Lisp programmer proficient in Common Lisp, Racket, Clojure, and Emacs Lisp, highlighting confirmed skills and ongoing developments.

Quantum Sensors: Detecting the Imperceptible

Keen to discover how quantum sensors can unveil hidden changes in our environment? Their revolutionary potential might just surprise you.

Can Mukesh Ambani pull off his biggest gamble yet?

Mukesh Ambani is reportedly undertaking his most ambitious business move yet. This analysis examines what is confirmed and what remains uncertain about his latest strategy.

O(x)Caml in Space

On April 23, the Borealis project successfully launched a pure OCaml CCSDS protocol stack in space, demonstrating secure, in-orbit cryptography and satellite communications.