📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
A recent whitepaper by Google emphasizes that in AI development, the model itself accounts for only 10% of system behavior. The focus should be on harnessing, configuration, and context engineering, which constitute the remaining 90%. This shift has significant implications for AI strategy and costs.
A new whitepaper by Google researchers, Addy Osmani, Shubham Saboo, and Sokratis Kartakis, asserts that the model constitutes only about 10% of an AI system’s behavior. The paper emphasizes that harness and context engineering are far more critical in shaping AI performance, marking a paradigm shift in AI development strategies.
The whitepaper, titled The New SDLC With Vibe Coding, argues that the traditional focus on acquiring larger or more advanced models is misguided. For more on this topic, see The Model Is Only 10%. Instead, the authors highlight that the majority of system behavior depends on how the AI is configured, including prompts, tools, rules, and context management.
Evidence from experiments shows that changing only the harness or context can significantly improve AI performance, even with the same underlying model. Learn more about the importance of system configuration. For example, a coding agent moved from outside the Top 30 to the Top 5 on a benchmark by tweaking only its harness, not the model itself.
The paper introduces the concept of agentic engineering, where AI is integrated with structured verification, testing, and oversight, moving away from casual vibe coding toward disciplined, reliable systems.
Furthermore, the whitepaper discusses the economic implications, noting that costs are driven more by configuration and token economy than by the model size. High upfront investments in design and context management can lead to lower ongoing costs and higher reliability.
The model is only 10%
A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.
The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.
Impact of Harness and Context Engineering on AI Strategy
This development shifts the focus from model size and raw AI power to system configuration, harness design, and context management. For organizations, it means that building effective AI solutions depends more on how they structure and control AI interactions than on acquiring the latest, largest models. It also suggests that costs and reliability can be optimized through disciplined engineering practices, reducing long-term expenses and vulnerabilities.
AI prompt engineering tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background of the Shift in AI Development Paradigms
Until early 2026, the industry largely equated AI performance with model size and complexity. Recent experiments and research, including this Google whitepaper, challenge this assumption, showing that configuration and engineering play a more decisive role. The paper builds on prior work emphasizing the spectrum of AI workflows, from vibe coding to agentic engineering, and highlights that cost and effectiveness are increasingly tied to system design.
Previous benchmarks and case studies demonstrate that even with the same models, performance and reliability can be vastly improved through better harness and context strategies, marking a significant evolution in AI development philosophy.
“The model constitutes only about 10% of what determines AI behavior; the harness and context engineering account for the rest.”
— Addy Osmani
AI configuration management software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Unanswered Questions About Practical Implementation
While the whitepaper provides strong evidence that harness and context are dominant factors, it remains unclear how these principles will be adopted across diverse industries and whether smaller organizations can implement such disciplined engineering at scale. The precise methods for optimizing harness and context in complex, real-world applications are still being developed.
AI testing and verification tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Future Steps for AI Development and Industry Adoption
Expect ongoing research and case studies to refine best practices for harness and context engineering. Organizations are likely to invest more in system design, testing, and verification frameworks. Industry standards may emerge to guide disciplined AI development, emphasizing configuration over model size, with further benchmarks and tools to measure effectiveness.
AI system monitoring tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Why is the model size less important than previously thought?
The whitepaper shows that system configuration, harness, and context management have a far greater impact on AI behavior than the underlying model size. Experiments demonstrate that performance improvements often come from better setup rather than larger models.
How does this shift affect AI development costs?
While initial investments in design, testing, and context engineering may be higher, the long-term costs decrease because ongoing token usage and maintenance are reduced. Proper configuration leads to more reliable and cost-efficient AI systems.
What does this mean for organizations using AI today?
Organizations should focus on building robust harnesses and managing context rather than solely acquiring larger or newer models. This approach enables better control, reliability, and cost savings in AI deployment.
Will this change how AI benchmarks are measured?
Yes, future benchmarks are likely to emphasize system configuration and engineering practices rather than just raw model performance, reflecting the shift in what determines AI effectiveness.
Source: ThorstenMeyerAI.com