📊 Full opportunity report: Engineering Is Automated. Research Is the Residual. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
AI systems are now capable of automating most core engineering tasks in AI development, with research remaining the last frontier. Experts see this shift as potentially accelerating AI progress and transforming research practices.
Recent evidence from multiple AI benchmarks confirms that core engineering tasks in AI research are now largely automated, with systems reaching near-saturation levels in reproducing research, optimizing kernels, and competing in Kaggle-like challenges. This shift suggests that the bottleneck in AI progress may soon shift from engineering to research itself, a development with significant implications for the field.
Thorsten Meyer, referencing Jack Clark’s analysis, notes that six key benchmarks measuring AI’s ability to perform tasks critical to AI R&D have shown rapid progress, approaching or reaching saturation. For instance, the CORE-Bench, which assesses research reproduction, has improved from 21.5% in September 2024 to 95.5% in December 2025, with some authors declaring it ‘solved.’ Similarly, the MLE-Bench, evaluating AI performance in Kaggle competitions, has advanced from 16.9% to 64.4% in roughly 16 months, now nearing mid-tier human performance.
Clark’s analysis suggests that the primary engineering challenges—such as reproducing research and optimizing hardware kernels—are now effectively automated, shifting the residual challenge toward research activities that may involve more creative or novel scientific inquiry. Multiple ongoing developments, including automated code generation and kernel design, indicate that AI is producing production-grade capabilities in these engineering domains.
Engineering is automated.
Research is the residual.
Six skill benchmarks. Edison’s framing. The question Clark leaves open is whether research is just engineering at scale.
Jack Clark’s Import AI #455 catalogs six benchmarks measuring AI capability on AI R&D tasks and concludes “AI can today automate vast swatches, perhaps the entirety, of AI engineering.” The residual question is research. The structural read on the residual: it may not be a permanent moat.
Six skills. One trajectory.
Clark catalogs six benchmarks measuring AI capability on AI R&D-relevant tasks. Each individual benchmark could be noise. Six benchmarks moving together is a curve. The pattern is the cascade observed across the broader Clark series — visible here in the specific R&D-skill domain.

1000 AI Tools Directory 2026: The Ultimate Guide to AI Tools for Business, Productivity, Content Creation, Marketing, Coding, Design, Research and Automation
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three data points. Mixed signal.
Clark provides three data points on the creative-spark question. Yes-evidence: Erdős-1051, centaur math discovery, sporadic Move-37-style moments. No-evidence: low yield, framing dependence, absence of acceleration. The mixed signal is the honest read.
The data supports two readings. Pessimistic: rare moments suggest creative insight is qualitatively distinct from engineering work. Optimistic: rare moments are an artifact of low-volume exploration; more shots on goal yields more discoveries. Both readings are consistent with Clark’s “vast swatches, perhaps the entirety” claim. They differ on the residual.
automated code generation software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five dimensions Clark gestures at but leaves underdeveloped.
Clark’s section is rigorous on the empirical evidence. Five strategic dimensions matter for the institutional response that the Clark series synthesis argues is structurally inadequate.
AI hardware kernel optimization tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Two readings. Different equilibria.
The structural question Clark leaves open: is research a permanent moat that bounds automated AI R&D, or is it engineering at scale that dissolves with more shots on goal? Both readings are consistent with the current data. They differ by orders of magnitude in consequences.
Productivity multiplier years
Recursive loop operational
AI research collaboration software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five audiences. Asymmetric cost of being wrong.
The institutional response should not bet on inspiration being a permanent moat. If the distinction holds, capacity built is still useful. If it closes, capacity is necessary. Asymmetric cost-of-being-wrong points toward building now.
IN INDUSTRY
IN ACADEMIA
POLICYMAKERS
INVESTORS
EVERYONE ELSE
Engineering is automated. The residual is the question. The institutional response should not bet on inspiration being a permanent moat.
Implications of Automated Engineering for AI Development
This development could dramatically accelerate AI research and deployment by reducing the time and cost associated with engineering tasks. It also raises questions about the future role of human researchers, potentially shifting their focus toward innovative scientific exploration rather than routine engineering. The transition may influence institutional strategies, funding, and the pace of AI breakthroughs, making understanding and adapting to this shift critical for stakeholders across the field.
Progression of AI Capabilities in Engineering Tasks
Over the past two years, multiple benchmarks have tracked AI’s ability to perform core engineering tasks. The CORE-Bench, assessing research reproduction, improved from 21.5% in September 2024 to 95.5% by December 2025. The MLE-Bench, measuring Kaggle competition performance, rose from 16.9% to 64.4% over the same period. Concurrently, advances in kernel design—such as automated CUDA conversion and optimized GPU kernels—demonstrate that AI is producing production-grade engineering solutions. These trajectories suggest that the engineering side of AI R&D is nearing full automation, leaving research as the remaining challenge.
“The pattern across these benchmarks indicates that core engineering tasks are approaching saturation, effectively automating much of the work traditionally done by human researchers.”
— Thorsten Meyer
Uncertainties About AI’s Research Capabilities
While engineering tasks are nearing full automation, it remains unclear how much of AI research—such as scientific discovery, hypothesis generation, and novel theory development—can be automated. Clark leaves this open, suggesting that research may itself be a form of large-scale engineering, which could accelerate automation in this domain as well. The precise limits of AI’s research capabilities are still under investigation, and the timeline for full automation remains uncertain.
Next Steps in AI Automation and Research Development
Researchers and institutions are likely to focus on advancing AI’s research capabilities, exploring how automation can support scientific discovery beyond engineering tasks. Monitoring ongoing benchmark developments and new AI applications in research settings will be critical. Additionally, discussions around ethical implications, oversight, and the future role of human researchers will intensify as automation continues to expand into more complex scientific activities.
Key Questions
What are the main engineering tasks now automated by AI?
Core research reproduction, hardware kernel design, and performance optimization are among the tasks AI systems are now capable of automating at near-human or better levels.
How close are we to automating all of AI research?
While engineering tasks are approaching full automation, the automation of creative, hypothesis-driven research remains uncertain and is still under active investigation.
What are the implications for human researchers?
If engineering is fully automated, researchers may shift focus toward high-level scientific inquiry and innovation, potentially accelerating AI development but also raising questions about the future of scientific labor.
Could this shift impact AI development timelines?
Yes, automating engineering tasks could significantly shorten development cycles, enabling faster iteration and deployment of AI systems, but the overall timeline depends on progress in automating research activities.
Source: ThorstenMeyerAI.com