Why Agentic AI Needs a Context-Based Approach
Why Agentic AI Needs a Context-Based Approach
Homepage   /    business   /    Why Agentic AI Needs a Context-Based Approach

Why Agentic AI Needs a Context-Based Approach

Chris du Toit,Frederic Lardinois 🕒︎ 2025-11-07

Copyright thenewstack

Why Agentic AI Needs a Context-Based Approach

The conversation around AI in engineering is shifting rapidly, moving past mere autocompletion toward full-scale reasoning and automation in engineering work. The future of development is increasingly defined by AI agents. This transition is already yielding concrete tools, such as AI agents dedicated to code review and comprehensive integrations. However, this exciting agentic shift is haunted by a productivity paradox. We see impressive benchmark scores for AI models alongside anecdotal reports of substantial helpfulness across various tasks. Yet, when these systems are subjected to rigorous scrutiny in real-world scenarios, the results are mixed or even negative. The evidence regarding AI capabilities is partially contradictory. While many people report finding AI very helpful for substantial software tasks, randomized controlled trials suggest a significant slowdown in specific, high-stakes development settings. This contradiction forces a critical realization: To overcome the current limitations and prevent the accumulation of costly technical debt, the key to building more effective and reliable AI agents lies in moving beyond vague prompting — a practice we call “vibe coding”— and instead embracing context engineering. We must enforce pervasive, rigorous context awareness to guide the agents’ creativity. The Current Challenges of Agentic AI Adoption The Real-World Productivity Slowdown The most striking evidence challenging the prevailing optimism comes from empirical research. A randomized controlled trial (RCT) designed to measure the impact of early 2025 AI tools on experienced open source developer productivity found a surprising core result: When developers were allowed to use frontier AI tools (such as Cursor Pro with advanced models like Claude 3.5/3.7 Sonnet), they took 19% longer to complete issues than when working without AI assistance. This represented a significant slowdown. The conditions of this trial were designed specifically to capture real-world usefulness, involving experienced developers working on real issues (bug fixes, features, refactors) in large, familiar repositories they had contributed to for years. Crucially, the definition of success required the human user to be satisfied that the code would pass review, including requirements related to documentation, style and testing coverage. This finding clashes sharply with developer perceptions. The developers expected the AI to speed them up by 24% and, even after experiencing the slowdown, they still believed AI had sped them up by 20%. This gap between perception and reality is striking and suggests that self-reports of speedup can be inaccurate and overoptimistic. Furthermore, the slowdown persisted in settings with very high quality standards or many implicit requirements, such as documentation or linting, that take humans substantial time to learn and enforce. The Pitfall of Vibe Coding and Technical Debt The root cause of this real-world slowdown is often a fundamental disconnect between the flexibility of large language models (LLMs) and the rigor required by production systems. This disconnect manifests as vibe coding. Vibe coding is defined as generating code through AI based on loose, informal prompts, such as typing, “make a login form that looks cool,” rather than providing precise, structured specifications. The AI readily fills in the blanks based on training patterns, leading to rapid initial development. This practice represents the logical extension of dynamic typing principles into the AI age. Both dynamic typing (think classic JavaScript or Python) and vibe coding prioritize immediate development speed and flexibility over upfront rigor, and both paradigms accumulate technical debt that must eventually be paid with interest. The hidden costs of vibe coding are substantial and critical: Security risks: AI may replicate insecure patterns from its training data, omitting crucial validation mechanisms like cross-site request forgery protection, error handling or cross-site scripting protection. Loss of context: AI cannot explain its reasoning, meaning the essential “why” behind the implementation is lost. Developers may skip human documentation, leading to difficulty for new team members. Maintenance nightmares: Generated code often contains unnecessary complexity or redundant logic, and refactoring becomes difficult because changes may break hidden dependencies. Deferred validation: Errors are pushed to runtime, potentially surfacing in production, rather than being caught immediately by a compiler or validator. If organizations scale AI without addressing this issue, they risk vibe coding their way into a tsunami of technical debt. Limitations of Uninformed Agents The impressive success of fully autonomous agents on benchmarks like SWE-Bench Verified or RE-Bench, where they may use complicated scaffolds or sample millions of tokens, often obscures their limitations in practical use. These benchmarks typically sacrifice realism for scale, relying on self-contained tasks and algorithmic scoring that do not capture the complexity of real-world requirements, prior context or human satisfaction (including style and architecture). It is important to understand that while these models demonstrate high capability under maximal elicitation (millions of sampling tokens), this capability does not easily translate to impact in the wild or complex, real-world usefulness. Furthermore, researchers argue that if AI systems were able to substantially speed up developers in these realistic RCT settings, it could signal a rapid acceleration of AI R&D progress, which might in turn lead to significant destabilizing risks, including proliferation risks and breakdowns in safeguards and oversight. Context Engineering: The Solution for Smarter Agents The key to unlocking AI’s full potential without drowning in technical debt is to integrate the agent deeply into the development ecosystem, a process we define as context engineering. Personalization Through Context Awareness AI platforms must be more than generic chat prompts. Context-aware AI platforms (such as Tabnine) provide highly relevant code and recommendations by understanding the developer’s specific applications, requirements and workflow. This personalization allows the AI to function like a truly onboarded member of the development team. Personalization can occur across multiple levels, ensuring that the AI has access to proprietary noncode knowledge essential for making architecturally sound decisions. Maximizing Agent Effectiveness With Data Connection To ensure agents generate effective and safe code, they require both immediate and enhanced context: Immediate context (IDE): Agents must use all available data directly from the developer’s IDE. This includes crucial information like variable types, comments, open files, imported packages and libraries, enabling the AI to provide relevant code and recommendations “out of the box.” Enhanced context (organization): Agent performance is significantly enhanced by connecting the system to additional information sources essential for the project, such as the organization’s codebase, requirements, documentation and tools like ticketing systems (Atlassian Jira) and documentation repositories (such as Confluence). Shifting From Vibe Coding to Context Engineering The strategic use of context allows developers to replace vague requests with rigorous specifications, fundamentally changing the interaction model with the agent: Structured specifications: Developers must move beyond simple requests, like “Add user management,” and instead use context-engineered prompts that function as structured specifications. These specifications explicitly define critical requirements, such as using TypeScript interfaces, defining PostgreSQL schema, specifying input validation, mandating rate limiting, requiring audit logging and setting targets for unit testing coverage (for example, >80% coverage). Implementing guardrails: Organizations must formalize their unique best practices, policies and engineering standards and convert them into explicit rules that control how the AI agent behaves. These rules must be enforced both within the IDE while the developer is working and again during the pull request or code review stage. Hybrid workflow: The most sustainable workflow demands a disciplined hybrid approach. The human role involves upfront rigor: defining types and interfaces, and crucially, writing test cases before generation. This is followed by the context-engineered prompt, which guides the AI generation. The AI-generated code is then subjected to necessary verification steps, including static type checking, running tests, security scans and, finally, systematic human code review and refactoring. The Importance of Enterprise Context The true frontier of agentic AI is not just context — it’s enterprise context. While most AI assistants operate within the narrow boundaries of a single file or repository, enterprise-grade engineering requires agents that understand the entire system of work. This includes not just the codebase, but the implicit architecture, compliance policies, deployment pipelines and the organizational intent behind every change request. Context as Architecture, Not Memory In an enterprise, context is not a temporary state or a chat history — it is the living architecture of the organization’s intellectual property. Every service, interface and schema represents a contract between systems and teams. When agents operate without access to this architecture, they make decisions in a vacuum. The result is syntactically correct but semantically wrong code: logic that “works” locally but violates global consistency. Context engineering, therefore, is about encoding this architectural awareness into the agent’s reasoning layer, ensuring every generation aligns with the organization’s design intent. Connecting the Organizational Graph Enterprise systems are webs of relationships — between services, tickets, documentation and people. Effective agents must navigate this graph. By connecting to tools like Jira, Confluence and internal Git repositories, an agent gains the same situational awareness a senior engineer has: the “why” behind a feature, the trade-offs that were already debated and the dependencies that constrain implementation. This deep integration allows the agent to reason in the same semantic space as the organization itself, not just within the syntactic space of code. Governance and Traceability by Design Enterprise context also introduces the foundation for governance at scale. When every AI action is tied to explicit standards, design documents and issue references, compliance becomes a built-in property of the workflow, not an afterthought. This ensures that code generation is not just fast, but accountable. Auditability, provenance tracking and explainability emerge naturally when agents operate within structured, context-rich environments. Each generated function can be traced back to the originating business requirement, the associated Jira ticket and the policies that guided its design. Context as the New Source of Leverage In the same way the compiler once became the engineer’s amplifier, enterprise context is now the agent’s amplifier. Context transforms generic intelligence into organizational intelligence. It lets the same model behave differently for a fintech firm enforcing PCI-DSS compliance, a telecom enforcing 3GPP standards or a healthcare company working under HIPAA constraints — not by retraining the model, but by surrounding it with the organization’s contextual fabric. This is the core of sustainable acceleration: AI that codes not just fast, but correctly, securely and in compliance with the company’s DNA. Conclusion: From Vibe Coding to Enterprise Context The tension between the convenience of rapid iteration and the necessity of architectural rigor is not new in software development. The evolution from dynamic typing to gradual typing (like TypeScript) offers crucial lessons: While convenient shortcuts like vibe coding offer immediate speed, they exact a heavy price in complexity and maintenance down the line. Sustainable software practices require discipline, sophisticated tools and accumulated experience. For software teams today, the path forward is clear: The most successful teams will leverage agents for rapid iteration while rigorously maintaining quality standards. By guiding AI with wisdom earned from decades of software experience — ensuring clear types, strong tests and structured specifications — we can transition from hoping for a speedup to architecting reliable software. Vibe coding offered a seductive shortcut — fast, expressive and seemingly creative. But it traded away the very structures that make enterprise software reliable. Just as untyped languages once forced the industry to confront hidden costs in debugging and maintenance, context-free AI generation is now exposing its own debt: code that compiles, but doesn’t belong. The next frontier is not just context engineering, but enterprise context engineering, a discipline that binds AI creativity to organizational truth. In this model, context is not an ephemeral memory window or a clever prompt trick; it’s an enterprise fabric. It includes type definitions, codebases, test frameworks, design patterns, compliance policies, business logic and the institutional history encoded across Jira tickets, Confluence pages and internal APIs. The goal is not faster code, but truer code — software that embodies the enterprise’s intent, operates within its guardrails and delivers value reliably over time. Context engineering is how we move from vibe-driven generation to governed creation. Enterprise context is how we ensure the agents we build today become not just accelerators of work, but stewards of the systems that define tomorrow.

Guess You Like