UNPKG

arela

Version:

AI-powered CTO with multi-agent orchestration, code summarization, visual testing (web + mobile) for blazing fast development.

264 lines (180 loc) 41.1 kB
# **Vertical Slice Architecture as a Foundation for Agent-Based Software Engineering: A Comparative Analysis and Governance Framework** ## **1\. Executive Summary** This report concludes that Vertical Slice Architecture (VSA) within a Modular Monolith (MMA) is a **strong, conditional fit** for AI agent-based software development. It is arguably the most promising architectural foundation for current-generation AI coding agents because it directly mitigates their most significant operational bottleneck: the finite "context window" and the resulting challenge of "context engineering". VSA’s high-cohesion, feature-based slices provide a "pre-packaged," minimal, and highly relevant "context" for an agent to perform a specific task. This transforms the engineering challenge from an unbounded repository-search problem into a well-defined, localized one. However, this "fit" is conditional on two critical factors: 1. The adoption of a **multi-agent system (MAS)** , where specialized "Developer Agents" are sandboxed within individual slices, while "Architect Agents" manage shared components. 2. The implementation of **strict, programmatic governance** at the module "seams" , likely using Policy-as-Code (PaC) frameworks like Open Policy Agent (OPA). This is essential to prevent agents from creating high-risk coupling, particularly in any SharedKernel. VSA is not a panacea; it creates new challenges for tasks requiring global, cross-repository reasoning, such as large-scale refactoring. Nonetheless, it is the superior choice for the vast majority of day-to-day development work: iterative, feature-based, and autonomously testable changes. ## **2\. The Agent's Dilemma: Architecture as the Key to Context** To evaluate an architecture's fitness for AI agents, it is first necessary to understand how those agents operate and, more importantly, how they fail. The primary constraints on current AI agents are not their ability to write code but their ability to understand *what* code to write and *where* to write it. ### **2.1 The Number One Job of Agent Engineering: Context** The central thesis of modern agent engineering is that "Context engineering is effectively the \#1 job". An agent's success is determined less by the raw intelligence of its underlying Large Language Model (LLM) and more by the *information* provided in its "working memory" or "context window". This "context" is a finite resource , analogous to a computer's RAM. It includes not just the initial prompt but the entire conversation history, retrieved documents (via Retrieval-Augmented Generation, or RAG), and the output of any tool calls. The problem is that this window is tiny relative to the problem space. A large context window might be 1 million tokens, but a typical enterprise monorepo can span *millions* of tokens, living in thousands of files. This "massive gap" between the agent's working memory and the repository's size means that *all* large-scale software engineering tasks—be it feature development, bug fixing, or refactoring—are fundamentally *information retrieval* problems. An agent cannot "read" the whole repository. It must be *given* the right files. Poor context engineering leads to "context rot," where irrelevant or poorly presented information causes even advanced models' performance to "crater". Therefore, the ideal software architecture for an AI agent is one that makes the "Context Engineering" job as easy as possible. The architecture itself must function as a high-efficiency retrieval and scoping mechanism, presenting the agent with a minimal, complete, and relevant set of files for any given task. ### **2.2 How AI Software Agents Actually Operate (and Fail)** Current autonomous software engineering (SWE) agents function as a Partially Observable Markov Decision Process (POMDP). They exist in a slow, expensive loop: 1. **Plan:** Analyze the task and create a "modification plan" or "refactoring plan". 2. **Act:** Execute an action, such as edit \--file path/to/file , read documentation, or run a test. 3. **Observe:** Receive an observation (e.g., a test failure, a compilation error, or new file content). 4. **Update:** Update the plan based on the new observation and repeat. This loop is fraught with failure points that are directly linked to context and architecture. * **The Localization Bottleneck:** A primary failure mode is *inefficient localization*. Academic studies show agents spend "excessive exploration time" just trying to *find* the candidate files to edit. If an agent cannot reliably find the correct files, it cannot even begin the task. * **The Planning Fallacy:** The "plan" is critical. Research on multi-agent systems shows that "argumentative rigor" and debate are necessary to produce "architecturally sound solutions". An agent that just starts editing without a plan will fail. This planning phase is, again, a context-retrieval problem—the agent is trying to build a "mental model" of the code it's about to change. * **Iteration and Failure Loops:** Agents *can* perform multi-file edits and rely on test-based feedback loops. A common pattern is for an agent to edit a file, create a new test (e.g., reproduce\_issue.py) , and run it. However, they are brittle. One case study details an agent failing on a KeyError. It then "switches to an alternative strategy" which also fails, leading to a "loop of failures" and eventual timeout. The agent's inability to self-correct from a simple error is a critical weakness. * **Refactoring and Architectural Blindness:** Agents demonstrate a significant lack of "architectural awareness". Research on agentic refactoring found that agents often *fail* to reduce the overall count of design smells, even when they successfully complete a task. They make "flawed assumptions" about code structure —for instance, assuming all \<dl\> tags in an HTML doc are for the same purpose—and fail to see the larger architectural picture. These failures all trace back to context-starvation. The agent is "blind." It fails at localization because it has no "map" of the repository. It lacks architectural awareness because it can't *see* the architecture. It needs a "plan" because it is building a mental model from scratch *every single time*. The ideal architecture would provide the "map" and the "plan" as part of the code's structure itself. ### **2.3 The "Team" Model: Multi-Agent Systems (MAS)** The industry is rapidly moving away from single-agent "copilots" and toward multi-agent frameworks like AutoGen and MetaGPT. These systems use "conversation programming" to allow multiple agents to collaborate, each with a specific role. In software engineering, these roles are becoming explicit: * **Developer Agent:** Writes feature code. * **Architect Agent:** Makes high-level design decisions. * **Project Manager Agent:** Decomposes tasks. * **Tester Agent:** Writes and runs validation. * **Security Specialist Agent:** Audits for vulnerabilities. This multi-agent debate is proven to find "architecturally sound solutions". For example, a "Performance Agent" might implement caching, but a "Security Agent" can intervene to enforce input validation and rate-limiting on that new cache. If the *agents* are being organized as a "team," the *software architecture* must be organized to support this team structure. A "monolithic" agent trying to understand a "monolithic" layered codebase is a recipe for failure. The architecture must be *partitionable*, providing clear "boxes" that can be assigned to specific agent roles. ## **3\. Vertical Slice Architecture: A Natively Agent-Friendly Scaffolding** Vertical Slice Architecture (VSA) emerges as a prime candidate precisely because it is, by design, a "partitionable" architecture that directly addresses the agent's context and localization problems. ### **3.1 Core Principles of VSA** VSA structures an application by *features* (vertical slices) rather than *technical layers* (horizontal). Traditional layered architectures (e.g., N-Tier) are coupled horizontally. A single feature change, like adding a new field to a form, requires touching *all* layers: the controller, the service, the domain model, the data access object, and the validator. VSA, by contrast, "couples along the axis of change". It co-locates all the code for a single feature *together* in one place. The prime directive of VSA is to: **"Minimize coupling *between* slices, and maximize coupling *in* a slice"**. This is often implemented using "feature folders" and patterns like Command Query Responsibility Segregation (CQRS) and Request-EndPoint-Response (REPR) , where a single slice maps directly to a single API request. ### **3.2 How VSA Properties *Help* AI Agents (The "Perfect Context Package")** When viewed through the lens of an AI agent's limitations, VSA's properties offer a near-perfect set of solutions. * **Solving Context Scoping:** * **Problem:** Agents have a finite context window. * **VSA Solution:** A single vertical slice *is* the minimal, complete context. An agent tasked with "POST /products" can be pointed at the Feature\[span\_32\](start\_span)\[span\_32\](end\_span)s/Products/CreateProduct folder. This folder contains the request DTO, the response DTO, the command handler, the validator, and the data access logic. This is a *perfect context package*, likely small enough to fit entirely within a single context window, providing the agent with everything it needs and *nothing* it doesn't. * **Solving Localization:** * **Problem:** Agents waste time and tokens on file localization. * **VSA Solution:** VSA is *natively* localized. The agent doesn't *need* to "find" the files; the architecture *tells* it where they are. The task "fix bug in product creation" maps 1:1 to the Features/Products/CreateProduct slice. This drastically reduces the search space from O(n) to O(1). * **Solving Safe Iteration:** * **Problem:** Agents need a fast, reliable feedback loop (e.g., running tests). * **VSA Solution:** VSA promotes slice-level tests. An agent can be sandboxed within its slice, modify the handler, run the tests *in that same slice*, observe the failure/success, and iterate. This is "CI-backed validation" at a micro-level, providing a tight, autonomous, and low-risk feedback loop. * **Solving the "Plan":** * **Problem:** Agents must generate a "plan" before acting, a complex and error-prone step. * **VSA Solution:** The VSA architecture *is* the plan. A new feature *always* follows the same pattern: "create a new folder, add a Request.cs, Handler.cs, Endpoint.cs." The agent's "plan" becomes trivial, a "fill-in-the-blanks" exercise rather than a complex, exploratory-discovery process. The VSA motto "new features only add code" is an agent's dream, as adding new, isolated code is infinitely safer than modifying complex, shared code. ### **3.3 How VSA Properties *Hinder* AI Agents (The Global Reasoning Problem)** VSA is not without its drawbacks, and these are exacerbated in an agent-based model. The most common misconception about VSA is that it means "share nothing". This is false. Any non-trivial application will have cross-cutting concerns (like logging, auth, validation) and shared domain models. The standard solution is to create a Shared or Kernel module that contains this common logic. And here lies the new bottleneck: VSA intentionally *hides* global context to optimize for feature-work. This optimization is excellent for a "Developer Agent" working in its slice, but it *blinds* the "Architect Agent". A task that requires *global reasoning*—such as "migrate all data access from EF Core to Dapper" or a massive ETL refactor like the one at Nubank —becomes exceptionally difficult. An agent can no longer look at a single Infrastructure layer to see all data access. It must now inspect *every single slice* to find its individual data access implementation. VSA's low inter-slice coupling makes systemic dependencies opaque. Therefore, VSA optimizes for "Developer Agents" at the direct *expense* of "Architect Agents." ## **4\. Comparative Analysis: Architectures from an AI Agent's Perspective** VSA's "fit" becomes clearer when contrasted with traditional architectures, analyzed from the agent's point of view. ### **4.1 Classic Layered Architecture (N-Tier)** A layered architecture organizes code by technical responsibility (e.g., Presentation, BusinessLogic, DataAccess). **Agent-Friendliness: *Hostile*.** This architecture is the antithesis of what an agent needs. A simple feature change (e.g., "add a 'SKU' field to 'Product'") forces the agent to perform multi-file edits across *horizontally-coupled* layers. * **Context-Scoping:** To edit the ProductService, the agent must load ProductService.cs, a file that may contain *dozens* of other, *irrelevant* methods for other features. This is a massive source of "context poisoning" and "context rot". * **Safety & Drift:** The risk of side effects is *maximal*. An agent, trying to be efficient, will follow the path of least resistance. It will add business logic directly to the ProductController to avoid the 7-file-hop, leading to rapid and irreversible architectural drift. ### **4.2 Hexagonal / Clean Architecture (CA)** Clean Architecture (an "onion" architecture ) focuses on isolating the Domain/Application "core" from "infrastructure" (UI, DB) via "Ports and Adapters". **Agent-Friendliness: *Conditional Fit / Deceptively Difficult*.** At first glance, CA seems agent-friendly. The "ports" (interfaces) provide *explicit, clear contracts*. A task can be clearly defined: "Implement the IProductRepository adapter for PostgreSQL." This aligns well with plugin-style development and agent-based tool-use. The *implementation*, however, is an agent's nightmare. CA *mandates* file-scattering and high abstraction. A simple feature often requires "opening 6-7 files" : a Command, a Handler, a Validator, a DTO, a Repository-Interface, a Domain-Entity, and so on. This creates "excessive abstraction" and "too much context-switching". For an agent, this is a path-tracing disaster. It must jump between 6-7 files just to trace the logic, blowing its context window and maximizing the chance of localization failure. ### **4.3 Microservices Architecture (MSA)** MSA consists of small, independently deployable services. **Agent-Friendliness: *Theoretically Ideal, Practically Impossible (Today)*.** A single microservice is the *ultimate* isolated context. An agent working on the "Product" service has a perfectly-sized, high-cohesion codebase. It is the VSA-slice concept taken to its logical, deployed conclusion. The problem is that almost *no* real-world feature lives in a single microservice. The agent's task ("add a 'Product' to a 'Cart'") now requires reasoning about network boundaries, asynchronous communication, data consistency, and API versioning. Current agents are "handymen," not "city planners". An AI agent that can *autonomously* and *safely* orchestrate a multi-service, asynchronous, event-driven workflow simply does not exist in a reliable way. This analysis leads to the **Modular Monolith (MMA)** as the pragmatic "sweet spot." It provides the *module-level isolation* of microservices without the *network-level complexity*. Agents can communicate via simple, in-process calls , which is a tractable problem. ### **4.4 Table: Architecture vs. Agent-Friendliness** The following table synthesizes this analysis, evaluating each architecture against criteria critical to agent-based development. | Criteria | Classic Layered Architecture | Hexagonal / Clean Architecture | Microservices Architecture | Vertical Slice Arch. (in Modular Monolith) | | :---- | :---- | :---- | :---- | :---- | | **Context-Scoping Efficiency** | **Very Low.** Agent must load large, low-cohesion layer-files (e.g., ProductService.cs) full of irrelevant code. | **Low-to-Medium.** Agent must load 5-7 high-abstraction files, scattering context. Good contract-scoping, bad logic-tracing. | **Very High (Per-Service).** The service is a perfect, isolated context. | **Very High (Per-Slice).** The slice-folder is a perfect, minimal, and complete context for one feature. | | **Locality of Change** | **Very Low.** A single feature-change is scattered across many files in many directories. | **Low.** A single feature is scattered (Command, Handler, Interface, Adapter). | **Very High (In-Service).** **Very Low (Cross-Service).** A feature touching 2+ services is a distributed-systems problem. | **Very High.** All code for one feature is co-located in one folder. "New features only add code". | | **Ease of Contract Definition** | **Low.** Contracts are implicit (method-calls) and buried in large service-classes. | **Very High.** The *entire point* is explicit contracts (Ports/Interfaces). Very "pluggable". | **Very High.** Contracts are explicit, versioned, and network-enforced (e.g., OpenAPI, gRPC). | **High.** Contracts are explicit at the slice entry-point (e.g., REPR DTOs, OpenAPI specs). | | **Risk of Unintended Side Effects** | **Very High.** Changing a shared service-method can break *dozens* of unrelated features that also call it. | **Medium.** The Domain/Application core is protected, but bugs in Adapters can have side effects. High abstraction can *hide* side effects. | **Low (Interservice).** **High (Intra-service).** Service boundaries are strong. Side effects are contained *within* the service. | **Low (Inter-slice).** By-design, slices are isolated. "New features only add code". Risk is concentrated in the SharedKernel. | | **Policy Enforcement Granularity** | **Low.** It's "all or nothing." You can't easily set a policy for "one part" of the ProductService. | **Medium.** Policies can be applied at the "Port" (interface) boundary. | **Very High.** Policies are applied at the *network-level* (e.g., API Gateway, Service Mesh). | **High.** Policies can be applied *per-slice* (at its endpoint) or *per-module* (at its "public API"). | | **Global Refactoring Difficulty** | **Low.** It's all one "blob." An "Architect Agent" can see everything. A global find-and-replace is easy (though unsafe). | **Medium.** High abstraction makes tracing dependencies hard, but they are explicit at the core. | **Very High.** A global refactor is a *multi-year, distributed systems project*. Agents cannot do this today. | **High.** VSA intentionally *hides* global dependencies. A global refactor requires inspecting *every single slice*, making it hard for an "Architect Agent". | This comparison shows VSA (in an MMA) wins decisively on all criteria related to *iterative feature development*. Its only significant drawback is the difficulty of *global refactoring*—the exact trade-off human teams make. ## **5\. Governance and Safety: VSA inside a Modular Monolith** An architecture is not just a diagram; it is a system of control. For AI agents, which are prone to "autonomy drift" , this control system is paramount. The VSA+MMA combination provides a powerful and fine-grained surface for programmatic governance. ### **5.1 The MMA as an Enforceable "Sandbox"** A Modular Monolith (MMA) organizes the application into "independent modules with well-defined boundaries" that are "independent and interchangeable". This is distinct from VSA. VSA is a pattern *within* a module. The MMA is the *enclosure* for many such modules. The key to agent safety is that these module boundaries can be *programmatically enforced*, not just suggested. 1. **Data-Level Enforcement:** We can assign each module its *own database schema* and a *dedicated database role*. An agent working in the Orders module, using the orders\_role, *literally cannot* select, join, or modify tables in the Shipping schema. This eliminates an entire class of agent-initiated data-corruption bugs. 2. **Code-Level Enforcement:** Using static analysis or compiler rules, we can enforce that modules *only* communicate via "well-defined contracts/interfaces". An agent in ModuleA is programmatically *blocked* from calling an "internal" implementation-detail class in ModuleB. This creates a high-trust "sandbox" for an agent to operate in. ### **5.2 VSA as the "Policy Enforcement Surface"** While the MMA provides coarse-grained control (module-to-module), VSA provides the fine-grained "seams" *within* a module that are perfect for "triggered governance". Each vertical slice has a clear, machine-readable contract: its Request DTO and its automatically generated OpenAPI definition. This "structured data" is the perfect input for a Policy-as-Code (PaC) engine like Open Policy Agent (OPA). OPA is a "general-purpose policy engine" that "decouples policy decision-making from policy enforcement". A practical agent-governance workflow would be: 1. An AI agent attempts to commit a change to the Features/Auth/Login slice. 2. A CI/CD pipeline or an IDE plugin intercepts this action. 3. It sends an input query to OPA: { "agent": "SWE-Agent-04", "slice": "Features/Auth/Login", "action": "modify" }. 4. OPA evaluates this input against its policy (written in the Rego language): deny\["Agent cannot modify 'Auth' slice"\] { input.slice \== "Features/Auth/Login"; input.agent\_trust\_level\!= "admin"; }. 5. OPA returns { "allow": false }, and the agent's action is *blocked* *before* it can be committed. This is the "architectural design challenge" solution —building a secure-by-design "operating system" for the agent, rather than just trying to filter its output. ### **5.3 A Multi-Agent Governance Model (Synthesis)** By combining the MMA, VSA, and OPA, we can create a sophisticated, role-based governance system for our AI "team": * **"Developer Agents" :** Are given programmatic "write" access *only* to feature slices (e.g., Modules/Orders/Features/\*). They are granted "read-only" access to the SharedKernel. * **"Architect Agents" :** Are the *only* agents given "write" access to the SharedKernel or to modify module-level boundaries. * **"Security Agents" :** Are given *no* write access, but are triggered by the CI/CD pipeline on *every* proposed change to run analysis. * \*\*"Human-in-the-Loop" : OPA policies can enforce that *any* change to a critical slice (like Auth or Billing) is "not allowed" until a human provides approval , moving the agent from a fully autonomous mode to a supervised one. This creates a *verifiable* and *auditable* system for managing "autonomy drift". We can *prove* that a "Developer Agent" *cannot* change a security rule, not because we *told* it not to in a prompt (which is brittle), but because the *architecture* (enforced by OPA) makes it *impossible*. ## **6\. Design Recommendations: Building the "Agent-Ready" VSA Codebase** To make a VSA codebase "agent-ready," one must be intentionally-prescriptive. The goal is to create a scaffold that maximizes predictability and minimizes ambiguity for the agent. ### **6.1 Source Layout: The Hybrid VSA-MMA** A hybrid Modular Monolith / Vertical Slice structure is recommended. 1. **Top-Level (MMA):** The src/ directory should be organized by *module*, not by layer. `/src` `/OrdersModule` `/ShippingModule` `/SharedKernel` 2. **Module-Level (VSA):** *Inside* each module, a strict VSA "feature folder" structure should be enforced. `/src/OrdersModule` `/Features` `/CreateOrder` `CreateOrderEndpoint.cs` `CreateOrderHandler.cs` `CreateOrderRequest.cs` `CreateOrderResponse.cs` `CreateOrderValidator.cs` `CreateOrder.Tests.cs` `README.md <-- CRITICAL` `/GetOrderDetails` ... /Domain /Infrastructure \`\`\` ### **6.2 Contract Conventions: Machine-Readable Seams** Agents require explicit, machine-readable contracts. * **Slice-Level:** Mandate the REPR (Request-EndPoint-Response) pattern. Each slice should map 1:1 to a use case. * **Module-Level:** Each *module* (e.g., OrdersModule) must expose a "public API" (e.g., IOrdersModuleApi) to the SharedKernel or other modules. This should be a small, explicit set of interfaces. Agents are forbidden from coupling to anything *but* this public API. * **API-Level:** Mandate automatic OpenAPI/Swagger generation for *all* HTTP-facing slices. This becomes the primary "contract" for external agents *and* for internal OPA governance checks. ### **6.3 Documentation Structure: The "In-Slice Prompt"** This is the most critical *new* recommendation for an agent-ready codebase. **Treat documentation as a core part of the agent's context-engineering.** Mandate that **every Feature folder must contain a README.md file.** This README.md is not for humans; it is a *prompt* for the agent's RAG system. It should contain: 1. **Intent:** "This slice is responsible for creating a new customer order." 2. **Contracts:** "It handles the POST /api/v1/orders endpoint." 3. **Key Dependencies:** "It *must* publish an OrderCreatedEvent via the IEventBus. It *reads* from the ShippingModule's IShippingApi to calculate rates." 4. **Governance Rules:** "WARNING: Do NOT call other slices directly. Do NOT make direct database calls to other modules. Do NOT modify the SharedKernel." When an agent is tasked, its *first* step is to *ingest* this local README.md. This *immediately* solves the localization and planning problems by providing the "ground-truth" context, pre-scoping its entire operation. ### **6.4 Testing Strategy: The Autonomous Feedback Loop** Each slice *must* be "a self-contained unit of functionality" and "testable... in isolation". Mandate that each feature-folder contains its *own* integration tests , as shown in the layout in 6.1. This enables the autonomous agent-workflow: 1. Agent receives task: "Fix bug in CreateOrder." 2. Agent reads CreateOrder/README.md (the prompt). 3. Agent runs CreateOrder.Tests.cs and sees the failure. 4. Agent edits CreateOrderHandler.cs. 5. Agent re-runs Crea\[span\_40\](start\_span)\[span\_40\](end\_span)teOrde\[span\_43\](start\_span)\[span\_43\](end\_span)r.Tests.cs. 6. This loop repeats until the tests pass. This creates an *autonomous, local feedback loop* that doesn't require a slow, full-repository dotnet build. It is the practical, slice-level implementation of the "CI-backed validation" identified in research. ## **7\. Limitations and Open Questions** The conclusions of this report are, by necessity, speculative. They are based on a logical synthesis of the *known properties* of AI agents and the *known properties* of software architecture. ### **7.1 The Critical Lack of Empirical Evidence** There is a severe lack of public, empirical research comparing agent performance (e.g., SWE-bench scores ) on *identical tasks* implemented in VSA vs. Clean vs. Layered architectures. Academic papers *discuss* the importance of architecture and even provide preliminary data on agent collaboration , but the crucial comparative benchmarks do not yet exist. This report's conclusions, while logically sound, await empirical validation. ### **7.2 The "Global Refactoring" Problem (The Architect's Dilemma)** VSA optimizes for feature-work by *hiding* global dependencies. This makes it *fundamentally hostile* to "Architect" agents tasked with large-scale refactoring. The Nubank/Devin case study , which involved refactoring a massive monolith, suggests a path forward. That refactor was a *structural migration* task, which agents are trained for , not a creative "feature" task. This implies that we need *different agents with different tools* for different tasks. The "Developer Agent" lives *inside* the VSA slice, using the local README.md and slice-tests. The "Architect Agent" must live *outside* the architecture, ingesting the *entire* codebase into a different RAG model—likely a *graph-based* one that can trace dependencies , or a "Meta-RAG" approach that uses code summaries to build a global map. Therefore, VSA is the *day-to-day* architecture for 90% of agent work, but a separate, specialized "refactoring agent" with a *different, global-view* toolset is required for the other 10%. ### **7.3 When VSA is Actively Harmful for Agents** VSA is not a universal solution. It is the wrong choice in at least two scenarios: 1. **Highly Cross-Cutting Domains:** In systems like a complex financial-rules engine or a social graph, *every* feature may be deeply interdependent. In this case, VSA's "isolation" is a lie. It will lead to a *massive* SharedKernel that simply becomes a new, unmanageable monolith, defeating the entire purpose. 2. **Plugin-Based Systems:** For a system like an IDE, a game engine, or a data-pipeline orchestrator, a "plugin-based" or "ports-and-adapters" model is superior. Hexagonal Architecture would be a *better* fit here. The agent's task is explicitly to "implement this adapter," and the high abstraction and explicit interfaces of CA become a *feature*, not a bug. ## **8\. References** #### **Works cited** 1\. Deep Dive into Context Engineering for Agents \- Galileo AI, https://galileo.ai/blog/context-engineering-for-agents 2\. Effective context engineering for AI agents \- Anthropic, https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents 3\. The Context Window Problem: Scaling Agents Beyond Token Limits \- Factory.ai, https://factory.ai/news/context-window-problem 4\. Vertical Slice Architecture: The Best Ways to Structure Your Project : r/dotnet \- Reddit, https://www.reddit.com/r/dotnet/comments/1eo7uhk/vertical\_slice\_architecture\_the\_best\_ways\_to/ 5\. Vertical Slice Architecture: Structuring Vertical Slices \- Milan Jovanović, https://www.milanjovanovic.tech/blog/vertical-slice-architecture-structuring-vertical-slices 6\. AutoAgents: A Framework for Automatic Agent Generation \- arXiv, https://arxiv.org/html/2309.17288v3 7\. (PDF) A Multi-Agent LLM Environment for Software Design and Refactoring: A Conceptual Framework \- ResearchGate, https://www.researchgate.net/publication/391205436\_A\_Multi-Agent\_LLM\_Environment\_for\_Software\_Design\_and\_Refactoring\_A\_Conceptual\_Framework 8\. Modular Monolith: Architecture Enforcement \- Kamil Grzybek, https://www.kamilgrzybek.com/blog/posts/modular-monolith-architecture-enforcement 9\. Architecture 101: Modular Monolith — A Primer | by Anji… \- Medium, https://anjireddy-kata.medium.com/architecture-101-modular-monolith-a-primer-36864f045697 10\. Introduction | Open Policy Agent, https://openpolicyagent.org/docs 11\. Agent Governance at Scale: Policy-as-Code Approaches in Action, https://www.nexastack.ai/blog/agent-governance-at-scale 12\. What are your experience with Clean Architecture vs Vertical slice architecture \- Reddit, https://www.reddit.com/r/dotnet/comments/1iysrq4/what\_are\_your\_experience\_with\_clean\_architecture/ 13\. Devin | The AI Software Engineer, https://devin.ai/ 14\. Agentic Refactoring: An Empirical Study of AI Coding Agents \- arXiv, https://arxiv.org/html/2511.04824v1 15\. Context Engineering in LLM-Based Agents | by Jin Tan Ruan, CSE Computer Science, https://jtanruan.medium.com/context-engineering-in-llm-based-agents-d670d6b439bc 16\. Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning \- arXiv, https://www.arxiv.org/pdf/2508.03501 17\. SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution \- arXiv, https://arxiv.org/pdf/2507.23348 18\. RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring \- arXiv, https://arxiv.org/html/2511.03153v1 19\. Meta-RAG on Large Codebases Using Code Summarization \- arXiv, https://arxiv.org/html/2508.02611v1 20\. SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering \- arXiv, https://arxiv.org/pdf/2405.15793? 21\. Copilot Workspace \- GitHub Next, https://githubnext.com/projects/copilot-workspace 22\. Building Effective AI Agents \- Anthropic, https://www.anthropic.com/research/building-effective-agents 23\. Exploring Autonomous Agents: A Closer Look at Why They Fail When Completing Tasks, https://arxiv.org/html/2508.13143v1 24\. LLM Agents for Code Migration: A Real-World Case Study \- Aviator, https://www.aviator.co/blog/llm-agents-for-code-migration-a-real-world-case-study/ 25\. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation \- arXiv, https://arxiv.org/pdf/2308.08155 26\. LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead, https://arxiv.org/html/2404.04834v1 27\. Single-agent and multi-agent architectures \- Dynamics 365 ..., https://learn.microsoft.com/en-us/dynamics365/guidance/resources/contact-center-multi-agent-architecture-design 28\. Why AI Agents are Good Software \- Devansh \- Medium, https://machine-learning-made-simple.medium.com/why-ai-agents-are-good-software-0fc97b7a4d25 29\. Vertical Slice Architecture \- Jimmy Bogard, https://www.jimmybogard.com/vertical-slice-architecture/ 30\. Vertical Slice Architecture \- Milan Jovanović, https://www.milanjovanovic.tech/blog/vertical-slice-architecture 31\. Vertical slice architecture pros and cons : r/ExperiencedDevs \- Reddit, https://www.reddit.com/r/ExperiencedDevs/comments/1m1v5pv/vertical\_slice\_architecture\_pros\_and\_cons/ 32\. Vertical Slice Architecture and Comparison with Clean Architecture | by Mehmet Ozkaya, https://mehmetozkaya.medium.com/vertical-slice-architecture-and-comparison-with-clean-architecture-76f813e3dab6 33\. Vertical Slices in practice \- Event-Driven.io, https://event-driven.io/en/vertical\_slices\_in\_practice/ 34\. An asp.net core template based on .Net 9, Vertical Slice Architecture, CQRS, Minimal APIs, OpenTelemetry, API Versioning and OpenAPI. \- GitHub, https://github.com/mehdihadeli/vertical-slice-api-template 35\. Instructing Devin Effectively \- Devin Docs, https://docs.devin.ai/essential-guidelines/instructing-devin-effectively 36\. Exploring Software Architecture: Vertical Slice | by Andy MacConnell | Medium, https://medium.com/@andrew.macconnell/exploring-software-architecture-vertical-slice-789fa0a09be6 37\. Vertical Slice Architecture Myths You Need To Know\! \- CodeOpinion, https://codeopinion.com/vertical-slice-architecture-myths-you-need-to-know/ 38\. Vertical Slice Architecture and shared functionality : r/dotnet \- Reddit, https://www.reddit.com/r/dotnet/comments/16vbqa3/vertical\_slice\_architecture\_and\_shared/ 39\. Vertical Slice Architecture: The Best Ways to Structure Your Project \- Anton DevTips, https://antondevtips.com/blog/vertical-slice-architecture-the-best-ways-to-structure-your-project 40\. Embracing Vertical Slices Beyond N-Tier Architectures | Leapcell, https://leapcell.io/blog/embracing-vertical-slices-beyond-n-tier-architectures 41\. Clean Architecture with Modular Monolith and Vertical Slice | by Eda Belge | Medium, https://medium.com/@eda.belge/clean-architecture-with-modular-monolith-and-vertical-slice-896b7ee22e3e 42\. Clean Architecture Disadvantages \- James Hickey, https://www.jamesmichaelhickey.com/clean-architecture/ 43\. What Cloud Architects Must Know in the Age of Autonomous AI Agents, https://www.architectureandgovernance.com/uncategorized/what-cloud-architects-must-know-in-the-age-of-autonomous-ai-agents/ 44\. Backend Coding AI Context Coding Agents: DDD and Hexagonal Architecture \- Medium, https://medium.com/@bardia.khosravi/backend-coding-rules-for-ai-coding-agents-ddd-and-hexagonal-architecture-ecafe91c753f 45\. Vertical Slices & Plugin Architecture \- Principal Software Engineering Manager's Approach, https://www.youtube.com/watch?v=5OKLiQM2y30 46\. Applying Hexagonal Architecture in AI Agent Development | by Marta Fernández García, https://medium.com/@martia\_es/applying-hexagonal-architecture-in-ai-agent-development-44199f6136d3 47\. Benefits and Drawbacks of Adopting Clean Architecture \- DEV Community, https://dev.to/yukionishi1129/benefits-and-drawbacks-of-adopting-clean-architecture-2pd1 48\. The Evolution and Future of Microservices Architecture with AI-Driven Enhancements \- Digital Commons@Lindenwood University, https://digitalcommons.lindenwood.edu/cgi/viewcontent.cgi?article=1725\&context=faculty-research-papers 49\. What is the difference between Vertical Slice Architecture and Feature-Based Architecture, https://softwareengineering.stackexchange.com/questions/459214/what-is-the-difference-between-vertical-slice-architecture-and-feature-based-arc 50\. Microservices vs AI Agent \- DEV Community, https://dev.to/aditya\_fe/microservices-vs-ai-agent-4644 51\. Behold the Modular Monolith: The Architecture Balancing Simplicity and Scalability, https://dev.to/naveens16/behold-the-modular-monolith-the-architecture-balancing-simplicity-and-scalability-2d4 52\. AI Agents: Evolution, Architecture, and Real-World Applications \- arXiv, https://arxiv.org/html/2503.12687v1 53\. Designing Multi-Agent Intelligence \- Microsoft for Developers, https://developer.microsoft.com/blog/designing-multi-agent-intelligence 54\. Seizing the agentic AI advantage \- McKinsey, https://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage 55\. How to Keep Your Data Boundaries Intact in a Modular Monolith \- Milan Jovanović, https://www.milanjovanovic.tech/blog/how-to-keep-your-data-boundaries-intact-in-a-modular-monolith 56\. I don't understand the point of modular monolithic : r/softwarearchitecture \- Reddit, https://www.reddit.com/r/softwarearchitecture/comments/1g4gb9a/i\_dont\_understand\_the\_point\_of\_modular\_monolithic/ 57\. Vertical Slice Architecture \- Wolverine, https://wolverinefx.net/tutorials/vertical-slice-architecture 58\. From Layers to Slices: Revolutionizing .NET Application Architecture with Vertical Slice Design-Part I | by Bhargava Koya \- Fullstack .NET Developer | Medium, https://medium.com/@bhargavkoya56/from-layers-to-slices-revolutionizing-net-736aab80291b 59\. Enforcing policy-as-code: Open Policy Agent (OPA) | by Raunak Balchandani | Medium, https://raunakbalchandani.medium.com/enforcing-policy-as-code-open-policy-agent-opa-508883d6c0e8 60\. Policy as Code: Enforcing Rules in IaaS with Open Policy Agent \- hoop.dev, https://hoop.dev/blog/policy-as-code-enforcing-rules-in-iaas-with-open-policy-agent/ 61\. Principled AI Governance with Policy-as-Code: Leveraging OPA for Trustworthy AI, https://principledevolution.ai/blog/governance-policy-as-code-opa-trust-ai/ 62\. IDE Native, Foundation Model Based Agents for Software Refactoring \- Fraol Batole, https://fraolbatole.github.io/assets/pdf/IDEWorkshopPositionPaper.pdf 63\. Countermind: A Multi-Layered Security Architecture for Large Language Models \- arXiv, https://arxiv.org/html/2510.11837v1 64\. From Assistant to Agent: Navigating the Governance Challenges of Increasingly Autonomous AI \- Credo AI, https://www.credo.ai/recourseslongform/from-assistant-to-agent-navigating-the-governance-challenges-of-increasingly-autonomous-ai 65\. Why AI Agents Still Need You: Findings from Developer-Agent Collaborations in the Wild, https://arxiv.org/html/2506.12347v3 66\. Development of Evaluation Techniques for Multi-Agent Systems \- IEEE SA, https://standards.ieee.org/industry-connections/activities/development-of-evaluation-techniques-for-multi-agent-systems/ 67\. Context Engineering for Multi-Agent LLM Code Assistants Using Elicit, NotebookLM, ChatGPT, and Claude Code \- arXiv, https://arxiv.org/html/2508.08322v1 68\. A Survey on Code Generation with LLM-based Agents \- arXiv, https://arxiv.org/html/2508.00083v1 69\. SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints \- arXiv, https://arxiv.org/html/2509.09853v2 70\. Experimenting with Multi-Agent Software Development: Towards a Unified Platform \- arXiv, https://arxiv.org/abs/2406.05381 71\. Generative AI for Software Architecture. Applications, Challenges, and Future Directions, https://arxiv.org/html/2503.13310v2 72\. Collaborative LLM Agents for C4 Software Architecture Design Automation \- arXiv, https://arxiv.org/html/2510.22787v1 73\. Devin AI — The Overhyped “Engineer” That's Just Another Fancy Code Refactor Bot, https://medium.com/@vignarajj/devin-ai-the-overhyped-engineer-thats-just-another-fancy-code-refactor-bot-7bed3eb4e464 74\. Deep Dive on Devin: The AI Software Engineer | Scalable Path ®, https://www.scalablepath.com/machine-learning/devin-ai 75\. Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches \- arXiv, https://arxiv.org/html/2510.04905v1 76\. \[2508.02611\] Meta-RAG on Large Codebases Using Code Summarization \- arXiv, https://arxiv.org/abs/2508.02611 77\. Vertical Slice Architecture in .NET Core: A Deep Dive into Theory, Pros, and Cons. \- Medium, https://medium.com/@chauhanshubham19765/vertical-slice-architecture-in-net-core-a-deep-dive-into-theory-pros-and-cons-975bf8cc5cb5