Open-Source Agent That Teaches Claude Code Your Architecture

AI has made building software cheap. A solo founder with Claude Code or Cursor can ship an MVP in a weekend that would've taken a small team a month two years ago. I've watched this happen across the NZ startup scene. Ideas that used to die in the "can we afford to build it" phase now get built over a long weekend.

This is mostly great. Velocity is what startups need. Cost of testing an idea is now close to zero, and the business prioritises speed.

The catch shows up when the idea works.

AI builds for right now. It optimises for the current prompt, the current file, the current feature. It doesn't think about what happens when your billing service needs to handle 10x the volume, or when your email notifications need to move from inline calls to a queue. It doesn't plan for the evolutionary pressure your system will face once it has users.

That's the gap I've been thinking about, and it's what led me to build domain-agents.

Give the tools their credit

I want to be fair to the current generation of AI coding assistants. They're not stupid about finding code.

Claude Code runs an agentic search loop (grep, glob, file reads) iterating through your codebase to find what's relevant. Boris Cherny (who created Claude Code) has said they tried RAG with a local vector database early on and dropped it because agentic search outperformed it. Cursor takes a different approach: it chunks your codebase, generates embeddings, and stores them for semantic search so you can find code by concept rather than keyword. Copilot combines semantic indexing with LSP-powered reference tracing from VS Code.

The search works. If you ask Claude Code to find your billing service, it'll find it. Ask Cursor for authentication logic and the embeddings will surface it even if the code never uses the word "authentication."

None of them understand the architecture those files live in.

All the information needed to understand domain relationships sits in the code: import graphs, interface signatures, dependency patterns. These tools don't extract or structure it that way. They find files one at a time. They don't map out that your billing service depends on the email service, that BillingService is consumed by two other domains, or that changing its interface is a cross-domain event. The information is in the codebase. Nobody's pulling it together.

And every session starts from zero. The AI learned your architecture yesterday and forgot it today.

Evolutionary architecture for the AI era

My thesis: cheap AI-built MVPs plus expensive scaling problems point toward evolutionary architecture with domain-based boundaries.

The idea isn't new. The reason it matters now is.

In an evolutionary architecture, you focus on clean interfaces between business domains. Your email service exposes a contract like sendEmail(to, subject, body), and the rest of the system calls that interface. Behind the interface, the implementation evolves through stages as your scaling needs change:

graph LR
    A["Inline\n(direct call)"] --> B["Async\n(fire & forget)"]
    B --> C["Queued\n(BullMQ/SQS)"]
    C --> D["Separate Service"]
    D --> E["Distributed"]

Day one, sendEmail is a function that calls Resend directly. Inline, synchronous, dead simple. When traffic picks up, you drop the await and let it run in the background. Later, you introduce BullMQ or SQS. Eventually it becomes its own service. The interface stays put. Only the implementation behind it changes.

This is the kind of evolution AI coding assistants are terrible at planning for. They'll inline that email call because it works right now. They have no concept of where this domain sits on its scaling trajectory.

Where domain-agents fits in

domain-agents is a CLI tool that runs static analysis on TypeScript codebases, discovers business domains, and generates AI agent context files for Claude Code and Cursor.

domain-agents discover .    # Analyse codebase → proposal.json
domain-agents init .        # Generate agents/*.md + AGENTS.md
domain-agents hooks claude  # Wire into Claude Code (rules + MCP server)
domain-agents hooks cursor  # Wire into Cursor (.mdc rules)

After setup, opening src/billing/invoice.ts in Claude Code loads the billing domain agent into context. The AI now knows: billing depends on email (coupling score 0.23), exposes BillingService consumed by 2 other domains, sits at the "inline" scaling stage with a path toward async queuing, and has 3 tracked tech debt items.

It plans work accordingly. The context was loaded before the first prompt, no search required.

Five signals, not one

The discovery engine runs 5 analysis passes because no single signal identifies business domains on its own.

Directory structure works for greenfield projects (src/auth/, src/billing/) but fails for legacy MVC apps. Import graphs capture coupling but not business intent. Package dependencies hint at external integrations but miss internal domains.

graph TD
    S["Structure Analysis"] --> O["Signal Orchestrator"]
    I["Import Graph\n(TS Compiler API)"] --> O
    N["Naming Patterns"] --> O
    D["Dependency Mapping\n(npm → domain hints)"] --> O
    IF["Interface Detection"] --> O
    O --> M["Merge Pipeline"]
    M --> R["Domain Proposal"]

Structure detects whether the codebase is feature-organised, layer-organised, mixed, or flat. Import graph uses the TypeScript Compiler API to parse each .ts file, resolve imports, and build a directed edge graph. Type-only imports get weighted at 0.3 because they're a weaker coupling signal than value imports. Naming patterns extract domain prefixes: auth.controller.ts → "auth". Dependency mapping maps npm packages to domain hints (stripe → billing, @sendgrid/mail → email). Interface detection identifies files imported across domain boundaries and calculates coupling scores between domain pairs.

Each pass produces weighted signals. The orchestrator combines them with confidence scoring: average signal strength plus a bonus for signal count, capped at 0.99. Layer-organised codebases get an 0.85 multiplier because they're harder to discover.

Most real codebases aren't clean

Feature-organised codebases are easy. The directory structure is the domain. But most real codebases look like this:

src/
  controllers/
    auth.controller.ts
    billing.controller.ts
  services/
    auth.service.ts
    billing.service.ts
  models/
    invoice.model.ts
    user.model.ts

Here auth.controller.ts, auth.service.ts, and auth.routes.ts all belong to the "auth" domain despite living in three different directories. domain-agents uses naming pattern extraction cross-referenced with import graph cohesion to cluster these. The auth.* files form a tight import cluster, which confirms the naming signal.

Merging is the hard bit

Raw signals produce too many small, overlapping clusters. The orchestrator runs a multi-phase normalisation pipeline.

Plurals merge: journals + journal → whichever has more files. Compound names consolidate: bank-balance + bank-statement + bank-transaction → bank-accounts (the largest cluster). Small clusters merge into their strongest import target, but only if they have a dominant dependency: more than 40% of imports from one target, and that target is at least 2x larger. This prevents cascading, where A merges into B, B gets bigger and attracts C, C pulls in D.

Files that import from 3+ domains get moved to "unassigned." These are coupling hotspots: middleware, orchestrators, shared handlers. Assigning them to one domain would mislead the AI, so the tool surfaces them for a human decision. That's the right call for architectural boundaries.

The E2E test suite validates the complete pipeline against 3 fixture codebases (feature-organised, layer-organised, mixed). Current benchmark: 100% activation accuracy across all 3 patterns and all 3 activation levels (domain assignment, glob matching, MCP lookup).

Auto-activation, not search

The integration into Claude Code and Cursor uses glob-based rule activation, the native mechanism both tools already support.

Each domain gets a rule file with glob patterns in the frontmatter:

---
description: billing domain
globs:
  - src/billing/**
  - **/billing.*
  - **/billing-*
---

When Claude Code opens a file matching those globs, the domain context loads. No MCP call, no background process, zero runtime overhead.

An MCP server complements the rules with 4 on-demand tools: domain_lookup(file), domain_context(name), domain_files(name), and list_domains(). A SessionStart hook prints a domain summary at the start of every Claude Code session, so the AI has system-level awareness from the first prompt.

Agents as a team model

This is the bit I'm most keen on long-term.

At Vend and Xero, teams owned domains. The billing team owned billing, the integrations team owned integrations. Ownership meant knowing the interfaces, the coupling points, the tech debt, and where things were headed. That knowledge lived in people's heads and got passed on through code reviews, architecture chats, and tribal memory.

Domain-specific AI agents formalise that same ownership model. An email agent loads the email domain's interface contract, its coupling to other domains, its current scaling stage, and its tracked tech debt. A billing agent carries the same for billing. They work within their boundaries and flag when a change crosses a domain line.

You don't need this from day one. Early on, one agent covers multiple areas. As the product grows, agents split along the same lines engineering teams split: by business domain. The operator (that's you) resolves conflicts where agents disagree, the same way an engineering manager resolves cross-team dependencies.

The analogy is rough, but it captures how AI-assisted development scales past a single person staring at a single context window.