
Claude Code Setup¶
Claude Code is powerful, but it has one fundamental problem: it tries to do everything itself. You don't need a soloist. You need a conductor and an orchestra.
This is the orchestra. The claude-code-setup repository contains the agents, skills, patterns, and delegation rules I use every day to build Mnemonic and other projects. These aren't experimental prototypes. These are the working agent definitions that will eventually be imported into Mnemonic itself.
The Delegation Philosophy¶
I learned this the hard way. Early on, I'd ask Claude to "write BATS tests for this shell script." Sometimes it would write decent tests. Other times it would hallucinate BATS syntax or miss edge cases. The quality was inconsistent because Claude was improvising every time.
The breakthrough came when I stopped asking Claude to do specialized work and started giving it explicit routing instructions. I created a delegation table in my global CLAUDE.md: "If user asks for X, immediately delegate to agent Y. No exploration, no planning, just delegate."
Here's a sample of the delegation table I use:
| Task | Delegate To |
|---|---|
| BATS tests | bats-test-engineer |
| Shell scripts | /shell-script skill |
| Go code or services | go-software-engineer |
| E2E tests | go-e2e-test-engineer |
| API specifications | api-architect |
| Documentation | technical-writer |
| System architecture | solutions-architect |
| Go architecture | go-software-architect |
| DevOps, Docker, CI | devops-engineer |
| Data schema or models | data-architect |
| SQL, migrations, Cypher | data-engineer |
| Code review or compliance | /code-review skill |
Main Claude's job is coordination. Specialists do the implementation. This separation is enforced in the system prompt - Main Claude should never write code, design systems, or explore codebases. It delegates immediately.
Agent Hierarchy¶
The agent ecosystem has two distinct layers. Architects provide recommendations and design guidance. Engineers execute the work.
graph TB
User[User Request] --> Main[Main Claude]
Main --> Arch[Architectural Layer]
Arch --> SolArch[solutions-architect]
Arch --> GoArch[go-software-architect]
Arch --> APIArch[api-architect]
Arch --> DataArch[data-architect]
Main --> Impl[Implementation Layer]
Impl --> GoEng[go-software-engineer]
Impl --> PyEng[python-software-engineer]
Impl --> NetEng[dotnet-software-engineer]
Impl --> ReactEng[react-software-engineer]
Impl --> E2E[go-e2e-test-engineer]
Impl --> BATS[bats-test-engineer]
Impl --> DevOps[devops-engineer]
Impl --> TechWriter[technical-writer]
Impl --> Review[/code-review skill/]
E2E -.->|validation<br/>feedback| GoEng
E2E -.->|failure<br/>routing| Main
style Main fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
style Arch fill:#fef3c7,stroke:#f59e0b,stroke-width:2px
style Impl fill:#dcfce7,stroke:#22c55e,stroke-width:2px
style E2E fill:#fecaca,stroke:#dc2626,stroke-width:2px
style Review fill:#e9d5ff,stroke:#a855f7,stroke-width:2px
Architects are consultants. They return recommendations to Main Claude, which then coordinates the implementation. They don't execute work themselves.
The E2E test engineer has a special role: it validates implementations and provides feedback. When tests fail, it routes failures back to the appropriate engineer through Main Claude. This creates a self-correcting loop that catches regressions and implementation issues early.
Knowledge Management¶
The biggest improvement came from taking patterns out of agent definitions and storing them in a Cognee knowledge graph. Agents query the graph on-demand for what they need instead of loading everything upfront.
This reduced agent definition sizes by about 80%. A Go agent that used to have 15KB of embedded examples now has a couple hundred bytes of instructions: "Query Cognee for Go patterns when you need them."
Query on Demand
Agents only load patterns they actually need for the current task. Working on CLI argument parsing? Query for CLI patterns. Building an API? Query for API patterns. This keeps context focused and avoids the "lost in the middle" problem.
The pattern library is organized by domain:
- API patterns: REST conventions, error handling, versioning
- CLI patterns: Command structure, flag parsing, output formatting
- Data patterns: Repository patterns, transaction handling, schema design
- DevOps patterns: Dockerfile structure, Docker Compose setup, CI/CD workflows
- E2E patterns: Test organization, fixture management, assertion patterns
- Engineering guidelines: Code review checklists, testing requirements, documentation standards
- Go patterns: Error handling, context usage, testing conventions
- Shell patterns: Script structure, error handling, POSIX compliance
- BATS patterns: Test file organization, setup/teardown, assertions
Each pattern is stored with metadata that allows semantic search. Agents can query for "error handling in REST APIs" and get relevant examples without loading the entire pattern library.
Skills¶
Two skills encapsulate multi-agent workflows that used to require manual coordination.
The /shell-script skill creates production-grade shell scripts with full test coverage. It delegates to the shell script engineer to create the script, then to the BATS test engineer to generate tests, then runs the tests. If tests fail, it loops: the shell engineer fixes the script, the test engineer updates tests, rerun. This continues until tests pass or the user intervenes. What used to take multiple prompts and manual back-and-forth is now a single command.
The /code-review skill runs a parallel three-agent code review. One agent checks pattern compliance against the engineering handbook. Another reviews language-specific conventions. A third does architectural analysis. All three run in parallel, then a synthesis step combines their findings into a single report. This surfaces issues that would be caught in PR review before any code is committed.
The Bridge to Mnemonic¶
This setup proved all the core concepts. Shared patterns work. Specialist agents work. A shared knowledge graph works. Delegation tables work.
But there's one thing I couldn't solve: non-deterministic routing. Even with explicit delegation rules in CLAUDE.md, Claude would sometimes forget. It would try to write tests itself instead of delegating to the test agent. It would explore code when it should just delegate immediately. The rules helped, but they weren't reliable enough.
That's the gap Mnemonic solves. Routing decisions in Mnemonic are made by code, not by an LLM. Deterministic. Every time.
Agent Definitions as Source of Truth
The agent definitions in the claude-code-setup repository are the ones I'm using right now. When Mnemonic reaches production, these will be the agent definitions imported into it. This repo is the working reference implementation.
Everything I'm building in Mnemonic started here. The delegation philosophy, the knowledge graph integration, the multi-agent workflows - all proven in daily use with Claude Code before being formalized into Mnemonic's architecture. This is the foundation.