Multi-Model AI Agent Team Design: Composed Architecture and 5-Team Hierarchy

Overview

For building a blog system, I designed a multi-model agent team consisting of 14 AI specialists, 5 teams, and 4 LLM models. The core innovations are two:

  1. Composed Agent: Separating role definitions from execution profiles for maximum reusability
  2. Hierarchical Bridge Leadership: Dual membership of tech leads between upper and lower teams to resolve communication bottlenecks

This post covers the final structure, model distribution strategy, and the composed architecture design process.


Background: Why Multi-Model?

Using a single LLM for all tasks creates two problems:

  • Cost: Running 14 specialists on a Claude Opus-level model makes costs uncontrollable
  • Fit: Design needs fast reasoning, security analysis needs deep logic, implementation needs stable coding

So we distributed models based on task characteristics.


Final Team Structure

5 teams, 14 specialists, 4 models.

Team Details

TeamTypeDecision-MakingLeaderMembers
Steering TeamupperconsensusOrchestrator8 (including bridges)
Backend Teamlowerleader_decidesBackend Tech Lead2
Frontend Teamlowerleader_decidesFrontend Tech Lead3
Desktop Teamlowerleader_decidesDesktop Tech Lead3
Infra Teamlowerleader_decidesServer Admin3

The Infra Team Separation Decision

In the initial design, DB Architect and Server Admin were part of the backend team. But we separated them based on workspace.

When workspaces differ, separation is more natural than keeping them together.


Model Distribution Strategy

ModelSpecialistsPurposeWhy Chosen
relay:developer-zai10Implementation, ops, leadsCost-effective, stable coding
gemini:gemini-2.5-flash2Design, infra networkFast response, easy external API calls
codex:gpt-4o1Security reviewHigh reasoning, OWASP knowledge
zai:glm-41Context compressionFree tier, text summarization specialized

By assigning 10 implementation specialists to GLM, we achieved 60-70% total cost reduction.


Composed Agent Architecture

The core innovation is separating role definitions (Expert) from execution profiles (Definition).

With the traditional approach, role and execution logic are coupled — any change requires a full rewrite, and reuse is impossible.

Composed Approach

Module Structure

agent-library/
├── definitions/        ← 14 agent definitions
├── modules/
│   ├── base/           ← 6 base modules
│   ├── capabilities/   ← 15 capability modules
│   ├── platforms/      ← 5 platform modules
│   └── policies/       ← 1 policy
└── runs/               ← execution history

Advantages

  1. Reusability: rest-api capability module shared by backend developer and backend tech lead
  2. Platform swap: Change platform: fastapi to platform: django for instant switching
  3. Capability extension: Add a new capability module and connect it to the Definition
  4. Policy unification: All agents follow the same blog-default policy

Expert-Definition Mapping

ExpertDefinitionBaseCapabilitiesPlatform
Backend Developerbackend-developerbackend-corerest-api, crud, auth-jwtfastapi
Backend Tech Leadbackend-tech-leadbackend-corerest-api, crud, code-reviewfastapi
Frontend Developerfrontend-developerfrontend-coremarkdown-renderer, list-filter-sortnextjs
Server Adminserver-administratorserver-coredocker-management, nginx-config, postgres-adminubuntu
Infra Networkinfra-network-admininfra-coredns-management, ssl-certificates, rate-limitingcloudflare
Security Reviewersecurity-auditorspecialist-coresecurity-auditfastapi
Context Compressorcontext-compressorspecialist-corecontext-compressionmarkdown

TLS Certificate Strategy: Cloudflare Origin CA

For production TLS certificates, we chose Cloudflare Origin CA over Let’s Encrypt.

ItemLet’s EncryptCloudflare Origin CA
Validity90 days (renewal required)15 years (no renewal)
IssuanceACME automation requiredManual from Dashboard
Complexitycertbot setupCopy cert files only

Production architecture:

Oracle Cloud ARM (4 OCPU, 24GB)
├── PostgreSQL (installed directly on host)
├── Docker Compose
│   ├── blog-api (FastAPI)
│   ├── blog-frontend (Next.js standalone)
│   ├── MinIO (S3-compatible storage)
│   └── Nginx (Cloudflare Origin CA)
└── Cloudflare Proxy (Full Strict SSL)

Relay Plugin: Agent Invocation Mechanism

The team structure runs in Claude Code through the Relay plugin.

backed_by Namespaces

NamespaceMCP ServerPurpose
relay:developer-zaiinternal agentImplementation, ops (low cost)
relay:steering-orchestratorinternal agentCoordination, final decisions
gemini:gemini-2.5-flashgemini_mcpDesign, external APIs
codex:gpt-4ocodex_mcpSecurity analysis
zai:glm-4zai_mcpContext compression

Design Decision Log

DecisionAlternativeRationale
Separate infra teamInclude in backendDifferent workspaces (SSH vs IDE)
Cloudflare Origin CALet’s Encrypt15-year validity, no renewal
PostgreSQL on hostDocker containerMemory efficiency on single server
Composed agentSingle-definition agentModule reusability, easy platform swap
Assign many GLMAssign many Claude60-70% cost reduction

Retrospective: What I Learned

1. “Executable Structure” over “Perfect Structure”

Trying to design everything perfectly can prevent you from ever starting. It’s better to compromise and improve as you execute.

2. Workspace Defines Team Boundaries

People who write code and people who manage servers have different physical work environments — that’s a natural team boundary.

3. The Value of Composed Architecture

In an environment with 14 specialists, 5 teams, and 4 models interacting, module separation is essential. It minimizes the scope of changes and maximizes reusability.

4. Cost is Determined at Design Time

Asking “does this task really need a high-cost model?” every time naturally optimizes costs.


Next Steps

  • Start Phase 1 implementation: DB, Auth, Post/Category CRUD, Docker
  • Share team operation experience: Problems encountered during actual execution
  • Performance monitoring: Response time per model, cost vs quality analysis

This post summarizes my experience building an AI agent team using Claude Code + Relay plugin. It reflects learnings from applying this to a real project — different approaches may be more suitable depending on your situation.

Built with Hugo
Theme Stack designed by Jimmy