<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Nats on Yarang's Tech Lair</title><link>https://blog.fcoinfup.com/tags/nats/</link><description>Recent content in Nats on Yarang's Tech Lair</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Fri, 08 May 2026 21:57:11 +0900</lastBuildDate><atom:link href="https://blog.fcoinfup.com/tags/nats/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a Multi-LLM Distributed Orchestrator with NATS JetStream</title><link>https://blog.fcoinfup.com/post/building-a-multi-llm-distributed-orchestrator-with-nats-jetstream/</link><pubDate>Fri, 08 May 2026 21:57:11 +0900</pubDate><guid>https://blog.fcoinfup.com/post/building-a-multi-llm-distributed-orchestrator-with-nats-jetstream/</guid><description>&lt;p&gt;Part 1 discussed the model-specific limitations discovered while running four AIs—Claude, ZAI, Codex, and Gemini—concurrently on the same tasks. This part is about &amp;ldquo;how we made it possible&amp;rdquo;—the system design and implementation story.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="system-overview"&gt;System Overview
&lt;/h2&gt;&lt;p&gt;AgentForge consists of three components.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Task Publisher]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ NATS JetStream publish
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[NATS Broker] ─── af.worker.{id}.inbox
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ JetStream consume (independent streams per worker)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Worker Pollers] × N (poller.py × 18 instances)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ LLM CLI Execution (claude / codex / gemini)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Result Return] af.task.{task_id}.completed
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When a publisher posts a task to NATS, each worker, which is independently subscribed, receives the message on its inbox and executes the LLM CLI. The result is then published back to a completion topic.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="why-nats-jetstream"&gt;Why NATS JetStream?
&lt;/h2&gt;&lt;p&gt;We considered several message broker options: Redis Streams, Kafka, RabbitMQ, and NATS JetStream.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reasons for choosing NATS JetStream:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Single Binary&lt;/strong&gt; — Operates with a single &lt;code&gt;nats-server&lt;/code&gt; without requiring separate runtimes. It has no dependencies like Kafka&amp;rsquo;s ZooKeeper or RabbitMQ&amp;rsquo;s Erlang/OTP.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Built-in Persistence&lt;/strong&gt; — JetStream is a streaming layer on top of NATS, storing messages to the filesystem. This ensures that unprocessed tasks are not lost even if a worker restarts.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NKey-based Authentication&lt;/strong&gt; — We can issue independent Ed25519 key pairs for each worker. If one worker is compromised, the credentials of other workers remain valid.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lightweight&lt;/strong&gt; — Memory usage is around 30MB on a single server. Even with 18 workers connected, the broker load is minimal.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id="the-core-backend-adapter-in-pollerpy"&gt;The Core: Backend Adapter in &lt;code&gt;poller.py&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;The heart of the worker is &lt;code&gt;poller.py&lt;/code&gt;. This single file handles NATS subscriptions, LLM CLI execution, and result returns.&lt;/p&gt;
&lt;p&gt;Since LLMs have different execution methods, we separated them into a backend adapter dictionary.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;_BACKENDS: dict[str, dict] &lt;span style="color:#f92672"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;claude&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CLAUDE_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/local/bin/claude&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;tools&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;ALLOWED_TOOLS&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;Read,Edit,Write,Glob,Grep&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CLAUDE_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;codex&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/bin/codex&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;sandbox&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_SANDBOX&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;read-only&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;gemini_cli&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;GEMINI_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/bin/gemini&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;GEMINI_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;MODEL_BACKEND&lt;/code&gt; environment variable determines which LLM to use. This allows the same &lt;code&gt;poller.py&lt;/code&gt; code to run different LLMs across 18 workers.&lt;/p&gt;
&lt;h3 id="claude-backend"&gt;Claude Backend
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;async&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;run_claude&lt;/span&gt;(instructions: str, task_id: str) &lt;span style="color:#f92672"&gt;-&amp;gt;&lt;/span&gt; tuple[int, str]:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cfg &lt;span style="color:#f92672"&gt;=&lt;/span&gt; _BACKENDS[&lt;span style="color:#e6db74"&gt;&amp;#34;claude&amp;#34;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cmd &lt;span style="color:#f92672"&gt;=&lt;/span&gt; [cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;], &lt;span style="color:#e6db74"&gt;&amp;#34;--print&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;--allowedTools&amp;#34;&lt;/span&gt;, cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;tools&amp;#34;&lt;/span&gt;]]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; cfg&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;):
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cmd &lt;span style="color:#f92672"&gt;+=&lt;/span&gt; [&lt;span style="color:#e6db74"&gt;&amp;#34;--model&amp;#34;&lt;/span&gt;, cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;]]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; proc &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;await&lt;/span&gt; asyncio&lt;span style="color:#f92672"&gt;.&lt;/span&gt;create_subprocess_exec(&lt;span style="color:#f92672"&gt;*&lt;/span&gt;cmd, stdin&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE, stdout&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE, stderr&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;--print&lt;/code&gt; flag is key. It runs Claude Code in non-interactive mode instead of conversational mode, ensuring the results are returned via stdout.&lt;/p&gt;
&lt;h3 id="zai-backend"&gt;ZAI Backend
&lt;/h3&gt;&lt;p&gt;ZAI offers an Anthropic API-compatible endpoint, so it doesn&amp;rsquo;t require a separate backend. Routing is handled by two environment variables.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-ini" data-lang="ini"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# /etc/agentforge/cc-zai-high-dev-01.env&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;lt;ZAI endpoint&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ANTHROPIC_AUTH_TOKEN&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;lt;ZAI API key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;By injecting this file using systemd&amp;rsquo;s &lt;code&gt;EnvironmentFile=&lt;/code&gt; directive, the &lt;code&gt;claude&lt;/code&gt; binary sends requests to the ZAI endpoint. This allows us to connect to a different LLM provider simply by changing environment variables, without altering the code.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="declarative-management-fleetyaml--serversyaml"&gt;Declarative Management: &lt;code&gt;fleet.yaml&lt;/code&gt; × &lt;code&gt;servers.yaml&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;Manually managing 18 workers is impractical. We declaratively defined the entire infrastructure using two YAML files.&lt;/p&gt;
&lt;h3 id="serversyaml--server-inventory"&gt;&lt;code&gt;servers.yaml&lt;/code&gt; — Server Inventory
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;servers&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;agentforge-worker, tunnel-arm1]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;nats-jetstream, postgres]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;agentforge-worker, tunnel-arm1]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="fleetyaml--worker-placement"&gt;&lt;code&gt;fleet.yaml&lt;/code&gt; — Worker Placement
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;workers&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;cc-go-dev-01&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-code&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-sonnet-4-6&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;go&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;codex-py-dev-01&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;codex&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;gpt-5.5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;python&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Changing just the &lt;code&gt;host&lt;/code&gt; field moves a worker to a different server. Setting &lt;code&gt;enabled: false&lt;/code&gt; stops the deployment script from starting that worker.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="worker-templating-system-provision_workerpy"&gt;Worker Templating System: &lt;code&gt;provision_worker.py&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;Manually writing systemd unit files for each new worker is prone to errors. We automated this using Jinja2 templates and a provisioning script.&lt;/p&gt;
&lt;h3 id="template-structure"&gt;Template Structure
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;templates/
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; systemd/
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; claude.service.j2 # For claude-code and ZAI alike
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; codex.service.j2 # OpenAI Codex
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; gemini.service.j2 # Google Gemini CLI
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The core part of &lt;code&gt;claude.service.j2&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;MODEL_BACKEND&lt;span style="color:#f92672"&gt;=&lt;/span&gt;claude
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CLAUDE_BIN&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ claude_bin }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; claude_model &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CLAUDE_MODEL&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ claude_model }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; env_file &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;EnvironmentFile&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ env_file }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;WORK_BASE&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ work_base }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;WORK_DIR&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ work_base }}&lt;span style="color:#f92672"&gt;/&lt;/span&gt;repo
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#34;{{ &amp;#39;ALLOWED_TOOLS=&amp;#39; + allowed_tools }}&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CREATE_PR&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ &lt;span style="color:#e6db74"&gt;&amp;#39;true&amp;#39;&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; create_pr &lt;span style="color:#66d9ef"&gt;else&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#39;false&amp;#39;&lt;/span&gt; }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; create_pr &lt;span style="color:#f92672"&gt;and&lt;/span&gt; github_remote &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;GITHUB_REMOTE&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ github_remote }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For ZAI workers, the &lt;code&gt;env_file&lt;/code&gt; block is activated, adding the &lt;code&gt;EnvironmentFile&lt;/code&gt;. For PR creation workers, &lt;code&gt;github_remote&lt;/code&gt; is injected. Other settings use defaults.&lt;/p&gt;
&lt;h3 id="provision_workerpy-usage"&gt;&lt;code&gt;provision_worker.py&lt;/code&gt; Usage
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Preview (no actual deployment)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker new-worker-id --dry-run
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Actual deployment (including NATS creds issuance)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker new-worker-id --issue-creds
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Bulk deployment for the entire fleet.yaml&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --all
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Internal operations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reads worker entries from &lt;code&gt;fleet.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Reads target hosts from &lt;code&gt;servers.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Renders Jinja2 templates.&lt;/li&gt;
&lt;li&gt;Deploys &lt;code&gt;/etc/systemd/system/{worker_id}-poller.service&lt;/code&gt; via SSH.&lt;/li&gt;
&lt;li&gt;Creates the working directory.&lt;/li&gt;
&lt;li&gt;Executes &lt;code&gt;systemctl daemon-reload &amp;amp;&amp;amp; enable --now&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;(Optional) Issues NATS NKey with &lt;code&gt;nsc add user&lt;/code&gt; → deploys creds → regenerates &lt;code&gt;auth.conf&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id="distributed-hosting-adding-workers-to-a-second-server"&gt;Distributed Hosting: Adding Workers to a Second Server
&lt;/h2&gt;&lt;p&gt;Running all workers on a single server creates a single point of failure. We added Claude workers to a second host.&lt;/p&gt;
&lt;p&gt;The method for workers on the second host to connect to the NATS broker is via an autossh tunnel.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-ini" data-lang="ini"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;[Unit]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Description&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;NATS Broker Tunnel&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;After&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;network-online.target&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;[Service]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ExecStart&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;/usr/bin/autossh -N \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; -L 4222:127.0.0.1:4222 \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; -i /home/ubuntu/.ssh/id_ed25519 \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Restart&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;always&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;RestartSec&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With this configuration active, workers always connect to &lt;code&gt;nats://127.0.0.1:4222&lt;/code&gt;. They don&amp;rsquo;t need to know the broker host&amp;rsquo;s address. As long as the tunnel is alive, it works the same way from any host.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="nats-credential-operations-experience"&gt;NATS Credential Operations Experience
&lt;/h2&gt;&lt;p&gt;NATS NKey management was the most complex part of the implementation.&lt;/p&gt;
&lt;p&gt;NATS JetStream&amp;rsquo;s authentication structure is hierarchical.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;Operator (Root Signing Authority)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── Account: SYS (System Account)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── Account: Services (Worker Account)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: cc-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: cc-go-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: codex-py-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── ...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each worker has an independent User NKey and can publish/subscribe within the permissions scope (&lt;code&gt;af.&amp;gt;&lt;/code&gt;, &lt;code&gt;_INBOX.&amp;gt;&lt;/code&gt;, &lt;code&gt;$JS.&amp;gt;&lt;/code&gt;) of the Services account.&lt;/p&gt;
&lt;p&gt;Adding a new worker requires the Operator&amp;rsquo;s signing key. We initially made the mistake of not backing up this key, leading to its loss. Consequently, we had to regenerate the entire Operator and replace all worker credentials en masse. The service downtime was approximately 60 seconds.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Regeneration procedure&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add operator AgentForge
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add account SYS
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add account Services
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;for&lt;/span&gt; worker in cc-dev-01 cc-go-dev-01 ...; &lt;span style="color:#66d9ef"&gt;do&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; nsc add user --account Services --name $worker &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; --allow-pub &lt;span style="color:#e6db74"&gt;&amp;#34;af.&amp;gt;,_INBOX.&amp;gt;,&lt;/span&gt;$JS&lt;span style="color:#e6db74"&gt;.&amp;gt;&amp;#34;&lt;/span&gt; &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; --allow-sub &lt;span style="color:#e6db74"&gt;&amp;#34;af.&amp;gt;,_INBOX.&amp;gt;,&lt;/span&gt;$JS&lt;span style="color:#e6db74"&gt;.&amp;gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;done&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc generate config --mem-resolver --sys-account SYS &amp;gt; auth.new.conf
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id="adding-a-new-worker-the-full-procedure"&gt;Adding a New Worker: The Full Procedure
&lt;/h2&gt;&lt;p&gt;Since the completion of this system, adding a new worker is straightforward.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1&lt;/strong&gt;: Add an entry to &lt;code&gt;fleet.yaml&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;- &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;my-new-worker&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-code&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-haiku-4-5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;multi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Step 2&lt;/strong&gt;: Preview&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker my-new-worker --dry-run
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Step 3&lt;/strong&gt;: Actual Deployment&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker my-new-worker --issue-creds
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. Template rendering, SSH deployment, NATS credential issuance, and service registration are all handled by a single command.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="next-steps"&gt;Next Steps
&lt;/h2&gt;&lt;p&gt;The current system is structured such that workers process tasks independently. Future plans include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Routing Policies&lt;/strong&gt;: Automatically selecting the appropriate worker based on task characteristics (e.g., Go code → &lt;code&gt;claude-go-dev&lt;/code&gt;, cost-first → ZAI lightweight tier).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Results Comparison Dashboard&lt;/strong&gt;: A UI to display fan-out results side-by-side.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost Tracking&lt;/strong&gt;: Aggregating API call costs per worker.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The code is publicly available on GitHub.&lt;/p&gt;</description></item><item><title>I Sent the Same Coding Task to 4 AIs Simultaneously</title><link>https://blog.fcoinfup.com/post/i-sent-the-same-coding-task-to-4-ais-simultaneously/</link><pubDate>Fri, 08 May 2026 21:55:39 +0900</pubDate><guid>https://blog.fcoinfup.com/post/i-sent-the-same-coding-task-to-4-ais-simultaneously/</guid><description>&lt;p&gt;What happens when the same bug-fixing task is sent to Claude, ZAI (GLM), OpenAI Codex, and Google Gemini simultaneously?&lt;/p&gt;
&lt;p&gt;This question sparked the AgentForge project. We built a system that connects multiple LLM CLIs with the NATS JetStream message queue to process the same tasks in parallel, and in the process, we made some unexpected discoveries. This article focuses on the comparative experimental findings during the setup phase.&lt;/p&gt;
&lt;p&gt;The system&amp;rsquo;s design and implementation will be covered in Part 2.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="list-of-ais-tested"&gt;List of AIs Tested
&lt;/h2&gt;&lt;p&gt;The final configuration of 18 operational workers is as follows:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Family&lt;/th&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Notes&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;claude-sonnet-4-6&lt;/td&gt;
 &lt;td&gt;Main development worker&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Claude Code&lt;/td&gt;
 &lt;td&gt;claude-sonnet-4-5&lt;/td&gt;
 &lt;td&gt;Previous generation comparison&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Claude Code&lt;/td&gt;
 &lt;td&gt;claude-haiku-4-5&lt;/td&gt;
 &lt;td&gt;Lightweight &amp;amp; High-speed&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Claude Code&lt;/td&gt;
 &lt;td&gt;claude-opus-4-6&lt;/td&gt;
 &lt;td&gt;Top-tier&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Claude Code&lt;/td&gt;
 &lt;td&gt;claude-opus-4-5&lt;/td&gt;
 &lt;td&gt;Previous generation comparison&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;ZAI (GLM)&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;glm-5.1&lt;/td&gt;
 &lt;td&gt;High-tier&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;ZAI (GLM)&lt;/td&gt;
 &lt;td&gt;glm-4.7&lt;/td&gt;
 &lt;td&gt;Mid-tier&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;ZAI (GLM)&lt;/td&gt;
 &lt;td&gt;glm-4.5-air&lt;/td&gt;
 &lt;td&gt;Lightweight tier&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;OpenAI Codex&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;gpt-5.5&lt;/td&gt;
 &lt;td&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Codex&lt;/td&gt;
 &lt;td&gt;gpt-5.4&lt;/td&gt;
 &lt;td&gt;1M context&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Codex&lt;/td&gt;
 &lt;td&gt;gpt-5.4-mini&lt;/td&gt;
 &lt;td&gt;400K context&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Codex&lt;/td&gt;
 &lt;td&gt;gpt-5.3-codex&lt;/td&gt;
 &lt;td&gt;272K context&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Google Gemini&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;gemini-2.5-flash&lt;/td&gt;
 &lt;td&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Gemini&lt;/td&gt;
 &lt;td&gt;gemini-2.5-pro&lt;/td&gt;
 &lt;td&gt;High-tier&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Gemini&lt;/td&gt;
 &lt;td&gt;gemini-2.5-flash-lite&lt;/td&gt;
 &lt;td&gt;Lightweight&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The list was much shorter when we first started. It grew as we experimented with which models were available.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="discovery-1-claude-3x-series-is-already-inaccessible"&gt;Discovery 1: Claude 3.x Series is Already Inaccessible
&lt;/h2&gt;&lt;p&gt;Those who have used Claude Code for a long time might recall Claude 3.7 Sonnet, 3.5 Sonnet, and 3.5 Haiku. We attempted to add these models as workers.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;claude --model claude-3-7-sonnet-20250219 --print &lt;span style="color:#e6db74"&gt;&amp;#34;hello&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → &amp;#34;may not exist or no access&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;All three models returned the same error. The Claude 3 series reached its EOL in early 2026, and access via the Claude Code CLI has been blocked. Currently, only the 4.x series is available with a Claude Code subscription.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;: Claude workers were configured using only the 4.5/4.6 series.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="discovery-2-limited-model-selection-for-chatgpt-account-codex"&gt;Discovery 2: Limited Model Selection for ChatGPT Account Codex
&lt;/h2&gt;&lt;p&gt;The OpenAI Codex CLI authenticates with a ChatGPT Plus/Pro account or a separate API key. If authenticated via a ChatGPT account, the accessible models are limited.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;codex --model gpt-5.5-pro &lt;span style="color:#e6db74"&gt;&amp;#34;fix the bug&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → &amp;#34;Model gpt-5.5-pro is not supported with ChatGPT account&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;codex --model gpt-5.5 &lt;span style="color:#e6db74"&gt;&amp;#34;fix the bug&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → Works normally&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Models available with a ChatGPT account:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Context&lt;/th&gt;
 &lt;th&gt;Inference Level&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;gpt-5.5&lt;/td&gt;
 &lt;td&gt;1M / 1M&lt;/td&gt;
 &lt;td&gt;High&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;gpt-5.4&lt;/td&gt;
 &lt;td&gt;1M / 1M&lt;/td&gt;
 &lt;td&gt;Medium&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;gpt-5.4-mini&lt;/td&gt;
 &lt;td&gt;400K / 400K&lt;/td&gt;
 &lt;td&gt;Medium&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;gpt-5.3-codex&lt;/td&gt;
 &lt;td&gt;272K / 400K&lt;/td&gt;
 &lt;td&gt;Medium&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;All other models, including &lt;code&gt;gpt-5.5-pro&lt;/code&gt;, returned a &amp;ldquo;not supported with ChatGPT account&amp;rdquo; error. More models are available with an API key, but that&amp;rsquo;s a different approach.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="discovery-3-gemini-cli-only-supports-25-series"&gt;Discovery 3: Gemini CLI Only Supports 2.5 Series
&lt;/h2&gt;&lt;p&gt;We tested various models with the Gemini CLI (&lt;code&gt;gemini&lt;/code&gt; binary).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;gemini -p &lt;span style="color:#e6db74"&gt;&amp;#34;hello&amp;#34;&lt;/span&gt; -m gemini-2.0-flash
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → ModelNotFoundError: models/gemini-2.0-flash is not found&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;gemini -p &lt;span style="color:#e6db74"&gt;&amp;#34;hello&amp;#34;&lt;/span&gt; -m gemini-1.5-pro
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → ModelNotFoundError&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;gemini -p &lt;span style="color:#e6db74"&gt;&amp;#34;hello&amp;#34;&lt;/span&gt; -m gemini-2.5-flash
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → Works normally&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Gemini models accessible with the current account:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;gemini-2.5-flash&lt;/code&gt; — Default recommended model&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gemini-2.5-pro&lt;/code&gt; — High-tier&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gemini-2.5-flash-lite&lt;/code&gt; — Lightweight&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Versions of Gemini 2.0 and below return &lt;code&gt;ModelNotFoundError&lt;/code&gt;. While this might vary based on account plan or API key type, based on the Gemini CLI, only the 2.5 series worked reliably.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="discovery-4-zai-can-be-bypassed-with-claude-sdk"&gt;Discovery 4: ZAI Can Be Bypassed with Claude SDK
&lt;/h2&gt;&lt;p&gt;ZAI is a service that provides an endpoint compatible with the Anthropic API. This allows us to use GLM models with the Claude Code CLI by changing just two environment variables.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ANTHROPIC_BASE_URL&lt;span style="color:#f92672"&gt;=&lt;/span&gt;https://&amp;lt;ZAI endpoint&amp;gt; &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;ANTHROPIC_AUTH_TOKEN&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&amp;lt;ZAI_KEY&amp;gt; &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;claude --model glm-5.1 --print &lt;span style="color:#e6db74"&gt;&amp;#34;fix the bug&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Since Claude Code internally uses the Anthropic Python SDK, simply overriding &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; allows calling ZAI&amp;rsquo;s GLM models with the same format. It was interesting that we could reuse the existing &lt;code&gt;claude&lt;/code&gt; backend without any separate adapter code.&lt;/p&gt;
&lt;p&gt;The three GLM models used were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;glm-5.1&lt;/code&gt; — High-tier&lt;/li&gt;
&lt;li&gt;&lt;code&gt;glm-4.7&lt;/code&gt; — Cost-performance balance&lt;/li&gt;
&lt;li&gt;&lt;code&gt;glm-4.5-air&lt;/code&gt; — Lightweight &amp;amp; High-speed&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="4-way-fan-out-comparison-test"&gt;4-Way Fan-out Comparison Test
&lt;/h2&gt;&lt;p&gt;We simultaneously issued the same Go bug-fixing task to 4 representative workers out of the 18 (Claude Sonnet, GLM-5.1, Codex gpt-5.5, Gemini 2.5 Flash).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;Task: &amp;#34;fix the off-by-one error in the binary search function&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Response times (wall clock):&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Worker&lt;/th&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Response Time&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;cc-go-dev-01&lt;/td&gt;
 &lt;td&gt;claude-sonnet-4-6&lt;/td&gt;
 &lt;td&gt;~8 seconds&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;cc-zai-high-dev-01&lt;/td&gt;
 &lt;td&gt;glm-5.1&lt;/td&gt;
 &lt;td&gt;~12 seconds&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;codex-py-dev-01&lt;/td&gt;
 &lt;td&gt;gpt-5.5&lt;/td&gt;
 &lt;td&gt;~15 seconds&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;gemini-py-dev-01&lt;/td&gt;
 &lt;td&gt;gemini-2.5-flash&lt;/td&gt;
 &lt;td&gt;~10 seconds&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;More interesting than the response times were the differences in their approaches. Claude tended to refactor the entire function, while Gemini preferred minimal modifications. Codex often included test code along with the fix.&lt;/p&gt;
&lt;p&gt;Of course, this is a single task result and has no statistical significance. It was a verification at the &amp;ldquo;does it actually work&amp;rdquo; level, not a benchmark.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="distributed-workers-adding-a-second-host"&gt;Distributed Workers: Adding a Second Host
&lt;/h2&gt;&lt;p&gt;If all workers are on the same server, the comparative experiment loses some of its meaning. Therefore, we added Claude workers to a second host.&lt;/p&gt;
&lt;p&gt;The method for workers to access the NATS broker (on the first host) from the second host is via an &lt;code&gt;autossh&lt;/code&gt; tunnel.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-ini" data-lang="ini"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;[Service]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ExecStart&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;autossh -N -L 4222:127.0.0.1:4222 broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;By forwarding the local port 4222 to the broker, workers can connect to &lt;code&gt;nats://127.0.0.1:4222&lt;/code&gt; from any host without code changes.&lt;/p&gt;
&lt;p&gt;Advantage of this method: Workers don&amp;rsquo;t need to know where the broker is. They can always connect to &lt;code&gt;localhost:4222&lt;/code&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="most-panicked-moment-during-operation"&gt;Most Panicked Moment During Operation
&lt;/h2&gt;&lt;p&gt;The most distressing situation was losing the NATS operator signing key. NATS JetStream uses NKey-based authentication, and the operator/account&amp;rsquo;s signing key (nsc seed) is required to issue credentials for new workers.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add user --account Services --name new-worker
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# → &amp;#34;signing key not found&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There was no backup. Ultimately, we had to perform a large-scale cutover, regenerating the entire NATS operator and replacing all worker credentials with a new permission tree. Service downtime was approximately 60 seconds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: Always create an offline backup of the NATS operator seed immediately after generation. If it&amp;rsquo;s lost, regeneration is the only option.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;p&gt;Practical conclusions from this experiment:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Claude 3.x is EOL&lt;/strong&gt; - Inaccessible via Claude Code CLI as of 2026. Use only 4.x.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Codex ChatGPT Account Limited to 4 Models&lt;/strong&gt; - gpt-5.5, 5.4, 5.4-mini, 5.3-codex. Pro models require a separate API key.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gemini Only 2.5 Series&lt;/strong&gt; - Previous versions inaccessible via CLI.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ZAI Integrable via Claude SDK Environment Variable Override&lt;/strong&gt; - No separate adapter needed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NATS NKey Must Be Backed Up&lt;/strong&gt; - Losing the signing key means reissuing everything.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The next installment will cover how these workers are connected, discussing system design and implementation.&lt;/p&gt;</description></item></channel></rss>