<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Jetstream on Yarang's Tech Lair</title><link>https://blog.fcoinfup.com/tags/jetstream/</link><description>Recent content in Jetstream on Yarang's Tech Lair</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Fri, 08 May 2026 21:57:11 +0900</lastBuildDate><atom:link href="https://blog.fcoinfup.com/tags/jetstream/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a Multi-LLM Distributed Orchestrator with NATS JetStream</title><link>https://blog.fcoinfup.com/post/building-a-multi-llm-distributed-orchestrator-with-nats-jetstream/</link><pubDate>Fri, 08 May 2026 21:57:11 +0900</pubDate><guid>https://blog.fcoinfup.com/post/building-a-multi-llm-distributed-orchestrator-with-nats-jetstream/</guid><description>&lt;p&gt;Part 1 discussed the model-specific limitations discovered while running four AIs—Claude, ZAI, Codex, and Gemini—concurrently on the same tasks. This part is about &amp;ldquo;how we made it possible&amp;rdquo;—the system design and implementation story.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="system-overview"&gt;System Overview
&lt;/h2&gt;&lt;p&gt;AgentForge consists of three components.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Task Publisher]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ NATS JetStream publish
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[NATS Broker] ─── af.worker.{id}.inbox
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ JetStream consume (independent streams per worker)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Worker Pollers] × N (poller.py × 18 instances)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; │ LLM CLI Execution (claude / codex / gemini)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ▼
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[Result Return] af.task.{task_id}.completed
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When a publisher posts a task to NATS, each worker, which is independently subscribed, receives the message on its inbox and executes the LLM CLI. The result is then published back to a completion topic.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="why-nats-jetstream"&gt;Why NATS JetStream?
&lt;/h2&gt;&lt;p&gt;We considered several message broker options: Redis Streams, Kafka, RabbitMQ, and NATS JetStream.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reasons for choosing NATS JetStream:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Single Binary&lt;/strong&gt; — Operates with a single &lt;code&gt;nats-server&lt;/code&gt; without requiring separate runtimes. It has no dependencies like Kafka&amp;rsquo;s ZooKeeper or RabbitMQ&amp;rsquo;s Erlang/OTP.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Built-in Persistence&lt;/strong&gt; — JetStream is a streaming layer on top of NATS, storing messages to the filesystem. This ensures that unprocessed tasks are not lost even if a worker restarts.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NKey-based Authentication&lt;/strong&gt; — We can issue independent Ed25519 key pairs for each worker. If one worker is compromised, the credentials of other workers remain valid.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lightweight&lt;/strong&gt; — Memory usage is around 30MB on a single server. Even with 18 workers connected, the broker load is minimal.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id="the-core-backend-adapter-in-pollerpy"&gt;The Core: Backend Adapter in &lt;code&gt;poller.py&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;The heart of the worker is &lt;code&gt;poller.py&lt;/code&gt;. This single file handles NATS subscriptions, LLM CLI execution, and result returns.&lt;/p&gt;
&lt;p&gt;Since LLMs have different execution methods, we separated them into a backend adapter dictionary.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;_BACKENDS: dict[str, dict] &lt;span style="color:#f92672"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;claude&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CLAUDE_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/local/bin/claude&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;tools&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;ALLOWED_TOOLS&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;Read,Edit,Write,Glob,Grep&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CLAUDE_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;codex&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/bin/codex&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;sandbox&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;CODEX_SANDBOX&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;read-only&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;gemini_cli&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;GEMINI_BIN&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;/usr/bin/gemini&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;: os&lt;span style="color:#f92672"&gt;.&lt;/span&gt;environ&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;GEMINI_MODEL&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;MODEL_BACKEND&lt;/code&gt; environment variable determines which LLM to use. This allows the same &lt;code&gt;poller.py&lt;/code&gt; code to run different LLMs across 18 workers.&lt;/p&gt;
&lt;h3 id="claude-backend"&gt;Claude Backend
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;async&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;run_claude&lt;/span&gt;(instructions: str, task_id: str) &lt;span style="color:#f92672"&gt;-&amp;gt;&lt;/span&gt; tuple[int, str]:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cfg &lt;span style="color:#f92672"&gt;=&lt;/span&gt; _BACKENDS[&lt;span style="color:#e6db74"&gt;&amp;#34;claude&amp;#34;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cmd &lt;span style="color:#f92672"&gt;=&lt;/span&gt; [cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;bin&amp;#34;&lt;/span&gt;], &lt;span style="color:#e6db74"&gt;&amp;#34;--print&amp;#34;&lt;/span&gt;, &lt;span style="color:#e6db74"&gt;&amp;#34;--allowedTools&amp;#34;&lt;/span&gt;, cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;tools&amp;#34;&lt;/span&gt;]]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; cfg&lt;span style="color:#f92672"&gt;.&lt;/span&gt;get(&lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;):
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; cmd &lt;span style="color:#f92672"&gt;+=&lt;/span&gt; [&lt;span style="color:#e6db74"&gt;&amp;#34;--model&amp;#34;&lt;/span&gt;, cfg[&lt;span style="color:#e6db74"&gt;&amp;#34;model&amp;#34;&lt;/span&gt;]]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; proc &lt;span style="color:#f92672"&gt;=&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;await&lt;/span&gt; asyncio&lt;span style="color:#f92672"&gt;.&lt;/span&gt;create_subprocess_exec(&lt;span style="color:#f92672"&gt;*&lt;/span&gt;cmd, stdin&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE, stdout&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE, stderr&lt;span style="color:#f92672"&gt;=&lt;/span&gt;PIPE)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;--print&lt;/code&gt; flag is key. It runs Claude Code in non-interactive mode instead of conversational mode, ensuring the results are returned via stdout.&lt;/p&gt;
&lt;h3 id="zai-backend"&gt;ZAI Backend
&lt;/h3&gt;&lt;p&gt;ZAI offers an Anthropic API-compatible endpoint, so it doesn&amp;rsquo;t require a separate backend. Routing is handled by two environment variables.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-ini" data-lang="ini"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# /etc/agentforge/cc-zai-high-dev-01.env&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;lt;ZAI endpoint&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ANTHROPIC_AUTH_TOKEN&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;lt;ZAI API key&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;By injecting this file using systemd&amp;rsquo;s &lt;code&gt;EnvironmentFile=&lt;/code&gt; directive, the &lt;code&gt;claude&lt;/code&gt; binary sends requests to the ZAI endpoint. This allows us to connect to a different LLM provider simply by changing environment variables, without altering the code.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="declarative-management-fleetyaml--serversyaml"&gt;Declarative Management: &lt;code&gt;fleet.yaml&lt;/code&gt; × &lt;code&gt;servers.yaml&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;Manually managing 18 workers is impractical. We declaratively defined the entire infrastructure using two YAML files.&lt;/p&gt;
&lt;h3 id="serversyaml--server-inventory"&gt;&lt;code&gt;servers.yaml&lt;/code&gt; — Server Inventory
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;servers&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;agentforge-worker, tunnel-arm1]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;nats-jetstream, postgres]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;name&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;services&lt;/span&gt;: [&lt;span style="color:#ae81ff"&gt;agentforge-worker, tunnel-arm1]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="fleetyaml--worker-placement"&gt;&lt;code&gt;fleet.yaml&lt;/code&gt; — Worker Placement
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;workers&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;cc-go-dev-01&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-code&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-sonnet-4-6&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;go&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; - &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;codex-py-dev-01&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;codex&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;gpt-5.5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;python&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Changing just the &lt;code&gt;host&lt;/code&gt; field moves a worker to a different server. Setting &lt;code&gt;enabled: false&lt;/code&gt; stops the deployment script from starting that worker.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="worker-templating-system-provision_workerpy"&gt;Worker Templating System: &lt;code&gt;provision_worker.py&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;Manually writing systemd unit files for each new worker is prone to errors. We automated this using Jinja2 templates and a provisioning script.&lt;/p&gt;
&lt;h3 id="template-structure"&gt;Template Structure
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;templates/
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; systemd/
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; claude.service.j2 # For claude-code and ZAI alike
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; codex.service.j2 # OpenAI Codex
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; gemini.service.j2 # Google Gemini CLI
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The core part of &lt;code&gt;claude.service.j2&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;MODEL_BACKEND&lt;span style="color:#f92672"&gt;=&lt;/span&gt;claude
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CLAUDE_BIN&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ claude_bin }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; claude_model &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CLAUDE_MODEL&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ claude_model }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; env_file &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;EnvironmentFile&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ env_file }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;WORK_BASE&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ work_base }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;WORK_DIR&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ work_base }}&lt;span style="color:#f92672"&gt;/&lt;/span&gt;repo
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#34;{{ &amp;#39;ALLOWED_TOOLS=&amp;#39; + allowed_tools }}&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;CREATE_PR&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ &lt;span style="color:#e6db74"&gt;&amp;#39;true&amp;#39;&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; create_pr &lt;span style="color:#66d9ef"&gt;else&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#39;false&amp;#39;&lt;/span&gt; }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; &lt;span style="color:#66d9ef"&gt;if&lt;/span&gt; create_pr &lt;span style="color:#f92672"&gt;and&lt;/span&gt; github_remote &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Environment&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;GITHUB_REMOTE&lt;span style="color:#f92672"&gt;=&lt;/span&gt;{{ github_remote }}
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;{&lt;span style="color:#f92672"&gt;%&lt;/span&gt; endif &lt;span style="color:#f92672"&gt;%&lt;/span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For ZAI workers, the &lt;code&gt;env_file&lt;/code&gt; block is activated, adding the &lt;code&gt;EnvironmentFile&lt;/code&gt;. For PR creation workers, &lt;code&gt;github_remote&lt;/code&gt; is injected. Other settings use defaults.&lt;/p&gt;
&lt;h3 id="provision_workerpy-usage"&gt;&lt;code&gt;provision_worker.py&lt;/code&gt; Usage
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Preview (no actual deployment)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker new-worker-id --dry-run
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Actual deployment (including NATS creds issuance)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker new-worker-id --issue-creds
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Bulk deployment for the entire fleet.yaml&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --all
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Internal operations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reads worker entries from &lt;code&gt;fleet.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Reads target hosts from &lt;code&gt;servers.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Renders Jinja2 templates.&lt;/li&gt;
&lt;li&gt;Deploys &lt;code&gt;/etc/systemd/system/{worker_id}-poller.service&lt;/code&gt; via SSH.&lt;/li&gt;
&lt;li&gt;Creates the working directory.&lt;/li&gt;
&lt;li&gt;Executes &lt;code&gt;systemctl daemon-reload &amp;amp;&amp;amp; enable --now&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;(Optional) Issues NATS NKey with &lt;code&gt;nsc add user&lt;/code&gt; → deploys creds → regenerates &lt;code&gt;auth.conf&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id="distributed-hosting-adding-workers-to-a-second-server"&gt;Distributed Hosting: Adding Workers to a Second Server
&lt;/h2&gt;&lt;p&gt;Running all workers on a single server creates a single point of failure. We added Claude workers to a second host.&lt;/p&gt;
&lt;p&gt;The method for workers on the second host to connect to the NATS broker is via an autossh tunnel.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-ini" data-lang="ini"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;[Unit]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Description&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;NATS Broker Tunnel&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;After&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;network-online.target&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;[Service]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;ExecStart&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;/usr/bin/autossh -N \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; -L 4222:127.0.0.1:4222 \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; -i /home/ubuntu/.ssh/id_ed25519 \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#e6db74"&gt; broker-host&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;Restart&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;always&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;RestartSec&lt;/span&gt;&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With this configuration active, workers always connect to &lt;code&gt;nats://127.0.0.1:4222&lt;/code&gt;. They don&amp;rsquo;t need to know the broker host&amp;rsquo;s address. As long as the tunnel is alive, it works the same way from any host.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="nats-credential-operations-experience"&gt;NATS Credential Operations Experience
&lt;/h2&gt;&lt;p&gt;NATS NKey management was the most complex part of the implementation.&lt;/p&gt;
&lt;p&gt;NATS JetStream&amp;rsquo;s authentication structure is hierarchical.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;Operator (Root Signing Authority)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── Account: SYS (System Account)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── Account: Services (Worker Account)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: cc-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: cc-go-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; ├── User: codex-py-dev-01
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; └── ...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each worker has an independent User NKey and can publish/subscribe within the permissions scope (&lt;code&gt;af.&amp;gt;&lt;/code&gt;, &lt;code&gt;_INBOX.&amp;gt;&lt;/code&gt;, &lt;code&gt;$JS.&amp;gt;&lt;/code&gt;) of the Services account.&lt;/p&gt;
&lt;p&gt;Adding a new worker requires the Operator&amp;rsquo;s signing key. We initially made the mistake of not backing up this key, leading to its loss. Consequently, we had to regenerate the entire Operator and replace all worker credentials en masse. The service downtime was approximately 60 seconds.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Regeneration procedure&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add operator AgentForge
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add account SYS
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc add account Services
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;for&lt;/span&gt; worker in cc-dev-01 cc-go-dev-01 ...; &lt;span style="color:#66d9ef"&gt;do&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; nsc add user --account Services --name $worker &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; --allow-pub &lt;span style="color:#e6db74"&gt;&amp;#34;af.&amp;gt;,_INBOX.&amp;gt;,&lt;/span&gt;$JS&lt;span style="color:#e6db74"&gt;.&amp;gt;&amp;#34;&lt;/span&gt; &lt;span style="color:#ae81ff"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; --allow-sub &lt;span style="color:#e6db74"&gt;&amp;#34;af.&amp;gt;,_INBOX.&amp;gt;,&lt;/span&gt;$JS&lt;span style="color:#e6db74"&gt;.&amp;gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;done&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;nsc generate config --mem-resolver --sys-account SYS &amp;gt; auth.new.conf
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id="adding-a-new-worker-the-full-procedure"&gt;Adding a New Worker: The Full Procedure
&lt;/h2&gt;&lt;p&gt;Since the completion of this system, adding a new worker is straightforward.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1&lt;/strong&gt;: Add an entry to &lt;code&gt;fleet.yaml&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;- &lt;span style="color:#f92672"&gt;worker_id&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;my-new-worker&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;llm&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-code&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;model&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;claude-haiku-4-5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;lang&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;multi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;role&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;developer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;host&lt;/span&gt;: &lt;span style="color:#ae81ff"&gt;worker-node-1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;enabled&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#f92672"&gt;create_pr&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Step 2&lt;/strong&gt;: Preview&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker my-new-worker --dry-run
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Step 3&lt;/strong&gt;: Actual Deployment&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;python3 scripts/provision_worker.py --worker my-new-worker --issue-creds
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. Template rendering, SSH deployment, NATS credential issuance, and service registration are all handled by a single command.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="next-steps"&gt;Next Steps
&lt;/h2&gt;&lt;p&gt;The current system is structured such that workers process tasks independently. Future plans include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Routing Policies&lt;/strong&gt;: Automatically selecting the appropriate worker based on task characteristics (e.g., Go code → &lt;code&gt;claude-go-dev&lt;/code&gt;, cost-first → ZAI lightweight tier).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Results Comparison Dashboard&lt;/strong&gt;: A UI to display fan-out results side-by-side.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost Tracking&lt;/strong&gt;: Aggregating API call costs per worker.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The code is publicly available on GitHub.&lt;/p&gt;</description></item></channel></rss>