OpenClaw supports running multiple agents on different machines, all coordinated through a single gateway. Each agent can have its own model, its own cron jobs, and its own workspace. The gateway handles routing, session management, and inter-agent communication.

The typical setup: a lightweight gateway machine (like a MacBook Air) runs the main agent with API-based models, while beefier machines with GPUs run local models for batch processing jobs. This keeps the gateway responsive while offloading heavy inference to dedicated hardware.

For solo developers and power users, this means you can run 20+ cron jobs across your hardware fleet without any single machine getting bogged down. Your Discord responses stay fast while background agents churn through tasks on separate GPUs.

/* TL;DR */

Fleet mode turns your hardware collection into a distributed AI workforce. Gateway stays light, agents run where the compute is.

The architecture is elegant in its simplicity. The gateway process runs on one machine and handles all external communication โ€” Discord, WhatsApp, Telegram, whatever channels you've configured. Agents on other machines connect to the gateway via Ollama or other model providers.

Cron jobs are the killer feature for fleet management. You can schedule tasks to run on specific agents, ensuring that GPU-intensive local model inference happens on your desktop while your laptop stays cool and responsive.

The key insight: you don't need expensive cloud GPUs. A consumer GPU with 12-16GB VRAM running a 14B parameter model is more than sufficient for most agent tasks โ€” editorial review, content generation, data analysis, and code review.