OpenFang on AWS โ Deploying an Open-Source Agent OS with Amazon Bedrock
1. Introduction
This post walks through deploying OpenFang, an open-source Agent Operating System written in Rust, on AWS using Amazon Bedrock as the LLM backend. OpenFang compiles to a single ~32 MB binary, ships with seven pre-built autonomous agent packages called “Hands,” and includes 16 independent security layers โ making it a compelling candidate for teams evaluating autonomous agent infrastructure.
By the end of this series, you will understand:
- What an Agent OS is and how it differs from agent frameworks like LangChain or CrewAI
- How OpenFang compares to OpenClaw, another open-source agent platform
- How to deploy OpenFang on a private EC2 instance with zero inbound ports, IAM-based authentication, and Amazon Bedrock integration via a LiteLLM sidecar proxy
- What a Well-Architected Framework (WAF) security review reveals about deploying autonomous agents โ and how to remediate the findings
This is a development and test deployment. Production hardening guidance, including the full WAF review results and remediation steps, is covered in later sections. All infrastructure is deployed with AWS CDK and can be torn down with a single command.
2. What Is an Agent OS?
The concept
An Agent OS is a runtime that manages autonomous agents the way a traditional operating system manages processes. It handles scheduling, resource isolation, inter-process communication, and lifecycle management. Agents are not functions you call โ they are long-running entities that the OS starts, monitors, suspends, and resumes.
This is a meaningful distinction from what most practitioners encounter when working with agent frameworks.
Agent frameworks vs. an Agent OS
Frameworks like LangChain, CrewAI, AutoGen, and LangGraph are libraries. You import them into your application code, define agent logic in Python, and execute that logic within your application’s process. When your application stops, the agents stop. There is no daemon, no scheduler, no resource metering independent of your code.
An Agent OS operates differently. It runs as a standalone daemon โ comparable to systemd managing services on Linux. Agents are declared through configuration manifests, not imperative code. The OS starts them on schedules, enforces resource budgets, isolates their execution environments, records an immutable audit trail, and manages their entire lifecycle independent of any user session.
Consider a concrete example: OpenFang’s Researcher Hand. Once activated, it decomposes research questions into sub-queries, searches the web across multiple sources, cross-references findings using credibility evaluation criteria (CRAAP โ Currency, Relevance, Authority, Accuracy, Purpose), builds a knowledge graph from the results, and delivers a cited report. This happens autonomously on a schedule. No user prompt triggers each step.
OpenFang specifics
OpenFang is built in Rust and compiles to a single ~32 MB binary across 14 crates totaling 137,000+ lines of code with 1,767+ passing tests. The architecture includes:
- Kernel โ Orchestration, RBAC, metering, scheduling, and budget tracking
- Runtime โ Agent execution loop, 53 built-in tools, WASM sandbox, MCP and A2A protocol support
- Hands โ Seven bundled autonomous agent packages (Researcher, Lead, Collector, Predictor, Twitter, Clip, Browser), each with a TOML manifest, multi-phase system prompt, domain expertise reference, and guardrails
What qualifies it as an “OS” rather than a “framework” comes down to kernel-level capabilities that frameworks do not provide:
| Capability | Agent OS (OpenFang) | Agent Frameworks (LangChain, CrewAI, AutoGen, LangGraph) |
|---|---|---|
| Execution model | Daemon process โ agents run independently of user sessions | Library โ agents execute within your application process |
| Scheduling | Built-in cron-style scheduler; agents wake on schedules | No native scheduling; requires external orchestration (Airflow, cron) |
| Resource isolation | WASM dual-metered sandbox with fuel metering and epoch interruption | No sandboxing (CrewAI, LangGraph) or Docker-level isolation (AutoGen) |
| Audit trail | Merkle hash-chain โ cryptographically linked, tamper-evident | Application-level logging; no integrity guarantees |
| Daemon mode | Native systemd support; runs as a background service | Not applicable โ terminates when the calling process exits |
| Capability control | Kernel-enforced RBAC; agents declare required tools, kernel gates access | Trust-based โ agents access whatever the code imports |
| Security layers | 16 independent layers (SSRF protection, taint tracking, prompt injection scanning, rate limiting, etc.) | 0โ2 layers depending on framework |
| Bundled agents | 7 production-ready Hands with multi-phase operational playbooks | None โ you build agent logic from scratch |
| Channel adapters | 40 native adapters (Telegram, Slack, Discord, WhatsApp, Teams, etc.) | 0 native adapters; requires custom integration |
The distinction is not about capability โ you can build sophisticated agents with any of these frameworks. The distinction is about operational maturity. An Agent OS treats agents as first-class managed workloads, not as code paths in your application.
3. OpenFang vs. OpenClaw โ Choosing the Right Tool
Both OpenFang and OpenClaw are open-source, MIT-licensed, and built for AI agents. Both support multiple LLM providers. But they solve fundamentally different problems, and understanding the distinction matters when selecting the right tool for a given workload.
OpenFang: Autonomous workers
OpenFang is a Rust-based Agent OS designed for autonomous task execution. Its seven bundled Hands run on schedules, produce deliverables (research reports, qualified lead lists, OSINT intelligence), and build knowledge graphs โ all without human prompting. It runs as a daemon, ships as a single ~32 MB binary with a cold start under 200 ms, and includes 16 security layers with a WASM-sandboxed execution environment. It supports 40 channel adapters for delivering results and 27 LLM providers for model routing.
The interaction model is declarative: you activate a Hand, configure its schedule and parameters, and it operates autonomously. The dashboard and API provide monitoring and control, but the agents do not depend on conversational input to function.
OpenClaw: Conversational assistant
OpenClaw is a Node.js/TypeScript platform designed as a personal AI assistant. Its primary interfaces are messaging platforms โ WhatsApp, Telegram, and Discord. It provides rich conversational memory, tool orchestration, sub-agent delegation, and cron-based scheduling. The footprint is larger (~500 MB install, ~6 second cold start), which reflects its richer runtime ecosystem for conversational AI.
The interaction model is conversational: you chat with OpenClaw through a messaging platform, and it uses tools and memory to assist you. It excels at context-rich, multi-turn interactions where the human is an active participant.
Comparison
| Dimension | OpenFang | OpenClaw |
|---|---|---|
| Language | Rust | Node.js / TypeScript |
| Install size | ~32 MB | ~500 MB |
| Cold start | <200 ms | ~6 s |
| Primary mode | Autonomous daemon | Conversational assistant |
| Bundled agents | 7 Hands (Researcher, Lead, Collector, Predictor, Twitter, Clip, Browser) | None (you configure assistants) |
| Security layers | 16 (WASM sandbox, taint tracking, Merkle audit, SSRF protection, etc.) | 3 (basic access controls) |
| Channel adapters | 40 | 13 |
| LLM providers | 27 | 10 |
| Best for | Scheduled autonomous tasks โ research, lead gen, OSINT, monitoring | Personal AI assistant โ chat-based interaction on messaging platforms |
When to use which
Choose OpenFang when you need agents that run autonomously on schedules โ daily lead generation, continuous OSINT monitoring, automated research pipelines. When you want a lightweight daemon on a small EC2 instance. When your security requirements demand WASM sandboxing, taint tracking, and a tamper-evident audit trail.
Choose OpenClaw when you want a personal AI assistant accessible through WhatsApp or Telegram. When you need rich conversational memory and multi-turn context. When your primary interaction model is chat-based and the human is in the loop for most decisions.
Use both together for the most capable architecture: OpenFang runs the autonomous backend work โ research, monitoring, lead generation โ while OpenClaw delivers the results conversationally to your messaging platform and handles human-in-the-loop interactions. OpenFang produces; OpenClaw communicates.
The key insight is that these are complementary tools, not competitors. One automates; the other converses.
4. Why Deploy on AWS?
OpenFang runs well on a laptop for testing. But autonomous agents โ by definition โ need to run when you are not at your desk.
If the Researcher Hand is scheduled for 6 AM daily, or the Collector is monitoring a target continuously, your laptop needs to be open, awake, and connected. For anything beyond casual experimentation, always-on infrastructure becomes a requirement.
AWS provides four specific advantages for this workload:
Amazon Bedrock: Managed LLM Access
Bedrock provides access to Claude, Nova, Llama, and other foundation models through a single API with IAM-based authentication. No GPU provisioning, no model hosting, no inference infrastructure to manage. You pay per token. For an always-on agent that may invoke the LLM dozens of times per day, this is significantly simpler and more cost-effective than self-hosting.
IAM Instance Profiles: Zero Static Credentials
The EC2 instance obtains temporary, automatically rotating AWS credentials via the Instance Metadata Service (IMDS). There is no .env file with an ANTHROPIC_API_KEY or AWS_SECRET_ACCESS_KEY. The credentials rotate automatically, cannot be exfiltrated from a config file, and are scoped to precisely the IAM permissions the agent needs.
SSM Session Manager: Zero Inbound Ports
With SSM, the instance has no SSH port, no bastion host, no VPN endpoint. Access is authenticated via IAM, encrypted in transit, and logged to CloudTrail. Port forwarding to the OpenFang dashboard works through the same channel. The attack surface is effectively zero from a network ingress perspective.
VPC Isolation
The instance runs in a private subnet with no public IP. A NAT Gateway provides outbound internet access for web research and package downloads, while a VPC Endpoint (PrivateLink) routes Bedrock API traffic privately within the AWS network. Nothing on the internet can initiate a connection to the instance. Combined with SSM and a restricted security group, this creates a defense-in-depth posture.
| Dimension | Local (Laptop) | AWS (EC2 + Bedrock) |
|---|---|---|
| Availability | Only when laptop is open | 24/7 |
| LLM access | API keys in env files | IAM instance profile โ no static keys |
| Credential security | Plaintext on disk | Rotating via IMDS, scoped by IAM policy |
| Network access | Home network, shared WiFi | Private subnet, no inbound ports |
| Cost | $0 infra (your electricity) | ~$70โ102/month + Bedrock tokens |
| Monitoring | None (unless you build it) | CloudWatch, VPC Flow Logs, CloudTrail |
The cost of this deployment โ roughly $70/month when reusing an existing VPC, or $102/month with a new VPC โ is modest for an always-on autonomous agent with enterprise-grade security.
5. The Bedrock Integration Challenge โ The LiteLLM Proxy Pattern
The Problem
OpenFang’s model catalog lists eight Bedrock models and defines a BEDROCK_BASE_URL constant. On the surface, it appears Bedrock-ready. It is not.
Examining the source code (crates/openfang-runtime/src/drivers/mod.rs), the provider_defaults("bedrock") function returns None โ there is no match arm for the Bedrock provider. The create_driver() function has no special case for Bedrock either, unlike Anthropic, Gemini, or Codex which have dedicated driver logic. The fallback path creates an OpenAI-compatible driver and sets base_url to the configured endpoint.
This fallback would work โ except for authentication. The OpenAI-compatible driver sends Authorization: Bearer <token> headers. Amazon Bedrock requires AWS Signature Version 4 (SigV4) signing. These are fundamentally incompatible authentication schemes.
The Solution: LiteLLM Sidecar Proxy
LiteLLM is an open-source proxy that translates between OpenAI-compatible API calls and 100+ LLM providers, including Amazon Bedrock. It runs as a lightweight sidecar container alongside OpenFang.
The integration works as follows:
- OpenFang sends a standard
POST /v1/chat/completionsrequest tohttp://litellm:4000with a Bearer token - LiteLLM receives the request, identifies the target model from its configuration
- LiteLLM reads EC2 instance profile credentials from IMDS (via boto3’s standard credential chain)
- LiteLLM signs the request with SigV4 and sends it to
bedrock-runtime.{region}.amazonaws.com, which resolves via private DNS to the VPC Endpoint (PrivateLink) — the request never leaves the AWS network - The Bedrock response flows back through LiteLLM to OpenFang in the OpenAI response format
Model Name Translation
This seemingly simple proxy introduces a three-layer name translation problem:
- OpenFang catalog name:
bedrock/anthropic.claude-sonnet-4-6 - Bedrock foundation model ID:
anthropic.claude-sonnet-4-6-v1 - Cross-region inference profile ID:
us.anthropic.claude-sonnet-4-6
OpenFang strips the bedrock/ prefix before sending to LiteLLM. LiteLLM needs to map the bare name to the correct Bedrock model identifier including the us. prefix for cross-region inference.
The LiteLLM configuration handles this with a model list that registers both the prefixed and bare names:
model_list:
- model_name: "bedrock/anthropic.claude-sonnet-4-6"
litellm_params:
model: "bedrock/us.anthropic.claude-sonnet-4-6"
aws_region_name: "us-west-2"
- model_name: "anthropic.claude-sonnet-4-6"
litellm_params:
model: "bedrock/us.anthropic.claude-sonnet-4-6"
aws_region_name: "us-west-2"
general_settings:
master_key: "${LITELLM_KEY}"
The second entry (without the bedrock/ prefix) catches requests from OpenFang after it strips the prefix. Both entries route to the same underlying Bedrock model.
Why This Pattern Matters Beyond OpenFang
The LiteLLM sidecar pattern is not OpenFang-specific. Any open-source agent framework that speaks the OpenAI API โ LangChain, AutoGen, CrewAI, LangGraph, or any custom agent code using the OpenAI Python SDK โ can use the identical setup to access Amazon Bedrock. The framework talks to localhost:4000 as if it were the OpenAI API. LiteLLM handles authentication, model routing, and response translation transparently.
This makes the LiteLLM sidecar a reusable architecture pattern for running open-source AI tools on AWS with Bedrock.
6. Architecture Deep Dive
VPC Design
The CDK stack provisions a VPC with a 10.0.0.0/16 CIDR block containing a public subnet (hosting the NAT Gateway) and a private subnet (hosting the EC2 instance). A VPC Interface Endpoint (PrivateLink) for Bedrock Runtime ensures that all LLM API traffic stays within the AWS network, never traversing the NAT Gateway or public internet. For development and testing, this is a single-AZ deployment. The stack also supports reusing an existing VPC by passing -c vpcId=vpc-xxx at deploy time, which skips VPC creation entirely.
EC2 Instance
| Specification | Value |
|---|---|
| Instance type | t3.xlarge (4 vCPU, 16 GB RAM) |
| AMI | Amazon Linux 2023 (latest) |
| EBS | 30 GB gp3, encrypted at rest |
| Public IP | None |
| Key pair | None (SSM access only) |
| IMDSv2 | Required, hop limit = 2 |
| Placement | Private subnet |
The t3.xlarge instance is sized for the Docker build phase โ compiling OpenFang’s 137K lines of Rust code requires more than 4 GB of RAM. After the initial build, the runtime needs are modest (~200โ500 MB under load). For production, you could build the image separately and push it to ECR, then run on a t3.medium.
Docker Compose: Two Containers
The deployment uses Docker Compose to run two containers on a shared Docker network:
OpenFang โ Runs with network_mode: host because it binds to 127.0.0.1:50051 and ignores the listen_addr configuration override. Docker port mapping cannot reach a loopback-bound process inside a container, so host networking is required. The OpenFang API and dashboard are accessible on localhost:50051.
LiteLLM โ Runs in standard Docker bridge mode, publishing port 4000 to 127.0.0.1:4000. It mounts the litellm_config.yaml file and uses the EC2 instance profile for Bedrock authentication.
Neither port is exposed to the VPC. They are bound to localhost only. Access from an operator’s workstation uses SSM port forwarding:
aws ssm start-session \
--target <instance-id> \
--document-name AWS-StartPortForwardingSession \
--parameters '{"portNumber":["50051"],"localPortNumber":["4200"]}' \
--region us-west-2
IMDSv2 Hop Limit
A critical detail: Docker containers add a network hop when accessing the EC2 Instance Metadata Service. The default IMDSv2 hop limit of 1 blocks container access to IMDS. This means LiteLLM cannot obtain instance profile credentials, resulting in NoCredentialsError from boto3.
The fix is a single CDK configuration:
instance.instance.addPropertyOverride(
"MetadataOptions.HttpPutResponseHopLimit", 2
);
UserData Bootstrap Sequence
The EC2 UserData script runs the full provisioning sequence at launch:
- Install system packages (Docker, git, build tools)
- Install Docker Compose
- Clone OpenFang source code
- Generate secrets dynamically โ
openssl rand -hex 32for both the LiteLLM master key and the OpenFang API key (no hardcoded values) - Write LiteLLM configuration (
litellm_config.yaml) - Write OpenFang configuration (
config.toml) - Write Docker Compose manifest
docker compose up -d --build- Wait for services, then activate the Researcher Hand via the API
Access Model
The operator’s access path: Workstation โ AWS CLI โ SSM Session Manager (HTTPS, IAM-authenticated, CloudTrail-logged) โ Port forward โ localhost:4200 โ OpenFang dashboard. At no point does a port open to the internet. Bedrock API calls follow a separate path through the VPC Endpoint (PrivateLink), never touching the NAT Gateway or public internet.
Continue reading: Part 2 โ Security Review, Deployment and Lessons Learned