Four servers, two NAS boxes, a Docker host, and a Kali machine. Real infrastructure, running real services. The question was how much of it an agent could manage without being asked.

The hive structure

Hive Domain What it owns
Infrastructure Servers, network, backups 6 devices (media1, media2, docker-host, kali, NAS1, NAS2), backup scripts, repo sync
Security Monitoring, anomaly detection Security Minion v2.0 — autonomous Tier 3 worker agent
MCP Tool integration browser-use, aws-docs, aws-terraform, git, github, filesystem, memory
API LLM access, cost control OpenRouter integration, 4-tier model routing, cost tracking
Automation Browser automation, data pipelines File-based desktop integration, extraction staging, processing pipeline
Product R&D, self-analysis Pattern extraction, reference model, documentation

Each hive owns its own budget, configuration, data, and reporting. Cross-hive communication goes through the shared workflow system — issues, tasks, decisions — not direct calls.

The Security Minion: an autonomous Tier 3 worker

The Security Minion runs a cycle: discover, learn, monitor, suppress, report. Discovery is auto-detected — Docker containers from compose files, servers by IP, NAS by SMB probe. First run builds a profile: expected ports, expected services. Every run after that, deviations from the profile are anomalies.

Suppress is the step most monitoring systems skip. Without it, you stop reading the alerts. The minion tracks which alerts have been raised and resolved vs suppressed, so "port 8080 always open on the media server" stops firing.

The four-tier LLM routing model

Tier Model Cost per 1M tokens Used for
Simple Llama 3.1 8B $0.06 Status checks, log parsing, data extraction
Analysis Small model $0.25 Trend analysis, basic recommendations
Complex Mid-tier model $3.00 Policy decisions, multi-factor analysis
Critical Large model $15.00 High-risk changes, compliance decisions

250x: the gap between the smallest and largest model tier, measured on real queries, not benchmarks. Most queries are simple: is this service running, what does this log line mean. The Engine's issue routing system uses the same pattern with three tiers instead of four.

The policy lifecycle

The agent architecture specified a five-step policy lifecycle:

  1. Draft — create in YAML, define rules and thresholds, document rationale, estimate impact
  2. Test — run in dry-run mode, collect metrics, identify false positives, assess performance
  3. Validate — review test results, LLM analysis of outcomes, risk assessment, stakeholder approval
  4. Deploy — enable in production, monitor closely, collect feedback
  5. Iterate — analyse effectiveness, adjust thresholds, refine rules

A policy that's never been run in dry-run mode is a hypothesis. Gatekeep's YAML structure and review/deploy CLI are the production form of this lifecycle.

Risk-aware decision making

Before any change: a risk assessment. Four dimensions (impact scope, data sensitivity, reversibility, financial impact) map to four risk levels, each with a corresponding model tier and approval requirement:

  • Low risk: smallest model tier, no approvals needed, optional rollback plan
  • Medium risk: small model, manager approval, standard testing, rollback recommended
  • High risk: mid-tier model, manager + compliance approval, thorough testing, rollback required
  • Critical risk: large model, CEO + legal + compliance approval, extensive testing, rollback required

What carried forward

BuildABeast was created on 18 January 2026. Brood-Hive: 5 February. Engine: 13 February. The hive structure, tiered routing, monitoring patterns, and policy lifecycle all moved from home lab to cloud infrastructure in three weeks. The "autonomous software factory" framing came directly from the BuildABeast docs. Paperclip formalised it six weeks later.

The hive pattern in practice: Each hive owns its domain completely — its own configuration, data, logs, reports, budget. Cross-hive communication happens through the shared workflow system. This is why the pattern scales: adding a new hive doesn't change how existing hives work.

If the articles or tools have been useful, a coffee helps keep things running.

☕ buy me a coffee

Related articles

→ Brood: worker queues for governance agents → Routing work to the right model → The agentic stack: seven layers → Paperclip: the company that runs itself

ticketyboo brings governed AI development to your pull request workflow. 5 governance runs free, one-time welcome grant. No card required.

View pricing Start free →