Most AI agent projects start by asking "what can we make this agent do?" OMNI started from a different question: "how does the agent know what it can't do — and what should it do about that?"
The answer shaped the whole architecture. Instead of a single capable agent that pretends it can handle anything, OMNI is a mesh of registered capabilities — each one discrete, governed, and independently deployable — with an orchestration layer that discovers what's available and composes what's needed. When a request falls outside what's registered, OMNI says so clearly. And it records that gap as a demand signal for the next thing to build.
The capability mesh pattern
A capability is a single thing an agent can do: run a security scan, review a Terraform plan, look up a compliance rule, analyse a document, generate code for a task. Each capability is a Lambda function registered with a gateway, with a name, a description of what it does, and a contract for inputs and outputs.
The orchestration layer — the agent itself — doesn't implement any of these. It discovers them from the registry at startup, and at request time it decides which ones to invoke and in what order. Adding a new capability means registering a new Lambda. The agent picks it up automatically on next startup; no changes to the agent code.
This matters for team scale. Different people can own different capabilities. The security capability can be improved by the security-focused engineer without touching the orchestrator. The knowledge retrieval capability can be swapped out for a better implementation without any other change propagating through the system. Loose coupling isn't just a software principle here — it's a team operating model.
Self-awareness as a first-class feature
Every interaction OMNI handles gets classified: what domain was it in, which capabilities were invoked, did the response fully satisfy the request, and if not, why not? This classification isn't logged and forgotten — it's stored and surfaced.
The "honest admission" design rule is the most important one: when a request can't be handled by any registered capability, OMNI says so explicitly rather than generating a plausible-sounding non-answer. This turns out to be genuinely hard to get right. Models want to be helpful. The constraint against confabulation has to be enforced at the prompt level, not hoped for.
Governance at the mesh level
Individual capabilities can have different governance requirements. A read-only security scan can run without approval. A capability that modifies infrastructure requires a governance gate — the request is classified, held, and a human approves before execution.
This tiered governance is built into the capability registration contract, not into the orchestrator. When a new capability registers itself as "tier 3 — requires approval", the orchestrator handles that automatically. Capability authors declare what governance tier their capability belongs to; they don't need to implement the approval flow themselves.
Data classification works the same way. A request that touches personal data is automatically flagged and handled according to the data classification registered for the relevant capability. The governance rules don't live in the agent; they live in the mesh contract.
The architecture in practice
OMNI runs on AWS: a Lambda handler, a Strands SDK agent that talks to AgentCore Gateway for capability discovery and MCP-based tool invocation, DynamoDB for sessions and interaction history, S3 for document uploads, and a static frontend. The infrastructure is not unusual — what's different is the agent design.
At launch, OMNI has access to 54 registered capabilities across security, governance, AutoDev, knowledge, and DevOps. Some of those are mature and well-tested; some are early. The maturity of a capability is tracked in the registry, and the agent surfaces that to users — "this capability is in beta; results should be verified" is a better answer than a confident wrong one.
What I've learned from building it
The hardest part hasn't been the capability implementations — those are mostly straightforward Lambda functions. The hardest part is the classification layer: getting consistent, meaningful classifications out of interactions that vary enormously in phrasing, intent, and domain.
The second hardest thing is managing the gap between what OMNI thinks it can do (based on capability descriptions) and what it actually can do (based on the quality of those capabilities). A capability that exists but produces poor output is worse than no capability — because the agent invokes it confidently, and the poor output looks like a confident answer. Quality gates on capability registration are something I'm still working on.
That said — the demand signal loop is the part I'd build first if starting again. Knowing what you can't do, and having that drive what you build next, is a more grounded approach than most teams are working with.
If the articles or tools have been useful, a coffee helps keep things running.
☕ buy me a coffeeScan any public GitHub repo for dependency risk, secrets, and code quality issues — free, no account needed.
Scan a repo free See governance agents →