A stateless agent is a powerful calculator. It takes an input, produces an output, and forgets everything. For one-shot tasks, that is sufficient. For anything that spans multiple sessions, multiple agents, or an evolving codebase, statelessness is a liability. The agent that built the authentication layer three sprints ago has no memory of the decisions made there. The agent picking up the work today starts cold.
Memory Management (pattern 8 in the Gulli taxonomy) and Inter-Agent Communication (pattern 15) address this problem from two directions. Memory Management asks: what should persist between invocations, and for how long? Inter-Agent Communication asks: how do agents share context without a human translating between them? This article covers both, grounded in how ticketyboo.dev has implemented them.
Memory is not one thing
The most common mistake in agent memory design is treating all memory as equivalent. In practice, there are at least four distinct memory types with fundamentally different requirements:
- Session state: what the current agent is doing right now. Lives for the duration of a session, measured in hours. Should be writable quickly and readable immediately. DynamoDB session records or an in-memory store works fine.
- Working notes: observations, gotchas, patterns, and API behaviours discovered during work. Lives until explicitly expired or superseded. Needs to be searchable by keyword. Tags and full-text search matter.
- Decisions: architecture and design decisions with rationale, alternatives considered, and impact assessment. Should never expire. Must be retrievable by project and tag. The rationale is as important as the decision.
- File history: which files changed, in which session, with a one-line description. Lives indefinitely. Enables any agent starting work on a file to see its recent history without reading every commit.
The roo-context MCP server implements all four as separate tools with separate
storage concerns. The MCP protocol means the agent calls start_session,
save_note, save_decision, and log_file_change
as structured tool calls, not as prose written to a text file. The structure
is enforced at the tool interface level.
The DynamoDB TTL pattern for agent telemetry
The ops agents write their run results to a DynamoDB table called
team-activity. Each record has a ttl attribute
set to 30 days from the write time. DynamoDB deletes expired records
automatically. No cleanup job. No maintenance. No bugs.
# agent_base.py (simplified)
import time
TTL_DAYS = 30
def write_run(agent_id: str, status: str, summary: str) -> None:
"""Write agent run result with automatic 30-day expiry."""
now = int(time.time())
table.put_item(Item={
"pk": f"RUN#{agent_id}",
"sk": f"TS#{now}",
"status": status,
"summary": summary,
"agent_id": agent_id,
"timestamp": now,
"ttl": now + (TTL_DAYS * 86400), # DynamoDB removes this at expiry
})
The proxy reads recent runs for the team dashboard. Because records expire at 30 days, a query for the last N runs always returns recent data. There is no need to filter by date in the query logic. The table stays small. The cost stays within Free Tier. The dashboard always shows current state.
This is memory management through deliberate ephemerality. Not everything should persist forever. Ops agent telemetry from 60 days ago is not useful. Architecture decisions from 60 days ago are essential. Treating them the same wastes storage and degrades query performance on the things that matter.
What the roo-context MCP actually does
The roo-context MCP server is a SQLite-backed service that exposes session and
memory management tools to AI agents. It runs locally. No cloud dependency,
no latency, no cost. An agent at the start of a session calls
get_project_context and receives: the last three session summaries,
the ten most recent notes, and the five most recent architecture decisions.
This is the institutional memory of the platform.
The difference this makes is concrete. Before the roo-context MCP existed, each agent session started from scratch: re-reading the same context files, re-establishing the same patterns, occasionally making the same mistakes that had been caught and documented in a previous session. After the MCP existed, a new session can pick up from "session 95 completed the sprint plan for agentic patterns, note #181 has the full 21-pattern taxonomy, decision #29 records the rationale for the book-as-framework approach."
That is not a marginal improvement. It is the difference between an agent with institutional memory and an agent with amnesia.
Inter-agent communication: the A2A pattern in practice
The A2A (agent-to-agent) communication pattern describes how agents share context and coordinate actions. The textbook version involves structured message passing between agent processes. The practical version on this platform is simpler and more robust: agents communicate through shared persistent memory, not through direct message passing.
Two AI coding agents work on this codebase. One handles planning and specification work. One handles implementation. They share the roo-context MCP database. When the planning agent completes a session, it writes a session summary with the sprint plan, saves key decisions, and ends the session. When the coding agent begins an implementation session, it reads the recent sessions, finds the sprint plan, and has full context without the planning agent needing to be running or a human needing to relay anything.
Why shared memory beats message passing for async agents
Direct message passing between agents requires both agents to be running simultaneously, or a message queue to buffer the communication. For development workflow agents (the planning agent, coding agent, Gatekeep) that run in response to human requests rather than on schedules, neither is practical.
Shared persistent memory sidesteps this entirely. An agent writes its outputs to the shared store. A later agent reads from the shared store. The writers and readers never need to be contemporaneous. The "conversation" between agents is a series of reads and writes to a common database, not a real-time exchange.
This is not a limitation. It is an architectural advantage. The shared store is inspectable: a human can read what the agents have written and understand the state of the system. The shared store is auditable: every write is timestamped and attributed to a session. The shared store is recoverable: if an agent session fails mid-way, the partial writes are preserved and a subsequent session can pick up from the last checkpoint.
The handoff pattern: what a real session transition looks like
A concrete example from this sprint. The planning agent analysed the Gulli book (424pp, 21 patterns) and mapped all 21 patterns to the ticketyboo architecture. That planning session (session 95) ended with:
- A session summary describing what was done, what was produced, what comes next
- Note #180: the sprint overview with a pointer to the plan file
- Note #181: the full 21-pattern taxonomy in a compact reference format
- Decision #29: the rationale for using the book as a content framework
The coding agent began the implementation session (session 96) by calling
get_project_context. That single call returned everything above,
plus the previous five sessions, ten recent notes, and five recent decisions.
Full context in one tool call. No briefing. No re-reading files. No starting cold.
# Session start pattern (every session)
context = mcp.get_project_context(
project="ticketyboo",
sessions=5, # last 5 session summaries
notes=10, # most recent 10 notes
decisions=5, # most recent 5 decisions
)
# context.recent_sessions[0].summary = "Analysed Gulli book..."
# context.active_notes[0].note = "GULLI BOOK 21 PATTERN TAXONOMY..."
# context.recent_decisions[0].title = "Agentic Design Patterns as framework"
The Gatekeep persona handoff
Gatekeep implements a different kind of A2A communication: within a single session, different personas pick up findings from each other. The Sentinel flags a security concern (IAM policy too permissive). The Auditor checks whether the same change also has cost implications (new service, potential Free Tier breach). The Architect assesses whether the design decision is sound regardless of the security and cost questions.
These are not sequential operations where one persona waits for another. They are parallel reads of the same change, each through a different lens. The governance record captures all three verdicts. A human reviewing an escalation can see what each persona found and why. The decision trail is complete.
What the ops agents' communication model looks like
The four ops agents (CTO, SRE, Security, Cost) do not communicate with each other directly. They each write independently to the same DynamoDB table. The proxy reads from the table and presents a unified view to the team dashboard. This is a hub-and-spoke model: the table is the hub, each agent is a spoke, and the proxy is the aggregator.
This model works because the agents are genuinely independent. The SRE agent's findings do not depend on the Cost agent's findings. Each agent can run, fail, and succeed without affecting the others. The dashboard degrades gracefully: if the Security agent fails to run, the dashboard shows the last successful security check with a timestamp, and the other three panels continue working.
The coordination the ops agents do need (avoiding duplicate writes, maintaining
a consistent schema, TTL management) is handled by the shared agent_base.py
module. The schema is the contract. As long as all agents write to the same schema,
the proxy can read from any of them without knowing which agent produced a given record.
Three things that make agent memory worth having
Not all memory systems are useful. The ones that degrade into noise are as common as the ones that provide genuine value. Three properties distinguish useful agent memory:
Structured retrieval. Notes that can only be retrieved by scrolling
through a list accumulate into a wall of text. Notes that are tagged, categorised,
and full-text searchable accumulate into a knowledge base. The roo-context MCP
search_notes tool accepts a query string and returns matching notes
ranked by relevance. An agent looking for "DynamoDB TTL pattern" finds this
article's notes without reading everything.
Deliberate expiry. Session state that never expires is noise.
Agent telemetry from three months ago is not useful for a dashboard showing
current system state. Notes that are still relevant after six months are worth
keeping. Notes about a workaround for a bug that was fixed last sprint should
be deleted. The roo-context MCP supports optional expiry on notes via
expires_days. DynamoDB TTL handles telemetry expiry automatically.
Both are deliberate choices about what to forget.
Attribution. A memory without provenance is difficult to trust. Every note and decision in the roo-context MCP is associated with a session ID and a timestamp. An agent reading a note from 60 days ago can see which session created it and, from the session record, what was being worked on at the time. Context about context.
get_project_context and give a coherent answer, it is.
If the articles or tools have been useful, a coffee helps keep things running.
☕ buy me a coffeeticketyboo brings governed AI development to your pull request workflow. 5 governance runs free, one-time welcome grant. No card required.
View pricing Start free →