How agents remember and talk to each other

Summary

Who it's for Engineers building multi-agent systems or long-running AI-assisted workflows who need context to survive between sessions and agents to coordinate without a human relay.

Key observations

There is no single "memory" system. The roo-context MCP handles four distinct memory types with different lifespans: session state (hours), notes (indefinite), decisions (permanent), file history (permanent). Using one mechanism for all four conflates concerns that have genuinely different requirements.
DynamoDB TTL is the right tool for ephemeral agent telemetry. Ops agent run results need to be readable for 30 days and then gone. A TTL attribute costs nothing and requires no maintenance. A cleanup job costs developer time and has bugs.
Inter-agent communication (A2A) works when the shared context is structured and typed. It breaks down when one agent is expected to parse another agent's prose output. The MCP server enforces structure by design: tools return typed responses, not free text.
Two AI coding agents working on the same codebase are not the same as two developers. The coordination model is different. Agents coordinate through shared memory, not through conversation. The handoff is a context record, not a meeting.

~9 min read

A stateless agent is a powerful calculator. It takes an input, produces an output, and forgets everything. For one-shot tasks, that is sufficient. For anything that spans multiple sessions, multiple agents, or an evolving codebase, statelessness is a liability. The agent that built the authentication layer three sprints ago has no memory of the decisions made there. The agent picking up the work today starts cold.

Memory Management (pattern 8 in the Gulli taxonomy) and Inter-Agent Communication (pattern 15) address this problem from two directions. Memory Management asks: what should persist between invocations, and for how long? Inter-Agent Communication asks: how do agents share context without a human translating between them? This article covers both, grounded in how ticketyboo.dev has implemented them.

Memory is not one thing

The most common mistake in agent memory design is treating all memory as equivalent. In practice, there are at least four distinct memory types with fundamentally different requirements:

Session state: what the current agent is doing right now. Lives for the duration of a session, measured in hours. Should be writable quickly and readable immediately. DynamoDB session records or an in-memory store works fine.
Working notes: observations, gotchas, patterns, and API behaviours discovered during work. Lives until explicitly expired or superseded. Needs to be searchable by keyword. Tags and full-text search matter.
Decisions: architecture and design decisions with rationale, alternatives considered, and impact assessment. Should never expire. Must be retrievable by project and tag. The rationale is as important as the decision.
File history: which files changed, in which session, with a one-line description. Lives indefinitely. Enables any agent starting work on a file to see its recent history without reading every commit.

The roo-context MCP server implements all four as separate tools with separate storage concerns. The MCP protocol means the agent calls start_session, save_note, save_decision, and log_file_change as structured tool calls, not as prose written to a text file. The structure is enforced at the tool interface level.

Four memory layers in the roo-context MCP. Session state is ephemeral. Notes and decisions are persistent. File history is the commit-log equivalent for agent activity.

The DynamoDB TTL pattern for agent telemetry

The ops agents write their run results to a DynamoDB table called team-activity. Each record has a ttl attribute set to 30 days from the write time. DynamoDB deletes expired records automatically. No cleanup job. No maintenance. No bugs.

# agent_base.py (simplified)
import time

TTL_DAYS = 30

def write_run(agent_id: str, status: str, summary: str) -> None:
    """Write agent run result with automatic 30-day expiry."""
    now = int(time.time())
    table.put_item(Item={
        "pk": f"RUN#{agent_id}",
        "sk": f"TS#{now}",
        "status": status,
        "summary": summary,
        "agent_id": agent_id,
        "timestamp": now,
        "ttl": now + (TTL_DAYS * 86400),   # DynamoDB removes this at expiry
    })

The proxy reads recent runs for the team dashboard. Because records expire at 30 days, a query for the last N runs always returns recent data. There is no need to filter by date in the query logic. The table stays small. The cost stays within Free Tier. The dashboard always shows current state.

This is memory management through deliberate ephemerality. Not everything should persist forever. Ops agent telemetry from 60 days ago is not useful. Architecture decisions from 60 days ago are essential. Treating them the same wastes storage and degrades query performance on the things that matter.

What the roo-context MCP actually does

The roo-context MCP server is a SQLite-backed service that exposes session and memory management tools to AI agents. It runs locally. No cloud dependency, no latency, no cost. An agent at the start of a session calls get_project_context and receives: the last three session summaries, the ten most recent notes, and the five most recent architecture decisions. This is the institutional memory of the platform.

The difference this makes is concrete. Before the roo-context MCP existed, each agent session started from scratch: re-reading the same context files, re-establishing the same patterns, occasionally making the same mistakes that had been caught and documented in a previous session. After the MCP existed, a new session can pick up from "session 95 completed the sprint plan for agentic patterns, note #181 has the full 21-pattern taxonomy, decision #29 records the rationale for the book-as-framework approach."

That is not a marginal improvement. It is the difference between an agent with institutional memory and an agent with amnesia.

Inter-agent communication: the A2A pattern in practice

The A2A (agent-to-agent) communication pattern describes how agents share context and coordinate actions. The textbook version involves structured message passing between agent processes. The practical version on this platform is simpler and more robust: agents communicate through shared persistent memory, not through direct message passing.

Two AI coding agents work on this codebase. One handles planning and specification work. One handles implementation. They share the roo-context MCP database. When the planning agent completes a session, it writes a session summary with the sprint plan, saves key decisions, and ends the session. When the coding agent begins an implementation session, it reads the recent sessions, finds the sprint plan, and has full context without the planning agent needing to be running or a human needing to relay anything.

Agent communication on ticketyboo.dev. The planning agent and coding agent share the roo-context MCP as a coordination layer. Neither needs the other to be running. The handoff is a context record, not a synchronous message.

Why shared memory beats message passing for async agents

Direct message passing between agents requires both agents to be running simultaneously, or a message queue to buffer the communication. For development workflow agents (the planning agent, coding agent, Gatekeep) that run in response to human requests rather than on schedules, neither is practical.

Shared persistent memory sidesteps this entirely. An agent writes its outputs to the shared store. A later agent reads from the shared store. The writers and readers never need to be contemporaneous. The "conversation" between agents is a series of reads and writes to a common database, not a real-time exchange.

This is not a limitation. It is an architectural advantage. The shared store is inspectable: a human can read what the agents have written and understand the state of the system. The shared store is auditable: every write is timestamped and attributed to a session. The shared store is recoverable: if an agent session fails mid-way, the partial writes are preserved and a subsequent session can pick up from the last checkpoint.

The handoff pattern: what a real session transition looks like

A concrete example from this sprint. The planning agent analysed the Gulli book (424pp, 21 patterns) and mapped all 21 patterns to the ticketyboo architecture. That planning session (session 95) ended with:

A session summary describing what was done, what was produced, what comes next
Note #180: the sprint overview with a pointer to the plan file
Note #181: the full 21-pattern taxonomy in a compact reference format
Decision #29: the rationale for using the book as a content framework

The coding agent began the implementation session (session 96) by calling get_project_context. That single call returned everything above, plus the previous five sessions, ten recent notes, and five recent decisions. Full context in one tool call. No briefing. No re-reading files. No starting cold.

# Session start pattern (every session)
context = mcp.get_project_context(
    project="ticketyboo",
    sessions=5,     # last 5 session summaries
    notes=10,       # most recent 10 notes
    decisions=5,    # most recent 5 decisions
)
# context.recent_sessions[0].summary = "Analysed Gulli book..."
# context.active_notes[0].note = "GULLI BOOK 21 PATTERN TAXONOMY..."
# context.recent_decisions[0].title = "Agentic Design Patterns as framework"

The Gatekeep persona handoff

Gatekeep implements a different kind of A2A communication: within a single session, different personas pick up findings from each other. The Sentinel flags a security concern (IAM policy too permissive). The Auditor checks whether the same change also has cost implications (new service, potential Free Tier breach). The Architect assesses whether the design decision is sound regardless of the security and cost questions.

These are not sequential operations where one persona waits for another. They are parallel reads of the same change, each through a different lens. The governance record captures all three verdicts. A human reviewing an escalation can see what each persona found and why. The decision trail is complete.

What the ops agents' communication model looks like

The four ops agents (CTO, SRE, Security, Cost) do not communicate with each other directly. They each write independently to the same DynamoDB table. The proxy reads from the table and presents a unified view to the team dashboard. This is a hub-and-spoke model: the table is the hub, each agent is a spoke, and the proxy is the aggregator.

This model works because the agents are genuinely independent. The SRE agent's findings do not depend on the Cost agent's findings. Each agent can run, fail, and succeed without affecting the others. The dashboard degrades gracefully: if the Security agent fails to run, the dashboard shows the last successful security check with a timestamp, and the other three panels continue working.

The coordination the ops agents do need (avoiding duplicate writes, maintaining a consistent schema, TTL management) is handled by the shared agent_base.py module. The schema is the contract. As long as all agents write to the same schema, the proxy can read from any of them without knowing which agent produced a given record.

Three things that make agent memory worth having

Not all memory systems are useful. The ones that degrade into noise are as common as the ones that provide genuine value. Three properties distinguish useful agent memory:

Structured retrieval. Notes that can only be retrieved by scrolling through a list accumulate into a wall of text. Notes that are tagged, categorised, and full-text searchable accumulate into a knowledge base. The roo-context MCP search_notes tool accepts a query string and returns matching notes ranked by relevance. An agent looking for "DynamoDB TTL pattern" finds this article's notes without reading everything.

Deliberate expiry. Session state that never expires is noise. Agent telemetry from three months ago is not useful for a dashboard showing current system state. Notes that are still relevant after six months are worth keeping. Notes about a workaround for a bug that was fixed last sprint should be deleted. The roo-context MCP supports optional expiry on notes via expires_days. DynamoDB TTL handles telemetry expiry automatically. Both are deliberate choices about what to forget.

Attribution. A memory without provenance is difficult to trust. Every note and decision in the roo-context MCP is associated with a session ID and a timestamp. An agent reading a note from 60 days ago can see which session created it and, from the session record, what was being worked on at the time. Context about context.

The institutional memory test: start a new agent session from scratch and ask it to summarise the state of the platform. If it needs to read every file in the repository to answer, your memory architecture is not working. If it can call get_project_context and give a coherent answer, it is.

Pattern taxonomy from Antonio Gulli, "Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems" (Springer, 2025). All examples are original implementations from the ticketyboo.dev platform. No book content reproduced verbatim.

Working on something like this?

Fractional CTO and transformation leadership for situations that aren't working. Bring a problem — thirty minutes, no obligation.

Bring a problem → or scan a repo first →

How agents remember and talk to each other

Memory is not one thing

The DynamoDB TTL pattern for agent telemetry

What the roo-context MCP actually does

Inter-agent communication: the A2A pattern in practice

Why shared memory beats message passing for async agents

The handoff pattern: what a real session transition looks like

The Gatekeep persona handoff

What the ops agents' communication model looks like

Three things that make agent memory worth having

More from the agentic patterns series

Working on something like this?