LoopForge: auditable state machines for the dev lifecycle

Summary

Who it's for Engineering teams running automated or agentic development pipelines who need to know exactly what happened, in what order, and why — for compliance, debugging, or simply trusting the system.

3 key takeaways

When automation is doing the work, the audit trail becomes more important, not less — because humans are no longer in every step.
A typed state machine prevents illegal transitions at the library level: CI can't be skipped, PRs can't merge without passing, issues can't close before merging.
Transition hooks let you attach governance, notifications, and metrics to state changes without those concerns polluting your business logic.

~6 min read

There's a problem that appears quietly as soon as you start automating development workflows. When a human opens a PR, you can look at git and GitHub and reconstruct most of what happened. When an agent opens a PR — having picked up an issue, generated code, passed a governance check, run verification, and submitted — you have the same artefacts but a much harder time reconstructing the chain of events. What triggered this? What checks ran? What approved it? Did anything get skipped?

Most automated pipelines answer these questions with logging and hope. LoopForge answers them with a typed state machine and a queryable audit trail. It's available on PyPI — pip install loopforge — and this is the story of why it exists and what it actually solves.

The problem with ad-hoc pipeline tracking

When you first automate a dev workflow, the pipeline state usually lives in a mix of places: GitHub issue labels, PR statuses, a field in a DynamoDB record, a message in a Slack channel. This works until something goes wrong, and then you spend an hour correlating timestamps across four systems trying to figure out whether CI actually passed before the merge happened, or whether something triggered the merge without CI completing.

The more automation you add — particularly AI automation that can move through the pipeline faster than a human would notice — the more this matters. A bad transition in a human-driven pipeline is visible because someone catches it. A bad transition in an automated pipeline might not surface until audit time, or until a production incident, or never.

What LoopForge does

LoopForge models the development lifecycle as an explicit state machine: issue_created → task_queued → pr_created → ci_pending → ci_passed → merged → closed, with branching for CI failure, review gates, and retry paths.

Every transition is:

Validated — the state machine won't allow an illegal transition. You can't merge before CI passes. You can't close before merging. If something tries to skip a step, LoopForge raises an error — not silently, not eventually, immediately at the call site.
Timestamped — ISO 8601 UTC, recorded with every state change.
Attributed — a trigger field records what caused the transition: "ci.passed", "reviewer.approved", "auto.merged". You know not just what happened but what caused it.
Enriched — arbitrary metadata (commit SHA, CI job URL, reviewer ID) attached to each transition for downstream querying.

from loopforge import LoopService, LoopState
from loopforge.repository import MemoryRepository

service = LoopService(repository=MemoryRepository())

# Create a record when an issue is picked up
record = service.create(
    ref="https://github.com/org/repo/issues/42",
    repo="org/repo",
)

# Walk through the lifecycle — each transition is validated
service.transition(record.record_id, LoopState.TASK_QUEUED,   "worker.picked_up")
service.transition(record.record_id, LoopState.PR_CREATED,    "pr.opened")
service.transition(record.record_id, LoopState.CI_PENDING,    "ci.started")
service.transition(record.record_id, LoopState.CI_PASSED,     "ci.completed")
service.transition(record.record_id, LoopState.MERGED,        "auto.merged")
service.transition(record.record_id, LoopState.CLOSED,        "issue.closed")

# Every transition is recorded and queryable
for t in record.transitions:
    print(f"{t.timestamp} | {t.from_state} → {t.to_state} | {t.trigger}")

Transition hooks

The hook system is where LoopForge earns its keep in a real pipeline. Every successful transition can fire a list of hook functions — for audit logging, Slack notifications, webhook calls, governance checks, metrics emission. Hooks run synchronously, in order, after the transition is recorded. If a hook raises an exception, the transition is still recorded — a failed notification doesn't roll back your state machine.

def governance_hook(record, previous_state, new_state, trigger):
    """Block merge without governance sign-off on high-risk changes."""
    if new_state == LoopState.MERGED:
        if record.metadata.get("risk_tier") == "high":
            if not record.metadata.get("governance_approved"):
                raise TransitionError("High-risk merge requires governance approval")

def audit_hook(record, previous_state, new_state, trigger):
    """Ship every transition to the audit log."""
    audit_log.append({
        "record_id": record.record_id,
        "ref": record.ref,
        "transition": f"{previous_state.value} → {new_state.value}",
        "trigger": trigger,
        "timestamp": record.updated_at.isoformat(),
    })

service = LoopService(
    repository=DynamoDBRepository(table_name="loopforge-records"),
    hooks=[governance_hook, audit_hook],
)

Storage backends

LoopForge is storage-agnostic. The default MemoryRepository is useful for testing; DynamoDBRepository (available via loopforge[dynamodb]) is what I use in production. You can also implement the Repository protocol yourself for Postgres, SQLite, Redis, or anything else — the interface is simple: create, get, save, list.

Pairing with Gatekeep

LoopForge and Gatekeep solve different problems that appear at the same place. LoopForge governs order — which steps happen, in what sequence, with what proof. Gatekeep governs quality — whether each step meets the standards it should.

Wire a Gatekeep persona into a LoopForge transition hook and you get both: the transition only succeeds if governance approves it, and the approval is recorded in the audit trail automatically.

from gatekeep.personas import consult_sync
from loopforge import LoopService, LoopState

def security_gate_hook(record, previous_state, new_state, trigger):
    if new_state == LoopState.PR_CREATED:
        result = consult_sync("sentinel", record.metadata.get("diff", ""))
        if result.verdict == "block":
            raise TransitionError(f"Security gate blocked PR: {result.reason}")
        record.metadata["security_review"] = result.summary

Why this matters for AI-driven pipelines specifically

Human developers are slow enough that informal audit trails usually work — there's time to notice if something looks wrong. AI agents aren't slow. A Brood worker can move a task from issue to merged PR in minutes, through a sequence of automated steps that no human watched.

The audit trail isn't about distrust — it's about debuggability and compliance. When a security team asks "how did this change get merged without a review?", "the agent did it" is not an answer. "Here is the complete transition history, here is the governance check that ran, here is its result, here is the trigger for every state change" — that's an answer.

Building that answer in retroactively is much harder than building LoopForge into the pipeline from the start. I learned this the expensive way.

Working on something like this?

Fractional CTO and transformation leadership for situations that aren't working. Bring a problem — thirty minutes, no obligation.

Bring a problem → or scan a repo first →