Planning and parallelization in agentic systems

Pattern 6 (Planning) and pattern 3 (Parallelization) are often discussed separately, but they compose naturally. Planning produces a task graph. Parallelization identifies which nodes in that graph have no dependencies and can run concurrently. Together they are the engine behind any multi-step agentic workflow that needs to be both structured and fast.

On ticketyboo, both patterns appear most clearly in the fixer-bot and the scanner. The fixer-bot plans before it acts: it generates a spec that describes what changes it will make before making any of them. The scanner parallelizes by design: each analysis layer runs independently on the same repository snapshot.

Pattern 6: Planning

Planning in the Gulli taxonomy means decomposing a high-level goal into a sequence of concrete steps before executing any of them. The agent does not start with the first subtask and figure out the rest as it goes. It produces a plan first, then executes it.

This distinction has practical consequences. An agent that plans first can:

Show the plan to a human for approval before any irreversible action is taken
Detect contradictions in the goal before spending compute on execution
Identify which tasks are independent (enabling parallelization)
Recover from partial failures by re-executing from a specific step rather than starting over

Planning produces a task graph. Execution respects dependencies (sequential) but runs independent tasks concurrently (parallel fork). The plan phase is complete before any execution begins.

Spec generation as planning

The most explicit implementation of pattern 6 in this codebase is the planning agent spec workflow. Before writing any code, the planning agent generates a structured specification: a list of requirements, a data model, and an implementation plan. The spec is a planning artefact. It commits the agent to a structure before any code is written.

# Spec structure (simplified from .kiro/specs/b2b-partner-portal/)
requirements:
  - REQ-001: Partner can authenticate via Cognito Essentials
  - REQ-002: Partner can view their assigned grants
  - REQ-003: Portal must operate within AWS Free Tier

data_model:
  - table: partners (pk=partner_id, sk=profile)
  - table: grants (pk=partner_id, sk=grant_id#status)

implementation_tasks:
  - TASK-001: Terraform Cognito user pool (no deps)
  - TASK-002: DynamoDB tables (no deps)
  - TASK-003: Lambda auth handler (needs TASK-001)
  - TASK-004: Lambda grants API (needs TASK-002)
  - TASK-005: Static portal HTML (needs TASK-003, TASK-004)

Tasks 001 and 002 have no dependencies. They can run in parallel. Tasks 003 and 004 depend on different prerequisites but not on each other. Task 005 is the only task that must wait for everything else. The spec makes this structure explicit before a single line of Terraform is written.

Pattern 3: Parallelization

Parallelization means running independent tasks concurrently rather than sequentially. The constraint is correctness: two tasks can only run in parallel if neither depends on the output of the other. The spec-generated task graph identifies these relationships.

The scanner is the clearest parallelization example in this codebase. When a repository is scanned, six analysis layers run on the same codebase snapshot: dependency analysis, IaC analysis, license checking, code quality, SAST, and secret detection. None of these layers depends on the output of another. They all read from the same source and write to separate result buckets.

The scanner's six analysis layers all read from the same repository snapshot and write to separate result buckets. None depends on another. The aggregator runs after all six complete. Sequential execution would take 6x longer for no correctness benefit.

The Lambda parallelization model

In the scanner architecture, parallelization is implemented through Lambda invocations. The main handler fans out to six layer functions, waits for all to complete, then aggregates the results. Each layer function is independently sized and billed only for the compute it uses.

# scanner/api/scanner.py -- simplified parallelization
import concurrent.futures
from typing import NamedTuple

LAYERS = [
    dependency.analyse,
    iac.analyse,
    license.analyse,
    quality.analyse,
    sast.analyse,
    secret.analyse,
]

def run_scan(repo_snapshot: RepoSnapshot) -> ScanResult:
    """Run all analysis layers concurrently, aggregate results."""
    with concurrent.futures.ThreadPoolExecutor(max_workers=6) as executor:
        futures = {
            executor.submit(layer, repo_snapshot): layer.__module__
            for layer in LAYERS
        }
        results = {}
        for future in concurrent.futures.as_completed(futures):
            layer_name = futures[future]
            try:
                results[layer_name] = future.result()
            except Exception:
                logger.exception("layer %s failed", layer_name)
                results[layer_name] = LayerResult.empty(layer_name)

    return aggregate(results)

The key design decision: each layer returns a LayerResult regardless of success or failure. A layer failure does not abort the scan; it contributes an empty result. The aggregator handles missing results gracefully. This is fault isolation built into the parallelization model.

Planning and parallelization together: the fixer-bot

The fixer-bot combines both patterns. When it receives a batch of issues from the scanner, it does not start fixing the first one immediately. It plans first: group issues by file, identify which files can be fixed independently, determine which fixes might conflict with each other.

Issues in the same file cannot be parallelized: fixing two issues in the same file simultaneously risks producing a conflict. Issues in different files can be parallelized: each file fix is independent.

def plan_fixes(issues: list[Issue]) -> FixPlan:
    """Group issues by file, identify parallelizable batches."""
    by_file: dict[str, list[Issue]] = {}
    for issue in issues:
        by_file.setdefault(issue.file_path, []).append(issue)

    # Within each file, fixes are sequential (ordered by severity)
    # Across files, fix batches are parallel
    sequential_groups = {
        file: sorted(file_issues, key=lambda i: i.severity, reverse=True)
        for file, file_issues in by_file.items()
    }

    return FixPlan(
        parallel_batches=list(sequential_groups.values()),
        total_issues=len(issues),
        affected_files=len(by_file),
    )

def execute_plan(plan: FixPlan) -> list[FixResult]:
    """Execute fix batches in parallel, sequential within each batch."""
    results = []
    with concurrent.futures.ThreadPoolExecutor() as executor:
        batch_futures = [
            executor.submit(_fix_batch, batch)
            for batch in plan.parallel_batches
        ]
        for future in concurrent.futures.as_completed(batch_futures):
            results.extend(future.result())
    return results

When to plan, when to just execute

Planning adds latency. If a task is simple and linear, generating a plan before executing is overhead with no benefit. The threshold for planning is roughly: does the task have more than one step that could be done differently, and does the choice matter? If yes, plan. If no, execute directly.

For the fixer-bot processing a single low-severity issue in one file, there is no meaningful plan to generate. For the fixer-bot processing a sprint backlog of 30 issues across 15 files, planning is essential. The batch size and complexity determine whether the overhead is worth it.

The dependency graph is the plan. Any task list with dependency relationships is implicitly a plan. Making it explicit, in code or in a spec document, forces you to think about which tasks are truly independent and which are only assumed to be. That thinking often reveals opportunities for parallelization that would otherwise be missed.

The ops agents as a parallel pipeline

The ops agents are a parallel execution model at the infrastructure level. Four EventBridge schedules trigger four independent Lambda functions: SRE metrics, cost analysis, security audits, and GitHub activity. None of these agents depends on another. They all write to the same DynamoDB table with different partition keys.

This is parallelization by architecture. No code coordinates the four agents; the EventBridge schedule is the fan-out mechanism. The team dashboard reads from DynamoDB and presents all four streams together. The aggregation is read-time, not write-time.

Pattern taxonomy from Antonio Gulli, Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems (Springer, 2025). All examples are original implementations from the ticketyboo.dev platform.

Want to see these patterns in a real codebase?

Scan a repository

Working on something like this?

Fractional CTO and transformation leadership for situations that aren't working. Bring a problem — thirty minutes, no obligation.

Bring a problem → or scan a repo first →

Planning and parallelization in agentic systems

Pattern 6: Planning

Spec generation as planning

Pattern 3: Parallelization

The Lambda parallelization model

Planning and parallelization together: the fixer-bot

When to plan, when to just execute

The ops agents as a parallel pipeline

Related

Working on something like this?