CLI

The agentdbg CLI lists runs, starts the local viewer, and exports runs to JSON. Storage is under ~/.agentdbg/ by default (overridable with AGENTDBG_DATA_DIR). For all configuration options and precedence, see the configuration reference.

`agentdbg list`

Lists recent runs (by started_at descending).

Usage:

agentdbg list [--limit N] [--json]

Options:

Option	Default	Description
`--limit`, `-n`	20	Maximum number of runs to list
`--json`	-	Output machine-readable JSON

Examples:

agentdbg list
agentdbg list --limit 5
agentdbg list --json

Exit codes: 0 success; 10 internal error.

Text columns: run_id (short), run_name, started_at, duration_ms, llm_calls, tool_calls, status.

`agentdbg view`

Starts the local viewer server and optionally opens the browser. Default bind: 127.0.0.1:8712.

Usage:

agentdbg view [RUN_ID] [--host HOST] [--port PORT] [--no-browser] [--json]

Arguments / options:

Argument/Option	Default	Description
`RUN_ID`	(latest)	Run to view; can be a short prefix (e.g. first 8 chars of UUID)
`--host`, `-H`	127.0.0.1	Bind host
`--port`, `-p`	8712	Bind port
`--no-browser`	-	Do not open the browser; only start the server
`--json`	-	Print run_id, url, status as JSON, then start server

Examples:

agentdbg view
agentdbg view a1b2c3d4
agentdbg view --port 9000 --no-browser
agentdbg view --json

Exit codes: 0 success; 2 run not found (or no runs); 10 internal error.

With --json, output shape: {"spec_version":"0.1","run_id":"...","url":"http://127.0.0.1:8712/?run_id=...","status":"serving"}.

`agentdbg export`

Exports one run to a single JSON file (run metadata + events array).

Usage:

agentdbg export RUN_ID --out FILE

Arguments / options:

Argument/Option	Description
`RUN_ID`	Run to export; can be a short prefix (e.g. first 8 chars of UUID)
`--out`, `-o`	Output file path (JSON)

Examples:

agentdbg export a1b2c3d4-1234-5678-90ab-cdef12345678 --out run.json
agentdbg export a1b2c3d4 -o ./exports/run.json

Exit codes: 0 success; 2 run not found; 10 internal error.

Output file contains: spec_version, run (run metadata), events (array of event objects).

`agentdbg baseline`

Captures a baseline snapshot from a completed run. The snapshot records structural metrics (event counts, tool path, token usage, duration, etc.) that agentdbg assert can later compare against. See Regression testing for the full workflow.

Usage:

agentdbg baseline RUN_ID [--out PATH]

Arguments / options:

Argument/Option	Default	Description
`RUN_ID`	(required)	Run ID or prefix to snapshot
`--out`, `-o`	`.agentdbg/baselines/<run_name>.json`	Output path for the baseline JSON file

Examples:

agentdbg baseline a1b2c3d4
agentdbg baseline a1b2c3d4 --out baselines/support_agent_v1.json

Exit codes: 0 success; 2 run not found; 10 internal error.

The output file is a JSON object containing schema_version, source_run_id, summary metrics, tool_path, tool_call_counts, llm_models_used, event_type_sequence, and final_status. Check it into version control to share the baseline with your team.

`agentdbg assert`

Asserts that a completed run meets behavioral policy checks. Returns exit code 0 when all checks pass and 1 when any check fails, making it suitable for CI gates.

Usage:

agentdbg assert RUN_ID [options]

Arguments / options:

Argument/Option	Default	Description
`RUN_ID`	(required)	Run ID or prefix to check
`--baseline`, `-b`	-	Baseline JSON file to compare against
`--policy`	`.agentdbg/policy.yaml` (auto-detected)	Policy YAML file with assertion thresholds
`--max-steps`	-	Max total events allowed
`--step-tolerance`	`0.5`	Fractional tolerance for step count
`--max-tool-calls`	-	Max tool calls allowed
`--tool-call-tolerance`	`0.5`	Fractional tolerance for tool calls
`--no-new-tools`	`false`	Fail if run uses tools not in baseline
`--no-loops`	`false`	Fail if any LOOP_WARNING present
`--no-guardrails`	`false`	Fail if any guardrail was triggered
`--max-cost-tokens`	-	Max total tokens allowed
`--cost-tolerance`	`0.5`	Fractional tolerance for token cost
`--max-duration-ms`	-	Max run duration in ms
`--duration-tolerance`	`0.5`	Fractional tolerance for duration
`--expect-status`	-	Expected run status (`ok` or `error`)
`--format`, `-f`	`text`	Output format: `text`, `json`, or `markdown`

Precedence: CLI flags override the policy file, which overrides defaults. See the Policy YAML reference for the full override rules and threshold semantics.

Examples:

# Assert against a baseline with default tolerances
agentdbg assert a1b2c3d4 --baseline .agentdbg/baselines/my_agent.json

# Assert with standalone thresholds (no baseline)
agentdbg assert a1b2c3d4 --max-steps 80 --max-tool-calls 30 --no-loops

# Assert using a policy file
agentdbg assert a1b2c3d4 --baseline baseline.json --policy ci-policy.yaml

# Markdown output for GitHub step summaries
agentdbg assert a1b2c3d4 --baseline baseline.json --format markdown

Exit codes: 0 all checks passed; 1 one or more checks failed; 2 run or baseline not found; 10 internal error.

`agentdbg diff`

Compares two runs, or a run against a baseline, showing structural differences in summary metrics, tool path, and event type distribution. Useful for understanding what changed when agentdbg assert reports a failure. See Regression testing for the workflow.

Usage:

agentdbg diff RUN_A [RUN_B] [--baseline FILE] [--format FORMAT]

Exactly one of RUN_B or --baseline must be provided.

Arguments / options:

Argument/Option	Description
`RUN_A`	First run ID or prefix
`RUN_B`	Second run ID or prefix (mutually exclusive with `--baseline`)
`--baseline`, `-b`	Baseline JSON file to compare against (mutually exclusive with `RUN_B`)
`--format`, `-f`	Output format: `text` (default)

Examples:

# Compare two runs
agentdbg diff a1b2c3d4 e5f6a7b8

# Compare a run against a baseline
agentdbg diff a1b2c3d4 --baseline .agentdbg/baselines/my_agent.json

Exit codes: 0 success; 2 run or baseline not found; 10 internal error.

Text output sections:

Summary — metric-by-metric comparison with percentage change (e.g. tool_calls: 10 -> 14 (+40%))
Tool path changes — new (+) and removed (-) tools
Event type distribution — per-event-type counts with percentage change