Maida

Maida is the pre-merge behavioral regression gate for AI agents. It records agent runs, compares current behavior against a known-good baseline, and fails CI when structural behavior regresses: more steps, unexpected tool calls, loops, latency spikes, or cost blowups.

What it is: A local-first, CI-first developer tool for recording runs, capturing baselines, and blocking bad PRs before merge.

What it is not: It is not a hosted telemetry product, a generic output eval platform, or a framework lock-in layer. The local viewer helps inspect evidence, but the core product is behavioral regression gating.

In 60 seconds

1. Install:

pip install maida-ai

2. Run the bundled demo agent (simulated; no repo clone, no API keys):

maida demo

3. Open the timeline viewer or capture a baseline:

maida view
maida baseline --out baselines/my_agent.json

A browser tab opens showing the run timeline - tool calls, LLM calls, timing, warnings, and errors. Data is stored locally under ~/.maida/runs/<trace_id>/ as OTel-compatible spans plus metadata.

To watch the gate catch a regression end-to-end on canned data — baseline a good run, run a "refactored" agent that loops and calls a new tool, see the failing report with a PR-comment preview:

maida demo --regression

When you're ready to wire up your own project, maida init scaffolds a starter .maida/policy.yaml (add --github for a ready-to-edit CI workflow).

Demos and examples

Example	Path	How to run
Minimal agent (pure Python)	`examples/minimal/`	`python examples/minimal/simple_agent.py`
LangChain minimal	offline script	`python langchain-minimal.py`
OpenAI Agents minimal	offline script	`python openai-agents-minimal.py`
CrewAI minimal	offline script	`CREWAI_DISABLE_TELEMETRY=true python crewai-minimal.py`
LangChain customer support (advanced)	`examples/langchain/`	Set API keys, then follow `_customer_support/README.md`
Demos (short scripts)	`examples/demo/`	`python examples/demo/pure_python.py` or `python examples/demo/langchain.py`

After any run, open the timeline with maida view.

Documentation

Page	Description
Getting started	Installation (uv/pip), quickstart, data dir, redaction
Guardrails	Stop runaway runs with loop, count, and duration limits
Regression testing	Baseline, assert, and diff workflow for catching agent regressions
CLI	`demo`, `init`, `list`, `view`, `export`, `baseline`, `accept`, `assert`, `diff` with options and exit codes
Viewer	Timeline UI usage, URL params, live refresh, and development
SDK	`@trace`, `traced_run`, `has_active_run`, `record_llm_call`, `record_tool_call`, `record_state`
Integrations	LangChain, OpenAI Agents, and CrewAI adapters, including failure behavior and limitations
Architecture	OTel span schema, storage layout, viewer API, loop detection
Reference
Trace format	OTel span envelope, derived event types, payload schemas, meta.json (public contract)
Configuration	Env vars, YAML precedence, redaction, truncation, loop detection, guardrails
Policy YAML	Assertion policy file format, fields, threshold semantics, CLI mapping