summary Enable measurement, optimization, and self-improvement of the agent pipeline. Capture execution metrics, success rates, usage per task; support A/B testing of prompts, skills, and models; and add automated review of queue, priorities, pipeline health, specs, implementation, and testing. Supports goal tracking and auto-scheduling of improvements.

process_summary Execution time: Per-task duration in agent_runner logs and task log footer (done: `duration_seconds` in task log); Task success rate: Aggregate completed vs failed by task_type, model, executor; store in metrics (JSON/SQLite); Usage per task: Token/cost proxy or API call count per task (when providers expose it); Pipeline metrics endpoint: `GET /api/agent/metrics` returning execution time P50/P95, success rate, usage summary; Prompt variants: Support `context.prompt_variant` or prompt ID in task; log which variant was used

pseudocode_summary -

implementation_summary api/app/services/agent_execution_metrics.py (resolve_cost_controls(), attribution_values_from_output()); api/app/services/agent_execution_hooks.py (register_lifecycle_hook(), dispatch_lifecycle_event()); api/app/routers/agent_status_routes.py (pipeline status endpoints)

Spec pipeline-observability-and-auto-review

Spec Source

Spec Value/Cost/Gap Sensing

Linked Ideas

Linked Contributors

Linked Process

Linked Implementation

Spec Summary + Process + Pseudocode + Implementation Notes

API Links