Build Loop
Checked build loop for multi-file changes: assess, plan, execute, review, iterate, learn, and optimize with evidence.
Install
claude plugin marketplace add tyroneross/RossLabs-AI-Toolkit
claude plugin install build-loop@rosslabs-ai-toolkit
The Problem
Multi-step builds drift. The model writes code, declares success, and leaves silent failures: placeholder data that looks real, test stubs that never ran, UI pages that do not render, or scope creep past the original goal.
Build Loop adds a repeatable operating system for that class of work. The plan names the design decisions up front. The implementer reports the decisions it made. Review compares the claim to the diff, the rendered behavior, and the original criteria.
What I Built
Build Loop turns significant code changes into a checked workflow: assess, plan, execute, review, iterate, and learn. It gives the agent a place to define success criteria, split work, validate claims, and repair failures before calling the job done.
Impact
Build Loop is host-neutral. Claude Code gets slash commands, agents, hooks, and plugin bridges. Codex gets skills, AGENTS.md, and explicit subagent prompts only when the user authorizes parallel delegation. Other coding tools can still follow the same phase structure through a plain AGENTS.md contract.
It also ships three modes in one plugin:
- Build for features, refactors, migrations, schema changes, and other multi-file work.
- Optimize for measurable improvements using Design of Experiments instead of one-variable-at-a-time guessing.
- Research for repo-grounded analysis before deciding whether to change code.
The bridges matter. Build Loop can consult API Registry before integration work, architecture mapping before risky edits, debugger memory before repeating an old failure, and visual validation when UI behavior matters. When a bridge is missing, the loop keeps going with local fallback checks.
How It Works
| # | Phase | Purpose |
|---|---|---|
| 1 | Assess | Read repo state, intent, tools, architecture, and prior failures |
| 2 | Plan | Define pass/fail criteria, file ownership, dependency order, and validation commands |
| 3 | Execute | Build locally or dispatch authorized parallel workers with bounded ownership |
| 4 | Review | Critic, validate, fact-check, simplify, and report against the criteria |
| 5 | Iterate | Repair review failures and repeat review until the criteria pass or the limit is reached |
| 6 | Learn | Optional pattern mining across prior runs to draft reusable skills or process improvements |
Hard-Won Defaults
Each round of dispatch-pattern testing has surfaced a default that’s now baked into the loop:
- Plan verification before execution — the plan must name the goal, exact files, criteria, commands, and design decisions before build work starts.
- Critic plus live validation for UI — passing tests do not mean the page renders. UI work needs a critic pass and browser or IBR-style evidence.
- Explicit ownership for workers — parallel workers get bounded file ownership, and the orchestrator integrates the final diff.
- Discoverability as a criterion — a shipped feature needs a path into the UI, not just a component on disk.
- Paid API guardrails — cost-bearing APIs require rate limits, failure handling, and explicit validation before release.