Build Loop

Install

claude plugin marketplace add tyroneross/RossLabs-AI-Toolkit
claude plugin install build-loop@rosslabs-ai-toolkit

The Problem

Multi-step builds drift. The model writes code, declares success, and leaves silent failures: placeholder data that looks real, test stubs that never ran, UI pages that do not render, or scope creep past the original goal.

Build Loop adds a repeatable operating system for that class of work. The plan names the design decisions up front. The implementer reports the decisions it made. Review compares the claim to the diff, the rendered behavior, and the original criteria.

What I Built

Build Loop turns significant code changes into a checked workflow: assess, plan, execute, review, iterate, and learn. It gives the agent a place to define success criteria, split work, validate claims, and repair failures before calling the job done.

Impact

Build Loop is host-neutral. Claude Code gets slash commands, agents, hooks, and plugin bridges. Codex gets skills, AGENTS.md, and explicit subagent prompts only when the user authorizes parallel delegation. Other coding tools can still follow the same phase structure through a plain AGENTS.md contract.

It also ships three modes in one plugin:

Build for features, refactors, migrations, schema changes, and other multi-file work.
Optimize for measurable improvements using Design of Experiments instead of one-variable-at-a-time guessing.
Research for repo-grounded analysis before deciding whether to change code.

The bridges matter. Build Loop can consult API Registry before integration work, architecture mapping before risky edits, debugger memory before repeating an old failure, and visual validation when UI behavior matters. When a bridge is missing, the loop keeps going with local fallback checks.

How It Works

#	Phase	Purpose
1	Assess	Read repo state, intent, tools, architecture, and prior failures
2	Plan	Define pass/fail criteria, file ownership, dependency order, and validation commands
3	Execute	Build locally or dispatch authorized parallel workers with bounded ownership
4	Review	Critic, validate, fact-check, simplify, and report against the criteria
5	Iterate	Repair review failures and repeat review until the criteria pass or the limit is reached
6	Learn	Optional pattern mining across prior runs to draft reusable skills or process improvements

Hard-Won Defaults

Each round of dispatch-pattern testing has surfaced a default that’s now baked into the loop:

Plan verification before execution — the plan must name the goal, exact files, criteria, commands, and design decisions before build work starts.
Critic plus live validation for UI — passing tests do not mean the page renders. UI work needs a critic pass and browser or IBR-style evidence.
Explicit ownership for workers — parallel workers get bounded file ownership, and the orchestrator integrates the final diff.
Discoverability as a criterion — a shipped feature needs a path into the UI, not just a component on disk.
Paid API guardrails — cost-bearing APIs require rate limits, failure handling, and explicit validation before release.