Back to projects
Active Started Sep 2025

Interface Built Right

End-to-end UI design tool for AI coding agents. Archetype-routed design system, /ibr:build orchestrator, custom CDP engine, verification loop.

TypeScript Playwright Pixelmatch Zod

Install

claude plugin marketplace add tyroneross/RossLabs-AI-Toolkit
claude plugin install ibr@rosslabs-ai-toolkit

The Problem

AI coding agents can render a UI without knowing whether it is the right UI. Tests pass, the page returns 200, the screenshot looks plausible — and the navbar is shifted 40px, the empty state has no CTA, the iOS app violates HIG, the form has no focus indicators. A regression checker tells you something changed. It does not tell you what to build, which patterns fit the platform, or whether the result is discoverable. IBR started as the regression checker. v0.10.0-alpha is the rest.

What I Built

IBR repositioned from “visual testing platform” to end-to-end design tool. The validation engine that used to be the product is now the verification layer underneath a build orchestrator, a platform-aware design system, and an iOS archetype router.

AreaBefore (visual testing)After (end-to-end design)
Primary identityCapture baseline, compare, verdictGuided build with verification along the way
iOS guidanceGeneric HIG notes6-archetype router into 7 domain reference files
Apple platformOut of scopeapple-platform skill folded in (architecture, SwiftData, concurrency, TestFlight)
Browser enginePlaywright (~180MB, pre-LLM)Custom CDP engine, ~95% migrated
WorkflowStateless single-action CLI/ibr:build orchestrator: preamble → brainstorm → plan → implement → validate → iterate

/ibr:build Orchestrator

/ibr:build is the entry point. It runs a fixed sequence: a UI preamble that captures platform, scope, template, references, and density; a brainstorming pass via the superpowers skill; a written plan; implementation; and a validate-iterate loop that uses the same comparison engine that started this project. The orchestrator is the one place that knows about the design system, the active reference templates, and the verification gates — so an agent invoking it cannot skip the design step on the way to a passing test.

iOS Design Router

Most “iOS design help” reduces to one of about six app shapes. The ios-design-router skill classifies an app into one of those archetypes — Utility, Content/Feed, Productivity, Consumer/Habit, Editorial, Tool/Pro — and routes to the matching reference files. A meditation app and a developer tool both ask “what navigation pattern?” and get different correct answers without the agent guessing.

The router sits next to two more iOS-specific skills:

  • ios-design — HIG rules: what to build (navigation, color, type, motion, SF Symbols, materials, Liquid Glass).
  • apple-platform — how to build it: architecture, SwiftData, Watch connectivity, concurrency, CI/CD, TestFlight. Folded in from the standalone apple-dev skill, which is now deprecated with a pointer here.

Reference Library

The references/ios-design/ directory holds 7 domain files lifted from the Calm Precision iOS design system: navigation, lists and cards, buttons, color and typography, motion and states, task economy, and the archetype catalog the router reads from. They are loaded on demand by archetype, not all at once, so the agent gets the relevant patterns instead of a wall of HIG.

CDP Engine

Playwright was the original automation layer. It is pre-LLM, ~180MB, and resolves elements by selector when an agent would rather query the accessibility tree. IBR is replacing it with a custom Chrome DevTools Protocol engine forked from Spectra’s transport layer. The migration is roughly 95% complete — about two stubborn imports remain, and removing them is the last phase before Playwright drops out of the dependency tree. The engine adds four LLM-native moves the old layer could not make cleanly: accessibility-tree-first element resolution, DOM chunking with relevance ranking, adaptive modality switching, and event-driven waits instead of arbitrary timeouts.

Verification, Still

Comparison did not go away — it became a step inside the loop instead of the whole product. Capture a baseline, run the change, diff with Pixelmatch at a 0.1 threshold, classify the result as MATCH, EXPECTED_CHANGE, UNEXPECTED_CHANGE, or LAYOUT_BROKEN, and feed the verdict back into the iterate phase. Landmark detection still extracts the accessibility tree on capture so the verdict carries page intent (auth form, list view, dashboard) rather than just a pixel count. The web dashboard still shows split views, diff overlays, session history, and a feedback field that talks to the agent.

Memory

A three-tier preference store keeps the agent’s design context small and current: summary.json as a hot cache under 2KB, a preferences/ directory with full details, and an archive/ for evicted entries. Cap is 50 active preferences; the least recently used drops to archive when a 51st arrives. Prompt hooks inject the relevant baselines and preferences into the agent’s context before a UI change instead of asking it to fetch them. Zod schemas validate every preference file and auto-migrate old shapes when the format changes.

Why the Repositioning

IBR already had build orchestration, design system enforcement, and platform skills. Calling it a testing tool described one slice and left the other three under-marketed and under-used. The new framing — design-first, verified end to end — matches what the plugin actually does on a real build, and gives the iOS work, the apple-platform integration, and the CDP engine a coherent place to live.