Feb 16 – 22, 2026

AI wrote 90% of this app.

7 days. 135 commits. One human steering.

cre8 is a real-time collaborative whiteboard built in a single week — Claude Code generated the code while a human drove architecture, design, and debugging decisions.

135Commits
7Days
~90%AI-Generated Code
$11.01API Cost
202LLM Traces
9,396AI Operations

Development Process

Claude Code wrote ~90% of the codebase — but the direction was entirely human-driven. The most valuable human contributions were architectural: choosing zustand over Redux, separating RTDB (high-frequency cursors) from Firestore (persistent objects), and designing the "AI describes, layout engine positions" pattern for architecture diagrams.

47% of sessions were bug fixes, not new features. The real value of AI coding isn't "build faster" — it's "fix faster." Claude excelled at reading 10+ files, understanding state interactions, and finding root causes across multi-file codebases.

606

Claude Code Messages

76

Sessions

+31,251

Lines Written

378

Files Touched

70.3s

Median Response Loop

72%

Goals Achieved

Toolchain

Claude Code

Primary dev agent — ~90% of code

OpenAI Codex

Architecture planning sessions

Langfuse

AI observability — every LLM call traced

Repomix

Codebase compression for AI analysis

Build Timeline

Day 1 — Genesis

8 commits$0 API
  • Initial commit — Next.js 16 + React 19 + Konva canvas
  • Firebase Auth (Google + email), dark/light mode
  • Infinite canvas with pan/zoom, shape primitives
8/135

Cost Analysis

3.3x — Prompt Optimization

AI command cost dropped from $0.014 to $0.004/call by trimming the system prompt from ~8K to ~1.9K input tokens. Same model (Haiku 4.5), just less wasted context.

8.2x — Model Migration

Switching AI commands from Sonnet to Haiku cut cost per call from $0.051 to $0.006. Quality was sufficient for tool-use tasks — simpler output, but good enough.

AI Commands — Haiku 4.5

Natural language board manipulation

133 traces$1.08 total
Avg latency4.1s
Cost per command$0.004
Arch Diagrams — Sonnet 4.6

GitHub repo → architecture visualization

69 traces$9.93 total
Avg latency25.3s
Cost per analysis$0.09

Still in active development — feature launched today. Uses Repomix + Sonnet 4.6 to analyze any GitHub repo and generate a visual architecture diagram on the canvas.

AI Commands — Haiku 4.5 — Production Projection

$0.004/cmd · projected monthly cost by user scale

Total Development API Spend$11.01
Haiku 4.5 — $1.08Sonnet 4.5 — $3.60Sonnet 4.6 — $6.33

Effective Prompts

1

Feature Implementation with Quality Constraints

"Review PRD.md. We're missing a core feature: Connector.tsx — lines and arrows between objects. This needs to work correctly on the first attempt without introducing bugs. The UX should feel smooth and intuitive like Figma — users click and drag an arrow to connect two elements. Establish a plan and implement it."

Working connectors with endpoint snapping, shipped in a single session with zero regressions

2

Readiness Assessment Before Major Feature

"Review the current implementation and analyze progress. Determine a readiness score (1-10) for implementing the final major feature — AI collaboration. Write a phased plan to implement the AI features in a way that builds on existing architecture."

Scored 8/10 readiness. Delivered a phased plan: tool schemas → simulate function → API route → chat UI

3

Separating AI Judgment from Layout Logic

"Build a feature that takes a GitHub repo URL and generates an architecture diagram on the canvas. Use Repomix to compress the repo, Claude Sonnet to analyze the structure, and a deterministic layout engine to position everything. The AI should only describe the architecture — the layout engine handles all positioning."

Clean separation of concerns — Claude describes, layout engine positions. Same repo always produces the same diagram.

Lessons Learned

Seven days of AI-first development surfaced a clear pattern: the mistakes and wins are two sides of the same coin. Every failure pointed directly to a working principle.

What went wrong

Tried to one-shot a massive performance refactor

Day 4 burned the most tokens but was the least productive. Vague instructions like "make it faster" produced sprawling, broken changes.

What I learned

Small prompts beat big prompts

Breaking the same refactor into focused, sequential steps with verification between each produced working code. Feed Chrome DevTools traces, not symptom descriptions.

What went wrong

Stopped referencing the PRD after day 3

Development became reactive — fixing whatever felt urgent instead of building against the spec. Features drifted from the original plan.

What I learned

The skill is steering, not prompting

53 wrong-approach redirects across 76 sessions. AI agents over-engineer constantly. The human value is catching bad paths early — 72% of sessions still hit their goal through active course correction.

What went wrong

Zero E2E tests meant silent regressions

119 unit tests but no Playwright tests. Fixing one bug would unknowingly break another. Even 5 integration tests would have saved hours.

What I learned

Observability enables real decisions

Wiring Langfuse on day 3 gave visibility into token counts, latency, and cost per trace. That data drove the Sonnet → Haiku migration with confidence instead of guesswork.

The architectural takeaway: separate AI judgment from deterministic logic. Architecture diagrams work because Claude only describes the structure while a layout engine handles positioning — nondeterministic reasoning paired with deterministic rendering produces consistent output every time.

Try it yourself

cre8 is live and open source

Real-time collaborative whiteboard with an AI agent that manipulates the canvas through natural language. Paste a GitHub URL and get an architecture diagram in 30 seconds.