Implement the following using test-driven development: $ARGUMENTS
Tests give Claude a self-verification loop. Instead of producing code that looks right, write tests first, then implement until they pass. This is the highest-leverage pattern for agentic coding.
| Claude Code Phase | TDD Phase | What Happens |
|---|---|---|
| Explore | Phase 1: Analysis | Read files, trace data flow, check standards |
| Plan | Phase 2: Planning | Chunk decomposition, dependency graph, approval |
| Implement | Phases 3-5: TDD Cycle | Per-chunk: failing test -> code -> passing test |
| Commit | (user-initiated) | Commit when user requests |
| Scope | Approach |
|---|---|
| Trivial (typo, rename, version bump) | Don’t use this skill – just do it directly |
| Small (single-file logic, simple bug fix) | Small feature shortcut (1-chunk tracker) |
| Medium+ (multi-file, unfamiliar code) | Full process |
| Large (cross-cutting, multi-session) | Full process + session resets between chunks |
Small feature shortcut: For single-file or few-file changes with no new domain logic (pure UI, config wiring), collapse to: Analysis -> Pre-Test (verify existing coverage) -> Implement -> Post-Test (regression) -> Quality Verification. Use a 1-chunk tracker (see tracker-schema.md §Single-Chunk Features). Skip chunking, plan mode, and Plan Review Gate (2.5). Quality Verification (Phase 6) still applies — review-impl + /simplify run for all features.
Load these on demand, not all upfront:
| File | When to Load |
|---|---|
| chunk-template.md | Phase 2, when decomposing into chunks (skip for small features) |
| tracker-schema.md | Phase 2.3, when creating the tracker (all features) |
| quality-checklist.md | Phase 2.5 (plan review) and Phase 6 (final verification) |
Phase 1: Analysis
Phase 2: Planning
└─ 2.5: Plan Review Gate (review-plan agent — blocks Phase 3)
Phase 3: Pre-Test (per chunk)
Phase 4: Implementation (per chunk)
Phase 5: Post-Test (per chunk)
Phase 6: Quality Verification
└─ Gate: review-impl agent + /simplify (parallel — blocks completion)
Each phase must complete before the next begins. For multi-chunk features, Phases 3-5 repeat per chunk. Gates are artifact-triggered: the tracker file triggers review-plan (2.5), all chunks complete triggers review-impl + /simplify (6).
docs/specs/ for an existing spec file matching the feature.
If found, use it as primary input — user stories, acceptance criteria,
and edge cases become the basis for chunk decomposition in Phase 2.
(Specs are created by the spec skill, an optional companion.)Context tip: For broad exploration across unfamiliar code, use subagents to investigate. They run in a separate context and report back summaries, keeping your main context clean for implementation.
Verify alignment with the project’s established architecture.
Refer to PROJECT.md for project-specific patterns.
Common patterns to verify:
PROJECT.mdBreak the feature into implementation chunks. See chunk-template.md.
Rules:
resume field for session resumptionOrganize chunks into layers:
Layer 1 (no deps): Chunks that can start immediately
Layer 2 (deps on L1): Chunks needing Layer 1 complete
Layer N (final): Regression + quality verification
Create a tracker file following the schema in tracker-schema.md. The tracker is always created, even for single-chunk features (see §Single-Chunk Features in the schema).
The tracker is the single source of truth for progress.
Always update status to in_progress BEFORE starting a chunk.
Why always? The tracker is a file-based artifact that survives context resets. Relying on in-context memory loses state when sessions end or context is compacted. The tracker ensures any session — current or future — can pick up exactly where work stopped.
Enter Plan Mode (Shift+Tab twice from Normal Mode) and present:
Press Ctrl+G to open the plan in your text editor for direct
editing before proceeding.
Get user approval, then switch back to Normal Mode (Shift+Tab)
before proceeding to implementation.
GATE — The tracker artifact from 2.3 triggers this review. Do not proceed to Phase 3 until review completes.
The tracker file is the review input (analogous to the spec
skill’s Phase 4 validation, but delegated to an independent
agent for deeper code-aware analysis). Spawn the review-plan
agent (.claude/agents/review-plan.md) as a subagent so the
reviewer operates in a fresh context without author bias:
Use the review-plan agent to review [path to tracker]
The agent evaluates 8 criteria against the plan: completeness, correctness, functional gaps, standards, regression risk, robustness, architectural gaps, and TDD quality.
FAIL: Update the plan and re-run review-plan. WARN: Proceed with a note. PASS: Proceed to Phase 3. Agent failure (timeout, error, inconclusive): Fall back to self-check against quality-checklist.md, document the failure in the tracker’s notes, and proceed.
Record the verdict in the tracker’s top-level plan_review field
(e.g., "plan_review": "PASS"). This makes the gate survive
session resets — resumption checks this field before Phase 3.
| Chunk Type | Test Strategy |
|---|---|
| Data class with logic | Computed properties, boundary values |
| Sealed type / union | Property delegation for each variant |
| Repository / service impl | Conversion helpers, filtering, errors |
| Observer / manager | Lifecycle, debounce, state changes |
| Configuration / preferences | Defaults, type conversions, round-trips |
| Composite / aggregating layer | Merge logic, fallbacks, empty states |
| Interface / trait only | No test (tested via downstream fake) |
| UI component | No test (build + regression) |
| DI wiring / config | No test (verified by compilation) |
| Type migration / rename | No test (verified by compilation) |
If the test strategy says “No test,” verify existing coverage is green and skip to Phase 4. Otherwise:
Run the project’s test command (see PROJECT.md) targeting
the specific test class. Confirm tests fail for the right
reason (compile error or assertion failure, not infrastructure).
Set chunk status to in_progress in the JSON tracker.
When adding a dependency to a class or changing a function signature:
When a fix eliminates a code path, remove the dead code in the same chunk. Don’t leave it for a future cleanup pass.
Examples:
Why same chunk? Dead code left behind confuses future readers and creates false grep matches. The person implementing the fix has the best context for what’s now unreachable.
Run the project’s test command targeting the specific test class. All tests must pass. If a test fails, fix the code (not the test) unless the test is wrong about expected behavior.
For intermediate chunks, skip the full suite - chunk tests are sufficient. Run the full suite after the last chunk before Phase 6 (or if a chunk touches widely-shared code).
Check for regressions. Note pre-existing flaky tests but don’t block.
Run the project’s build command. Compilation must succeed.
Set chunk status to complete in the JSON tracker.
Prefer context resets over compaction. A clean session with the tracker as handoff preserves more fidelity than compacted context, which loses information unpredictably.
Between chunks (preferred): Start a new session. The JSON
tracker + resume fields give the new session everything it needs.
Use /rename to name the session for reference.
Mid-chunk (fallback only): If you must compact within a chunk,
run /compact Focus on the current chunk, tracker path, and test
results. This is a fallback — finish the chunk and reset.
After all chunks complete, run the 8-point checklist. See quality-checklist.md for detailed criteria.
GATE — All chunks complete triggers parallel review.
Spawn both in a single message so they run concurrently:
review-impl agent — verifies plan conformance, acceptance
criteria, test quality, regression, robustness, and dead code/simplify — reviews changed files for code reuse, quality,
and efficiencyIn parallel:
- Use the review-impl agent to review implementation against [path to tracker]
- Run /simplify on changed files
While the agents run, self-check against the quick reference above. Merge findings from all three sources (review-impl, /simplify, self-check). Address any FAIL findings before proceeding.
Agent failure (timeout, error): If review-impl fails, fall back to self-check against the full quality-checklist.md. If /simplify is unavailable, review changed files manually for reuse and dead code. Document any agent failures in the tracker.
Always create or update a reference document that survives context compaction. This is required even for small fixes. The document serves as the single source of truth for what was done, why, and what remains.
Contents:
If the feature already has an analysis doc, design doc, or issue tracker, update it instead of creating a new one:
Why mandatory? Context compaction loses implementation details. The doc ensures a future session can understand the full scope of what was reviewed, what was fixed, and what was intentionally left.
in_progress BEFORE starting workthrow to continue (resilience fix),
narrow catch scope. Fatal errors (OOM, stack overflow)
indicate the runtime is broken - continuing would cause
cascading failures. Also verify cancellation exceptions
can’t reach the catch siteDo not commit proactively. Wait for the user to request it.
Refer to PROJECT.md for project-specific commit conventions
(author, message format, trailers).
When resuming work on an in-progress feature:
plan_review field — if missing or "FAIL", run Phase 2.5
before proceedingpending chunk where all depends_on are completeresume field for instructionsTip: Use --resume to continue a named session, or read the
JSON tracker if starting a fresh session on an existing feature.