ai-agent-dev-workflow

Spec Validation Checklist

Run this checklist during Phase 4 after the spec is drafted and clarifications are resolved. Each point must be verified with specific evidence from the spec file.

Pass criteria: all 8 points pass, or remaining failures are documented in the spec’s Notes section with rationale.

Max 2 fix iterations. If issues persist after 2 rounds, mark the spec as Ready-with-warnings and proceed.


1. Testability

Every acceptance criterion can be converted to a test.

How to check:

Common misses:

Self-evaluation warning: It’s tempting to say “all criteria are testable” without actually imagining the test. For each criterion, mentally construct the specific assertion — don’t just pattern-match the Given/When/Then structure.

Adversarial check technique: For each criterion, ask: “Could an implementation pass this criterion while being completely wrong?” If yes, the criterion is too vague. Example: “Then the system responds appropriately” passes for ANY response. “Then the system returns a 400 status with field-level error messages” can only pass when correct.


2. Completeness

Every user story has acceptance criteria. Every P1 story has both happy-path and error-path criteria.

How to check:

Variant completeness: If a story’s behavior branches by N variants (scenarios, user roles, platforms, modes), each variant needs at least one criterion. Count the variants mentioned in the story, count the criteria covering each — flag uncovered variants. Example: a story says “generates output for 4 scenarios” but only 2 of 4 have criteria — the other 2 are untested.

Common misses:


3. Clarity

No vague adjectives without measurable targets. No jargon without definition. No pronouns with ambiguous referents.

How to check:

Common misses:


4. Scope

Boundaries are explicit. There is an “Out of Scope” section. P1/P2/P3 boundaries are defensible.

How to check:

Common misses:


5. Independence

User stories can be implemented and tested independently. P1 stories deliver standalone value.

How to check:

Common misses:


6. Priority

Stories are prioritized P1/P2/P3. The P1 set forms a coherent MVP that delivers user value on its own.

How to check:

Priority dependency check: If a P1 story depends on work that’s currently P2 or P3 (or doesn’t exist yet), flag it:

Common misses:


7. Edge Cases

Key failure modes and boundary conditions are identified. At minimum: empty state, error state, boundary values.

How to check:

Common misses:


8. Resolution

No unresolved [NEEDS CLARIFICATION] markers remain.

How to check:

Common misses:


Project-Specific Validation

After the 8 generic points above, check the spec against project-specific criteria:

  1. Load PROJECT.md “Domain-Specific Concerns” section
  2. Verify each concern is addressed (in Edge Cases, Acceptance Criteria, or explicitly in Out of Scope)
  3. Load PROJECT.md “Quality Standards” section
  4. Verify each standard is met

Document project-specific failures the same way as generic failures — fix or add to Notes with rationale.


Handling Failures

If all 8 points pass: Mark spec status as Ready. Proceed to Phase 5 (Handoff).

If points fail (iteration 1): Fix the spec directly. Re-run the checklist. Most failures are fixable without user input (adding missing error criteria, removing vague adjectives, adding Out of Scope items).

If points still fail (iteration 2): Fix what you can. Document remaining issues in the spec’s Notes section with rationale for why they couldn’t be resolved. Mark spec status as Ready-with-warnings. Proceed to Phase 5 with a warning in the handoff summary.

Do not iterate more than twice. Diminishing returns. The remaining issues are likely ambiguities that need human input during TDD implementation, not spec-level resolution.