ai-agent-dev-workflow

8-Point Quality Verification Checklist

Run this checklist after all implementation chunks are complete. Each point should be verified with specific evidence (test results, grep output, build logs).


1. Completeness

Verify every acceptance criterion is met.

Self-evaluation warning: This is where generators most often over-report. Cross-check each criterion against the tracker’s acceptance_criteria array — don’t rely on memory of what was planned. If using a subagent evaluator, have it verify independently.

How to check:

Template:

- [ ] Criterion 1: [describe] - verified by [test or file:line]
- [ ] Criterion 2: [describe] - verified by [test or file:line]

Common misses:


2. Correctness

Verify data mapping, conversions, and logic are correct.

How to check:

Common misses:


3. Gaps (Functional)

Verify no broken references, missing wiring, or orphaned code.

Self-evaluation warning: Generators tend to miss gaps in files they didn’t directly modify. Always grep — don’t rely on recall of which files reference the changed code.

How to check:

# Check for references to old/removed types
grep -r "OldTypeName" src/

# Check for TODO/FIXME left behind
grep -r "TODO\|FIXME" src/ --include="*.ext"

# Run linter to catch unused imports/variables
# (use project's lint command from PROJECT.md)

Common misses:


4. Standards

Verify implementation follows project standards and platform conventions.

How to check:

Project-specific standards: See PROJECT.md §Standards.


5. Regression

Verify existing functionality isn’t broken.

How to check:

Run the project’s test, build, and lint commands (see PROJECT.md).

Evidence to record:

Distinguishing pre-existing vs new failures: Check if the failing test class was modified in this feature. If not, it’s likely pre-existing. Run the failing test in isolation to confirm.


6. Robustness

Verify error handling, empty states, and cleanup.

How to check:

Common scenarios:


7. Gaps (Architectural)

Verify abstraction boundaries are respected.

How to check:

# Example: UI should NOT import data layer directly
grep -r "import.*data\." src/ui/ --include="*.ext"

# Example: Controllers should NOT import DB layer
grep -r "import.*db\." src/controllers/ --include="*.ext"

Architecture rules to verify (see PROJECT.md):


8. Blindspots

Verify edge cases that automated tests may miss.

Self-evaluation warning: This is the hardest point to self-assess honestly. Generators naturally focus on what they built, not what they missed. For medium+ features, delegate this check to a subagent evaluator via /simplify.

Always check:

Platform-specific blindspots (see PROJECT.md §Blindspots):

Common examples across platforms:

How to document:

For each blindspot, note:

  1. What could go wrong
  2. Likelihood (low/medium/high)
  3. Mitigation (if any)
  4. Whether manual testing is needed