Cold Email Copywriter // Version Lab

Iterative copy improvement with traceability. Each version tracks inputs, outputs, QA, and feedback.
v1 BASE
v1.1
v1.2
v2
v2.1
v3 NEW
Date: 2026-06-02
Skill: v1 (base import)
Target: B2B SaaS founder, Series B, 30-80 HC
Trigger: Just raised, still doing sales

Input Traceability

  • SKILL
    cold-email-copywriter/SKILL.md
    .claude/skills/cold-email-copywriter/SKILL.md
  • REF
    Framework definitions (3 variants + worked examples)
    .claude/skills/cold-email-copywriter/references/frameworks.md
  • REF
    Rules, constraints, tone philosophy, quality gates
    .claude/skills/cold-email-copywriter/references/rules-and-constraints.md
  • ICP
    LeadGrow self-campaign ICP profile (Series B SaaS founders)
    clients/gtm-client-leadgrow/icp-profile.yaml
  • RULE
    No em-dashes in copy (workspace rule from MEMORY.md)
    MEMORY.md content rule

Email Sequence

QA Checklist

No em dashesPASS 4 lines max per emailPASS (4/4/3) 60-100 words per emailPASS (87/82/78) No links in bodyPASS No signaturePASS No "I wanted to reach out" etcPASS Signal not named directly (E1)PASS Subject line 4 words max (E1)PASS (3/3/3) Emails 2+3 threaded (no subject)PASS Bar test -- say this out loud?REVIEW Permissionless value usable today?WEAK Cialdini subtle?PASS

Self-Critique

  • 1. E3 permissionless value is weak. "2.3x more pipeline" is a stat, not something they can use in a meeting today. Should be named accounts with signals or a specific diagnostic.
  • 2. E1 might be too long. 4 lines but dense. Could trim the middle paragraph.
  • 3. E2 "2,000+ qualified meetings" is a flex, not proof. Would land better as a specific founder story or timeframe.
  • 4. All 3 emails lean on the same "founder doing sales" pain. Could diversify E2 or E3 to hit a different angle.
  • 5. "Worth a quick look" CTA in E1 might be too close to asking for a meeting. "Comparing notes" is softer.
Date: 2026-06-03
Skill: v1.1 (QA loop + spacing + variables)
Target: B2B SaaS founder, Series B, 30-80 HC
Trigger: Just raised, still doing sales
Runner: Isolated subagent

Input Traceability

  • SKILL
    cold-email-copywriter/SKILL.md (v1.1 with QA validation section)
    .claude/skills/cold-email-copywriter/SKILL.md
  • REF
    Framework definitions (3 variants + worked examples, {FIRST_NAME} format)
    .claude/skills/cold-email-copywriter/references/frameworks.md
  • REF
    Rules v1.1: tone, constraints + spacing rules + variable format rules
    .claude/skills/cold-email-copywriter/references/rules-and-constraints.md
  • ICP
    LeadGrow self-campaign ICP profile (Series B SaaS founders)
    clients/gtm-client-leadgrow/icp-profile.yaml
  • RULE
    No em-dashes in copy (workspace rule from MEMORY.md)
    MEMORY.md content rule
  • NEW
    QA validator script (post-write validation loop)
    .claude/skills/cold-email-copywriter/scripts/qa_validate.py

Email Sequence

QA Results (automated)

qa_validate.py output: 31 checks, 0 failures, 0 warnings. PASS

No em dashesPASS 4 lines max per emailPASS (4/4/3) 60-100 words per emailPASS (79/86/81) No links in bodyPASS No signaturePASS No banned phrasesPASS Subject line 4 words max (E1)PASS Subject spintax 3 variants (E1)PASS Emails 2+3 threaded (no subject)PASS Variable format ({FIRST_NAME} only)PASS Line spacing between idea blocksPASS

Skill Changes (v1 -> v1.1)

  • 1. QA Validation Loop — New Python script (scripts/qa_validate.py) runs automated checks against copy before output. 11 check categories. Sub-agent spawns it during skill invocation.
  • 2. Spacing Rules — Every idea block gets blank line separator. Visual rhythm = breathing, not wall of text.
  • 3. Variable Format — Strict {FIRST_NAME} format only. [First Name] and {{name}} banned. Allowed: {FIRST_NAME}, {LAST_NAME}, {COMPANY}, {TITLE}.

v1 Retroactive Validation

Ran v1.1 validator against v1 copy. 30 checks, 1 failure.

Variable format (E1 subject lines)FAIL — used [First Name] not {FIRST_NAME} All other checks (29)PASS

Feedback

Date: 2026-06-04
Skill: v1.2 (skill-creator architecture)
Commit: cf0c587
Eval: iteration-1 — 100% vs 15% baseline

Skill Changes (v1.1 → v1.2)

  • 1. SKILL.md restructure — Applied skill-creator progressive disclosure. Aggressive description with natural-language trigger phrases. Out of Scope section added (LinkedIn, warm nurture, deliverability, proposals, strategy). Lean body — detail pushed to reference files.
  • 2. references/taste.md (new) — Synthesized preference rules for recurring judgment calls: permissionless value bar, authority specificity, signal subtlety, subject line register, pain angle rotation.
  • 3. Sequence diversity constraint — Name E1's pain axis, pick different for E2. Name both, pick third for E3. Axis table: pipeline credibility / team performance / forecast accuracy / speed to first win / motion maturity.
  • 4. Subject line voice test — Colleague-forward test added. Observation/fragment, not CTA or question. If a colleague wouldn't forward it with that subject, rewrite it.
  • 5. Permissionless value bar sharpened — Stat ≠ value. Must pass "can they share this in a meeting tomorrow without any additional work?" Vague benchmark warning added with failing/passing examples.
  • 6. Authority specificity — One specific anonymous story beats "a few revenue leaders." Anonymous is fine. Vague is not.

Eval Iteration-1 Results

3 evals: CRO 3-email sequence, VP Sales signal observation, permissionless value benchmarks. Full review →

with_skill pass rate100% (25/25) without_skill pass rate15% (4/25) Delta+0.85 E1 — CRO 3-email sequence11/11 PASS E2 — VP Sales signal observation7/7 PASS E3 — Permissionless benchmarks7/7 PASS

Copy Outputs — Eval Iteration-1 (with skill)

3 evals × with_skill. Eval inputs shown in label.
Eval 1 — CRO 3-Email Sequence
Newly hired CRO, Series B SaaS, announced on LinkedIn 2 weeks ago. Signal Observation → Poke the Bear → Permissionless Value.
Eval 2 — VP Sales Signal Observation
VP Sales, mid-market SaaS. 4 SDR job postings in 45 days, no AE hires. Single email, Signal Observation framework.
Eval 3 — Permissionless Value (Benchmarks Variant)
Newly promoted internal CRO (from VP Sales). No named accounts available — benchmarks variant.

Top Baseline Failure Modes

Signal named directlyFAIL baseline Wrong variable format ({{name}})FAIL baseline No spintax on subjectFAIL baseline Word count blown (150+ words)FAIL baseline Em-dashes in subject linesFAIL baseline Meeting ask in permissionless valueFAIL baseline Generic non-quotable benchmarksFAIL baseline

Feedback

Version: v2.0 — KG Scout
Date: 2026-06-07
Change: Phase 1 KG Scout added — loads real EDP + persona + PQS from clients/gtm-client-leadgrow/
Eval result: 31/31 assertions pass (100%)

What Changed (v2.0)

  • +KG Scout phase searches clients/gtm-client-{client}/ dynamically — no hardcoded paths
  • +Auto-selects EDP most relevant to prompt context (not just top-scored)
  • +Benchmarks now sourced from real EDP intel (5.3x/1.6x from Gradient Works 2025 via edp-framework.md)
  • +QA loop includes qualitative rubric (qa-agent.md) — EDP grounding, PQS grounding, signal subtlety, persona fit
  • +trace.json now includes edp_selected field with selection rationale

Input Traceability

  • SKILL
    cold-email-copywriter/SKILL.md (v2.0 — 4-phase architecture)
    .claude/skills/cold-email-copywriter/SKILL.md
  • REF
    KG Scout instructions — dynamic client folder search
    .claude/skills/cold-email-copywriter/references/kg-scout.md
  • REF
    QA Agent — qualitative rubric (EDP grounding, persona fit, bar test)
    .claude/skills/cold-email-copywriter/references/qa-agent.md
  • ICP
    EDP framework — 7 EDPs ranked by score (leadgrow CRO campaign)
    clients/gtm-client-leadgrow/research/edp-framework.md
  • ICP
    Buyer personas — Vanessa (S1 CRO), Jordan (PLG CRO), Marcus (Founder)
    clients/gtm-client-leadgrow/research/buyer-persona.md
  • ICP
    EDP discovery — industry data, CRO tenure stats, PQS signals
    clients/gtm-client-leadgrow/research/edp-discovery.md

Eval 1 — CRO Sequence (3 emails)

Newly hired CRO, Series B SaaS, announced 2 weeks ago. EDP selected: EDP-1 (Pipeline Coverage Vacuum) + EDP-2 (CRO Tenure Clock). Score: 13/13.

Eval 2 — VP Sales Signal Observation

VP Sales, mid-market SaaS. 4 SDR postings in 45 days, no AE hires. EDP selected: EDP-3 (Infrastructure Build Gap). Score: 9/9.

Eval 3 — Permissionless Value (Benchmarks)

Internally promoted CRO (from VP Sales). No named accounts. EDP selected: EDP-1 (Pipeline Coverage Vacuum). Score: 9/9.

KG Scout EDP Selection Rationale

Eval 1 — EDP-1 + EDP-2 (newly hired external CRO, S5 dual-signal)CORRECT Eval 2 — EDP-3 (VP Sales + SDR scaling before playbook proven)CORRECT Eval 3 — EDP-1 over EDP-3 (internal promo already built infrastructure)CORRECT All benchmarks traceable to edp-framework.md (Gradient Works 2025)VERIFIED

Feedback

Version: v2.1 — edp-discovery.md required
Date: 2026-06-08
Change: edp-discovery.md now mandatory in KG Scout (was skipped in eval-1 iter-2)
Eval result: 31/31 assertions pass (100%)

What Changed (v2.1)

  • +edp-discovery.md marked REQUIRED in kg-scout.md with explicit callout: "skipping loses PQS/urgency window data that separates grounded copy from generic copy"
  • +Direct path reads in KG Scout instead of smart-searcher (faster, deterministic)
  • +New PQS data surfaced: urgency window (Days 14-45), 55-65% outbound benchmark, 14-month SDR math catchup, 70% involuntary CRO departure, 19-25 month tenure
  • +Eval-3 EDP selection changed: EDP-2 (CRO Tenure Clock) chosen over EDP-1 for internally promoted CRO — promoted-in persona skips the external hire "honeymoon" buffer

Eval 1 — CRO Sequence (3 emails)

Newly hired CRO, Series B SaaS, announced 2 weeks ago. EDP selected: EDP-1 (Pipeline Coverage Vacuum) + EDP-3 (Infrastructure Build Gap). Score: 13/13.

Eval 2 — VP Sales Signal Observation

VP Sales, mid-market SaaS. 4 SDR postings in 45 days, no AE hires. EDP selected: EDP-3 (Infrastructure Build Gap). Score: 9/9.

Eval 3 — Permissionless Value (Internally Promoted CRO)

Internally promoted CRO (from VP Sales). No named accounts. EDP selected: EDP-2 (CRO Tenure Clock). Score: 9/9.

v2.1 vs v2.0 — What edp-discovery.md Added

Eval-1 E2: "8-12 weeks to build in-house" (urgency window)NEW vs v2 Eval-1 E3: 55-65% outbound benchmark (Series B)NEW vs v2 Eval-1 E3: 1.5-2x inherited coverage statNEW vs v2 Eval-2: "14 months to catch up" SDR sequencing mathNEW vs v2 Eval-3: 70% involuntary CRO departure rateNEW vs v2 Eval-3: 19-25 month average tenureNEW vs v2 Eval-3 EDP switched to EDP-2 (promoted-in persona awareness)NEW vs v2

Feedback

Version: v3 — Simplified eval architecture
Date: 2026-06-15
Change: 2-layer QA (mechanical + recipient-lens) — offer carry-through as the binary eval gate — Mitchell handles copy quality
Result: 3/3 evals pass — L1 + L2 all pass, The Clean Slate in E1 across all signals

What Changed (v3)

  • +Eval architecture simplified — eval-spec.md gutted to 2 criteria only: (1) primary offer from product-cards.md in E1 CTA, (2) Layer 1 script passes. Prior 3-layer quality scoring was circular (model grading model output) and irrelevant to copy quality.
  • +2-layer QA rubric — Layer 1: qa_validate.py mechanical checks. Layer 2: 5 recipient-lens dimensions (pass/fail from recipient's point of view). Quality judgment stays with Mitchell.
  • +Offer loaded at runtime — product-cards.md primary ⭐ offer (The Clean Slate) loaded by writer, never embedded in agent prompt.
  • +Archaeology hardcoding removed — verbatim phrase removed from all reference files. Output now varies across runs (confirmed — all 3 evals open differently).
  • +Framework 4: The Naked Offer — direct offer lead when the offer is strong enough to stand alone.

Recipient-Lens QA — All 3 Evals (Layer 2)

5 pass/fail dimensions graded from the recipient's point of view, not rule compliance. All must pass.

Dimension E1 — New CRO E2 — VP Sales E3 — Internal Promo
Written for me PASS PASS PASS
Feels valuable PASS PASS PASS
Social proof (traced) PASS PASS PASS
Worldview aligned PASS PASS PASS
Easy to say yes PASS PASS PASS
The Clean Slate in E1 CTA (all 3 evals)PASS ✓ Layer 1 script (all 3 evals)PASS ✓

Eval 1 — New CRO Hire Signal (3 emails)

Vanessa (S1) — external CRO hire, Series B/C B2B SaaS, days 14-45. L1: 31/31. L2: 5/5 pass.

Eval 2 — VP Sales SDR Scaling Signal (3 emails)

VP Sales scaling SDR headcount before outbound motion proven. L1: 31/31. L2: 5/5 pass.

Eval 3 — Internal Promotion Signal (3 emails)

VP Sales promoted to CRO at same company. L1: 32/32. L2: 5/5 pass. The Clean Slate named explicitly in E1.

Feedback