How Outpace AI Cold Email Works: Architecture, Personalization & Deliverability Engineering

"AI writes cold emails" is a description you'll find on a dozen products. What you won't find is how they actually work — because the gap between "LLM call that returns email text" and "system that generates personalized emails at scale without destroying your sender reputation" is enormous. This post covers the latter.

We're going to walk through the full Outpace pipeline: prospect research, AI email generation, follow-up sequencing, domain reputation management, and the deliverability engineering that ties it together. This is the real architecture — no marketing gloss.

The System Architecture

At a high level, Outpace is a pipeline. Data flows in one direction: ICP definition → prospect enrichment → email generation → send queue → delivery → tracking.

ICP Input

→

Prospect Research

→

Enrichment Pipeline

↓

Send Queue + Warmup

←

AI Generation

←

Claude API

↓

Delivery + Tracking

→

Follow-up Scheduler

→

Reply Detection

Each stage is independent. The research pipeline doesn't know about the email generator. The send queue doesn't know about domain warming thresholds. This separation is intentional — it lets us iterate each component without cascading failures.

Stage 1: Prospect Research Pipeline

The prospect research problem is more constrained than it sounds. You need signals that are (1) publicly available, (2) recent enough to be relevant, (3) specific enough to generate a non-generic email, and (4) retrievable at scale without hitting rate limits or getting flagged.

The signals we extract for each prospect:

Role and seniority signals — Current title, tenure, reporting structure. Seniority matters because the email tone and CTA for a VP is different from a manager.
Company growth signals — Hiring velocity, recent funding, geographic expansion, new product launches. A company that just raised a Series A and is hiring SDRs is categorically different from one that's been flat for two years.
Pain signal indicators — Job postings for roles that signal problems you solve. A company posting five "cold email specialist" jobs means their outbound isn't working at scale.
Recent news hooks — Press mentions, product announcements, leadership changes. These provide natural conversation entry points that don't feel manufactured.

The research output is a structured JSON object per prospect — not free text. This matters for the generation stage: structured input produces consistently structured emails. Free-text research produces inconsistent generation quality.

Stage 2: AI Email Generation

Model Selection: Why Claude

We tested GPT-4o, GPT-4o-mini, Claude Sonnet, and Claude Haiku across 500 generated emails judged on three criteria: personalization depth, natural tone, and avoidance of obvious AI tells (stilted phrasing, template-feel, inappropriate formality).

Claude Sonnet won on tone quality by a meaningful margin. The specific failure mode of GPT-4o was over-formality — emails that technically incorporated the prospect signals but read like they were written by someone trying very hard to sound natural. Claude's output had fewer of these tells.

We use Claude Haiku for follow-ups (lower cost, acceptable quality for shorter messages) and Claude Sonnet for initial outreach (where quality matters most).

The Prompt Architecture

The system prompt defines the output constraints. We don't ask the model to "write a good cold email." We ask it to produce a specific structure with specific constraints:

System prompt structure (simplified)

You are an AI sales assistant. Write a cold outreach email for the following prospect.

CONSTRAINTS:
- Subject line: under 50 characters, no clickbait, curiosity-driven
- Body: 3 short paragraphs maximum (under 120 words total)
- Paragraph 1: personalized hook using one specific signal from prospect data
- Paragraph 2: one-sentence value proposition (not features, outcomes)
- Paragraph 3: single soft CTA — ask for a reply, not a meeting

DO NOT:
- Use filler phrases ("I hope this finds you well", "I wanted to reach out")
- Mention AI, automation, or software explicitly in paragraph 1
- Use exclamation points
- Use passive voice

OUTPUT FORMAT: Valid JSON with fields: subject, body
No markdown, no explanation, just the JSON.

The constraint list is the most important part. Without explicit prohibitions, models default to the patterns in their training data — which for cold email means every spam cliché in existence. The constraints prune the output space.

Personalization That Doesn't Feel Manufactured

The hardest problem in AI email generation isn't generating personalization — it's generating personalization that feels genuine. The tell of a bad AI cold email isn't that it lacks prospect-specific content. It's that the prospect-specific content is irrelevant or forced.

Bad AI Personalization

"I saw you recently expanded to 3 new markets — that's exciting! We help companies like yours..."

Good AI Personalization

"You're hiring 4 SDRs right now. Most companies at that scale hit a deliverability wall around month 3 — we built Outpace to remove that constraint."

The difference: the good example uses the signal to surface a problem the prospect actually has. The bad example mentions the signal and then pivots to a generic pitch. The former requires the model to reason about why the signal matters. This is where prompt engineering earns its keep.

We explicitly inject the signal and its implication into the prompt: "This prospect is hiring 4 SDRs (signal). Companies hiring aggressively for outbound often hit deliverability problems at scale (implication). Use this to open the email." Giving the model the reasoning it needs to make the connection produces dramatically better output than asking it to figure it out.

Stage 3: Follow-Up Sequencing

The conventional wisdom on cold email follow-ups: send 3, space them out, vary the angle. The nuance we've added: follow-ups are generated fresh, not scheduled templates. Each follow-up is aware of the original email and designed to provide an additional value signal — not just nudge the prospect about the previous message.

Follow-up scheduling logic

Follow-up 1: Day 3 (soft reminder, new angle)
Follow-up 2: Day 7 (concrete value add — case study or insight)
Follow-up 3: Day 14 (breakup email — "last note")

Sequence stops automatically on:
- Reply detected (any sentiment)
- Bounce (hard bounce removes from all future sequences)
- Spam complaint (removes domain from all sending)

The breakup email ("last note" format) consistently outperforms continued value pitches at the 14-day mark. Psychologically, framing it as the final outreach creates urgency without being aggressive. We've seen reply rates on breakup emails equal to or higher than follow-up 1.

Stage 4: Deliverability Engineering

This is where most AI cold email tools fail silently. You can generate perfect emails and get zero deliverability if your sending infrastructure is wrong. The deliverability problem has three layers:

Domain Warmup

A fresh sending domain has no reputation. ISPs (Gmail, Outlook, Yahoo) evaluate sender reputation by looking at historical sending patterns, engagement rates, and spam complaint rates. A domain that starts sending 500 cold emails on day one looks exactly like a spam operation — because it is the same behavioral pattern.

The warmup process builds domain reputation gradually. The general principle:

Domain warmup ramp (simplified)

Week 1:  10-20 emails/day (to known contacts, high engagement expected)
Week 2:  20-50 emails/day
Week 3:  50-100 emails/day
Week 4+: Scale to target volume

Monitor:
- Inbox placement rate (aim for 95%+)
- Spam complaint rate (alert if >0.1%)
- Bounce rate (alert if >2%)
- Open rate (proxy for inbox placement)

The warmup emails in weeks 1-2 should have high expected engagement — send to colleagues, existing customers, anyone likely to open and reply. This establishes a baseline reputation before cold outreach begins.

Technical Authentication: SPF, DKIM, DMARC

These three DNS records are non-negotiable. Emails without valid SPF/DKIM/DMARC fail authentication checks at major ISPs and are either rejected outright or routed to spam with near-certainty.

SPF (Sender Policy Framework) — Authorizes which servers can send email on behalf of your domain. Prevents spoofing.
DKIM (DomainKeys Identified Mail) — Cryptographic signature proving the email wasn't tampered with in transit. Required for Gmail inbox placement.
DMARC (Domain-based Message Authentication, Reporting & Conformance) — Policy record that tells ISPs what to do with emails that fail SPF/DKIM. Start with p=none (monitoring only), move to p=quarantine after confirming clean authentication.

Critical: Use a subdomain for cold outreach, never your primary domain. If outpace-40.polsia.app is your main domain, send cold email from mail.outpace-40.polsia.app or a separate sending domain entirely. A damaged sending domain shouldn't take your entire company email with it.

Content Signals: What Spam Filters Actually Look For

Modern spam filters (SpamAssassin, Google's ML models, Microsoft SmartScreen) don't work on keyword blocklists anymore. They analyze behavioral and structural signals:

Image-to-text ratio — Heavy HTML templates with lots of images look like marketing email, not personal outreach. Plain text or minimal HTML performs better for cold outreach.
Link density — More than one link per email raises spam scores. For cold outreach, zero links in the body (CTA to reply, not click) is optimal.
Subject line patterns — ALL CAPS, excessive punctuation, and common spam phrases ("Free," "Act now," "Limited time") trigger filters. Our constraint-based prompts prevent most of these.
Engagement history — Low open rates from previous sends train the ISP model to deprioritize your future emails. This is why quality targeting matters more than volume.

Tracking and Reply Detection

Open tracking is implemented via a 1x1 pixel with a unique URL per email. Click tracking wraps links. Both are standard and don't affect deliverability when implemented correctly (HTTPS, unique per-email URLs, served from a reliable CDN).

Reply detection is more interesting. We poll for replies and parse them to classify intent:

Positive — Stop the sequence, surface in dashboard for follow-up
Neutral / Referral — "Talk to X instead" — log and mark for manual follow-up
Negative / Unsubscribe — Suppress forever, add to domain-level suppression list
Out of office — Pause sequence, resume after expected return date

The out-of-office handling is one of those unglamorous details that matters a lot. Sending a follow-up to someone who's OOO for two weeks annoys them, wastes a follow-up slot, and produces engagement signals that look like disinterest (no opens) when the reality is timing.

What We Got Wrong and Fixed

In the spirit of showing the real work:

Early mistake: generating all emails upfront. Our first architecture generated the full 4-email sequence (initial + 3 follow-ups) before any sending began. The problem: context changes. A company that raised a round between the initial email and follow-up 2 deserves a different follow-up angle. Now we generate each follow-up fresh, with a prompt that includes "what happened since the last send" signals.

Early mistake: treating every bounce as equivalent. Hard bounces (invalid address) and soft bounces (mailbox full, server busy) are completely different. We initially suppressed contacts on any bounce, which was overly aggressive. Soft bounces should retry; only hard bounces warrant permanent suppression.

Early mistake: ignoring the subject line as a deliverability signal. We focused prompt engineering effort on the email body and treated the subject line as a secondary concern. It isn't. The subject line is evaluated by the ISP for spam signals before the body is processed. Subject line quality affects whether your email even gets evaluated on its merits.

Where We're Going

The current architecture handles the core pipeline well. The next layer is what we're calling "intent-aware sequencing" — using engagement signals (opens, click timing, reply patterns) to dynamically adjust follow-up timing and angle, rather than running every prospect through the same schedule.

A prospect who opens the initial email three times but doesn't reply is different from one who never opens. The former is interested but hesitant; the latter may not have received it at all. Same follow-up schedule is the wrong answer for both.

If you're building something in this space or have questions about our architecture, open channels are the best way to reach us — reply to any Outpace email, or start a trial and reach out from inside the dashboard.

See it running on your ICP

7-day free trial. Full access. We generate the first batch of prospects and emails — you evaluate the quality before committing to anything.

Start Free Trial →

No credit card. No domain required to start evaluating output quality.