Workflows··5 min read

Convert HTML Emails to Markdown for AI Summarization

Pasting a forwarded HTML email into ChatGPT yields a wall of inline styles and tracking pixels. Here's how to convert it to clean Markdown the model can actually summarize — in seconds.

Long-form newsletters — Stratechery, Platformer, vendor product updates — are mostly HTML wrapped around a small amount of actual content. When you paste the email into ChatGPT or Claude to ask for a summary, you're paying tokens for inline CSS, table layouts, tracking pixels, and the email client's own boilerplate.

A typical 2,000-word newsletter expands to 30,000+ tokens of HTML. Convert it to Markdown first and you're back to roughly 3,000 — a 90% reduction with zero loss of meaning.

The full workflow

  1. In Gmail, open the email → ⋮ menu → 'Show original' (Outlook: 'View source'; Apple Mail: View → Message → Raw Source).
  2. Copy the HTML body — the part between <body> and </body>.
  3. Open Markdown Converter and paste.
  4. Copy the cleaned Markdown straight into your AI chat.

Why Markdown beats raw HTML for AI

  • Roughly half the tokens for the same content
  • Headings, lists, and links survive intact
  • Models were trained on enormous Markdown corpora — they read it natively
  • Strips tracking pixels, inline styles, and email-client boilerplate automatically
  • Clean diff if you re-summarize after edits

What gets stripped (and why that's fine)

The converter removes inline CSS, conditional Outlook comments, MSO blocks, table-based layout scaffolding, and image tracking pixels. None of it carries meaning a language model needs. What survives: every heading, paragraph, link (with anchor text), bullet list, and image alt-text — exactly what you want for a summary.

If the email is still too long

Cleaned Markdown will fit any modern model's context window. But if you're forwarding a whole digest of newsletters, run the combined output through Context Compressor for a final 20–30% reduction. You'll fit days of newsletters into a single prompt and ask "what's actually changed this week?"

Sample prompt for newsletter summaries

You are summarizing a newsletter. Read the Markdown below and produce:

1. A 3-sentence TL;DR.
2. The 3 most actionable insights (with the link to each).
3. Anything that contradicts conventional wisdom in this space.

Newsletter:
[paste markdown here]

The model can now cite links because the Markdown preserved them. Try asking for the same thing on the raw HTML and the answer will hallucinate URLs.

Building a personal newsletter knowledge base

Convert each newsletter to Markdown and save it. Over a few months you'll have a searchable archive that an AI can query — "what did Stratechery say about Apple's services strategy in Q1?" — without you re-reading anything. The work is in the conversion; the archive compounds.

Tools mentioned

Frequently asked

Can I just paste the rendered email?

If you copy from your email client, you usually paste the rendered text — which is fine for short notes but loses links, structure, and context. HTML → Markdown preserves all three.

Where do I find the raw HTML of an email?

Gmail: open the message → three-dot menu → 'Show original'. Outlook: 'View source'. Apple Mail: View → Message → Raw Source.

Will tracking pixels affect the AI summary?

No, but they bloat the token count. The Markdown converter strips them automatically.

Keep reading