HTML vs Markdown Blog Analysis | Generated by AI

Home 2026.06

Now I have a complete picture. Let me give you the analysis.


Here’s my analysis of HTML vs Markdown for your blog, looking at your actual codebase:

The Anthropic Point

Anthropic observed that when Claude generates Artifacts, HTML produces significantly better visual output than Markdown. The reason is simple: Markdown has ~15 structural primitives (headings, lists, tables, blockquotes, code blocks). HTML has unlimited expressiveness — custom layouts, callout boxes, color, typography, embedded media, interactive elements. When an AI agent is designing for human eyes, HTML is the richer canvas.

Your Current Pipeline (What I Found)

Your note creation flow is:

  1. Clipboard → LLM → Markdown fileww/note/create_note_from_clipboard.py
  2. Content goes through: clean_grok_tags()clean_content()generate_title()format_front_matter()write_note()
  3. Output: .md files in notes/ with YAML frontmatter
  4. Jekyll processes them via kramdown (GFM) through _layouts/post.html
  5. The layout just does `<main id="content" class="main-content post-content" role="main">

GPT-2 Training Loss Spike | Generated by AI

Home 2026.06


================================================================ GPT-2 124M TRAINING — github-code dataset ================================================================

CURRENT PROGRESS Iteration: 257,240 / 427,000 = 60.2% Tokens seen: ~8.43B / 14B Latest ckpt: step 257,000 (ckpt.pt, 1.4GB)

LOSS TRAJECTORY Step 187,000: train 3.1404, val 3.1927 Step 234,000: train 2.6758, val 2.6576 Step 257,000: train 3.0636, val 2.9376

⚠ NOTE: Loss increased from 234k → 257k Train: 2.6758 → 3.0636 (+0.39) Val: 2.6576 → 2.9376 (+0.28)

This could indicate:

HARDWARE (RTX 4070) GPU: 100% util, 66°C, 208W VRAM: 5.3 / 12.3 GB MFU: 14.43%, 622ms/step

ESTIMATED TIME REMAINING ~170k steps left × 622ms = ~29.5 hours (~1.2 days)

HEALTH CHECK ⚠ Loss regression from best checkpoint at 234k You may want to monitor the next few eval intervals to see if it recovers. If not, consider:

Want me to keep monitoring, or check if the loss trend is stabilizing?


Back Donate

</main>

` — dumps rendered markdown into the page

Key files:

The Tension You’re Feeling

Your notes are Q&A format with tables, code blocks, lists, blockquotes. Markdown handles this adequately for content. But if you want callout boxes, styled tables, expandable sections, custom typography — Markdown can’t do it without raw HTML embedded in the Markdown.

The real problem: Jekyll’s pipeline is Markdown-first. The _config.yml has markdown: kramdown with input: GFM. The layout does `<main id="content" class="main-content post-content" role="main">

GPT-2 Training Loss Spike | Generated by AI

Home 2026.06


================================================================ GPT-2 124M TRAINING — github-code dataset ================================================================

CURRENT PROGRESS Iteration: 257,240 / 427,000 = 60.2% Tokens seen: ~8.43B / 14B Latest ckpt: step 257,000 (ckpt.pt, 1.4GB)

LOSS TRAJECTORY Step 187,000: train 3.1404, val 3.1927 Step 234,000: train 2.6758, val 2.6576 Step 257,000: train 3.0636, val 2.9376

⚠ NOTE: Loss increased from 234k → 257k Train: 2.6758 → 3.0636 (+0.39) Val: 2.6576 → 2.9376 (+0.28)

This could indicate:

HARDWARE (RTX 4070) GPU: 100% util, 66°C, 208W VRAM: 5.3 / 12.3 GB MFU: 14.43%, 622ms/step

ESTIMATED TIME REMAINING ~170k steps left × 622ms = ~29.5 hours (~1.2 days)

HEALTH CHECK ⚠ Loss regression from best checkpoint at 234k You may want to monitor the next few eval intervals to see if it recovers. If not, consider:

Want me to keep monitoring, or check if the loss trend is stabilizing?


Back Donate

</main>

. Frontmatter is YAML. The notes collection expects .md` files.

Three Approaches (Ranked by Pragmatism)

Approach 1: HTML-in-Markdown (Minimal Change)

Jekyll already supports raw HTML inside .md files. kramdown passes it through. You can keep .md files, YAML frontmatter, and the entire pipeline — but have the LLM generate HTML content blocks instead of Markdown syntax.

Current note:

**Question:** How do I manage information sharing?

**Answer:**

## The Core Mental Model

| Tier | Close Friend | Spouse |
|------|-------------|--------|
| Salary | No | Yes |

HTML-enhanced note:

---
title: Managing Relationship Information Sharing
layout: post
...

<div class="qa-card">
<div class="question">How do I manage information sharing?</div>
<div class="answer">

<h2>The Core Mental Model</h2>

<table class="styled-table">
<thead><tr><th>Tier</th><th>Close Friend</th><th>Spouse</th></tr></thead>
<tbody><tr><td>Salary</td><td>❌ No</td><td>✅ Yes</td></tr></tbody>
</table>

<div class="callout info">Information flows inward freely, outward carefully.</div>

</div>
</div>

Pros: Zero pipeline changes. Keep frontmatter, .md extension, Jekyll collection, everything. Just add CSS classes to _sass/.

Cons: You’re writing HTML inside .md files — feels wrong semantically.

Approach 2: Full HTML Files (Big Change)

Switch to .html files. Jekyll processes .html files too — it still runs Liquid templating and frontmatter on them. But:

You’d need to change:

This is 15-20 files to change, plus all the content post-processing logic.

The Anthropic insight isn’t “stop using Markdown files.” It’s “when AI generates visual content, HTML gives better results.” In your context:

  1. Keep .md files with YAML frontmatter — don’t fight Jekyll
  2. Change the LLM prompt in create_note_from_clipboard.py to generate HTML content blocks instead of Markdown syntax
  3. Add CSS to _sass/ for the HTML classes you want
  4. Update _layouts/post.html to handle both markdown and HTML content gracefully

The pipeline change is minimal:

# In create_note_from_clipboard.py, change the prompt:
full_title_prompt = lambda c: (
    f"Give a short English title (at most 6 words, no quotes, no explanation) for:\\n{c}\\n\\nTitle:"
)
# Add a content formatting prompt:
format_prompt = lambda c: (
    f"Format the following as clean semantic HTML with these CSS classes: "
    f"`.qa-card`, `.question`, `.answer`, `.callout`, `.styled-table`. "
    f"Do NOT include <html>, <head>, <body> tags. "
    f"Only return the content HTML:\\n\\n{c}"
)

Then in write_note(), after the LLM generates HTML content, write it directly:

def write_note(file_path, front_matter, content):
    # content is already HTML from the LLM
    with open(file_path, "w", encoding="utf-8") as file:
        file.write(front_matter + "\n\n" + content + "\n")

My Honest Take

The Anthropic observation is about AI-generated visual artifacts — dashboards, interactive diagrams, styled documents. Your notes are textual Q&A content. For your use case:

My recommendation: Approach 3. Keep the pipeline, change the prompt to emit HTML content blocks, add CSS. This gives you the visual richness without rewriting the entire toolchain. The .md file extension is just a convention — Jekyll renders HTML inside .md files identically to .html files.

The one thing you’d lose: fix_liquid_raw_tags() and fix_mathjax_in_file() assume markdown fences. If the LLM generates HTML, you’d need to adjust those post-processing steps. But that’s 2-3 functions, not 20 files.


Back Donate