Training GPT-2 on SEC Filings | Generated by AI

Home 2026.06

Here’s the full picture:

== DATASET == SEC-EDGAR — all 10-K annual reports from the SEC’s EDGAR filing system.

== MODEL == GPT-2 124M (standard nanoGPT architecture)

== TRAINING ==

== TIME ==

== WHAT TO EXPECT ==

== USEFULNESS == This is a domain-specific LM for financial/SEC filings. Good for:

Limitations: 124M is small — don’t expect GPT-4 level coherence. It’ll produce plausible SEC-sounding text but will struggle with complex reasoning or long-range consistency.


Back Donate