SEC-EDGAR-GPT: A GPT-2 (124M) Language Model Trained from Scratch on SEC EDGAR Filings | Original
Disclaimer: All training data is publicly available on Hugging Face. All experiments and training were conducted on my personal devices or cloud platforms using my personal accounts — no bank resources were used.
I trained a 124M-parameter GPT-2 from scratch on 1.55B tokens of SEC-EDGAR financial filings using a single RTX 4070. Training took ~8 hours and converged to a validation loss of 2.28.
SEC-EDGAR is the U.S. Securities and Exchange Commission’s public database of corporate filings — 10-K annual reports, 10-Q quarterly reports, and other disclosures from publicly traded companies.
The model generates convincing SEC boilerplate — risk factors, MD&A sections, business descriptions — but struggles with numerical consistency and long-range coherence, as expected at this scale.
| Code: github.com/lzwjava/sec-edgar-gpt | Paper: sec-edgar-gpt.pdf |
This entire 124M model — training, deployment, paper, and website — was done in 3 days using Hermes Agent. With AI agents, LLM research and practice has become genuinely accessible.
Thanks to Andrej Karpathy’s nanoGPT for the training framework, the kapilrao/SEC-EDGAR dataset on Hugging Face, and Ming Jian Wei, Du Chun, and Parjanya Mudunuri for helpful discussions.
About me: I’m an AI full stack engineer at GFT working with a global bank though by contract arrangment. Over the past year I’ve consumed ~3B LLM tokens and trained ~15 small models (up to 760M) across RTX 4070, H200, B200, and AMD MI300X. I’ve made 5,000+ contributions on the bank’s internal GitHub and 11,000+ on public GitHub — with the help of AI tools and verified by human review. I’ve done 2.5 years as a contracting engineer in the bank, including WPB and GFT. I’ve given an AI talk to 80 bank peers to share hands-on experience. Recently, I am also working with Principal Engineer Parjanya Mudunuri and my lead Ming Jian Wei for about a month on a project involving large Excel file handling and a join/union tool to compare gaps and differences across multiple banking systems.
As the bank’s CEO put it, AI is becoming one of the defining technologies in our time. I hope this work can help the bank adopt AI tech a bit more.
Project site (deployed to Cloudflare Workers, please use personal device to access):

sec-edgar-gpt.lzwjava.workers.dev

Chat — try the model (deployed to RunPod, please use personal device to access; $0.24/hr, will shut down in some days):

gq8kq409jjxh7r-8888.proxy.runpod.net

HuggingFace model:

huggingface.co/lzwjava/sec-edgar-gpt-124m-hf