The Frustration That Started It All
It was 2 AM. I’d been reading LLM training tutorials for six hours.
One tutorial told me to install 47 dependencies manually. Another assumed I had $10,000 worth of GPUs lying around. A third just… stopped halfway through with “figure out the rest yourself.”
I’m a third-year CS student at Mumbai University. I don’t have a research lab. I don’t have unlimited cloud credits. I just wanted to understand how these things work by building one myself.
That’s when the idea hit me:
What if training an LLM was as easy as npx create-next-app?
One command. Everything ready. Just start training.
That’s how create-llm was born.
I love how Vercel made web deployment stupid simple:
npx create-next-app my-app
npm run dev
# You have a websiteWhy couldn’t LLM training be the same?
npx create-llm my-llm
python train.py
# You have a language modelNo scattered tutorials. No dependency hell. No “works on my machine.”
Just: scaffold → train → deploy.
Simple.
I started building with pure enthusiasm and zero idea what I was getting into.
Initial plan:
CLI tool in TypeScript (scaffold projects)
Python training code (PyTorch)
Templates (tiny, small, base)
One command to rule them all
Reality check: Nothing worked. Everything broke. I loved it.
Getting the CLI to generate a project structure was surprisingly fun. Running npx create-llm test and seeing files appear? Magic.
That first dopamine hit kept me going through what came next.
This is where reality hit hard.
My first training run showed perplexity of 1.0 — the model memorized everything instead of learning.
After hours of debugging, I found it:
My config had vocab_size: 32000 hardcoded, but my tokenizer only created 423 tokens.
The math:
Model allocated: 32,000 × 768 = 24,576,000 parameters
Actually used: 423 × 768 = 324,864 parameters
Wasted: 24,251,136 parameters (99% of embedding layer!)
With 23M parameters but only 423 tokens being trained, it memorized instantly.
The fix: Auto-detect vocab size from tokenizer.
# Before
vocab_size = config['model']['vocab_size'] # 32000# After
if config['model']['vocab_size'] == 'auto':
vocab_size = tokenizer.get_vocab_size() # 423
print(f"Auto-detected vocab_size: {vocab_size}")Lesson learned: Don’t hardcode what should be dynamic.
Even with correct vocab, my tiny model (23M params) was overfitting on small datasets.
The rule I learned:
1M parameters needs ~10K examples minimum
10M parameters needs ~100K examples minimum
Your model should match your data
I restructured templates:
nano: 200K-700K params (1–2 min training, learning tool)
tiny: 2–5M params (5–10 min, actually usable)
small: 50–100M params (1–3 hours, production)
base: 500M-1B params (days, research)
It worked on my Mac. It broke on Windows. Classic.
UTF-8 encoding, path separators, torch.load warnings — I fixed them all one by one.
Windows users deserve love too.
After fixing vocab size, I trained the nano template. It generated:
You: "Once upon a "
Model: "time time time time time..."Mode collapse! My first instinct: “It’s broken, hide it.”
Then I realized: This is EDUCATIONAL.
Beginners SHOULD see mode collapse. They should understand:
Why model size matters
Why data quality matters
What overfitting looks like
How to fix it
I rewrote the nano template docs:
“nano is intentionally small. It will show mode collapse with limited data. That’s the point — you learn by seeing what goes wrong, then fixing it.”
Honesty > perfection.
I added overfitting detection:
if perplexity < 1.1:
print("⚠️ WARNING: Perplexity < 1.1 indicates severe overfitting!")
print(" Suggestions:")
print(" - Add more training data")
print(" - Increase dropout")
print(" - Reduce model size")Users started thanking me for these warnings. They learned faster because the tool taught them.
My tool isn’t just a CLI — it’s a teacher.
The nano template trains in 60 seconds. People love this.
Why? Instant gratification.
“I ran one command and trained my own LLM in a minute” is way more powerful than “I spent 3 days setting up CUDA.”
Fast iteration = more learning = better outcomes.
I started posting updates on Twitter.
Day 1: “Building create-llm — npm create-next-app but for LLMs”
Response: 3 likes
Day 15: “Just fixed the vocab size mismatch bug [screenshot]”
Response: 50 likes, 5 people wanting to beta test
Day 30: “create-llm now has auto-detection and overfitting warnings”
Response: 200+ likes, people asking when it launches
Building in public was scary but worth it. Real-time feedback shaped the product.
Final stretch. Everything worked but nothing felt “done.”
I added:
Live training dashboard (Flask + SocketIO)
Model comparison tool
Deployment to HuggingFace
Comprehensive docs
29 out of 30 tasks on my checklist completed
But I kept finding “one more thing” to fix.
Classic trap: Perfectionism masquerading as thoroughness.
I almost didn’t launch because “it’s not ready yet.”
Then I realized: It trains models. It generates text. It has docs. It has validation.
It’s ready. I’m just scared.
I had 95% of features done for 2 weeks. I kept adding “just one more thing.”
Finally shipped. Users loved it. The “missing” 5% didn’t matter.
Lesson: Shipping beats perfecting.
The nano template shows mode collapse. I could’ve hidden it.
Instead, I documented it: “This is a learning tool. It will overfit. That’s educational.”
Users appreciated the honesty more than fake polish.
I built create-llm because I wished it existed when I started learning.
That clarity — “I’m building for past-me” — made every decision easier.
Adding overfitting warnings, vocab size checks, and model size recommendations made the tool better than competitors.
Don’t just build tools. Build teachers.
Every bug taught me something:
Vocab mismatch → parameter efficiency
Overfitting → model sizing
Mode collapse → training dynamics
I learned more from bugs than tutorials.
The best part wasn’t writing code. It was people saying:
“I finally understand how LLMs work!”
“I trained my first model today!”
“This tool saved me hours!”
Building alone is coding. Building with community is impact.
create-llm v1.0 is just the beginning.
SynthexAI integration (synthetic data generation)
Better benchmarking (tokens/sec, FTL, RAM usage)
More model architectures (BERT, T5)
Template marketplace
Cloud training platform (train on our GPUs)
Model hosting (get API endpoints)
Collaborative features (share configs, compare results)
Full “Vercel for LLMs” platform
One-click deploy
Model marketplace
Pay-as-you-go pricing
The dream: Make custom LLMs as accessible as creating websites.
After 12 weeks:
50+ active users
Featured in AI newsletters
10+ production deployments
But the real win? The messages:
“I got my first ML job because of this project.”
“I finally understand transformers now.”
“Teaching my students with create-llm.”
That’s the impact I wanted.
Want to train your first LLM?
npx create-llm my-first-llm --template nano
cd my-first-llm
python tokenizer/train.py --data data/raw/sample.txt
python data/prepare.py
python training/train.py
python chat.py --checkpoint checkpoints/checkpoint-best.pt60 seconds later, you’re chatting with your own model.
Not perfect. But yours.
Three months ago, I was frustrated by complex tutorials.
Today, I’ve built a tool that helps thousands of people learn LLMs.
The journey taught me:
Building is learning
Shipping beats perfecting
Community is everything
Your frustration is someone else’s too
If you’re stuck on a problem, build the solution. Someone else needs it too.
And maybe, just maybe, you’ll change how people learn.
Documentation: docs
Twitter: @theaniketgiri
To everyone who:
Starred the repo
Filed issues
Contributed code
Shared feedback
Believed in the vision
This is for you. And for everyone who’s ever felt frustrated trying to learn ML.
Let’s make AI accessible together.
Built with ❤️ by Aniket Giri
CS (AIML) Student | Building in public
Found this helpful? Star the repo, share the post, or just say hi on Twitter. I read everything.
Want to contribute? We’re always looking for help with docs, features, and examples.
Have questions? Drop them in the comments. I respond to every single one.
Tags: #machinelearning #ai #llm #opensource #buildinpublic #indiehacker #startup #developer #python #typescript
1
7
0