TL;DR: I recreated Git’s core features from scratch using Python. What started as a side project turned into a deep dive into Git internals, SHA-1 hashing, content-addressable storage, and why HEAD is
I’m a full-stack developer with a curiosity problem. I don’t like just using tools — I want to know how they work.
One such tool? Git.
We use it every day — git commit, git push, git panic.
But let’s be honest: most of us treat Git like magic.
So I did something wild:
I built my own Git clone in Python.
I call it MiniGit — and yes, it actually works.
This post is about why I did it, how it works, and what I learned.
Because understanding Git is like understanding the Matrix. Once you see how it works, you realize:
Commits are just text files
History is just linked SHA hashes
HEAD is just a pointer
And the “black box” isn’t scary at all
It also helped me:
Understand version control at a systems level
Gain confidence to debug Git issues in real projects
Create something cool for my portfolio
MiniGit supports the bare-bones features of Git:
CommandWhat it DoesinitCreates a .minigit/ folder to store everythinghash-object <file>Saves a file as a blob with SHA-1write-treeSnapshots the directory into a treecommit -m "msg"Creates a commit object with parent & messagelogPrints commit history like git log
Yes, it has no branches, remotes, or merge conflicts. Just pure Git internals — clean, understandable, and hands-on.
Think of hash-object as storing your file in a secret vault.
MiniGit wraps your file content like this:
blob <size>\0<content> Then it SHA-1 hashes it and stores it in .minigit/objects/<hash>
When you run write-tree, it scans your files, hashes each one, and builds a “tree object” like:
100644 blob a1b2c3 file.txt Imagine taking a group photo of your current directory. That’s a tree.
A commit points to:
a tree (snapshot of files)
a parent (previous commit)
a message
and your name + timestamp
It’s like putting a photo album in a safe, with a sticky note saying:
“Here’s what everything looked like on this date.”
Git's HEAD is just a file:
ref: refs/heads/master Which in turn points to the latest commit’s hash.
Mind. Blown.
python minigit.py init
echo "Hello world" > a.txt
python minigit.py commit -m "Initial commit"
echo "More content" >> a.txt
python minigit.py commit -m "Updated a.txt"
python minigit.py log You’ll see the commits print out, parent chains included.
Git is just a content-addressable file system
SHA-1 is everything — identity is defined by content
Commits are just linked text files — a giant linked list (DAG)
Version control is just smart snapshots
HEAD is not scary — it’s literally just a pointer
I also learned how to:
Work with raw bytes and file systems
Build immutable data structures
Respect Git instead of fear it
Pro Git Book — the best deep dive
git cat-file -p <sha> — to inspect Git internals
ChatGPT — for sanity checks and metaphors (ofc) 😅
If you want to truly understand Git — or just want a mind-expanding project — build your own version.
You’ll:
Think more clearly about systems design
Stop being afraid of Git
And earn massive bragging rights
You can check out the project and code here:
🔗 GitHub Repo →
If you’re curious, want to collaborate, or have questions — reach out!
I love talking devtools, side projects, and why Git is secretly beautiful.
🧡 Thanks for reading!
Built with Python, curiosity, and too much tea.
By Niyati Nehal — full-stack dev, builder of things, destroyer of bugs.
0
7
0