
Not a wrapper. Not a library call. A real, working storage engine — written in C++, exposed over gRPC, running inside Docker. This is the story of how I built it and why.
I needed to store structured data in a very specific shape:
A document (HTML, text, anything)
A chain of inputs attached to that document
An expected output for each input
Simple enough, right? But every database I looked at felt wrong.
A relational database wanted me to define a schema upfront and write JOIN queries just to read a document with its inputs. A key-value store was too flat — I'd have to model the relationships myself. A document database handled the top level fine but got awkward when I needed an ordered, linked chain of sub-documents.
So I built exactly what I needed. Nothing more.
Stratum is a log-structured, hierarchical storage engine with O(1) lookup, built in C++ and exposed over gRPC so any language can talk to it.
Here's the data model:
Document ──→ Input node 1 ──→ Input node 2 ──→ Input node 3
│ │ │
▼ ▼ ▼
Output node 1 Output node 2 Output node 3
Every document lives at a known byte offset on disk. An in-memory hash map holds document_id → byte_offset. Every read is one hash map lookup (O(1)) plus one disk seek. No indexes to maintain, no query planner, no surprises.
Every write — insert, update, delete — is an append to the end of the active segment file. Nothing is ever mutated in place.
When you update a document, the old version stays on disk. The new version is appended. The in-memory index is updated to point at the new offset. The old version becomes garbage.
This means writes are always O(1) and you never get partial writes corrupting your data.
When the active segment grows beyond a configured threshold, it gets sealed — renamed to seg_NNN.seg — and a fresh active.seg is opened. You end up with a stack of segments on disk, which is where the name comes from. Like geological strata.
A background C++ thread wakes periodically and checks total segment size. When it crosses a threshold, it:
Scans all sealed segments
For every record ID, keeps only the version with the highest timestamp
Discards tombstoned (deleted) records entirely
Writes a single merged.seg
Atomically replaces the old segments
Storage only grows proportionally to live data — not to total write history. This is the same strategy used by Bitcask and LSM-tree databases like RocksDB, just much simpler.
Reads use a std::shared_mutex — any number of readers can run concurrently. Writes take an exclusive lock only to update the in-memory index. The disk append itself is serialized by a per-segment mutex.
Your Python/Go/Node app
│
│ gRPC (auto-generated client from a single .proto file)
▼
Stratum server (C++, always running)
│
├── In-memory hash index ← O(1) lookups
├── Segment manager ← append-only log files
└── Compactor thread ← background GC
│
▼
Disk (log-structured .seg files)
The gRPC part is what makes this useful beyond a single project. You write the .proto file once, run protoc, and get a client in Python, Go, Java, Node, Rust — any language gRPC supports. The C++ engine doesn't care who's calling it.
One thing I underestimated early on: input and output nodes can hold anything. An integer. A string. A list of integers. A list of strings. A map.
I ended up using std::variant in C++ to represent this:
using ValueVariant = std::variant<
int64_t,
double,
std::string,
std::vector<int64_t>,
std::vector<std::string>,
std::unordered_map<std::string, std::string>
>;
On the wire (gRPC), this becomes a oneof in the proto definition:
message Value {
oneof kind {
int64 int_val = 1;
double double_val = 2;
string str_val = 3;
Int64List vec_int = 4;
StringList vec_str = 5;
StringMap map_val = 6;
}
}
The Python client wraps this so you never think about it — you just pass a Python int, list, or dict and the client figures out the right proto type.
Once the Docker container is running, this is the entire client-side API:
from lse_client import LseClient
lse = LseClient("localhost:50051")
# Create a document
doc_id = lse.create_problem(
"<h1>My document</h1>",
category="tutorial",
version="1.0"
)
# Add input nodes — any data type
node1 = lse.add_test_case(doc_id, [2, 7, 11, 15])
node2 = lse.add_test_case(doc_id, "some string")
node3 = lse.add_test_case(doc_id, {"key": "value"})
# Attach output nodes
lse.set_expected_output(doc_id, node1, [0, 1])
# Read everything back
doc = lse.get_problem_html(doc_id)
nodes = lse.get_all_test_cases(doc_id)
out = lse.get_expected_output(doc_id, node1)
No HTTP, no JSON parsing, no URL construction. Just function calls. The generated gRPC stub handles serialization, connection management, and error handling.
The engine ships as a Docker container. Starting it is one command:
docker-compose up --build
That's it. The server starts on port 50051. Data is written to a named Docker volume so it survives container restarts and rebuilds. To stop: docker-compose down. Data stays. To wipe everything: docker-compose down -v.
1. Append-only is underrated. The moment I stopped trying to update records in place, the entire concurrency problem got simpler. Readers and writers can't conflict on the same byte offset because writers never touch old offsets.
2. Separate the index from the data. The in-memory hash index is tiny — just id → offset. It loads from disk on startup in milliseconds. The actual data — potentially gigabytes of blobs — never touches RAM until you ask for it. This is the core insight behind Bitcask.
3. gRPC over a custom protocol every time. I briefly considered rolling a custom TCP protocol. The moment I saw what a .proto file + protoc gives you — a typed, versioned, language-agnostic API with generated clients in every language — there was no reason to do anything else.
4. The compactor is the hardest part. Not because the algorithm is complex, but because it has to be correct under concurrent access. Getting the rename-then-reload sequence right — so readers never see a half-compacted state — took more iteration than anything else in the codebase.
5. CRC on every record. I added CRC-32 checksums on every record header from day one. It caught two bugs during development that would have been nearly impossible to find otherwise. Storage engines fail silently at the byte level. Checksums surface those failures immediately.
Stratum is open source. You can use it for anything that fits the document → inputs → outputs model. I'm actively working on:
A write-ahead log (WAL) for crash recovery
Snapshot / backup support
Benchmarks against SQLite for this specific access pattern
A Rust client
If you build something with it, I'd genuinely love to know.
Docker image link: https://hub.docker.com/r/electrogeek/stratum-server
The .proto file (the full API contract): proto/lse.proto
Quick start: docker-compose up --build then pip install grpcio grpcio-tools
If you have questions about the implementation — compaction strategy, the variant serialization, the gRPC wiring — drop them in the replies. Happy to go deep on any of it.
0
1
0