Shikhil Saxena

Nov 09, 2025 • 1 min read

What I learned from the book Designing Data-Intensive Applications

Designing Data-Intensive Applications (DDIA), highlighting how it reshaped his understanding of distributed systems, databases, and architectural trade-offs.

🧠 Why DDIA Matters

  • Milan read DDIA twice — first in 2018, then again in 2023 — and found it transformative.

  • The book bridges theory and practice, helping engineers reason about reliability, scalability, and maintainability.

🔍 Key Learnings

1. Foundational Concepts

  • Reliability: Systems must work even when things go wrong.

  • Scalability: Efficiently handle increased load.

  • Maintainability: Design for evolution and operational ease.

2. Data Models

  • Relational: Best for joins and consistency.

  • Document: Great for flexible schemas and nested data.

  • Graph: Ideal for highly connected data (e.g., social networks).

3. Storage Engines

  • B-trees: Fast reads, slower writes (used in PostgreSQL, MySQL).

  • LSM-trees: Fast writes, slower reads (used in Cassandra, RocksDB).

  • Trade-offs between read/write performance and complexity.

4. Replication & Consistency

  • Single-leader: Simple, consistent.

  • Multi-leader: Complex, conflict-prone.

  • Leaderless: High availability, eventual consistency.

  • Concepts like quorum, read/write trade-offs, and consistency models (linearizability vs serializability) are explained clearly.

5. Schema Evolution

  • Importance of backward/forward compatibility.

  • Use of Avro, Protocol Buffers, and schema registries.

6. Distributed Systems Challenges

  • Partial failures, unreliable networks, clock drift.

  • Leader election, fencing tokens, Byzantine faults.

  • Safety vs liveness in algorithm design.

7. Streams & Event-Driven Architecture

  • Batch vs stream processing.

  • Change Data Capture (CDC), Event Sourcing.

  • Kafka and real-time pipelines as core patterns.

⚠️ Critiques

  • Outdated examples: Published in 2017, lacks coverage of newer tools like Flink, Kubernetes, or NewSQL.

  • Theory-heavy: Less hands-on guidance.

  • Breadth over depth: Some chapters feel overloaded (especially Chapter 9).

  • Operational gaps: Limited coverage of monitoring, backups, and migrations.

✅ Who Should Read It

  • Mid-career engineers, architects, and tech leads.

  • Anyone preparing for system design interviews or building scalable systems.

❌ Who Might Struggle

  • Beginners without distributed systems background.

  • Readers looking for practical tutorials or vendor-specific guidance.

📌 Bonus: Cheat Sheet Highlights

  • Design for failure.

  • Measure tail latency (p95/p99), not averages.

  • Match data models to access patterns.

  • Understand storage engine trade-offs.

  • Use transactions wisely.

  • Embrace event-driven architecture.

  • Prioritize maintainability and observability.

  • Always weigh trade-offs.

Join Shikhil on Peerlist!

Join amazing folks like Shikhil and thousands of other builders on Peerlist.

peerlist.io/

It’s available... this username is available! 😃

Claim your username before it's too late!

This username is already taken, you’re a little late.😐

0

0

0