Designing Data-Intensive Applications (DDIA), highlighting how it reshaped his understanding of distributed systems, databases, and architectural trade-offs.
Milan read DDIA twice — first in 2018, then again in 2023 — and found it transformative.
The book bridges theory and practice, helping engineers reason about reliability, scalability, and maintainability.
1. Foundational Concepts
Reliability: Systems must work even when things go wrong.
Scalability: Efficiently handle increased load.
Maintainability: Design for evolution and operational ease.
2. Data Models
Relational: Best for joins and consistency.
Document: Great for flexible schemas and nested data.
Graph: Ideal for highly connected data (e.g., social networks).
3. Storage Engines
B-trees: Fast reads, slower writes (used in PostgreSQL, MySQL).
LSM-trees: Fast writes, slower reads (used in Cassandra, RocksDB).
Trade-offs between read/write performance and complexity.
4. Replication & Consistency
Single-leader: Simple, consistent.
Multi-leader: Complex, conflict-prone.
Leaderless: High availability, eventual consistency.
Concepts like quorum, read/write trade-offs, and consistency models (linearizability vs serializability) are explained clearly.
5. Schema Evolution
Importance of backward/forward compatibility.
Use of Avro, Protocol Buffers, and schema registries.
6. Distributed Systems Challenges
Partial failures, unreliable networks, clock drift.
Leader election, fencing tokens, Byzantine faults.
Safety vs liveness in algorithm design.
7. Streams & Event-Driven Architecture
Batch vs stream processing.
Change Data Capture (CDC), Event Sourcing.
Kafka and real-time pipelines as core patterns.
Outdated examples: Published in 2017, lacks coverage of newer tools like Flink, Kubernetes, or NewSQL.
Theory-heavy: Less hands-on guidance.
Breadth over depth: Some chapters feel overloaded (especially Chapter 9).
Operational gaps: Limited coverage of monitoring, backups, and migrations.
Mid-career engineers, architects, and tech leads.
Anyone preparing for system design interviews or building scalable systems.
Beginners without distributed systems background.
Readers looking for practical tutorials or vendor-specific guidance.
Design for failure.
Measure tail latency (p95/p99), not averages.
Match data models to access patterns.
Understand storage engine trade-offs.
Use transactions wisely.
Embrace event-driven architecture.
Prioritize maintainability and observability.
Always weigh trade-offs.
0
0
0