Lambda architecture
Lambda architecture is the data-processing pattern that runs two parallel pipelines: a batch layer that computes complete views over all data with high latency, and a speed layer that computes incremental views over recent data with low latency. Queries combine results from both layers to balance freshness and completeness.
Lambda was popularised by Nathan Marz (Storm, Apache projects) around 2012-2014. The trade-off: dual implementations of the same business logic (one for batch, one for streaming) double maintenance cost and create consistency risks. Kappa architecture (Jay Kreps, 2014) proposed eliminating the batch layer by handling everything via streaming with replay capability — simpler but historically required more sophisticated stream-processing tools. Modern data architectures often use unified frameworks (Apache Flink, Apache Beam, Materialize) that bridge the gap. Lambda remains relevant for legacy systems and specific cases where batch's analytical depth complements streaming's recency.
Related terms
- Kappa architecture
Kappa architecture handles all data processing through a single streaming pipeline that can be replayed from the beginning of the log when computations need to be reprocessed.
- Event-driven architecture
Event-driven architecture is the design pattern where services communicate by emitting and consuming events rather than by direct synchronous calls.
- Event sourcing
Event sourcing is a persistence pattern that stores every state change as an immutable event in an append-only log.