Watermark
A timestamp that tracks how far a stream processing system has progressed through event time. Tells the system when it's safe to close a window and emit results, even with late-arriving data.
What is Watermark?
A timestamp that tracks how far a stream processing system has progressed through event time. Tells the system when it's safe to close a window and emit results, even with late-arriving data.
Watermark is a advanced concept that sits in the Stream & Batch Processing area of system design. Engineers reach for it whenever they need to reason about real-world trade-offs in that space — not just for textbook correctness, but because real production systems at companies like Netflix, Amazon, and Google make these decisions every day.
If you want to go deeper than this definition — with diagrams, code, and a quiz to lock it in — work through the "Watermark" lesson linked below. It walks through the why, the mechanism, the trade-offs, and how the giants actually use it in production.
Learn Watermark in depth
Full interactive lesson with diagrams, code examples, real-world references, and a quiz.
Open the Watermark lessonRelated lessons
Lessons that touch on Watermark as part of a larger topic.
See also
Related glossary terms you might want to look up next.
Stream Processing
Processing data continuously as it arrives, rather than in batches. Powers real-time analytics, fraud detection, and live dashboards.
Apache Flink
A distributed stream processing framework that handles both real-time streams and batch data with exactly-once guarantees. Used by Alibaba, Netflix, and Uber at massive scale.
Exactly-Once Processing
A processing guarantee where each message is processed exactly one time, even in the face of failures. Achieved through idempotent consumers and transactional producers.