Blog Logo
TAGS

Datadog Creates Scalable Data Ingestion Architecture

Datadog created a dedicated data ingestion architecture with event-driven architecture (EDA) offering exactly-once semantics for their third-generation event store, Husky. The architecture of Husky separated data ingestion, data compaction, and data reading workloads, which allows them to be scaled independently. All three workloads leverage a shared metadata store built on FoundationDB and a blob storage service that uses AWS S3. Datadog solved the unique challenge of ensuring exactly-once ingestion semantics with an internal routing mechanism that deterministically splits the incoming stream of events into multiple shards for each tenant while limiting the number of tenants included in a shard to lower storage costs. The design supports conflict detection and resolution as well as efficient event deduplication through persistent datastore tables and an LRU cache.