Design an Ad Click Aggregation System — Hard System Design Problem
Back to problemsHard SYSTEM DESIGN · Data
Design an Ad Click Aggregation System Click Stream
Design a real-time ad click aggregation system that ingests billions of click events daily, aggregates them for billing and reporting, detects click fraud, and provides low-latency analytics queries. The system must guarantee exactly-once counting for accurate advertiser billing.
Ingest ad click events from web and mobile clients at massive scale Aggregate clicks by ad, campaign, advertiser, and time window in real time Detect and filter fraudulent clicks using anomaly detection rules Provide real-time dashboards for advertisers with click-through rates and spend Generate billing reports with exactly-once click counting guarantees
Click Ingestion API: Lightweight endpoint collecting raw click events at scale Event Stream: Kafka-based pipeline buffering and distributing click events Stream Aggregator: Real-time stream processing for per-window click aggregation Fraud Detector: Applies rules and ML models to flag suspicious click patterns Aggregation Store: Pre-computed aggregates for fast dashboard queries Billing Batch: Periodic jobs that reconcile click counts for advertiser billing Monitoring / Logs for observability across all services Rate Limiter to throttle abusive or runaway clients Auth Service for token validation and session authentication Dead Letter Queue (DLQ) for failed message retry and inspection CDN (CloudFront) for edge caching of static assets Cache for frequently accessed data and reduced database load Load Balancer for distributing traffic across service instances HA/DR Strategy — multi-AZ deployment with automated database failover, cross-AZ synchronous replication, regular backup snapshots, and defined RTO/RPO targets Data Partitioning — time-range partitioning with automatic rollover for efficient window queries and TTL-based expiration Service Mesh (Sidecar Proxy) — mTLS between all services, zero-trust enforcement, and fine-grained traffic control for an Ad Click Aggregation System Distributed Tracing (APM) — end-to-end request correlation across all services for latency percentile tracking, dependency mapping, and SLO alerting
POST /api/v1/clicks — record ad click event (lightweight, fire-and-forget) GET /api/v1/analytics/campaigns/{id}?window=1h — get aggregated click metrics GET /api/v1/billing/reports?advertiser={id}&period=daily — billing report GET /api/v1/fraud/flagged?campaign={id} — get flagged fraudulent clicks
Handle 1,000,000+ click events per second at peak Real-time aggregation lag under 10 seconds from click to dashboard update Exactly-once semantics for billing-critical click counts Query latency under 500ms for aggregated metrics over any time range 99.99% availability for the click ingestion pipeline
User clicks an ad; client sends click event with ad ID, timestamp, and context
Click ingestion API validates and publishes event to Kafka stream
Stream aggregator consumes events, groups by ad/campaign/time window
Fraud detection runs inline, tagging suspicious clicks for review
Valid aggregated counts are written to database and cache for querying
Dashboard service reads pre-aggregated data for real-time visualization
Batch processor runs hourly/daily reconciliation for billing accuracy
Billing reports are materialized in data warehouse for advertiser access