Real-Time Analytics Platform
Stream processing pipeline handling 5TB/day with sub-second latency
KafkaSparkClickHouseKubernetesPython
Designed and implemented a real-time analytics platform using Apache Kafka, Spark Streaming, and ClickHouse. The platform ingests events from 50+ microservices, processes them through a multi-stage pipeline (validation, enrichment, aggregation), and serves analytics dashboards with sub-second query latency.
Key achievements: 5TB/day throughput, 99.95% uptime, 200ms p95 query latency, automated schema evolution with Protobuf and Schema Registry.