2025-02-14 · Eun-ji Han

Partitioning Kafka topics for warehouse handoff

Kafka topic partition diagram drawn during mentor office hours

When teams first wire Kafka to a warehouse, partition keys are often copied from the OLTP primary key without considering skew. In our Stream Processing cohort, mentors ask learners to chart event volume per key before committing.

We use a Daejeon logistics dataset where shipment_region dominates traffic. Learners try three key strategies—order_id, region, and composite—and compare consumer lag in Grafana. The exercise surfaces hot partitions quickly.

The handoff lab adds a compacted topic for dimension snapshots. Students document retention and cleanup policies, then justify choices in a five-minute review. Most teams settle on order_id for facts and region for aggregated rollups.

None of this replaces your production SLOs, but it builds vocabulary for architecture interviews and on-call conversations.

Tags: Kafka, Pipelines, Mentor notes

← All field notes