Roadmap

Data Engineering Pipelines Roadmap

A public view of how our cohorts connect—what you practice, in what order, and what artifacts you should have at each stage. Updated quarterly; not a job guarantee.

Skills map

Foundations → streaming → warehouse → orchestration → cloud → security overlay.

[L0 Foundations]──► Python ingest, SQL, Git
        │
[L1 Streaming]────► Kafka topics, consumers, lag ops
        │
[L2 Warehouse]────► dbt models, tests, docs site
        │
[L3 Orchestration]► Airflow DAGs, SLAs, runbooks
        │
[L4 Scale]────────► Spark batch, cloud ingest (AWS)
        │
[L5 Security]─────► Threat models, encryption, audit logs
        
Layered architecture diagram for medallion data lake pattern

Capstone project path

  1. Week 1–2: Ingest raw events to object storage with idempotent scripts.
  2. Week 3–4: Stream to curated topic; compact dimensions.
  3. Week 5–6: Model marts in dbt with tests and docs.
  4. Week 7–8: Orchestrate with Airflow; define SLAs and alerts.
  5. Week 9–10: Defense: architecture walkthrough + runbook handoff.

Certificate of completion

Issued after capstone defense and attendance requirements met. Describes modules completed—not a vendor exam.

Portfolio review letter

Optional mentor summary of artifacts suitable for employers; factual scope only.

Lab continuity access

30-day post-cohort access to Docker stacks for practice—not production licenses.