DAG (Directed Acyclic Graph)
A DAG is a directed acyclic graph—a visual representation of task dependencies in a data pipeline, showing which tasks must complete before others can begin, with no cycles or circular dependencies.
Definition
A Directed Acyclic Graph (DAG) is the data orchestration concept made visual. Each task is a node; an edge from Task A to Task B means 'A must finish before B can start.' 'Acyclic' means no cycles—you can't have Task A depending on Task B depending on Task A. DAGs are the language orchestration systems use to express pipeline logic. They're simple, elegant, and powerful: they encode all the parallelism information (tasks with no dependencies can run simultaneously), prevent circular dependencies, and make pipeline logic debuggable. Most orchestration tools visualize DAGs to help teams understand and troubleshoot their pipelines.
How It Works
1. Define tasks: Extract, Transform, Load, Validate. 2. Declare dependencies: LoadRaw → TransformDaily → LoadWarehouse. 3. Visualize: The orchestrator draws the DAG. 4. Parallelize: Tasks with no dependencies run at the same time. 5. Execute: The orchestrator walks the DAG, running tasks in topological order.
When to Use It
Think in DAGs when designing any multi-step pipeline. DAGs prevent you from creating impossible dependencies and make parallel execution obvious. If you're using an orchestration tool (Airflow, Dagster, dbt), you're using DAGs—it's the underlying abstraction.
Relevant Tools
Compare These Tools
Last updated: Jun 17, 2026