dbt Model

A dbt model is a SQL query or Python script that transforms raw data into a derived table, with built-in testing, documentation, and dependency management.

Definition

dbt (data build tool) is a transformation tool that's become the standard for analytical SQL. A dbt model is a .sql or .py file that defines how to transform data. Models are modular (each solves one problem), testable (you write data quality tests), and documented. dbt manages lineage, runs models in dependency order, and tracks which models are up-to-date. Instead of writing ad-hoc SQL scripts scattered across your warehouse, dbt models live in version control, are peer-reviewed, tested, and executed predictably. The dbt ecosystem has expanded to include orchestration (dbt Cloud), metrics definitions, and semantic layers, making it central to modern data teams.

How It Works

1. Write: Create a .sql file with SELECT ... FROM ... (or Python transformation). 2. Define: In dbt's config, specify the model's target table type (table, view, incremental). 3. Test: Write assertions (not null, unique values, custom tests). 4. Document: Add descriptions and column-level docs. 5. Run: dbt run builds all models; dbt test validates data quality; dbt docs generates a website.

When to Use It

Use dbt if you're doing analytical transformations in SQL or Python. dbt is an essential tool for modern data teams—it turns transformation code into production-quality, tested pipelines. Start with dbt Core (open source) on a single data warehouse; graduate to dbt Cloud for orchestration and multi-warehouse support.

Relevant Tools

Last updated: Jun 17, 2026