How a dbt project is built.
The value pages make the business case. This is the short, practical view: the file types a dbt project is made of, how the folders are laid out, and the handful of commands you actually run. When you want the full detail, the links at the bottom take you there.
A project is mostly SQL and YAML.
There is very little to learn. Models hold the logic, YAML holds the config and the checks, and Python is there for the rare case that needs it.
SQL models
.sql
A model is a single SELECT. dbt wraps it and materialises it as a table or view, so you write the logic and dbt handles the boilerplate of creating, replacing and ordering objects.
YAML config
.yml
YAML files sit next to your models and declare sources, tests, descriptions and config. Tests and docs live with the code, so they stay in step with what actually runs.
Python models
.py (optional)
Where SQL is awkward, a model can be Python instead, returning a dataframe. It is an alternative for the few cases that need it, not a different way of working.
The code and its documentation live together, so the docs cannot drift from what actually runs.
A few verbs cover the day to day.
Most work is one of these, run locally or in CI. Add a selector to scope any of them to part of the project.
dbt runBuilds your models in dependency order, as tables or views on the warehouse.dbt testRuns the tests declared in YAML against the data and reports any failures.dbt buildRuns and tests together, model by model, so bad data is caught before it flows downstream.dbt docs generateGenerates the searchable catalog and the lineage graph from your project.
Each command takes a selector, so you can run only what you need:
dbt build --select tag:daily builds and tests just the models tagged daily, and
their downstream dependents. That is how a large project stays fast to iterate on.
dbt Core or the paid platform?
dbt Core is open-source and free to run, and it is what we use for about 90 percent of our implementations. You own and run it: your CI, your scheduler, your warehouse. The paid dbt platform (formerly dbt Cloud) adds a hosted UI, managed scheduling and orchestration, and a governed catalog and semantic layer on top of the same project.
dbt Core (our default)
Open-source and free to run. You own the project end to end: run it in your own CI, schedule it with the orchestrator you already have, and target the warehouse or lakehouse you already own. It is what we reach for in most engagements.
The paid dbt platform
Formerly dbt Cloud. On top of the same project it adds a hosted UI, managed scheduling and orchestration, and a governed catalog and semantic layer, so you do not build that scaffolding yourself.
When we recommend the paid platform
When non-technical users need a UI to explore and run models, when you want managed scheduling without building your own CI and orchestration, or when you need the hosted catalog and semantic layer across many teams. Otherwise, dbt Core.
From here, the real documentation.
This page is a launchpad. For how we structure, style and test dbt projects in practice, see the Plainsight Playbook below. For the full reference, go straight to dbt's official docs.
dbt official documentation
Want this set up properly?
We structure, style and test dbt projects on Fabric and Databricks every day. Tell us where your data sits today.