Faster, cheaper runs
Incremental models process only what changed, so runs are faster, compute costs less, and the warehouse is under less strain.
Spend compute on what moved, not on all of history. Faster runs, lower cost, fresher data.
The problem
Many pipelines do the same expensive thing every single run: they drop the table and rebuild it from all of history. On day one, with a thousand rows, nobody notices. Two years later, reprocessing hundreds of millions of rows to capture one day of change is slow, costly, and hard on a shared warehouse that other teams are also trying to use. The bill climbs, the run window stretches past the time anyone is awake to watch it, and the data is stale by the time it lands.
What dbt does about it
dbt makes incremental loading a first-class pattern: a model processes only the new or changed data and merges it into what is already there. You declare how new rows should be combined, and dbt generates the right SQL to do it. There are four strategies, each for a different shape of change. Append adds fresh rows and fits events and logs. Merge updates matching records and inserts the rest, the natural fit for late-arriving dimension updates. Delete and insert clears a defined window then reloads it, which suits corrections and reprocessing a time range. Insert overwrite swaps entire partitions at once, built for large partitioned tables. You can read the full mechanics in dbt’s incremental models docs.
What it looks like
A daily orders table holds two years of data. The full-reload version rescans everything each night and runs for forty minutes. Switched to an incremental merge, the same model touches only yesterday’s orders plus any late corrections, and finishes in under two. The compute it no longer burns is compute you no longer pay for, and the warehouse is freed up for the analysts who need it.
How we think about it
Incremental loading should be the default for tables big enough to feel a full reload, not a special-case optimisation bolted on later. We pick the strategy from the shape of the change, not habit, and we keep a clean full-refresh path so logic changes are safe to roll out. Faster, cheaper runs are not just a cost line: they buy fresher data and headroom to grow. The same run results that prove a build was cheap also prove it was healthy, which is where reliability you can prove picks up the story.
Faster, cheaper runs, in short.
Is incremental loading always the right choice?
It is the right default for large, append-heavy or steadily growing tables, where reprocessing all of history every run is pure waste. Small lookup and dimension tables that rebuild in seconds are fine as full refreshes. The rule of thumb: go incremental when the cost of a full reload starts to hurt, not before.
What happens on the very first run, when there is no history?
On the first run, and any time you force a full refresh, dbt builds the table from scratch over all the data. After that, each run processes only the new or changed rows. You can rebuild the whole table on demand whenever the logic changes, so you are never locked into a partial state.
How do the four strategies differ?
Append adds new rows and suits events and logs. Merge updates matching rows and inserts the rest, which fits late-arriving dimension changes. Delete and insert clears a window then reloads it, handy for corrections and reprocessing. Insert overwrite swaps whole partitions, ideal for large partitioned tables.
Does incremental loading work on both Fabric and Databricks?
Yes. The same dbt model targets either platform; dbt generates the right SQL for each, using MERGE and partition-aware writes where the platform supports them. You write the intent once and dbt adapts it to Microsoft Fabric or Databricks.
Keep exploring
Change without fear
Modular, version-controlled models with tests mean a change is reviewed, tested and reversible, not a leap of faith.
One agreed way of working
Bronze, silver, gold layers and reusable models give every team one contract, so questions become queries, not rebuilds.
Change shared models without breaking everyone
When a shared model changes, it can silently break every team downstream. Model versions and contracts let a central team ship a new version while consumers migrate on their own schedule.
Want this for your data?
If this is how you want your team to work, we should talk.