Autonomous in the loop. Accountable at the gates. An agentic offering by Plainsight

Migrate Azure Synapse to Microsoft Fabric.

All four Synapse workloads (pipelines, notebooks, dedicated and serverless SQL) rebuilt on Microsoft Fabric and iterated until the tests pass.

Azure Synapse logo Microsoft Fabric logo

Why moving Azure Synapse to Microsoft Fabric is four migrations, not one

An Azure Synapse workspace looks like a single product, but it’s a collaboration boundary around four separate engines: Synapse pipelines, Spark notebooks, dedicated SQL pools and serverless SQL. Each has its own runtime, its own dialect and its own migration considerations, and the most expensive mistake is treating them as one. The logic that runs your business rarely lives inside any single workload. It lives in the seams between them. A Spark notebook lands Delta; serverless SQL exposes views over that Delta; a dedicated SQL pool holds the modeled warehouse; and a pipeline orchestrates all three. The end-to-end behavior only exists across those handoffs. Microsoft positions Fabric as the go-forward platform for exactly this estate, and provides first-party assistants for the warehouse, Spark and pipeline workloads, but a faithful rebuild still has to follow data across the Spark-to-serverless-to-dedicated boundaries, not document each pool in isolation. That’s the job of the Documenter before anything is built. Autonomous in the loop, accountable at the gates.

What breaks in a naive lift-and-shift

The seductive idea is to copy each workload straight across. It fails on contact with the targets, and the failures are specific. Dedicated SQL pool physical design (HASH, ROUND_ROBIN and REPLICATE distributions across 60 distributions, single-column partitioning tuned to the MPP architecture) has no equivalent on the Fabric Warehouse, which stores everything as Delta tables in OneLake. None of that distribution tuning maps directly; it must be re-derived, not pasted. Dedicated-pool stored procedures are already a reduced T-SQL dialect (an 8-level nesting cap, no @@NESTLEVEL, no INSERT...EXECUTE), and the Warehouse imposes its own surface-area limits: no triggers, no materialized views, no synonyms, no recursive queries, and primary/unique/foreign keys only as NOT ENFORCED metadata. Serverless SQL adds its own trap: it reads Delta at reader version 1 only, so a view that worked over older files may point at a Delta feature the engine cannot read once data is upgraded. Spark notebooks mostly carry over, but %run (shared variable context) and mssparkutils.notebook.run (isolated) behave differently and have to be told apart before logic is reassembled. A lift-and-shift that ignores these breaks loudly at best and silently at worst.

How the fleet maps each workload onto Fabric

The dedicated SQL pool becomes the Fabric Warehouse: T-SQL tables, views, stored procedures and functions, with full multi-table ACID transactions and MERGE generally available, which is why hand-rolled SCD Type 2 loops often collapse into a single MERGE statement, or, on the dbt flavor, into a dbt snapshot that maintains dbt_valid_from, dbt_valid_to and dbt_scd_id for you. Serverless SQL maps to the SQL analytics endpoint that every Lakehouse provisions automatically (a read-only T-SQL surface over /Tables Delta data on the same engine as the Warehouse) with OPENROWSET(BULK ...) in the Warehouse covering the query-in-place cases. Spark notebooks become Fabric notebooks running PySpark on the default Live Pool against Delta tables. Synapse pipelines become Fabric Data pipelines: the dependsOn activity graph, the If Condition / Switch / For Each / Until control flow and the success and failure paths all carry across, with linked services and datasets becoming Connections and a tumbling-window trigger becoming an interval-based schedule.

The build-test-run loop, and the gates

Both delivery flavors run through the same loop. On the dbt flavor the Builder produces models, sources, snapshots and tests against the Warehouse via the dbt-fabric adapter, with merge as the default incremental strategy and the DAG built from ref() and source(). On your framework, the Builder produces native Fabric artifacts (pipelines, notebooks, Warehouse procedures) in your own orchestration. The Test agent runs the rebuilt logic and the fleet iterates until green: a distribution clause rejected, a nesting limit hit, a serverless view pointing at a Delta v2 feature, each caught and corrected until row counts, current-version counts and checksums match the Synapse baseline. Three human gates bound the autonomy (assessment, design and promotion) so nothing reaches production without a person signing off. If your destination is the other platform instead, see Migrate Azure Synapse to Databricks; the workload mapping changes, the discipline does not.

How the fleet runs it

Azure Synapse to Microsoft Fabric, stage by stage.

For Azure Synapse → Microsoft Fabric, the same five stages run with the agents that handle this pair named and linked below. Inside the loop they iterate on their own; your experts hold the three gates.

dbt or your framework

Built your way, on Fabric.

Take Azure Synapse to Microsoft Fabric as an open-source dbt project (free to run), or built into the framework your team has trusted for years: same migration, your standards either way.

What gets generated

dbt
  • A dbt project targeting the Fabric Warehouse via the dbt-fabric adapter: models, sources, snapshots, tests, macros and generated docs.
  • Dedicated SQL pool logic ported into dbt models materialized as table (default), view or incremental, with the MPP physical design re-derived.
  • Snapshots for SCD Type 2 history (dbt_valid_from / dbt_valid_to / dbt_scd_id) and merge incremental models keyed on a unique_key.
  • Serverless OPENROWSET views re-expressed as dbt sources over OneLake data, with generic tests (unique, not_null, relationships) gating the build.

Pattern mapping

Each Azure Synapse pattern, mapped to a deliberate Microsoft Fabric target.

Source pattern
Microsoft Fabric · dbt
Microsoft Fabric · your framework
Synapse dedicated SQL pool table (hash / round-robin / replicated, 60 distributions)
dbt model on the Fabric Warehouse; MPP distribution removed and physical design re-derived
Fabric Warehouse table loaded via COPY INTO / CTAS; distribution keys not carried over
Dedicated pool stored procedure (8-level nesting cap, no INSERT...EXECUTE)
Logic decomposed into ordered models; side effects via dbt hooks / operations
CREATE PROCEDURE in the Warehouse, called from a pipeline Stored procedure activity
Hand-rolled SCD Type 2 history table
dbt snapshot (timestamp or check strategy) writing dbt_valid_from / dbt_valid_to
T-SQL MERGE in a Warehouse procedure, or Spark MERGE INTO in a Fabric notebook
Serverless SQL OPENROWSET over Delta (reader v1 only) / external tables
dbt source over OneLake data, or OPENROWSET(BULK ...) staged then modeled
SQL analytics endpoint views over a Lakehouse, or OPENROWSET(BULK ...) in the Warehouse
Serverless partitioned view / CETAS logical data warehouse
dbt models over Lakehouse Delta tables exposed through the SQL analytics endpoint
T-SQL views on the read-only SQL analytics endpoint over /Tables Delta data
Synapse Spark notebook (%run, mssparkutils, writes Delta)
Set-based logic re-expressed as dbt models; non-SQL steps stay in a notebook
Fabric notebook (PySpark) on the Live Pool; %run chains re-expressed for Fabric
Synapse pipeline + dependsOn activity graph
dbt DAG from ref() / source(); tests gate ordering; orchestration outside dbt
Fabric Data pipeline with dependsOn; success / failure paths preserved
Synapse linked services & datasets; tumbling-window trigger
dbt profiles/targets for the Warehouse connection; scheduling via Airflow or the dbt job
Fabric Connections; interval-based schedule replaces the tumbling window

Before & after

A real Azure Synapse snippet and what the builders generate on Microsoft Fabric.

synapse_dedicated_sql_pool/usp_load_dim_customer.sql legacy
-- Azure Synapse dedicated SQL pool (MPP, 60 distributions)
-- Hand-rolled SCD Type 2 load with a hash-distributed dimension.
CREATE TABLE dbo.DimCustomer
WITH (DISTRIBUTION = HASH(CustomerKey), CLUSTERED COLUMNSTORE INDEX)
AS SELECT * FROM dbo.DimCustomer_Existing;

CREATE PROCEDURE dbo.usp_load_dim_customer AS
BEGIN
    -- close changed current rows
    UPDATE d
    SET    d.ValidTo = GETDATE(), d.IsCurrent = 0
    FROM   dbo.DimCustomer d
    INNER JOIN stg.Customer s ON s.CustomerId = d.CustomerId
    WHERE  d.IsCurrent = 1
      AND (d.Email <> s.Email OR d.Segment <> s.Segment);

    -- insert new versions (no MERGE used; staged round-robin table)
    INSERT INTO dbo.DimCustomer (CustomerId, Email, Segment, ValidFrom, IsCurrent)
    SELECT s.CustomerId, s.Email, s.Segment, GETDATE(), 1
    FROM   stg.Customer s;  -- stg.Customer is DISTRIBUTION = ROUND_ROBIN
END;
dbt
snapshots/dim_customer.yml dbt
# dbt-fabric → Fabric Data Warehouse. SCD Type 2 becomes a dbt snapshot:
# the MPP distribution design is dropped; history is tracked declaratively.
snapshots:
  - name: dim_customer
    relation: source('staging', 'customer')
    config:
      unique_key: customer_id
      strategy: timestamp
      updated_at: updated_at
      # dbt maintains dbt_valid_from / dbt_valid_to / dbt_scd_id
# models/marts/dim_customer.sql, current view over the snapshot
# select * from {{ ref('dim_customer') }} where dbt_valid_to is null
The loop that closes itself

Iterated until green, and logged.

A failure on this pair doesn't go on a list. It goes back to the builders, and the loop re-runs until the suite passes.

synapse-to-fabric · iteration log build-test-operate
iter 1 Iter 1, FAIL: `DISTRIBUTION = HASH(...)` rejected; Fabric Warehouse has no MPP distribution clause. Stripped, re-derived layout.
iter 2 Iter 2, FAIL: dedicated-pool proc exceeded the 8-level nesting it relied on; logic flattened and SCD2 collapsed into one `MERGE`. Row counts still off by 14.
iter 3 Iter 3, FAIL: serverless `OPENROWSET` view pointed at a Delta v2 feature; re-pointed source through the SQL analytics endpoint over the Lakehouse.
iter 4 Iter 4, GREEN: DimCustomer parity. Row counts, current-version count and checksum match the Synapse baseline; delta 0.

Green on iteration 4, DimCustomer reaches row-count and checksum parity with the Synapse baseline.

Questions for this migration

Azure Synapse → Microsoft Fabric, answered.

Do my Synapse dedicated SQL pool distribution keys carry over to the Fabric Warehouse?

No, and they should not. Dedicated SQL pool is an MPP engine that spreads every table across 60 distributions, and you tune it with `HASH`, `ROUND_ROBIN` or `REPLICATE` distribution plus single-column partitioning. The Fabric Warehouse stores data as Delta tables in OneLake and has no distribution clause to set. The Architect re-derives the physical design rather than copying it: distribution choices, replicated-table decisions and the 60-distribution assumptions are removed, and the Builder lands the table with `COPY INTO`, `CTAS` or `INSERT`. Statistics behavior also differs, so any custom "create stats on every column" procedure from the dedicated pool is dropped, not ported.

What happens to serverless SQL `OPENROWSET` queries and external tables on Microsoft Fabric?

Serverless SQL pool queries files in place with `OPENROWSET`, external tables and views to build a logical data warehouse, and its Delta support is limited to reader version 1, so column renames, deletion vectors and v2 checkpoints are not readable there. On Fabric the go-forward pattern is the SQL analytics endpoint that every Lakehouse provisions automatically: a read-only T-SQL surface over the Lakehouse Delta tables in `/Tables`, on the same engine as the Warehouse, with no Delta v1 ceiling. Where the original logic queried raw files, the Builder either re-points it as a dbt source or uses `OPENROWSET(BULK ...)` in the Warehouse. The endpoint is read-only (no DML, limited DDL) so writes move into the Warehouse.

How do Synapse Spark notebooks move to Fabric notebooks, is `%run` still valid?

Both run Apache Spark over Delta, so most PySpark transformation logic carries over, but the chaining semantics differ enough to read carefully. In Synapse, `%run` copies the referenced notebook's cells into the caller and shares variable context, while `mssparkutils.notebook.run` calls it in isolation. The Documenter distinguishes the two because they propagate variables and side effects differently. On Fabric the notebook runs on a default Live Pool with no cluster sizing, and references to `mssparkutils` and Synapse-specific session config are re-expressed. PySpark notebooks are preferred over Python notebooks because they have the most complete Delta Lake support.

Can I lift-and-shift my Synapse pipelines into Microsoft Fabric unchanged?

Largely, but not literally. Synapse pipelines share the same JSON model as Azure Data Factory (activities, `dependsOn`, datasets, linked services, triggers) and Data Factory in Fabric is the next-generation successor, so the `dependsOn` graph, the control-flow activities and the success/failure branches map across cleanly. The renames matter: datasets and linked services become Connections, `Execute pipeline` becomes `Invoke pipeline`, and a tumbling-window trigger becomes an interval-based schedule. The Builder reproduces the activity graph natively rather than importing files blindly, and the Test agent confirms the rebuilt pipeline produces the same outputs before promotion.

Is Microsoft Fabric just Azure Synapse renamed?

No, that is an understandable oversimplification, not the reality. Synapse was a PaaS workspace bundling SQL pools, Spark and pipelines; Fabric is a SaaS platform that re-implements those capabilities as workloads on OneLake with one capacity-based commercial model. There is genuine lineage (each of the four Synapse workloads has a clear Fabric successor) but the architecture, packaging and behavior differ materially (capacity-based vs fixed pool sizing, starter pools, a different Spark runtime lineup). Microsoft positions Fabric as the go-forward platform. Treat it as a next-generation successor, not the same product with a new logo.

Let's talk

Ready to migrate Azure Synapse to Microsoft Fabric?

Tell us about your Azure Synapse estate and we'll run the assessment, the Surveyor scores it before you commit to Microsoft Fabric.

Plan my migration

A short form, no spam. We usually reply within one business day.

Plan my migration