Autonomous in the loop. Accountable at the gates. An agentic offering by Plainsight

Azure Synapse migration, run by an agent fleet.

The Documenter reads Azure Synapse Analytics (the logic, not just the labels) and writes it to the knowledge base. Then the fleet builds, tests and operates the result on Microsoft Fabric or Databricks, iterating until the tests pass, with your experts approving every gate.

Azure Synapse logo

What the Documenter reads in Azure Synapse Analytics

A Synapse workspace looks like one product. It’s really a collaboration boundary wrapped around four distinct engines, and treating it as a single thing is the first mistake a migration makes. The Documenter (the agent that builds the knowledge base before any rebuild begins) reads each workload on its own terms, then follows the data where it crosses between them. The point of the assessment gate is to read the logic, not the labels: the distribution key behind a fast join, the %run that quietly shares a notebook’s variables, the view that turns raw lake files into a warehouse. Microsoft positions Fabric as the path forward for all four, so every observation is recorded against its documented destination on the target. Autonomous in the loop, accountable at the gates.

Synapse pipelines

Synapse pipelines run the same engine and JSON model as Azure Data Factory, so the Documenter parses them as Azure Resource Manager resources: a pipeline file with activities[], parameters, and concurrency; execution and control activities distinguished by whether they carry an activity-level linked service; and the dependsOn graph with its Succeeded, Failed, Skipped, and Completed conditions reconstructed into both the happy path and the error branches. What it records as Synapse-specific is environmental. Synapse pipelines support only Azure and self-hosted integration runtimes (there is no Azure-SSIS IR here) and storage-event trigger metadata is referenced differently, through @trigger().outputs.body.fileName and folderPath rather than the ADF @triggerBody() form. The Documenter notes those differences so the activity graph rebuilds faithfully on Data Factory in Fabric.

Spark notebooks

Synapse notebooks run on serverless Apache Spark pools across PySpark, Scala, Spark SQL, and .NET for Spark. The Documenter reads the chaining semantics first, because they decide how state moves. The %run magic copies another notebook’s cells into the caller and shares the variable context (include semantics) and a mssparkutils.notebook.exit inside the referenced notebook stops the caller; mssparkutils.notebook.run(...) instead invokes the other notebook as an isolated function with no shared variables. Confusing the two silently changes which variables and side-effects propagate, so the Documenter distinguishes them explicitly. It also captures session configuration set through %%configure (honored under %run, ignored under notebook.run), the mssparkutils / notebookutils calls into file systems and secrets, and the Delta Lake reads and writes that make Spark the producer of the interchange format the SQL side consumes. The .NET for Spark note is flagged as a rebuild driver, since it is unsupported on newer runtimes and Microsoft recommends Python or Scala.

Dedicated SQL pools

The dedicated SQL pool is an MPP T-SQL warehouse spread across 60 distributions, and almost none of its physical design survives a literal port, so the Documenter reads the design intent, not just the DDL. It captures the distribution strategy per table: hash-distributed on a chosen column for large facts and dimensions, where co-located joins on the same key avoid data movement; round-robin for staging, where joins reshuffle; replicated for small dimensions cached on every node. It records single-column partitioning and the partition switching and merging that drives data lifecycle, and the statistics the optimizer depends on, including the absence of an sp_create_stats equivalent that pushes teams toward a custom stored procedure. It also flags the dialect subset that signals what has already been adapted away from full SQL Server: the eight-level nesting cap, no @@NESTLEVEL, no INSERT...EXECUTE, constrained scalar UDFs, and blob-type limits that force dynamic SQL to be chunked through EXEC(). Distribution keys, replicated choices, and one-column partitioning are tuned to the 60-distribution architecture and must be re-derived for Fabric Warehouse or Databricks, never copied.

Serverless SQL

Serverless SQL is an on-demand T-SQL engine that queries lake files in place, with no provisioned infrastructure and no stored table data, so the Documenter reads how the relational surface is constructed over raw files. It captures OPENROWSET queries with their wildcards and schema inference, and for Delta the BULK pointer at a root folder containing _delta_log, noting the constraint that serverless reads Delta reader version 1 only: column renames, deletion vectors, and v2 checkpoints are out of reach, which is itself a reason Microsoft points newer Delta work to the Fabric SQL analytics endpoint. It records external tables with their external data sources and file formats, CETAS exports, and the views that assemble a logical data warehouse over the lake, including the partitioned views used because external tables do not partition. It also documents what cannot exist here by design: no regular tables, no DML, no triggers or materialized views, no scalar UDFs, and no time travel. Lake tables created by Spark pools become queryable from serverless after a schema-sync delay, which is the bridge the Documenter follows from the notebook workload to this one.

Your estate, in minutes

The Surveyor scores the risk before you commit.

Before a single asset moves, the Surveyor inventories your Azure Synapse estate, scores every asset for complexity, and flags the drivers that make a migration risky. You get a prioritized backlog and a clear-eyed view of where the effort really sits, typically the kind of work below.

  • Every asset inventoried and complexity-scored
  • Risk drivers flagged early, not near the deadline
  • A prioritized, data-driven migration backlog

Complexity scorecard

synapse_estate.summary
Low Medium High
Dedicated pool coupling 0.26
Hybrid Spark / SQL orchestration 0.21
Cross-workload dependencies 0.16
Workspace-specific linked services 0.13
minto inventory
0.89avg confidence
Two destinations

Take Azure Synapse to Fabric or Databricks.

The knowledge base is target-agnostic: document once, then choose the platform that fits your estate and strategy.

Microsoft Fabric logo

Microsoft Fabric

What the fleet builds when you take Azure Synapse to Microsoft Fabric: in dbt or your own framework, iterated until green.

Migrate Synapse to Fabric →
Databricks logo

Databricks

What the fleet builds when you take Azure Synapse to Databricks: in dbt or your own framework, iterated until green.

Migrate Synapse to Databricks →
Good questions

Azure Synapse migration, answered.

Is Azure Synapse Analytics being retired? When is the end-of-support date?

Microsoft has not announced a service-wide retirement or end-of-support date for Azure Synapse Analytics. It follows the Modern Lifecycle Policy and is listed as in support today. What is published is direction, not a shutdown: Microsoft positions Microsoft Fabric as the go-forward analytics platform and the recommended target for new work and migrations, and it ships first-party migration assistants for warehouses, Spark, and pipelines. We describe that direction honestly and never invent a date.

If there's no retirement date, why move off Synapse now?

Because new capability lands in Microsoft Fabric while Synapse settles into maintenance. The signal is in the specifics: Synapse Link for SQL and for Azure Cosmos DB are steered to Fabric Mirroring, `.NET for Apache Spark` is no longer supported on newer Synapse Spark runtimes, and individual Spark runtime versions carry dated end-of-support milestones that force an ongoing upgrade treadmill. The defensible argument is momentum and direction (Fabric is where investment is concentrated) not a service-wide doomsday date.

Isn't Microsoft Fabric just Synapse with a new name?

Not exactly. Fabric is a broader SaaS analytics platform built on OneLake that unifies data integration, engineering, warehousing, data science, real-time intelligence, and Power BI. Synapse's core capabilities each have a clear Fabric successor, and there is real lineage, but the architecture, storage model, and capacity-based commercial model differ materially from Synapse's per-resource provisioned and consumption models. Think next-generation successor, not the same product with a new logo. The Documenter reads the actual Synapse logic so the rebuild lands on the right Fabric workload rather than assuming a one-to-one swap.

How does the fleet handle logic that is split across the four Synapse workloads?

It follows the data across the seams. In a typical workspace a Spark notebook lands Delta, serverless SQL exposes views over it, a dedicated SQL pool holds the modeled warehouse, and pipelines orchestrate all three, so the end-to-end logic only exists across the boundaries. The Documenter traces datasets through the Spark to serverless to dedicated handoffs rather than documenting each pool in isolation, and records the type, Delta-version, and schema-sync mismatches at those handoffs as explicit risk items for the assessment gate.

Do the four Synapse workloads each map to a different target?

Yes, and Microsoft documents a distinct Fabric destination for each. Synapse pipelines map to Data Factory in Microsoft Fabric; Spark notebooks to Fabric Data Engineering; dedicated SQL pools to Fabric Data Warehouse; serverless SQL to the Fabric SQL analytics endpoint over a Lakehouse or to the Warehouse for relational work. On the Databricks path the same logic re-lands as Delta tables, Lakeflow Declarative Pipelines, and Unity Catalog objects. The Documenter records which workload each piece of logic lives in so the rebuild (as `dbt` or in your framework) targets the right destination.

Let's talk

Ready to migrate your Azure Synapse estate?

Tell us about your Azure Synapse Analytics landscape and we'll run the assessment, score the risk, and show you the path to Fabric or Databricks.

Plan my migration

A short form, no spam. We usually reply within one business day.

Plan my migration