Wednesday, August 06, 2025

Everything You Need to Know About On-Premise SQL Server Mirroring in Microsoft Fabric

Microsoft Fabric is rapidly changing the data landscape, bringing together analytics, data science, and business intelligence into a single, unified platform. One of its most exciting features for organisations with existing on-premises SQL Server deployments is Mirroring. This capability offers a near real-time, low-friction way to bring your transactional data into the Fabric ecosystem for immediate analytics, reporting, and AI workloads.  

At the time of writing this post, this feature is still in Preview. This means this feature cannot be used in Production scenarios.

But what exactly is it, how does it work, and what should you be aware of before diving in? 

Let's break it down.

What is SQL Server Mirroring in Microsoft Fabric?

At its core, SQL Server Mirroring in Fabric provides a near real-time, low-latency replication mechanism for your on-premises SQL Server databases directly into Microsoft OneLake, Fabric's unified data lake. It creates a "Mirrored Database" item in your Fabric workspace, which then exposes a SQL Analytics Endpoint and a Default Semantic Model. 

This means your live transactional data is instantly available for the below use cases.

  • Querying with T-SQL using the SQL Analytics Endpoint.
  • Building Power BI reports with Direct Lake mode for unparalleled performance.
  • Leveraging Spark notebooks for advanced analytics and transformations.
  • Creating operational dashboards that reflect the latest business transactions.

The magic behind this is that Fabric automatically uses Change Data Capture (CDC) for SQL Server 2016-2022 or a native change feed for SQL Server 2025 to intelligently track and synchronise changes, avoiding the need for complex ETL pipelines for initial data ingestion.

The Power of Mirroring: Key Benefits

  • You can say goodbye to nightly batch jobs. Mirroring ensures your analytics are always based on the freshest possible data.
  • No more manual pipeline creation for initial loads and incremental updates. Fabric handles the replication automatically.
  • Your SQL Server data lands directly in OneLake, making it accessible across all Fabric personas (Data Engineering, Data Science, Data Warehousing, BI).
  • The actual data replication process itself (the initial snapshot and ongoing CDC) does not consume your Fabric Capacity Units (CU). You only pay for the compute when you query or process the data in Fabric.
  • For supported tables, Power BI datasets automatically leverage Direct Lake, offering the performance of Import Mode with the freshness of DirectQuery.
  • The replication is designed to be low impact on your production SQL Server, using CDC mechanisms that are highly optimised.

What You Need to Know: Requirements & Limitations

While incredibly powerful, mirroring isn't a silver bullet, understanding its current requirements and limitations is crucial for a successful implementation.

On-Premises SQL Server Requirements -

  • SQL Server 2016 through 2022 (using CDC) and SQL Server 2025 (using native change feed).
  • Mirroring is supported only on the primary replica of an Always On Availability Group.
  • SQL Server on Windows is fully supported. SQL Server on Linux (2017 CU18+ to 2022) is also supported.
  • A Microsoft Data Gateway (Standard or VNET) is required for secure communication between Fabric and your on-premises SQL Server.

Database & Table Limitations:

  • Single Mirror Target: A database can only be mirrored to one Fabric workspace at a time.
  • No Redundant CDC (SQL Server 2025): If you're on SQL Server 2025, do not manually enable CDC on the source database; Fabric handles this for you.
  • No Delayed Transaction Durability: Databases with this setting cannot be mirrored.
  • 500 Table Limit: Each mirrored database can replicate a maximum of 500 tables. Fabric will pick the first 500 alphabetically if you choose "Mirror all data."
  • Unsupported Data Types: This is a big one! Tables containing columns with types like image, text, ntext, xml, timestamp/rowversion, sql_variant, geometry, geography, hierarchyid, json, vector, or User-Defined Types (UDT) cannot be fully mirrored. The presence of these types will cause the entire column, or sometimes the whole table, to be excluded.
  • Precision Loss: datetime2(7) and datetimeoffset(7) columns will lose their 7th digit of precision (becomes 6 digits) when mirrored to OneLake's Delta Lake format.
  • No Row/Column Filtering: Mirroring is an "all or nothing" proposition for selected tables. You cannot filter specific rows or exclude individual columns during replication. 
  • DDL Changes: Significant schema changes (like altering a Primary Key) will often trigger a full re-sync of the entire table, leading to higher latency for that specific table until the re-sync completes.

Measuring Capacity Usage: Understanding the Costs

  • Replication is FREE (Compute-wise) -- The initial snapshot and continuous Change Data Capture (CDC) processing itself does not consume your Fabric Capacity Units (CU). This is a significant advantage over other replication methods.
  • You Pay for Data Usage -- Your Fabric Capacity Units are consumed when you -

    • Query the mirrored data via the SQL Analytics Endpoint.
    • Interact with Power BI reports built on the mirrored data (especially in Direct Lake mode).
    • Process the data using Spark notebooks or other Fabric experiences.
  • Storage Costs -- You get a generous 1 TB of free mirroring storage for every 1 Capacity Unit (CU) you own. For example, an F64 capacity gets 64 TB of free storage for mirrored data. You only start paying for OneLake storage if you exceed this allowance or if your Fabric Capacity is paused (storage charges still apply).

How to Monitor ? 

Use the Microsoft Fabric Capacity Metrics App to track your CU consumption and storage usage for "Mirrored Database" items. This app will clearly show you what's driving your costs. You can also leverage Workspace Monitoring (via KQL Databases) to track MirroredDatabaseTableExecution logs for metrics like ProcessedBytes, ProcessedRows, and Latency to understand the operational health and volume of your replication.

Prerequisites to Getting Started with Mirroring

  • Verify your SQL Server version, Always On Availability Group configuration, and install a Data Gateway if needed.
  • Fabric will automatically enable CDC on your source database (or use the native change feed for SQL Server 2025). Ensure your SQL Server instance has the necessary permissions.
  • In your Fabric workspace, select "New > More Options > Mirrored Database (SQL Server)."
  • Provide your SQL Server connection details and gateway information.
  • Choose which tables you want to mirror. Fabric will automatically identify and flag any tables with unsupported data types.
  • Keep an eye on the replication status in Fabric and use the Capacity Metrics App to track usage.

Conclusion

On-premises SQL Server Mirroring in Microsoft Fabric is a game-changer for organisations looking to bridge the gap between their transactional systems and modern analytics. By providing near real-time, low-cost data ingestion directly into OneLake, it unlocks a world of possibilities for faster insights and more agile data operations. While you need to be mindful of its current limitations, the benefits of quick, seamless integration make it an invaluable tool in your Fabric journey.


No comments:

Medallion Architecture in Fabric

Medallion Architecture is a design pattern that has become the industry standard for Lakehouse environments. It is not a Fabric feature. It...