Even with a near-perfect migration plan, the “Last Mile” of an Azure SQL Managed Instance (SQL MI) deployment often reveals technical nuances that differ from on-premises SQL Server. In 2026, resolving these gaps is less about “fixing” and more about architectural alignment. This troubleshooting guide addresses the three most common hurdles that enterprise teams face post-migration, ensuring your cloud data estate remains performant and integrated.
Architect’s Insight: Solving Azure SQL MI Log Throughput Stalls
Common Azure SQL MI bottlenecks include the 105 MB/s transaction log rate cap and legacy SQL Agent proxy dependencies. Solve these by using Next-gen General Purpose storage (leveraging Azure Elastic SAN) and offloading OS tasks to Azure Automation via Managed Identities. For latency-sensitive apps, ensure your connection policy is set to Redirect rather than Proxy to bypass unnecessary gateway hops.
The 2026 Performance Breach: Why Your Migration is “Stalling”
In the 2026 cloud landscape, the cost of “waiting” for a migration to self-correct is higher than the cost of a Tier upgrade. If your instance is pinned at 100% log write governance, you are not just losing speed—you are losing the ability to compete in real-time. Upgrading to Next-gen General Purpose is no longer an option; it is a requirement for high-velocity data estates.
SQL Server Agent & SSIS: 2026 Automation Workarounds
For decades, SQL Server Agent and SQL Server Integration Services (SSIS) have been the workhorses of enterprise automation. While SQL MI provides a native Agent, certain legacy dependencies require modern workarounds. To maintain maximum development velocity during migration refactoring, ensure your IDE is fully optimized by resolving common SSMS IntelliSense not working issues that often surface in hybrid cloud environments.
Proxy Accounts & OS Commands
SQL MI Agent or SQL Server Agent in Azure SQL Managed Instance (MI) does not support local OS, CmdExec, or PowerShell proxies because the underlying operating system is abstracted.
- The Troubleshooting Step: Scan your legacy job steps for non-TSQL types.
- The Fix: Move to Azure Automation. By using the system-stored procedure
sp_invoke_external_rest_endpoint, you can trigger PowerShell and Python scripts directly from your Agent jobs. This allows for seamless execution while maintaining the security of Microsoft Entra Managed Identities.
SSIS Catalog (SSISDB) Integration
You can host your SSISDB directly on SQL MI, but for execution, the 2026 standard is to leverage the Azure-SSIS Integration Runtime (IR) within Azure Data Factory (ADF).
- Why it Matters: Traditional on-premises servers often suffer from resource contention when SSIS packages run on the same box as the DB. The Azure-SSIS IR allows for horizontal scaling and VNet injection, ensuring your packages have the “line of sight” to your data without stealing vCores from your primary instance.
Maintenance Plan Modernization
Legacy SSMS Maintenance Plans are frequently incompatible with the PaaS architecture.
- The Solution: Transition to SQL Agent Jobs running T-SQL-based scripts. Furthermore, for data integrity, many architects are enabling Azure SQL Database Ledger to provide a cryptographically verifiable audit trail of all changes.
Linked Servers: Connecting the Hybrid Cloud Estate
In a hybrid cloud architecture, the ability to query across environments is non-negotiable. While SQL MI supports Linked Servers, networking in 2026 requires stricter Zero-Trust alignment.
DNS & Identity Resolution
A common “last mile” blocker is the inability to resolve on-premises FQDNs, resulting in the dreaded “Server not found” error.
- The Solution: Implement an Azure Private DNS Resolver. This service acts as a bridge, allowing your SQL MI to “talk” to your local data center using human-readable names rather than brittle, hardcoded IP addresses.
- Connectivity Tip: Use the PowerShell command
Test-NetConnection -ComputerName [YourServer] -Port 1433from a VM within the same VNet to verify the network path before configuring the Linked Server.
Distributed Transactions (MSDTC)
While SQL MI supports distributed transactions via Server Trust Groups, the 2026 pivot for many organizations is moving toward Saga patterns or asynchronous messaging using Azure Service Bus. This eliminates the performance drag caused by synchronous distributed locks across high-latency hybrid links.
Log Throughput: Overcoming the 105 MB/s Governance Limit
A frequent performance bottleneck in PaaS is Transaction Log Rate Governance. Standard General Purpose instances often hit a hard log write limit of 105 MB/s.
Choosing the Right Tier: Performance vs. Governance Limits
Selecting the correct service tier is the most direct way to bypass governance-related “Migration Gaps.” Use the following comparison to align your workload with the appropriate Azure SQL MI resource limit:
| Feature | General Purpose (Legacy) | Next-gen General Purpose | Business Critical |
|---|---|---|---|
| Best Use Case | Budget-friendly dev/test | High-IOPS Enterprise Apps | Ultra-low latency (<2ms) |
| Log Throughput | Capped at 105 MB/s | Up to 155 MB/s+ | Up to 96 MB/s per database |
| IOPS Limits | Linked to vCore count | Independent (Up to 80k) | Linked to vCore count |
| Storage Tech | Remote Azure Premium Storage | Azure Elastic SAN | Local Direct-Attached SSD |
| Cost Profile | Baseline | Optimized Performance/Cost | Premium Performance |
Breaking the Throughput Ceiling
To identify if you are being throttled, check your resource stats:
SELECT TOP 10 end_time, avg_log_write_percent
FROM sys.dm_db_resource_stats
ORDER BY end_time DESC;
If avg_log_write_percent is consistently at 100%, you have reached the limit. In 2026, the architectural solution is the Next-gen General Purpose tier. This hardware evolution leverages Azure Elastic SAN to decouple IOPS from vCore count, providing the throughput needed for high-velocity data loads without the high cost of the Business Critical tier.
Below is the Master Automation Script, divided into the three critical areas of your 2026 deployment: Log Throughput Monitoring, Cloud-Native Agent Automation, and Resumable Maintenance.
1. The “Governance Watcher”: Monitoring Log Throughput
This script identifies if your instance is being throttled by the 105 MB/s log write cap. In 2026, consistent 100% values here are your signal to upgrade to Next-gen General Purpose.
/*
SQL MI GOVERNANCE WATCHER (2026)
Check for Transaction Log Rate Throttling
*/
SELECT
end_time,
avg_cpu_percent,
avg_log_write_percent, -- If this is ~100%, you are hitting the log cap
avg_data_io_percent,
(SELECT Max(v) FROM (VALUES (avg_cpu_percent), (avg_log_write_percent), (avg_data_io_percent)) AS value(v)) AS [Max_Resource_Pressure]
FROM sys.dm_db_resource_stats
ORDER BY end_time DESC;
2. The “Bridge”: Triggering Azure Automation via SQL Agent
Since SQL MI doesn’t support local OS commands (CmdExec), use this T-SQL script within a SQL Agent Job step to trigger an Azure Automation Webhook. This is the 2026 standard for executing PowerShell or Python scripts.
/* SQL MI TO AZURE AUTOMATION BRIDGE
Requires: Database Scoped Credential for Managed Identity
*/
DECLARE @url NVARCHAR(4000) = N'https://<your-automation-webhook-url>';
DECLARE @payload NVARCHAR(MAX) = N'{"Action": "RunMaintenance", "Instance": "' + @@SERVERNAME + '"}';
DECLARE @response NVARCHAR(MAX);
EXEC sp_invoke_external_rest_endpoint
@method = 'POST',
@url = @url,
@payload = @payload,
@headers = N'{"Content-Type":"application/json"}',
@timeout = 30,
@response = @response OUTPUT;
-- Verify the call was accepted (Status 202)
SELECT @response as Webhook_Response;
3. The “Resumable Rebuild”: Staying Under Log Limits
To avoid filling up your transaction log or hitting the throughput governor during index maintenance, use this Resumable Rebuild script. It allows you to pause maintenance if log pressure becomes too high.
/* 2026 RESUMABLE INDEX MAINTENANCE
Prevents log throughput bottlenecks by allowing manual pause/resume.
*/
-- Start a resumable rebuild
ALTER INDEX [YourIndexName] ON [YourTableName]
REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAX_DURATION = 60 MINUTES);
-- Monitor progress of resumable operations
SELECT
name,
percent_complete,
state_desc,
last_pause_time
FROM sys.index_resumable_operations;
/* TO MANUALLY PAUSE IF LOG GOVERNANCE HITS 100%:
ALTER INDEX [YourIndexName] ON [YourTableName] PAUSE;
*/
Implementation Guide (Summary)
- Step 1: Run the Governance Watcher to confirm if your performance issues are caused by the 105 MB/s log limit.
- Step 2: If you have legacy
CmdExecjobs, wrap the logic in an Azure Automation Runbook and trigger it using the Bridge Script. - Step 3: Refactor all large index maintenance to use Resumable Rebuilds to keep log growth in check.
Accelerated Database Recovery (ADR): The 2026 Default
In Azure SQL Managed Instance, the most critical takeaway for 2026 is that Accelerated Database Recovery (ADR) is always enabled by default and cannot be disabled. Unlike on-premises SQL Server, where manual toggling is common, Azure SQL MI treats ADR as a core architectural pillar to ensure high availability and near-instantaneous rollbacks.
How to Verify ADR Status
While you do not need to enable it, you should verify that it is active and monitoring your version store correctly. Use the following T-SQL script to check its status:
/* VERIFY ADR STATUS ON SQL MI */
SELECT name, is_accelerated_database_recovery_on
FROM sys.databases
WHERE name = 'YourDatabaseName';
Implementing these ADR optimizations is a key pillar of Azure SQL Managed Instance Governance. For organizations looking to maximize their Cloud ROI, these technical refinements often precede a shift to Enterprise-grade Monitoring Solutions or Premium Azure Support Tiers. By stabilizing the Persistent Version Store, you reduce the need for emergency SQL Consulting Services, allowing your team to focus on high-value Database Architecture and Cloud Financial Management (FinOps).
Troubleshooting PVS Growth and Log Truncation
If you are investigating ADR due to performance lag, the issue is likely Persisted Version Store (PVS) growth rather than the feature being disabled. If the ADR cleaner is “stuck” or the transaction log is not truncating despite ADR being active, use the following diagnostic script to check the health of the cleaner:
/* DIAGNOSE PVS SIZE AND CLEANER HEALTH */
SELECT
DB_NAME(database_id) AS DatabaseName,
persistent_version_store_size_kb / 1024 AS PVS_Size_MB,
pvs_off_row_page_skipped_oldest_aborted_xdesid AS AbortedTransactionBacklog,
online_index_atp_cleanup_status_desc AS CleanupStatus
FROM sys.dm_tran_persistent_version_store_stats;
Solving PVS Bloat: 2026 ‘Last Mile’ Tips for Accelerated Database Recovery (ADR)
Accelerated Database Recovery (ADR) is a foundational feature in Azure SQL Managed Instance that ensures near-instantaneous rollbacks and rapid recovery. However, unlike SQL Server on-premises, where you must manually enable it, ADR is “Always-On” in Managed Instance. This means ALTER DATABASE commands are redundant, but managing the underlying Persistent Version Store (PVS) is critical to preventing storage stalls.
1. The “Multi-Threaded Cleaner” Configuration
In 2026, the architectural standard for high-concurrency SQL MI instances is to ensure the ADR cleaner isn’t bottlenecked by a single thread. If you notice the PVS size growing despite low active transaction counts, use the following T-SQL to verify and optimize your cleaner thread count:
/* 2026 ARCHITECT'S OPTIMIZATION: ADR CLEANER THREADS
Note: Increasing thread count beyond 1 is recommended for high-volume
Enterprise SQL workloads where PVS bloat is detected.
PREREQUISITE: Ensure your SQL MI vCore count is >= 4 before increasing.
Setting this to 2 or 4 on a 2-vCore instance can lead to CPU starvation.
*/
-- Step 1: Check current ADR cleaner thread count and PVS health
SELECT
conf.name AS [Configuration],
conf.value_in_use AS [Current_Threads],
pvs.pvs_size_kb / 1024.0 AS [PVS_Size_MB],
pvs.oldest_active_xid AS [Oldest_Transaction_ID]
FROM sys.configurations conf
CROSS JOIN sys.dm_tran_persistent_version_store_stats(DB_ID()) pvs
WHERE conf.name = 'ADR Cleaner Thread Count';
-- Step 2: Optimize for 2026 multi-threaded recovery (High-vCore instances only)
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
-- Note: In 2026, set to 2 or 4. Do not exceed the number of logical CPUs.
EXEC sp_configure 'ADR Cleaner Thread Count', 2;
RECONFIGURE;
GO
Monitoring these Persistent Version Store metrics in real-time allows Database Administrators to distinguish between a temporary transaction surge and a systemic storage bottleneck. By integrating these health checks into your standard Enterprise Monitoring Dashboards, you ensure that ADR remains an asset rather than a storage liability.
2. Monitoring PVS (Persistent Version Store) Health
By default, ADR moves the version store from tempdb to the user database. While this improves tempdb stability, it means a single uncommitted transaction from a SQL Agent Job or a Linked Server session can cause the PVS to bloat, potentially filling your primary data drive. Always monitor sys.dm_tran_persistent_version_store_stats to track the “oldest_active_transaction_id.”
3. The “Next-gen GP” Connection
For workloads with high versioning (massive DELETE/UPDATE operations), Next-gen General Purpose is the preferred tier. Its Azure Elastic SAN backend handles the asynchronous I/O required for PVS cleanup significantly better than the legacy “Classic” GP tier, which often suffers from log-write stalls during heavy cleanup cycles.
Azure SQL Managed Instance ADR Troubleshooting and PVS Optimization
| Symptom | Primary Cause | 2026 Recommended Fix |
|---|---|---|
| Rapid Storage Growth | Long-running active transaction preventing PVS truncation. | Check sys.dm_tran_database_transactions and kill blockers. |
| Slow PVS Cleanup | Single-threaded cleaner bottleneck on high-volume instances. | Increase ADR Cleaner Thread Count configuration. |
| High Log Write Latency | IOPS limit hit on transaction log during heavy versioning. | Toggle the IOPS Slider in the Azure Portal (Next-gen GP). |
Architect’s Insight: Mastering Azure SQL MI Version Store Governance
Because Accelerated Database Recovery (ADR) is managed by the Azure platform, your troubleshooting focus should shift from “activation” to version store governance. If the Persisted Version Store (PVS) size exceeds your instance’s allocated storage, it can severely impact overall IOPS performance. In 2026, the best practice remains refactoring large-scale data modifications into batches to allow the ADR cleaner to cycle frequently and reclaim space.
Batching & Minimal Logging
To stay under the governor limits, refactor large INSERT...SELECT statements into smaller, manageable batches. Utilizing Resumable Index Rebuilds (RESUMABLE = ON) is also critical in 2026 to ensure that maintenance windows don’t exceed log throughput quotas. To optimize your 2026 data estate, you must move away from “all-at-once” operations which trigger the 105 MB/s log governor.
Below is the T-SQL for implementing Smart Batching and Resumable Maintenance.
1. Smart Batching (INSERT…SELECT Refactoring)
This script uses a WHILE loop to move data in chunks. This ensures the transaction log has time to truncate and prevents your Log Write Percent from hitting 100%.
/* 2026 SMART BATCHING TEMPLATE
Prevents Log Throughput Governance Stalls
*/
DECLARE @BatchSize INT = 5000; -- Adjust based on row width
DECLARE @RowsAffected INT = 1;
WHILE (@RowsAffected > 0)
BEGIN
INSERT INTO [TargetTable] WITH (TABLOCK) -- TABLOCK encourages minimal logging
SELECT TOP (@BatchSize) *
FROM [SourceTable] AS src
WHERE NOT EXISTS (
SELECT 1 FROM [TargetTable] AS tgt
WHERE tgt.ID = src.ID
);
SET @RowsAffected = @@ROWCOUNT;
-- Optional: Artificial delay to let the Log Cleaner catch up
-- WAITFOR DELAY '00:00:01';
END
2. Resumable Index Rebuilds
In 2026, running a standard REBUILD on a multi-terabyte table is an architectural risk. Use the RESUMABLE syntax to stay within your maintenance window quotas.
/* RESUMABLE MAINTENANCE (2026 STANDARD)
Keeps Log Throughput within Governance Limits
*/
-- Start the rebuild with a 60-minute hard cap
ALTER INDEX [IX_LargeTable_Column] ON [dbo].[LargeTable]
REBUILD WITH (
ONLINE = ON,
RESUMABLE = ON,
MAX_DURATION = 60 MINUTES
);
-- If the log limit is hit, the operation pauses. Check status here:
SELECT
name,
percent_complete,
state_desc,
last_pause_time
FROM sys.index_resumable_operations;
-- To manually resume after log pressure drops:
-- ALTER INDEX [IX_LargeTable_Column] ON [dbo].[LargeTable] RESUME;
Strategic TCO and Azure SQL Managed Instance Cost Optimization
Successfully resolving migration gaps is only the first step; the next is Azure SQL Managed Instance Cost Optimization. For large-scale environments, balancing performance with budget requires precise Next-gen GP Performance Benchmarking to ensure you aren’t over-provisioning vCores just to hit IOPS targets. Many organizations find that while they can solve immediate issues internally, complex architectures often benefit from specialized SQL MI Migration Consulting Services to ensure Day-1 production stability. To maintain this health post-migration, integrating Enterprise SQL Monitoring Tools is essential for tracking long-term trends in log throughput and version store growth that native tools might miss.
1. Log Write Throughput & the “Log Rate Governor”
Many DBAs encounter unexpected slowness during bulk loads. This is often not a disk issue, but a Log Rate Governance limit (RBIO_RG_STORAGE).
- The 2026 Reality: SQL MI limits log throughput based on vCore count (e.g., 4.5 MiB/s per vCore).
- The Fix: Transition to the Next-gen General Purpose tier. Unlike the legacy GP tier, the Next-gen architecture allows for higher throughput and decoupled IOPS, significantly reducing
WRITELOGwaits during migration cutovers.
2. Re-Architecting Linked Servers (Private Link & Entra ID)
Legacy linked servers using NTLM often fail in the cloud.
- The Fix: Modernize connections using Microsoft Entra ID (Azure AD) Managed Identities. In 2026, the gold standard is to utilize Private Link to bridge the connection between SQL MI and other Azure resources, ensuring all traffic stays within the Microsoft backbone with sub-millisecond latency.
3. SSIS Modernization via Azure Data Factory (ADF)
You cannot run SSIS packages “inside” the MI engine as you did on-premises.
- The Fix: Move the execution “engine” to the Azure-SSIS Integration Runtime (IR) in ADF.
- 2026 Best Practice: Store your SSISDB catalog on the Managed Instance, but trigger and monitor executions via the ADF portal for better scalability and FinOps tracking.
4. SQL Server Agent Object Dependencies
Agent jobs that rely on local file system paths (e.g., C:\Backups\) will fail.
- The Fix: Refactor jobs to use Azure Blob Storage (WASB/HTTPS). Use the Azure SQL Migration extension for VS Code to identify and script out these dependencies automatically before the final cutover.
Conclusion: Troubleshooting Azure SQL MI
Successful Azure SQL MI troubleshooting in 2026 requires a fundamental transition from legacy “on-box” thinking to a sophisticated model of cloud-native orchestration. By proactively solving Azure SQL MI log throughput stalls through an upgrade to the Next-gen General Purpose tier, you leverage Azure Elastic SAN to decouple IOPS from vCore limits.
Furthermore, mastering Azure SQL MI version store governance ensures that Accelerated Database Recovery (ADR) remains an asset rather than a bottleneck. When combined with Redirect connection policies to minimize latency and Azure Automation to replace restricted Agent proxies, you effectively eliminate the technical debt that slows down modern data estates. Resolving these Azure SQL MI migration gaps is the final step in building a resilient, high-velocity hybrid cloud environment that is fully optimized for 2026 and beyond.
Frequently Asked Questions (FAQs) Troubleshooting Azure SQL MI
How do I fix the 105 MB/s Azure SQL MI log throughput limit in 2026?
The 105 MB/s log write cap is a common governance bottleneck in the General Purpose tier. In 2026, the primary workaround is upgrading to the Next-gen General Purpose tier, which leverages Azure Elastic SAN to decouple IOPS from vCores. For high-velocity data loads, verify your throughput using sys.dm_db_resource_stats and consider enabling Accelerated Database Recovery (ADR) to optimize log space usage during bulk operations.
Why are my SQL Agent jobs failing after migrating to Azure SQL Managed Instance?
Most legacy SQL Agent failures in 2026 stem from Proxy account dependencies and unsupported CmdExec or PowerShell steps. Since SQL MI is a PaaS service without OS access, you must offload these tasks to Azure Automation. By using sp_invoke_external_rest_endpoint, you can trigger external scripts via high-performance webhooks while securing the connection with Microsoft Entra Managed Identities.
What is the best way to reduce Linked Server latency in a hybrid Azure environment?
To resolve “chattiness” and 15ms–25ms latency spikes in hybrid Linked Servers, toggle your Connection Policy from Proxy to Redirect. While Proxy is the legacy default for compatibility, Redirect mode allows the client to bypass the gateway and communicate directly with the backend node. Additionally, ensure you have an Azure Private DNS Resolver configured to resolve on-premises FQDNs without hardcoded IP addresses.
Can I run SSIS packages natively on Azure SQL MI in 2026?
While you can host the SSISDB on SQL MI, executing heavy packages locally can trigger resource governance. The 2026 architectural standard is to use the Azure-SSIS Integration Runtime (IR) within Azure Data Factory. This allows you to scale compute horizontally and maintain VNet-local connectivity to your Managed Instance without impacting the performance of your primary database engine.
Is Next-gen General Purpose better than Business Critical for IOPS-intensive workloads?
In 2026, Next-gen General Purpose is often the more cost-effective choice for IOPS-intensive workloads because it utilizes Azure Elastic SAN to provide up to 80,000 IOPS independently of vCore count. While the Business Critical tier still offers the lowest latency (1–2ms), Next-gen GP removes the traditional 105 MB/s log rate “Performance Breach” that previously forced many enterprises into more expensive tiers.
Architect’s Insight: The ROI of ADR Stability
From a Cloud Financial Management (FinOps) perspective, PVS bloat isn’t just a performance issue—it’s a hidden cost driver. Every GB of storage consumed by uncleaned versions on a Premium tier adds up. By implementing Multi-Threaded ADR Cleanup and moving to Next-gen General Purpose storage, organizations can achieve a 20-30% better performance-to-cost ratio, making it a critical step for Enterprise Cost Optimization in 2026.
📋 Download: Troubleshooting Azure SQL MI Checklist (2026)
Ensure your migration hits the finish line with our comprehensive “Last Mile” checklist. This guide covers everything from DNS resolution for Linked Servers to configuring the Next-gen IOPS slider.
- ✅ Automation: Audit SQL Agent for legacy proxies and move to Azure Automation.
- ✅ Networking: Deploy Private DNS Resolver for hybrid Linked Server connectivity.
- ✅ Throughput: Validate
sys.dm_db_resource_statsand enable Elastic SAN scaling. - ✅ Latency: Toggle connection policy to Redirect mode in the Azure Portal.

Add comment