Databases Azure SQL complete template-specs-five-use-cases template-specs-five-use-cases-three-case-studies

SQL active geo-replication

SQL active geo-replication keeps a readable copy of an Azure SQL Database in another region. The primary database handles normal writes, and Azure continuously replicates committed changes to the secondary. If the primary region has a serious outage, an operator or application can fail over to the secondary so the database becomes writable. It is a disaster recovery feature for individual databases, not a magic guarantee that every app connection, login, firewall rule, or dependent service will move perfectly without planning.

Back to glossary browser Open Microsoft Learn source

Aliases: SQL active geo-replication, sql active geo replication, sql-active-geo-replication
Difficulty: intermediate
CLI mappings: 5
Last verified: 2026-05-24

Microsoft Learn

Microsoft Learn describes active geo-replication as a business continuity feature for Azure SQL Database that creates readable geo-secondary databases. If a regional disaster or large outage occurs, you can initiate geo-failover to a secondary database in another Azure region. promptly. quickly.

Microsoft Learn: Active Geo-Replication - Azure SQL Database2026-05-24

Technical context

In Azure architecture, SQL active geo-replication sits in the Azure SQL Database business continuity layer. It is configured per database between logical servers that may be in different Azure regions. The control plane tracks replication links, primary and secondary roles, partner region, replication state, and failover actions. The data plane continues serving database reads and writes while log records replicate asynchronously. Architects must also plan connection strings, DNS or failover routing, logins, Microsoft Entra access, firewall rules, private endpoints, monitoring, backups, and application retry behavior.

Why it matters

SQL active geo-replication matters because many systems cannot wait for a full restore when a region-scale incident affects a business-critical database. A readable secondary can support disaster recovery, reporting offload, and migration rehearsal, but it also introduces decisions about recovery point, recovery time, security symmetry, and application failover. Misconfigured replicas create a false sense of safety: the secondary may exist, yet users cannot sign in, private endpoints are missing, or the app still points at the failed primary. Treating this term seriously turns continuity from a screenshot into an exercised database recovery design. It also clarifies which recovery decisions remain manual. That distinction prevents brittle recovery assumptions later.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

The Azure SQL database Replicas or Geo-replication view shows primary and secondary roles, partner region, replication state, failover options, and readable secondary configuration. during recovery drills for operators.

Signal 02

Azure CLI replication-link output exposes partner server, partner database, role, replication state, location, link ID, and actions available for planned or forced failover. for runbook evidence during drills.

Signal 03

Monitoring workbooks, activity logs, and incident runbooks reveal failover attempts, replication lag symptoms, connection failures, firewall mismatches, and post-failover validation results. during business continuity reviews and executive recovery reports after drills.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

Protect one critical Azure SQL Database with a readable regional secondary when restore-only recovery is too slow.
Rehearse database failover during business continuity exercises without waiting for a real regional outage.
Offload approved read-only reporting to a secondary while keeping write traffic on the primary database.
Migrate an application between regions by creating a secondary, validating access, then failing over during a planned window.
Compare active geo-replication with failover groups when an application needs coordinated failover for several databases.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Ticketing platform survives a regional database interruption

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A live-events ticketing platform depended on one Azure SQL Database for seat inventory. A previous regional incident forced a six-hour restore exercise that nearly canceled ticket sales.

Business/Technical Objectives

Reduce database recovery time from hours to under fifteen minutes.
Keep a readable secondary available for inventory verification reports.
Ensure buyers could authenticate after a regional failover.
Produce business continuity evidence for venue contracts.

Solution Using SQL active geo-replication

The database team configured SQL active geo-replication to a secondary server in another Azure region. They aligned firewall rules, Microsoft Entra administration, contained users, auditing, and private endpoints on both logical servers. The application already supported retry logic, so the release team added a controlled connection-switch step to the runbook. Azure CLI commands listed replication links, checked role and state, and captured evidence before and after quarterly drills. Reporting jobs were allowed to read seat inventory from the secondary, but write traffic stayed on the primary until failover was approved.

Results & Business Impact

Recovery drill time fell from 6 hours to 11 minutes.
Post-failover login validation succeeded for all tested buyer and operator roles.
Inventory reporting offload reduced primary read CPU by 18 percent during sales peaks.
Venue continuity audits accepted the CLI evidence package without follow-up findings.

Key Takeaway for Glossary Readers

Active geo-replication works when the database copy, identity path, network path, and application switch are tested as one recovery design.

Case study 02

Logistics firm rehearses region migration before peak season

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A logistics company wanted to move its dispatch application to a lower-latency region before holiday volume. The SQL database was the riskiest part of the migration window.

Business/Technical Objectives

Validate the new region with current data before the cutover weekend.
Limit write downtime to less than twenty minutes.
Keep dispatch supervisors able to review read-only data during rehearsal.
Avoid broad firewall access during temporary migration testing.

Solution Using SQL active geo-replication

Engineers created an active geo-replication secondary in the target region and connected a staged application slot to the readable copy for validation. Security teams mirrored private endpoints, firewall rules, and Entra access on the target logical server. During rehearsal, CLI scripts confirmed replication state, partner database, and role before supervisors reviewed dispatch data. On cutover weekend, the team paused writes, ran final checks, initiated failover, updated application configuration, and validated route assignment workflows. The old primary was kept as a secondary until the rollback window closed.

Results & Business Impact

Production write downtime was 13 minutes, below the 20-minute target.
Dispatch screen latency improved 34 percent for the main operations region.
No temporary public firewall opening was needed during validation.
Rollback readiness was preserved for 72 hours after cutover.

Key Takeaway for Glossary Readers

Active geo-replication can support planned regional migration when rehearsal, access parity, and rollback are built into the runbook.

Case study 03

Digital publisher protects subscriber access during traffic spikes

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A digital publisher's subscription database handled login status for breaking-news surges. Executives wanted a recovery path that did not depend only on backup restore during major events.

Business/Technical Objectives

Maintain a warm regional copy of the subscriber database.
Confirm login and entitlement users existed on both servers.
Use the secondary for approved read-only analytics outside emergencies.
Keep secondary cost visible to the media operations budget.

Solution Using SQL active geo-replication

The platform team configured active geo-replication for the subscriber database and created a secondary with a service objective sized for emergency write traffic, not just idle replication. They synchronized contained database users, verified Entra authentication, and routed audit logs from both servers to the same workspace. Analytics queries were moved to the readable secondary with resource limits so they did not mask failover readiness. Monthly CLI checks exported replication state, database edition, service objective, and partner server configuration for the operations review.

Results & Business Impact

Failover rehearsal completed in 8 minutes with subscriber login tests passing.
Primary database read load fell 23 percent after analytics moved to the secondary.
Budget reports separated replica cost from content-platform compute for the first time.
Two missing user mappings were found in rehearsal instead of during an outage.

Key Takeaway for Glossary Readers

A readable geo-secondary is useful insurance only when it is sized, secured, monitored, and exercised like the future primary.

Why use Azure CLI for this?

With a decade of Azure SQL operations behind me, I use Azure CLI for active geo-replication because recovery evidence must be fast, exact, and repeatable. CLI can list replication links, show partner servers, confirm roles, inspect replication state, and initiate planned or emergency failover without hunting through portal blades. It is also useful for comparing firewall rules, identities, private endpoints, and tags around both servers. During a real outage, teams need scripted checks and a known command path; they do not need five engineers debating which portal screen is current. The same commands support rehearsals, evidence capture, and rollback checks. Automated checks reduce mistakes when urgent recovery decisions are stressful.

CLI use cases

List active geo-replication links for a database and export partner region, role, replication state, and link ID.
Create or delete a replication link during controlled setup and teardown of a disaster recovery environment.
Initiate planned failover from a documented runbook after approval during a recovery drill or outage.
Compare server firewall rules, private endpoints, and Entra settings around primary and secondary logical servers.
Capture pre-failover and post-failover evidence for auditors, incident reviews, and business continuity reports.

Before you run CLI

Confirm both subscriptions, resource groups, logical server names, database names, and regions before touching replication links.
Use least privilege for inspection; creating links, deleting links, or failing over requires elevated Azure SQL permissions.
Treat failover commands as production-impacting because they can change the writable database role and application behavior.
Verify logins, users, firewall rules, private endpoints, and application connection switching before starting a planned drill.
Use JSON output for runbook checks so partner IDs, role values, and replication states are captured exactly.

What output tells you

Role tells you whether the local database is currently primary or secondary in the replication relationship.
Partner server and database fields identify the recovery target that must also have security and networking configured.
Replication state indicates whether the link is healthy enough to trust for planned recovery or reporting scenarios.
Location and resource IDs prove the regional boundary and help teams avoid failing over to the wrong environment.
Operation results show whether failover, link creation, or link deletion completed or needs additional validation.

Mapped Azure CLI commands

Azure SQL geo-replication link operations

direct-database-continuity

az sql db replica list-links --name <database> --server <primary-server> --resource-group <resource-group>

az sql db replicadiscoverDatabases

az sql db replica create --name <database> --server <primary-server> --resource-group <resource-group> --partner-server <secondary-server> --partner-resource-group <secondary-resource-group>

az sql db replicaprovisionDatabases

az sql db replica set-primary --name <database> --server <secondary-server> --resource-group <secondary-resource-group>

az sql db replicaoperateDatabases

az sql server firewall-rule list --server <server> --resource-group <resource-group>

az sql server firewall-rulediscoverDatabases

az sql db show --name <database> --server <server> --resource-group <resource-group>

az sql dbdiscoverDatabases

Architecture context

Architecturally, active geo-replication is only one layer of the recovery story. I design it with paired application deployment, regional networking, identity, monitoring, and a tested connection-switch process. The secondary server should have the access paths users will need after failover, including logins, Microsoft Entra configuration, firewall rules, private endpoints, and diagnostic settings. Because replication is asynchronous, the business must accept a possible data-loss window. For groups of databases that need a stable listener and coordinated failover, I would evaluate failover groups instead of relying on unrelated per-database replicas. Dependent caches, jobs, and reporting tools must be included. Include connection testing. Those decisions decide whether failover helps or merely moves the outage elsewhere.

Security

Security impact is direct because replication creates a second database copy in another region and often another logical server. The secondary must be protected with the same identity model, auditing, encryption expectations, firewall rules, private endpoint strategy, and privileged access controls as the primary. A disaster recovery drill can fail if logins are missing, server-level permissions differ, or network rules allow broader access than intended. Operators should verify Microsoft Entra authentication, database users, TDE status, auditing destinations, Defender settings, and break-glass procedures on both sides before depending on the replica. Security parity should be tested before every production drill. Test access. Access reviews must include both sides of the replication link.

Cost

Cost impact is direct because the geo-secondary is a separate database with its own compute and storage charges. Readable secondaries can offset reporting infrastructure, but they still require service-tier planning, monitoring, security operations, and sometimes private endpoint or data transfer costs. Oversizing the secondary wastes money, while undersizing it can make failover painful when the secondary becomes primary. FinOps teams should review replica count, service objectives, retention needs, monitoring volume, and whether the business value justifies active replication versus point-in-time restore, geo-restore, or failover groups. Replica spend should be tied to a named continuity requirement. Name the owner. Track approvals. Cost reviews should confirm the secondary is neither forgotten nor underpowered for recovery needs.

Reliability

Reliability impact is direct because active geo-replication is a continuity mechanism for individual databases. It can reduce recovery time compared with restore-only approaches, but it does not eliminate the need for testing. Replication lag, failover readiness, app retry behavior, DNS or configuration switching, and dependent services all affect real recovery. Operators should monitor replication state, test planned failover, document rollback, and confirm the secondary can become primary under pressure. If multiple databases must fail over together, uncoordinated active geo-replication links can create application consistency problems unless the architecture accounts for them. Monitoring should prove the secondary is usable, not merely present. Drills expose gaps before executives demand recovery during crisis.

Performance

Performance impact appears in two places: normal replication behavior and post-failover workload capacity. Active geo-replication is asynchronous, so heavy write bursts can increase lag and widen the possible recovery point. Read-only workloads on the secondary can be useful, but they consume the secondary database's resources and may affect reporting responsiveness. After failover, the former secondary must handle production writes, app connections, and user concurrency. Operators should test failover performance using realistic traffic, watch replication lag, and size the secondary for the recovery role, not only for idle insurance. Read workloads should be tested so they do not hide failover weakness. Post-failover testing should measure user transactions, not only database availability under load.

Operations

Operations teams manage active geo-replication by creating links, checking replication state, monitoring lag and availability, testing failover, documenting partner servers, and keeping access configuration aligned. Day-two work includes validating that both regions have the right firewall rules, private endpoints, logins, diagnostic settings, tags, and alert routing. During incidents, operators need to decide whether to wait, fail over, or preserve evidence before changing roles. Good runbooks include CLI commands, expected states, business approval steps, post-failover validation, and a plan for re-establishing protection after the old primary recovers. Operators also need contacts for application owners and database approvers. Keep approvers current. Review contacts. Ownership, timestamps, and communication checkpoints keep disaster recovery actions auditable under pressure during incidents.

Common mistakes

Creating the secondary database but forgetting matching logins, users, firewall rules, or private endpoints on the partner server.
Assuming active geo-replication protects every database in an application with coordinated consistency by default.
Never testing failover, then discovering connection strings and app retries still depend on the failed primary region.
Sizing the secondary too small because it usually looks idle, causing poor performance after it becomes primary.
Confusing active geo-replication with backup retention, point-in-time restore, geo-restore, or failover groups.

Operator quick checks

Show replication links and verify partner region, role, replication state, and database names before any failover test.
Compare access paths on both logical servers, including firewall rules, private endpoints, Entra admin, and logins.
Confirm alert routing and audit destinations work for the secondary as well as the primary.
Review whether the application has one database or several that must fail over consistently.
Run a planned failover drill and record connection, login, performance, and rollback results.

Questions to ask

What data-loss window is acceptable if the asynchronous replica must be promoted during an outage?
Which application settings, DNS records, or connection strings change when the secondary becomes primary?
Who can approve failover, and what evidence proves the primary region is impaired enough to switch?
Are security, auditing, and private networking equivalent on the partner server before failover?
How will protection be re-established after failover and after the original region recovers?

Related terms

No related terms mapped yet.

Graph connections

Graph edges are queued for this term.

Learn next

Use related terms, graph links, command groups, and comparison cards to keep moving through Azure without losing context.

Open relationship graph