Databases Azure Cosmos DB learning-path-anchor

Cosmos DB continuous backup

Cosmos DB continuous backup means the backup mode that keeps a recent history of changes so a supported account, database, or container can be restored to a selected time. In Cosmos DB, it appears when teams need recovery from accidental writes, deletes, dropped containers, or account-level mistakes without rebuilding data from application logs. It controls the restore window, restorable resources, latest restorable timestamp, region, source account, and target account used during recovery. Teams should know owner, affected data, limits, and verification path before production changes. That shared language keeps developers, operators, security reviewers, and finance teams aligned.

Aliases
No aliases mapped yet
Difficulty
fundamentals
CLI mappings
4
Last verified
2026-05-12

Microsoft Learn

Azure Cosmos DB continuous backup is the backup mode that supports point-in-time restore for supported Cosmos DB resources within a configured retention tier.

Microsoft Learn: Continuous backup with point-in-time restore - Azure Cosmos DB2026-05-12

Technical context

Technically, Cosmos DB continuous backup uses the account backup policy, continuous tier, restorable resource metadata, latest backup timestamps, and restore workflow that creates or updates a recovery target. Configure it through account backup settings, create or restore commands, portal restore screens, and infrastructure templates that choose continuous mode. Verify with account backup policy output, restorable database and container lists, latest backup time checks, restore-test results, and incident records. Key choices include 7-day or 30-day tier, source region, restore timestamp, excluded resources, target account name, identity, and approval path. Capture scope, region, identity, capacity, backup state, owner, and rollback trigger.

Why it matters

Cosmos DB continuous backup matters because recovery is only useful when the team knows the backup mode, restore window, timestamp, and target account before a data incident starts. It turns an abstract database concept into something teams can operate, secure, recover, and explain. If misunderstood, teams can face missed restore windows, wrong recovery targets, unrecoverable corruption, delayed incident response, and business outages that exceed recovery objectives. For glossary readers, it shows where the term sits in the Cosmos DB model, which settings are safe to inspect, which changes require review, and which metrics, logs, or ownership records responders should check first. It keeps design reviews evidence-based.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In the Azure portal, Cosmos DB continuous backup appears near backup policy, restore, restorable resources; operators confirm scope, environment, readiness, and whether it belongs to production today.

Signal 02

In CLI, SDK, or IaC output, Cosmos DB continuous backup appears through backupPolicy, continuousTier, restorable resources; those fields create repeatable review evidence for audits, incidents, handoffs, and pull requests.

Signal 03

In monitoring and support work, Cosmos DB continuous backup appears beside backup status, restore events, activity logs; those signals connect symptoms to security, reliability, cost, and performance.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • teams need recovery from accidental writes, deletes, dropped containers, or account-level mistakes without rebuilding data from application logs.
  • recovery is only useful when the team knows the backup mode, restore window, timestamp, and target account before a data incident starts.
  • Use production evidence for Cosmos DB continuous backup during architecture reviews, incidents, and support handoffs.
  • Connect Cosmos DB continuous backup decisions to security, reliability, cost, operations, and performance outcomes.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Flash sale price recovery

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

PrairieMart, a regional retailer, accidentally published incorrect promotional prices to thousands of product documents during a weekend sale.

Business/Technical Objectives
  • Restore affected catalog data to the last known-good timestamp
  • Keep storefront downtime under 45 minutes
  • Preserve order records created after the mistake
  • Document recovery evidence for finance and audit
Solution Using Cosmos DB continuous backup

The platform team used Cosmos DB continuous backup to identify the latest restorable timestamp before the bad catalog write. Operators listed restorable resources for the product database, restored the affected containers into a temporary recovery account, and compared product documents against the live account. A controlled repair job copied only approved catalog fields back to production, leaving valid orders and inventory holds untouched. Azure Monitor, activity logs, and change records captured the restore timestamp, source account, target account, and validation queries. The runbook required finance approval before customer-facing prices were corrected. The team also added owner approval, validation evidence, and post-release monitoring for the flash sale price recovery workflow. Support notes captured rollback triggers, dashboard links, and escalation contacts so responders could act without tribal knowledge.

Results & Business Impact
  • Catalog repair completed in 32 minutes
  • No valid weekend orders were rolled back
  • Customer price complaints dropped 91 percent after correction
  • Audit received a complete timestamp and validation record
Key Takeaway for Glossary Readers

Continuous backup is most valuable when restore scope and validation steps are planned before accidental writes happen.

Case study 02

Patient scheduling rollback

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

HarborLine Clinics found that an integration bug had overwritten appointment reminder preferences for several thousand patients.

Business/Technical Objectives
  • Recover preferences without deleting new appointments
  • Meet a two-hour patient communication recovery target
  • Prove restored data was limited to reminder settings
  • Avoid exposing restored patient records to extra support users
Solution Using Cosmos DB continuous backup

Engineers used continuous backup to restore the scheduling database to a separate account at the timestamp before the integration deployment. The recovery team compared preference fields by patient id and generated a narrow patch list instead of replacing complete documents. Private endpoints and temporary RBAC limited access to the restored account, while Key Vault controlled connection strings for the repair tool. Operators recorded restorable-resource output, latest backup time, validation query counts, and approval from the privacy office. After the patch, the temporary account was locked down and scheduled for deletion. The team also added owner approval, validation evidence, and post-release monitoring for the patient scheduling rollback workflow. Support notes captured rollback triggers, dashboard links, and escalation contacts so responders could act without tribal knowledge.

Results & Business Impact
  • Preferences were corrected in 74 minutes
  • New appointments and messages remained intact
  • Privacy review confirmed no broad support access was granted
  • The repair process became the template for future bad-write incidents
Key Takeaway for Glossary Readers

Continuous backup supports precise recovery when teams restore to a safe target and repair only the affected data.

Case study 03

Permit database delete response

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

CivicWorks, a municipal permitting agency, lost a container after an automation script ran against the wrong environment.

Business/Technical Objectives
  • Restore deleted permit records for the affected date range
  • Keep public permit lookup available during recovery
  • Capture evidence for the change-control board
  • Reduce future delete-script blast radius
Solution Using Cosmos DB continuous backup

The operations group used continuous backup to find restorable databases and containers for the production account. They restored the deleted permit container into a new account, validated record counts by permit type, and redirected only internal review tools to the restored copy while public lookup stayed on cached summaries. A follow-up script rehydrated missing records into the original account after legal approval. The team added CLI prechecks for account, resource group, and environment tags before destructive jobs. Activity logs, restore output, and validation queries were attached to the post-incident review. The team also added owner approval, validation evidence, and post-release monitoring for the permit database delete response workflow. Support notes captured rollback triggers, dashboard links, and escalation contacts so responders could act without tribal knowledge.

Results & Business Impact
  • Deleted permit records were restored within the four-hour objective
  • Public lookup stayed available with no reported outage
  • Change-board evidence was accepted without rework
  • Precheck automation blocked two later wrong-environment script attempts
Key Takeaway for Glossary Readers

Continuous backup gives public-sector teams a practical recovery path when delete mistakes are paired with strong validation and change controls.

Why use Azure CLI for this?

Use CLI to prove backup mode, latest restorable time, and restore scope before a recovery decision depends on screenshots or memory.

CLI use cases

  • Confirm continuous backup mode and tier during readiness reviews.
  • Find the latest restorable timestamp during accidental-write incidents.
  • Document restore parameters before creating or validating a recovery account.

Before you run CLI

  • Confirm source account, region, API type, database, container, and timestamp in UTC.
  • Use read-only restorable-resource commands before any restore command.
  • Get business approval before restoring into or switching traffic to a target account.

What output tells you

  • Account output shows whether the backup policy is continuous and which tier applies.
  • Restorable-resource output shows which databases and containers existed at the timestamp.
  • Restore output confirms the target account, region, status, and recovery operation tracking details.

Mapped Azure CLI commands

Cosmos DB continuous backup CLI checks

direct
az cosmosdb show --name <account> --resource-group <resource-group>
az cosmosdbdiscoverDatabases
az cosmosdb sql retrieve-latest-backup-time --account-name <account> --resource-group <resource-group> --database-name <database> --container-name <container> --location <region>
az cosmosdb sqldiscoverDatabases
az cosmosdb sql restorable-resource list --account-name <restorable-account> --location <region> --restore-timestamp <timestamp>
az cosmosdb sql restorable-resourcediscoverDatabases
az cosmosdb restore --account-name <source-account> --target-database-account-name <target-account> --resource-group <resource-group> --location <region> --restore-timestamp <timestamp>
az cosmosdbprotectDatabases

Architecture context

Cosmos DB continuous backup belongs in the recovery architecture for accounts where accidental deletes, bad writes, migrations, or failed releases are realistic risks. I treat it as part of the production runbook, not just a backup setting. The design needs to document retention tier, supported APIs, restorable databases and containers, restore target behavior, region availability, and who can initiate a restore. Application teams should know how to capture the incident timestamp, verify latest restorable time, restore into a clean account when required, and reconnect consumers safely. Platform teams should pair it with RBAC, diagnostic logs, private networking, and periodic restore drills. Continuous backup gives precision, but it only helps when restore evidence and cutover decisions are already practiced.

Security

Security for Cosmos DB continuous backup starts with knowing who can inspect backup state, initiate restore operations, read restored data, and approve access to a recovered account. Review RBAC, data-plane permissions, keys, managed identities, firewall rules, private endpoints, encryption, diagnostics, and backup access. Avoid broad admin access just because a team needs to troubleshoot one resource or feature. Sensitive data can appear in query output, logs, support tickets, exports, or downstream processors. Operators should prefer read-only discovery, store secrets in approved locations, and document every emergency change. The safest design proves who can read data, who can change configuration, and how denied access is logged and reviewed.

Cost

Cost for Cosmos DB continuous backup comes from continuous backup tier selection, storage history, restored account capacity, extra regions, test restores, monitoring, and the temporary cost of recovery environments. Some spending is direct, while other costs appear as retries, duplicate processing, larger logs, extra environments, migration effort, or staff time during investigations. Review budgets, tags, expected usage, retention, alert thresholds, and change windows before scaling or enabling new behavior. Compare the cost of prevention, monitoring, and testing with the cost of an outage or data repair. The safest cost review ties spending to owner, workload value, measured demand, and rollback plan. Include both steady-state and incident-driven costs in the review.

Reliability

Reliability for Cosmos DB continuous backup depends on backup mode selection, latest restorable timestamp, regional backup availability, restore drills, application cutover planning, and clear recovery ownership. Define the expected failure mode before production use, including what happens during regional incidents, throttling, expired credentials, schema drift, blocked network paths, or restore activity. Monitor health, latency, request units, errors, retry rate, backlog, and stale-data indicators rather than trusting a single success message. Test rollback, restore, failover, replay, or reprocessing steps where they apply. A reliable runbook names the owner, required evidence, escalation path, and point where rollback is safer than live repair. Retest after meaningful platform, schema, identity, or region changes.

Performance

Performance for Cosmos DB continuous backup is measured through restore duration, latest restorable timestamp lag, recovered account validation time, application cutover time, and post-restore read and write latency. Tune only after confirming the real bottleneck, because identity, networking, client retries, partition choice, query shape, consistency, or quota can mimic platform slowness. Use baseline metrics before and after every significant change. Test peak load, failure recovery, and representative data rather than happy-path samples. A good performance plan states the target, measurement window, acceptable tradeoff, and rollback trigger so speed improvements do not damage reliability, security, or cost control. Keep the accepted baseline with the change record.

Operations

Operationally, Cosmos DB continuous backup needs documented restore drills, timestamp capture procedures, restorable-resource checks, owner approvals, and a cleanup plan for restored accounts. Keep portal location, CLI discovery commands, dashboards, alerts, IaC source, change history, and support ownership close to the runbook. Capture before-and-after evidence with tenant, subscription, resource group, region, owner, timestamp, and environment. Separate read-only inspection from mutating or destructive actions so responders do not improvise under pressure. Good operations make the term searchable, auditable, and explainable across engineering, support, security, and finance handoffs. Store evidence where incident responders can find it without developer access or tribal knowledge during high-pressure incidents.

Common mistakes

  • Starting restore work without an exact UTC timestamp from logs or incident notes.
  • Assuming every API, region, or resource supports the same restore workflow.
  • Forgetting that restored data may require separate security review before application cutover.