Databases Azure Cosmos DB learning-path-anchor field-manual-complete field-manual-complete

Time to live

Time to live, or TTL, is an automatic expiration setting for Azure Cosmos DB items. Instead of writing custom cleanup jobs for old records, you tell Cosmos DB how long items should remain after they are last modified. The setting is expressed in seconds. A container can have a default TTL, and individual items can override it when needed. Learners should think of TTL as a lifecycle tool for operational data, not as a backup or audit feature. Expired items disappear from reads and are removed in the background.

Back to glossary browser Open Microsoft Learn source

Aliases: Time to live, time to live, Azure Time to live, Microsoft Learn Time to live, TTL, Cosmos DB TTL, defaultTtl, item expiration, container TTL
Difficulty: intermediate
CLI mappings: 4
Last verified: 2026-05-27

Microsoft Learn

Time to live in Azure Cosmos DB automatically expires items after a configured number of seconds. TTL can be set at the container level and overridden per item, helping teams remove stale operational data, limit storage growth, and keep queries focused on current records.

Microsoft Learn: Expire data with time to live in Azure Cosmos DB2026-05-27

Technical context

In Azure architecture, TTL lives in the Cosmos DB data model and container configuration. It is applied by the database engine after a container-level setting enables expiration, with optional per-item values for exceptions. TTL interacts with partitioning, indexing, request units, backup expectations, change feed designs, and application data-retention rules. The control plane configures the container property, while the data plane stores item-level TTL values. Operators must understand that TTL deletion is automatic and asynchronous, and it should be designed together with compliance, analytics export, and recovery requirements.

Why it matters

Time to live matters because stale data quietly becomes expensive, risky, and slow to query. Session records, device heartbeats, shopping carts, temporary recommendations, and security signals often have value for hours or days, not years. If teams keep everything forever, storage grows, indexes become heavier, queries scan more irrelevant data, and privacy reviews become harder. TTL gives developers a precise way to encode data freshness directly into Cosmos DB instead of relying on fragile scheduled jobs. It also forces an important governance conversation: which records are operationally temporary, which must be retained, and which must be exported before automatic deletion.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In the Cosmos DB container settings, defaultTtl shows whether item expiration is disabled, inherited by default, or set to a specific seconds value. during data reviews.

Signal 02

In item JSON, a ttl property can override the container default for records that should expire earlier, later, or not at all. during retention testing.

Signal 03

In Azure CLI container output, resource.defaultTtl, partitionKey, and throughput details help operators review lifecycle behavior before changing production data. during cleanup validation. during operational review.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

Expire shopping carts, sessions, device heartbeats, or temporary recommendations without running custom cleanup jobs.
Keep high-volume operational containers small enough for predictable query cost and manageable storage growth.
Apply item-level exceptions when most records should expire but a small subset needs longer retention.
Separate short-lived operational data from compliance records that must be exported or retained elsewhere.
Reduce privacy exposure by automatically removing data that no longer has a valid business purpose.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Food delivery app controls abandoned cart data

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A food delivery platform kept abandoned carts and menu personalization records indefinitely in Cosmos DB. Query costs climbed, and privacy reviewers challenged the lack of deletion rules.

Business/Technical Objectives

Expire abandoned carts after 72 hours unless an order is placed.
Reduce storage and query RU consumption for active cart screens.
Keep completed order records outside the expiring container.
Prove that TTL settings are consistent across regions and environments.

Solution Using Time to live

The application team separated temporary cart state from completed orders. The cart container used time to live with a default value matching the 72-hour business rule, while completed orders were written to a separate record-of-truth store. Item-level TTL overrides kept a few fraud-review carts longer when required. Azure CLI showed defaultTtl for every cart container in development, staging, and production, and the output was attached to the privacy review. Operators watched storage size, RU charges, and application errors for two weeks after rollout. The support tool was updated to explain when an abandoned cart was expected to disappear instead of treating it as a data-loss incident.

Results & Business Impact

Cart-container storage dropped 58% within 30 days.
RU charge for active-cart queries fell 34% during dinner peaks.
Privacy review findings for abandoned cart retention closed without custom cleanup jobs.
Support tickets about missing old carts fell after the tool showed expected expiration behavior.

Key Takeaway for Glossary Readers

Time to live works best when temporary state is separated from durable business records before expiration is enabled.

Case study 02

Online learning platform removes stale classroom presence

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

An online education provider stored classroom presence, cursor location, and temporary chat signals in Cosmos DB. Old signals polluted live dashboards and confused instructors after sessions ended.

Business/Technical Objectives

Expire classroom presence data within two hours of the last update.
Keep attendance and graded chat transcripts in durable stores.
Reduce instructor dashboard query latency during peak school hours.
Avoid nightly cleanup jobs that failed during exam weeks.

Solution Using Time to live

Engineers used time to live on the live-presence container and left attendance records in a separate database. The default TTL was set to two hours, while active clients refreshed documents as students interacted. The application treated missing presence records as offline status, not as an error. Azure CLI was used in the release pipeline to verify defaultTtl before deployment and to prevent accidental changes in production. Observability dashboards compared item count, RU consumption, and dashboard latency before and after TTL. The team also documented that TTL was not an audit trail and that course analytics had to read from the durable attendance pipeline instead.

Results & Business Impact

Live dashboard p95 query latency improved from 1.9 seconds to 620 ms.
Nightly cleanup job failures dropped to zero because the job was retired.
Presence-container storage stayed below 18 GB during exam week instead of growing past 70 GB.
Instructor complaints about stale student status fell 81% after the change.

Key Takeaway for Glossary Readers

Time to live keeps fast operational views clean when the application can safely recreate or ignore expired state.

Case study 03

Security platform limits short-lived threat signals

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A managed security service stored millions of short-lived threat-enrichment results in Cosmos DB. Analysts needed fresh context, but stale enrichments slowed investigations and increased exposure.

Business/Technical Objectives

Expire enrichment results after seven days unless attached to an open investigation.
Keep investigation evidence in a separate retained case store.
Reduce RU waste from queries scanning outdated signals.
Show auditors how temporary intelligence differs from retained evidence.

Solution Using Time to live

The security engineering group applied time to live to the enrichment-cache container. New records received the seven-day default, and the case-management service copied any enrichment attached to an active investigation into a retained evidence store. For rare exceptions, item-level ttl values extended cache life while a case remained open. CLI checks were added to the deployment pipeline to verify container defaultTtl and prevent accidental disabling. Analysts were trained that a missing cache record meant the enrichment should be refreshed, not restored. Azure Monitor tracked RU usage, query latency, and storage as old signals aged out.

Results & Business Impact

Cache storage decreased from 14 TB to 5.1 TB in six weeks.
Average enrichment-query RU charge fell 42%.
Investigation evidence completeness remained above 99.9% because retained cases used a separate store.
Audit review time for data-retention questions dropped from five days to one day.

Key Takeaway for Glossary Readers

Time to live reduces risk when short-lived intelligence is clearly separated from evidence that must survive.

Why use Azure CLI for this?

Azure CLI is useful for TTL because the setting is easy to miss in the portal and dangerous to change casually. A senior Azure engineer wants repeatable commands that show the container defaultTtl, database, partition key, throughput, and account context before any update. CLI output can be reviewed in pull requests, run from release pipelines, and archived with change tickets. It also helps compare TTL settings across many containers so one forgotten development default does not reach production. For item-level overrides, CLI is adjacent rather than complete; SDKs usually handle item values, while CLI validates the container baseline. It also makes this review easier to repeat in pipelines.

CLI use cases

Run az cosmosdb sql container show to inspect defaultTtl before changing a container or debugging missing records.
Run az cosmosdb sql container update with a reviewed TTL value to enable, change, or disable container-level expiration.
List containers in a database and export TTL settings for retention review across an application estate.
Check container throughput beside TTL settings to understand whether background deletion could contribute to RU pressure.
Capture container configuration before and after a TTL change for incident review, audit evidence, or rollback planning.

Before you run CLI

Confirm the Cosmos DB account, database, container, subscription, and resource group because TTL changes can delete production data automatically.
Review whether the container uses manual or autoscale throughput and whether expiration waves could compete with normal workload.
Validate legal retention, analytics export, backup expectations, and restore procedures before lowering an existing TTL value.
Check application code for item-level ttl overrides so the container default is not misunderstood during review.
Use JSON output and save the previous defaultTtl value before any update so rollback discussions start from evidence.

What output tells you

resource.defaultTtl shows whether TTL is off, enabled with no default, or set to a specific expiration interval in seconds.
partitionKey and indexingPolicy help reviewers understand whether lifecycle changes affect the same container used by critical query paths.
throughput output shows whether the container has enough RU headroom for normal traffic and background expiration work.
provisioningState confirms whether the container update completed before application teams start testing expiration behavior.
account, database, and container IDs prove the command targeted the intended environment, which is essential for destructive lifecycle changes.

Mapped Azure CLI commands

Cosmos DB TTL container commands

direct

az cosmosdb sql container show --account-name <account> --resource-group <resource-group> --database-name <database> --name <container> --query "{defaultTtl:resource.defaultTtl,partitionKey:resource.partitionKey,provisioningState:provisioningState}" --output json

az cosmosdb sql containerdiscoverDatabases

az cosmosdb sql container update --account-name <account> --resource-group <resource-group> --database-name <database> --name <container> --ttl <seconds>

az cosmosdb sql containerconfigureDatabases

az cosmosdb sql container list --account-name <account> --resource-group <resource-group> --database-name <database> --query "[].{name:name,defaultTtl:resource.defaultTtl}" --output table

az cosmosdb sql containerdiscoverDatabases

az cosmosdb sql container throughput show --account-name <account> --resource-group <resource-group> --database-name <database> --name <container> --output json

az cosmosdb sql container throughputdiscoverDatabases

Architecture context

TTL is an architecture decision about data lifecycle, not just a database checkbox. It should be decided with product owners, privacy teams, analytics users, and incident responders because automatic deletion changes what future queries and investigations can see. I separate short-lived operational containers from record-of-truth containers whenever possible. For example, a device heartbeat container can use a short TTL, while billing events are retained elsewhere under a formal retention policy. TTL also influences throughput planning because background deletes use capacity after items expire. Good designs document default TTL, allowed item overrides, export paths, backup assumptions, and rollback limits. That separation keeps the design aligned with ownership boundaries.

Security

Security benefits come from reducing unnecessary data exposure, but TTL must not be mistaken for access control. Expiring old items can limit the amount of personal, behavioral, or security data available if an account is misused. However, anyone with read access can still read unexpired items, and anyone with write access may change item-level values if the application allows it. Protect Cosmos DB keys, prefer managed identity and RBAC where supported, restrict network access, and monitor container changes. Compliance teams should verify that TTL aligns with legal retention, because deleting required records too early can be as damaging as keeping sensitive data too long.

Cost

TTL can reduce cost by limiting storage, index size, backup footprint, and query work against stale data. The savings are strongest for high-volume temporary data such as telemetry summaries, carts, sessions, or short-lived recommendations. There is still an indirect cost: background deletion consumes request units, and too-short retention can create reprocessing, support, or compliance costs if teams need deleted data later. FinOps reviews should compare storage growth before and after TTL, watch RU consumption during expiration waves, and confirm that data is not being copied elsewhere forever. The cheapest TTL setting is not always the right business-retention setting. Review usage regularly so spend does not drift silently.

Reliability

Reliability impact is subtle because TTL deletion is asynchronous and depends on available capacity. Expired items stop appearing in reads, but physical removal may lag under heavy load or low request-unit availability. Applications should not rely on TTL as a millisecond timer or as a workflow trigger. If expired data drives downstream cleanup, use explicit events or scheduled processing instead. Reliable designs test how the application behaves when items vanish between reads, when item-level overrides are absent, and when background deletion competes with normal workload. Before changing TTL, teams should understand restore needs, backup mode, and whether deleted data can be reconstructed.

Performance

TTL improves performance indirectly by keeping containers smaller and indexes focused on current items. Queries that filter active records, dashboards that scan recent events, and point reads against hot data can benefit when old noise leaves the container. However, TTL is not a substitute for good partition keys, indexing policy, query design, or provisioned throughput. Background deletion can also create RU pressure if many items expire at the same time. Operators should watch query latency, RU charge, throttling, storage size, and expiration patterns after a TTL change. Staggered item timestamps and realistic retention windows help avoid large synchronized cleanup waves.

Operations

Operators manage TTL by inventorying Cosmos DB containers, reading defaultTtl values, checking item schemas for overrides, and confirming the owning team understands the retention behavior. During troubleshooting, they ask whether missing records were deleted by TTL, manually deleted, overwritten, or never written. During change review, they compare proposed TTL values with business retention, backup, analytics export, and alerting requirements. Operational dashboards should include storage growth, request-unit consumption, item counts where available, and application errors after expiration changes. Runbooks should explain how to disable or change TTL safely and what evidence proves that expiration is working as expected. Keep the runbook linked to owners, alerts, dashboards, and validation commands.

Common mistakes

Setting TTL on a record-of-truth container and accidentally deleting data needed for billing, audit, or customer support.
Expecting TTL to fire exactly at the expiration second or to trigger business workflow events by itself.
Forgetting item-level ttl overrides, then assuming every item follows the same container default.
Lowering TTL without confirming backups, analytics exports, and downstream reports no longer need older data.
Ignoring RU pressure from large synchronized expiration waves after bulk imports or batch updates.

Operator quick checks

Show the container and record the current defaultTtl before proposing any expiration change.
Sample item documents and confirm whether ttl overrides exist, especially for exceptions and non-expiring records.
Compare retention requirements from product, privacy, analytics, support, and legal teams before choosing seconds.
Review storage size, RU usage, throttling, and query latency before and after enabling TTL.
Test the application path that reads expired records so missing-data behavior is intentional, not surprising.

Questions to ask

What data is safe to expire automatically, and what data must remain a record of truth elsewhere?
Who can approve lowering TTL, and how will users be warned if older records disappear?
What happens when an item expires while a workflow, report, or support case still references it?
How will teams prove whether missing data was expired by TTL or lost through another failure mode?
What metrics show storage, RU, latency, and deletion behavior after the TTL change?

Related terms

No related terms mapped yet.

Graph connections

Graph edges are queued for this term.

Learn next

Use related terms, graph links, command groups, and comparison cards to keep moving through Azure without losing context.

Open relationship graph