Cosmos DB change feed is a built-in way to read changes that happen in a container. Instead of constantly scanning for new or updated items, an application can process the feed and react when writes occur. Teams use it to update search indexes, build materialized views, move data, trigger workflows, or keep another store in sync. It is not the same as a general queue; it is tied to container writes, partitioning, leases, continuation state, and the mode used to read changes.
Azure Cosmos DB change feed, change feed processor, Cosmos DB feed
Difficulty
intermediate
CLI mappings
4
Last verified
2026-06-02T07:49:29Z
Microsoft Learn
Cosmos DB change feed is the ordered stream of item changes from a container for asynchronous processing. Microsoft Learn documents latest-version mode and all-versions-and-deletes mode, enabling processors, Azure Functions, and pipelines to react to writes without polling the entire container.
Change feed sits in the Cosmos DB data plane as a stream of changes from a monitored container. Consumers can use SDK pull models, the change feed processor, Azure Functions triggers, or custom workers. Processing is coordinated through leases, partition ranges, continuation tokens, and checkpoints. Latest-version mode captures the current version of changed items, while all-versions-and-deletes mode can capture intermediate versions and deletes where supported. Architecture decisions include lease container design, processor scale, idempotent writes, replay strategy, RU budget, region behavior, and downstream consistency.
Why it matters
Change feed matters because it turns database writes into a reliable integration signal without forcing applications to poll or duplicate write logic everywhere. It is common in event-driven Cosmos DB designs: update a cache, project a read model, sync a search index, stream changes to analytics, or migrate containers with minimal downtime. The design also carries risk. Consumers can lag, downstream systems can fail, and reprocessing can duplicate work unless handlers are idempotent. Teams must understand what the selected feed mode includes, where checkpoints live, and how far processors can fall behind. Done well, change feed separates write paths from slower integration work.
⌁
Where you see it
Signals, screens, and Azure surfaces where this term usually becomes operational.
Signal 01
In Azure Functions configuration, a Cosmos DB trigger references the monitored container, lease container, connection setting, database name, and behavior for processing event batches reliably.
Signal 02
In SDK processors, change feed appears through builders, leases, continuation tokens, handlers, start options, feed modes, checkpoint state, processor instance names, and diagnostics during troubleshooting.
Signal 03
In monitoring dashboards, you notice change feed through processor lag, RU throttling, lease writes, failed batches, downstream ingestion errors, duplicate handling, freshness metrics, and evidence.
✦
When this becomes relevant
Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.
Keep an Azure AI Search index synchronized with Cosmos DB writes without scanning the full container repeatedly.
Build materialized read models in another container when operational queries need a different partition key.
Run a low-downtime migration by bulk copying history, then using change feed to catch up new writes.
Trigger workflows from item changes while keeping the original application write path fast and simple.
Capture deletes or intermediate versions where supported when audit or replication requirements exceed latest-version updates.
◆
Real-world case studies
Different enterprise-style examples that show the term being used to hit measurable objectives.
Case study 01
Marketplace search refreshed without polling scans
Scenario, objectives, solution, measured impact, and takeaway.
📌Scenario
An online equipment marketplace stored listings in Cosmos DB but refreshed its search index every hour through expensive full-container scans. Sellers complained that updated prices appeared late.
🎯Business/Technical Objectives
Reduce listing search freshness to under two minutes.
Stop hourly scans that consumed unnecessary RU/s.
Keep indexing failures visible to support.
Avoid slowing the seller listing update path.
✅Solution Using Cosmos DB change feed
Engineers replaced polling with a change feed processor that watched the listings container and updated Azure AI Search only for changed items. They created a dedicated lease container, made index writes idempotent by listing ID and version, and added dead-letter logging for records that failed enrichment. Azure CLI captured container partition key paths and throughput before deployment. The seller API continued writing to Cosmos DB without waiting for search updates. Dashboards measured write-to-index freshness, processor lag, RU throttling, and failed search documents. During rollout, the old hourly scan stayed available as a fallback for one week.
📈Results & Business Impact
Median search freshness improved from 61 minutes to 46 seconds.
RU consumption from indexing scans dropped 68%.
Support could identify failed listing updates from a dashboard instead of manual database checks.
Seller update latency stayed within the existing API target because indexing became asynchronous.
💡Key Takeaway for Glossary Readers
Change feed is valuable when downstream systems need fresh data but the write path should stay focused and fast.
Case study 02
Transit fare migration cut over with confidence
Scenario, objectives, solution, measured impact, and takeaway.
📌Scenario
A metropolitan transit agency needed to move fare-card account data to a new Cosmos DB container with a better partition key. A full outage was unacceptable during commuter hours.
🎯Business/Technical Objectives
Migrate active accounts without a weekend shutdown.
Keep new fare taps synchronized during bulk copy.
Prove the target container caught up before cutover.
Maintain rollback access to the source container.
✅Solution Using Cosmos DB change feed
The migration team bulk-copied historical account records into a target container partitioned by cardId. While copy ran, a change feed processor watched the source container and applied new account updates to the target. Each handler wrote with an idempotency key based on account ID and source timestamp. Azure CLI showed source and target container definitions, partition key paths, and throughput before every rehearsal. The team measured change feed lag against live tap volume and kept both containers readable during a staged cutover. A final reconciliation compared counts, sample balances, and last-update timestamps before routing production reads to the new container.
📈Results & Business Impact
No rider-facing outage occurred during cutover.
Change feed lag stayed below 18 seconds during peak tap replay.
Reconciliation found fewer than 0.02% mismatches, all tied to test cards.
The new partition key reduced fare lookup P95 latency by 57%.
💡Key Takeaway for Glossary Readers
For rekey migrations, change feed can bridge history and live writes so cutover is based on evidence, not hope.
Case study 03
Industrial sensor alerts projected into a control dashboard
Scenario, objectives, solution, measured impact, and takeaway.
📌Scenario
An industrial robotics company stored sensor events in Cosmos DB. Operators needed a live dashboard of abnormal vibration patterns, but synchronous dashboard writes slowed ingestion.
🎯Business/Technical Objectives
Keep sensor ingestion below 80 ms per event.
Update operator dashboard within 30 seconds.
Avoid duplicate alerts when processors retry.
Show lag and handler failures during plant incidents.
✅Solution Using Cosmos DB change feed
Architects kept the ingestion API focused on writing raw sensor events to Cosmos DB. A change feed processor consumed new events, calculated vibration summaries, and wrote alert projections to a separate dashboard container partitioned by facility and robot line. The handler used event ID and sequence number to make repeated processing safe. Azure CLI was used in runbooks to show container throughput, partition key paths, and account region settings. Operators monitored processor lag, failed calculations, destination RU throttling, and dashboard freshness. When the dashboard container was throttled during a plant incident, ingestion continued because the change feed processor retried independently.
📈Results & Business Impact
Sensor ingestion stayed at 54 ms P95 during the busiest production shift.
Dashboard freshness averaged 12 seconds and stayed below the 30-second objective.
Duplicate alert tickets dropped to zero after idempotency keys were added.
Incident triage identified destination throttling in 11 minutes instead of blaming the ingestion API.
💡Key Takeaway for Glossary Readers
Change feed lets teams decouple fast writes from slower projections, as long as retries and lag are treated as first-class operations.
Why use Azure CLI for this?
With ten years of Azure engineering experience, I use Azure CLI around change feed to validate the resources that make processors reliable. There is no single CLI command that reads every change feed event for you, but CLI can show the monitored container, lease container, partition key path, throughput, account region settings, and diagnostics configuration. That evidence matters when a processor lags or an Azure Functions trigger stops: the problem may be RU throttling, wrong container name, missing lease database, disabled function settings, or a region failover. CLI also helps automation compare production and staging container settings before deploying processors that depend on stable leases and throughput.
CLI use cases
Show monitored and lease container definitions before deploying a processor or Azure Functions trigger.
Inspect throughput settings when processors lag, receive 429s, or fall behind downstream targets.
List account regions and consistency settings to understand behavior during failover or multi-region processing.
Export diagnostics and container metadata for incident reviews involving stale search indexes or projections.
Before you run CLI
Confirm subscription, resource group, Cosmos DB account, database, monitored container, and lease container before collecting evidence.
Use read-only commands first; throughput updates, container changes, and account changes can create cost or availability risk.
Know whether the workload uses latest-version mode or all-versions-and-deletes mode because troubleshooting evidence differs.
What output tells you
Container output shows partition key, indexing, TTL, and naming details that processors and triggers depend on.
Throughput output shows whether RU limits could explain lag, 429 responses, or slow lease updates.
Account region and consistency output helps interpret failover, read location, and downstream freshness behavior.
Mapped Azure CLI commands
Cosmos DB change feed supporting checks
adjacent
az cosmosdb sql container show --account-name <account> --resource-group <resource-group> --database-name <database> --name <container>
az cosmosdb sql containerdiscoverDatabases
az cosmosdb sql container throughput show --account-name <account> --resource-group <resource-group> --database-name <database> --name <container>
az cosmosdb sql container throughputdiscoverDatabases
az cosmosdb show --name <account> --resource-group <resource-group>
az cosmosdbdiscoverDatabases
az functionapp config appsettings list --name <function-app> --resource-group <resource-group>
az functionapp config appsettingsdiscoverWeb
Architecture context
Architecturally, change feed is an integration backbone for Cosmos DB, but it should not be treated as a magic event bus. The monitored container remains the source of writes. Processors read changes, coordinate lease ownership, and write side effects to search, caches, warehouses, queues, or other containers. I design handlers to be idempotent, observable, and restartable because replay and retries are normal. Lease containers deserve their own capacity and monitoring because they control progress. For migrations, change feed can keep a target container current while bulk copy handles history. For product features, it can maintain read models without slowing the original write transaction.
Security
Security impact is direct because change feed consumers can copy sensitive data from the source container into other systems. Access to the monitored container and lease container should use managed identity or protected keys with least privilege. Downstream targets need the same data classification as the source unless fields are filtered or transformed deliberately. Do not send full change documents to logs for debugging. Review whether deletes, intermediate versions, or latest versions are available and appropriate for the data type. Network restrictions, private endpoints, encryption, and role assignments should cover both source and destination. Change feed is convenient, but it can become an uncontrolled replication path if ownership is weak.
Cost
Cost impact comes from RU consumption, downstream writes, processor hosting, logging, and duplicate processing. Reading change feed consumes request units, and handlers often write to search indexes, caches, analytics stores, queues, or other containers that have their own meters. Lagging processors may require higher throughput or better scaling. Reprocessing after a bug can double-write downstream data unless idempotency is planned. Detailed logs help support but can become expensive when every changed document is logged. Cost reviews should include monitored-container RU, lease-container RU, function or worker compute, destination ingestion charges, and the engineering cost of rebuilding side effects after handler defects.
Reliability
Reliability depends on checkpointing, leases, idempotent processing, and lag monitoring. A processor can crash after writing to a downstream system but before recording progress, so handlers must tolerate duplicates. If RU/s is too low on the monitored container or lease container, processors can lag behind writes. If lease containers are shared carelessly, one workload can affect another. Reliable systems track continuation progress, remaining lag, failed batches, poison records, dependency failures, and deployment versions. During failover or scale-out, processors should rebalance without losing work. For migrations, teams must prove the target caught up before cutover and keep rollback reads available.
Performance
Performance impact appears in both the source database and downstream latency. Change feed avoids polling scans, but processors still consume RU/s and can be throttled by hot partitions, large documents, or slow destinations. End-to-end freshness depends on write rate, feed read speed, handler concurrency, lease distribution, and target service capacity. A processor that falls behind may make search results, dashboards, or caches stale even while writes succeed. Performance testing should measure time from source write to destination update, not only database latency. Operators should tune batch size, processor instances, retry behavior, and downstream throughput while watching for duplicate side effects.
Operations
Operators manage change feed by watching processor health, lease ownership, lag, RU throttling, downstream failures, and deployment configuration. Common tasks include confirming container names, checking partition key paths, inspecting throughput, reviewing Azure Functions settings, validating lease container access, and replaying from a known continuation or start point when supported. Runbooks should explain whether the workload uses latest-version mode or all-versions-and-deletes mode, where checkpoints live, and which downstream systems receive copied data. Incident triage should separate source writes, feed reading, lease updates, handler failures, and destination throttling. Without that separation, teams often blame Cosmos DB when the actual problem is a slow target service.
Common mistakes
Assuming change feed is a general message queue and ignoring idempotency, leases, continuation state, and replay behavior.
Using the same underprovisioned lease container for multiple processors until checkpoint writes become a bottleneck.
Logging full changed documents during debugging and accidentally copying sensitive source data into log analytics.
Cutting over a migration before proving the target container has caught up with new source writes.