Databases Azure Cosmos DB premium

Integrated cache

Integrated cache controls how read-heavy Cosmos DB API for NoSQL applications reduce latency and request-unit consumption by reusing cached point reads and query results. Teams see it in azure cosmos db accounts, dedicated gateway configuration. It is not Azure Cache for Redis, a client-side memory cache, analytical store, a materialized view, or a replacement for correct partitioning; confusing them can create unexpected RU spending, stale reads. Use the term when reviewing access, monitoring, cost, recovery, or performance. It keeps architects, operators, security reviewers, and support teams focused on the same setting, resource, or behavior.

Aliases
Cosmos DB integrated cache, Azure Cosmos DB integrated cache, dedicated gateway cache, item cache, query cache
Difficulty
Intermediate
CLI mappings
5
Last verified
2026-05-15

Microsoft Learn

Integrated cache controls how read-heavy Cosmos DB API for NoSQL applications reduce latency and request-unit consumption by reusing cached point reads and query results. Microsoft Learn places it in Azure Cosmos DB integrated cache - Overview; operators confirm scope, configuration, dependencies, and production impact.

Microsoft Learn: Azure Cosmos DB integrated cache - Overview2026-05-15

Technical context

Technically, Integrated cache sits in Azure Cosmos DB accounts, dedicated gateway configuration, application connection strings, item cache. Key fields include dedicated gateway size, dedicated gateway instance count, connection endpoint, max integrated cache staleness. Operators verify it with Cosmos DB account properties, dedicated gateway settings, request charge, cache hit behavior. In production reviews, connect the term to resource scope, identity, network path, diagnostics, cost ownership, and rollback. Confirm subscription, resource group, service tier, dependent workload, and current Azure evidence before changing it.

Why it matters

Integrated cache matters because it turns an architecture choice into day-to-day workload behavior. If the team misunderstands it, the failure usually appears as unexpected RU spending, stale reads, cache miss storms before anyone notices the documentation gap. The term also affects security, reliability, operations, cost, and performance because one setting can influence access, recovery, automation, user experience, and budget. Naming it precisely helps engineers compare portal settings, CLI output, infrastructure-as-code, monitoring data, and incident notes without guessing. It also gives reviewers a practical checklist: where is it configured, who owns it, what depends on it, what evidence proves it works, and how rollback happens.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In the Azure portal, Integrated cache appears near azure cosmos db accounts, dedicated gateway configuration, where owners review configuration, health, access, and dependent workload impact before safe production changes.

Signal 02

In CLI or REST output, Integrated cache shows up through cosmos db account properties, dedicated gateway settings and related fields that confirm live Azure state during audits, releases, and incidents.

Signal 03

In incident reviews, Integrated cache is discussed when users report unexpected RU spending, and engineers compare logs, metrics, ownership, dependencies, recent changes, support impact, and deployment evidence together.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • Design and review Integrated cache as part of a production Azure workload.
  • Troubleshoot incidents where Integrated cache affects user-visible behavior or operator evidence.
  • Document ownership, rollback, monitoring, and cost impact for Integrated cache during governance reviews.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Integrated cache in action for read-heavy claims portal

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Fabrikam Claims, a insurance organization, needed to cut peak read latency and RU burn on a claims status portal that repeatedly queried the same policy and payment records. The team had to improve the design without disrupting existing users or weakening governance.

Business/Technical Objectives
  • Use Integrated cache to solve the immediate workload problem
  • Keep security and compliance evidence available for review
  • Reduce manual support effort during operations
  • Measure results with production telemetry and owner signoff
Solution Using Integrated cache

Architects treated Integrated cache as a production control point rather than a background detail. They reviewed the current Azure resources, confirmed owners, and documented how the term connected to identity, networking, monitoring, cost, and rollback. Engineers implemented a dedicated gateway, integrated cache staleness settings, SDK endpoint changes, partition-aware point reads, Azure Monitor latency charts, and rollback to the standard gateway endpoint, then validated the change with read-only CLI checks and portal evidence. The rollout used a pilot scope first, with diagnostic logging enabled before wider release. Support teams received a runbook explaining expected output, common failure modes, and the safest rollback path. Security reviewers checked access boundaries and data-handling assumptions before the change moved to production.

Results & Business Impact
  • lowered read RU consumption by 58 percent during weekday peaks
  • reduced p95 claims lookup latency from 180 ms to 55 ms
  • kept staleness within the approved five-minute business window
  • avoided a provisioned-throughput increase for two regional containers
Key Takeaway for Glossary Readers

Integrated cache is valuable when teams connect the Azure setting to measurable security, reliability, operational, cost, and performance outcomes.

Case study 02

Integrated cache in action for product catalog acceleration

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Contoso Retail Group, a retail organization, needed to serve holiday product catalog reads without overprovisioning Cosmos DB throughput for pages that rarely changed during a promotion. The team had to improve the design without disrupting existing users or weakening governance.

Business/Technical Objectives
  • Use Integrated cache to solve the immediate workload problem
  • Keep security and compliance evidence available for review
  • Reduce manual support effort during operations
  • Measure results with production telemetry and owner signoff
Solution Using Integrated cache

Architects treated Integrated cache as a production control point rather than a background detail. They reviewed the current Azure resources, confirmed owners, and documented how the term connected to identity, networking, monitoring, cost, and rollback. Engineers implemented dedicated gateway scaling, query cache validation, cache-warming jobs, product update alerts, and dashboards comparing cacheable queries with direct reads, then validated the change with read-only CLI checks and portal evidence. The rollout used a pilot scope first, with diagnostic logging enabled before wider release. Support teams received a runbook explaining expected output, common failure modes, and the safest rollback path. Security reviewers checked access boundaries and data-handling assumptions before the change moved to production.

Results & Business Impact
  • absorbed a 3.4x traffic spike without throttling
  • cut catalog query latency by 47 percent
  • reduced projected seasonal RU spend by 31 percent
  • kept price updates visible within the approved staleness target
Key Takeaway for Glossary Readers

Integrated cache is valuable when teams connect the Azure setting to measurable security, reliability, operational, cost, and performance outcomes.

Case study 03

Integrated cache in action for clinical reference lookup

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Northlake Health, a healthcare organization, needed to speed up clinician lookup screens that repeatedly read reference-code documents while preserving audit evidence for patient-facing systems. The team had to improve the design without disrupting existing users or weakening governance.

Business/Technical Objectives
  • Use Integrated cache to solve the immediate workload problem
  • Keep security and compliance evidence available for review
  • Reduce manual support effort during operations
  • Measure results with production telemetry and owner signoff
Solution Using Integrated cache

Architects treated Integrated cache as a production control point rather than a background detail. They reviewed the current Azure resources, confirmed owners, and documented how the term connected to identity, networking, monitoring, cost, and rollback. Engineers implemented private endpoint access, integrated cache through the dedicated gateway, monitored request charge, session token review, and application logs tied to change tickets, then validated the change with read-only CLI checks and portal evidence. The rollout used a pilot scope first, with diagnostic logging enabled before wider release. Support teams received a runbook explaining expected output, common failure modes, and the safest rollback path. Security reviewers checked access boundaries and data-handling assumptions before the change moved to production.

Results & Business Impact
  • improved p95 lookup latency by 61 percent
  • kept audit reviewers supplied with request diagnostics
  • reduced incident tickets about slow reference screens by 42 percent
  • maintained approved network isolation for the database account
Key Takeaway for Glossary Readers

Integrated cache is valuable when teams connect the Azure setting to measurable security, reliability, operational, cost, and performance outcomes.

Why use Azure CLI for this?

CLI checks are useful for Integrated cache because they capture live Azure state, reduce guesswork, and separate safe inspection from approved changes.

CLI use cases

  • Confirm the live Azure resource or configuration related to Integrated cache before approving a production change.
  • Capture read-only evidence for Integrated cache during incident response, audit review, or release validation.
  • Compare CLI output with infrastructure-as-code, portal settings, and runbook expectations for Integrated cache.
  • Validate graph-connected dependencies for Integrated cache before changing production scope.

Before you run CLI

  • Confirm tenant, subscription, resource group, service name, and environment before trusting command output.
  • Run list or show commands first, then save evidence before any create, update, delete, restore, or deploy action.
  • Check whether the command exposes secrets, customer data, training examples, file paths, keys, or private endpoints.
  • Have an approved rollback path and owner contact ready before changing production configuration.

What output tells you

  • Whether the expected Azure resource exists and whether Integrated cache is configured at the intended scope.
  • Which names, IDs, locations, states, tiers, policies, identities, and dependent resources are active right now.
  • Whether live Azure state differs from the design document, deployment template, release ticket, or support runbook.
  • Which metric, log query, portal page, or application test should be checked before closing the issue.

Mapped Azure CLI commands

Integrated cache operational checks

direct
az cosmosdb show --name <account-name> --resource-group <resource-group>
az cosmosdbdiscoverDatabases
az cosmosdb update --name <account-name> --resource-group <resource-group> --dedicated-gateway-instance-count <count> --dedicated-gateway-instance-size <size>
az cosmosdbconfigureDatabases
az cosmosdb sql database list --account-name <account-name> --resource-group <resource-group>
az cosmosdb sql databasediscoverDatabases
az cosmosdb sql container throughput show --account-name <account-name> --resource-group <resource-group> --database-name <database-name> --name <container-name>
az cosmosdb sql container throughputdiscoverDatabases
az monitor metrics list --resource <cosmos-account-resource-id> --metric TotalRequestUnits,ServerSideLatency
az monitor metricsdiscoverDatabases

Architecture context

Technically, Integrated cache sits in Azure Cosmos DB accounts, dedicated gateway configuration, application connection strings, item cache. Key fields include dedicated gateway size, dedicated gateway instance count, connection endpoint, max integrated cache staleness. Operators verify it with Cosmos DB account properties, dedicated gateway settings, request charge, cache hit behavior. In production reviews, connect the term to resource scope, identity, network path, diagnostics, cost ownership, and rollback. Confirm subscription, resource group, service tier, dependent workload, and current Azure evidence before changing it.

Security

Security for Integrated cache starts with private endpoints, managed identity or key handling, gateway endpoint exposure, role assignments, application connection strings. Review who can read, create, update, delete, restore, deploy, or invoke the related resource, and verify that privileged changes create audit evidence. Prefer Microsoft Entra ID, managed identities, private endpoints, key rotation, customer-managed keys, and policy controls where the service supports them. Keep secrets, credentials, personal data, and regulated content out of scripts and examples unless the data-handling design explicitly allows it. During approval, check tenant boundaries, network exposure, diagnostic logs, and break-glass procedures so a configuration mistake does not become an incident.

Cost

Cost for Integrated cache is driven by dedicated gateway hourly charges, saved request units, query volume, point-read volume, cache hit rate. The common mistake is treating the term as free because it is a setting, schema choice, job, or child resource instead of a cost influence. Check whether charges come from storage, requests, tokens, replicas, retention, backups, training, data transfer, diagnostics, or engineer time spent recovering from bad configuration. Use tags, budgets, Azure Cost Management, and owner reviews to connect usage to a workload. When reducing cost, confirm the change will not remove recovery evidence, security controls, or needed performance headroom.

Reliability

Reliability for Integrated cache depends on dedicated gateway capacity, regional availability, cache warmup, failover behavior, staleness tolerance. A resource can exist and still fail the business workflow when permissions, network paths, limits, schema settings, or downstream services are wrong. Define the health signal before production use, then test the expected failure mode with a controlled change. Monitor platform metrics, application traces, deployment history, and user symptoms in the same time window during incidents. Recovery plans should include owner contact, safe rollback, validation queries, and customer-impact checks, not just proof that the Azure resource exists. Confirm this behavior is tested before the workload depends on it.

Performance

Performance for Integrated cache depends on cache hit ratio, point-read latency, query latency, gateway saturation, item size. Measure the real workload instead of assuming the default configuration is enough. Look at latency, throughput, concurrency, request size, metadata operations, query complexity, token counts, or recovery duration depending on the service. Compare production metrics with load tests and with the limits of the selected tier or model. Tuning should be incremental and reversible, because a change that improves one path can hurt another. Always verify user-facing behavior after configuration, schema, deployment, or data-layout changes. Capture before-and-after metrics so tuning is based on evidence rather than assumptions.

Operations

Operations for Integrated cache require gateway inventory, cache staleness reviews, latency and RU monitoring, query pattern analysis, endpoint checks. Treat the term as something support teams must inspect quickly, not only as a design-time concept. Keep a runbook with portal locations, CLI commands, expected output, known dependencies, approval rules, and rollback steps. Review it during releases, migrations, incidents, access changes, and cost investigations. Good operations practice also means tagging owners, enabling diagnostics, storing evidence from read-only checks, and documenting exceptions. When the term changes, update handoff notes so future operators know what normal looks like. Keep the same evidence available to the next on-call engineer.

Common mistakes

  • Treating Integrated cache as a harmless label instead of checking the live resource, scope, owner, and dependencies.
  • Running a mutating command in the wrong subscription, resource group, account, service, index, share, or deployment.
  • Assuming a successful deployment proves the feature works without checking logs, metrics, access, and rollback evidence.
  • Ignoring cost, retention, quotas, network exposure, or data classification until an incident forces emergency cleanup.