Networking Application delivery and API edge premium

Load balancer health probe

A load balancer health probe is Azure’s way of asking each backend instance whether it should receive traffic. The probe checks a configured port or endpoint, then the load balancer sends new flows only to backends that respond as healthy. It is not the same as an application monitoring alert. It is a traffic-control signal, so a bad probe can remove good servers or keep bad servers in rotation. The key is knowing which Azure component owns the behavior before changing production configuration.

Aliases
No aliases mapped yet
Difficulty
intermediate
CLI mappings
4
Last verified
2026-05-16

Microsoft Learn

Microsoft Learn describes an Azure Load Balancer health probe as the check that determines whether backend pool instances are healthy enough to receive new traffic. Probe configuration includes protocol, port, interval, and threshold choices that influence traffic distribution. Operators should review it with the connected Azure resource settings.

Microsoft Learn: Azure Load Balancer components2026-05-16

Technical context

Technically, the health probe belongs to an Azure Load Balancer and is referenced by load-balancing rules. It can use TCP, HTTP, or HTTPS depending on the design, with settings such as port, request path, interval, and failure threshold. The probe originates from Azure infrastructure and evaluates backend instances in the pool. Operators review probe configuration together with NSGs, guest firewalls, application listeners, backend pool membership, and diagnostic metrics such as Health Probe Status. That context helps operators separate resource configuration, runtime behavior, and dependency troubleshooting during reviews.

Why it matters

The health probe matters because it decides which backend instances receive production traffic. If the probe checks the wrong port, uses a path the app does not serve, or is blocked by a guest firewall, healthy instances can be drained from rotation. If the probe is too shallow, broken applications may continue receiving traffic. This makes probe design a reliability and operations decision, not just a networking detail. It should check readiness, not merely whether the VM is powered on, and it should match how the workload actually fails. Clear ownership also makes incident triage faster because teams know which setting changed and why.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In load balancer settings, health probes appear with protocol, port, path, interval, and threshold values referenced by load-balancing rules during release review, incident triage, and ownership checks.

Signal 02

In Azure metrics, Health Probe Status shows whether backend instances are considered healthy enough to receive new client flows during release review, incident triage, and ownership checks.

Signal 03

During incidents, probes appear when traffic stops reaching one instance because NSGs, guest firewalls, listeners, or readiness endpoints changed while operators compare evidence against the approved runbook.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • Publishing a stable frontend address while backend VMs scale, patch, or change.
  • Separating internal and public traffic paths for the same application family.
  • Troubleshooting why healthy-looking infrastructure is not receiving client connections.
  • Documenting routing dependencies for security review, cutover, and incident response.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Retail readiness probe redesign

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Fabrikam Market, an online retailer, saw checkout failures because backend VMs passed a TCP probe even when the checkout application could not reach its payment dependency.

Business/Technical Objectives
  • Stop routing checkout traffic to application-broken instances
  • Detect failed readiness within two probe intervals
  • Reduce false healthy signals during dependency outages
  • Improve release safety before peak shopping events
Solution Using Load balancer health probe

The team replaced the shallow TCP probe with an HTTP health probe that called a lightweight readiness endpoint on each checkout VM. The endpoint verified that the web process was running and that required local dependencies were initialized, without exposing payment details. Load-balancing rules were updated to use the new probe, and NSGs were reviewed to allow only the required probe path. Application Insights and Load Balancer metrics were placed on the same dashboard so operators could compare failed readiness with user-impacting checkout errors during releases. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact
  • Checkout errors during backend failures dropped by 63%
  • Two bad instances were removed from rotation automatically during testing
  • Release validation time fell from 90 minutes to 35 minutes
  • No sensitive diagnostic information was exposed through the probe endpoint
Key Takeaway for Glossary Readers

A health probe should measure readiness to serve traffic, not simply whether a port is open.

Case study 02

Banking API probe tuning

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Bluewater Credit Union operated a member API behind Azure Load Balancer and needed to reduce traffic flapping during short CPU spikes on patched backend VMs.

Business/Technical Objectives
  • Reduce unnecessary backend removal during brief spikes
  • Keep unhealthy APIs out of rotation within five minutes
  • Separate probe failures from application dependency alerts
  • Document probe thresholds for audit review
Solution Using Load balancer health probe

Architects reviewed existing HTTP probe interval and threshold settings, then tuned them to tolerate short startup spikes while still detecting true failures. They changed the probe path to a fast status endpoint and removed database calls that made the probe too expensive. The rule and probe definitions were exported with Azure CLI for audit evidence, and Application Insights alerts were adjusted to show whether failures came from probe status, dependency latency, or API code. The resulting probe was lightweight, secure, and aligned with real production readiness. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact
  • False unhealthy events dropped by 71% after the tuning change
  • Patch windows completed without unnecessary traffic churn
  • API availability stayed above the internal 99.9% target
  • Auditors received probe configuration evidence in the same day
Key Takeaway for Glossary Readers

Health probe tuning improves reliability only when it reflects how the workload actually starts, fails, and recovers.

Case study 03

Logistics route-service failover

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Apex Freight Systems needed its route-optimization service to drain traffic from VMs when a background map-cache refresh made the service temporarily unable to answer requests.

Business/Technical Objectives
  • Prevent clients from reaching route engines during cache rebuilds
  • Keep existing healthy VMs receiving new traffic
  • Show operators why each backend was removed
  • Avoid manual backend pool changes during refresh windows
Solution Using Load balancer health probe

The platform team added a readiness endpoint that returned unhealthy while the route engine rebuilt its local map cache. Azure Load Balancer health probes used that endpoint, and the load-balancing rule stopped new flows to rebuilding instances while leaving healthy backends active. The team created runbook steps for viewing Health Probe Status, VM logs, and backend pool membership together. This avoided manual removal from the pool and ensured the probe reflected service readiness rather than only process uptime. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact
  • Manual backend pool edits were eliminated from cache refresh windows
  • Customer route latency spikes dropped by 38%
  • Operators could identify rebuilding nodes within two minutes
  • The route service completed three refresh cycles without client-visible errors
Key Takeaway for Glossary Readers

A health probe can turn application readiness into safe traffic distribution when the probe endpoint is deliberately designed.

Why use Azure CLI for this?

Azure CLI is useful for a load balancer health probe because probe mistakes often hide behind healthy-looking resources. Commands show probe protocol, port, thresholds, linked rules, backend pools, and related network settings in one reviewable output.

CLI use cases

  • List frontend IP configurations, rules, probes, and backend pools for a load balancer during incident triage.
  • Export load balancer configuration before a change so reviewers can compare intended and actual routing behavior.
  • Check public IP, private IP, and backend pool references when DNS resolves but traffic still fails.
  • Automate repeatable reviews of exposed ports, probe settings, and backend membership across environments.

Before you run CLI

  • Confirm the subscription, resource group, load balancer name, and whether the design is public or internal.
  • Verify you have Network Contributor or equivalent read permissions for load balancer, IP, NSG, and NIC resources.
  • Know whether you are inspecting only, or making changes that can immediately affect production traffic.
  • Choose table or JSON output before sharing evidence with responders, reviewers, or automation.

What output tells you

  • Frontend IP output shows which public or private address receives client traffic and which resource owns it.
  • Rule output shows frontend port, backend port, protocol, backend pool, probe, idle timeout, and persistence behavior.
  • Probe and backend pool output helps explain whether Azure considers instances healthy enough for new flows.
  • Linked resource IDs reveal whether the configuration points to the expected environment and workload.

Mapped Azure CLI commands

Loadbalancer operations

Direct
Az network lb list --resource-group <resource-group>
az network lbdiscoverNetworking
Az network lb show --name <lb-name> --resource-group <resource-group>
az network lbdiscoverNetworking
Az network lb create --name <lb-name> --resource-group <resource-group>
az network lbprovisionNetworking
Az network lb probe list --lb-name <lb-name> --resource-group <resource-group>
az network lb probediscoverNetworking

Architecture context

Technically, the health probe belongs to an Azure Load Balancer and is referenced by load-balancing rules. It can use TCP, HTTP, or HTTPS depending on the design, with settings such as port, request path, interval, and failure threshold. The probe originates from Azure infrastructure and evaluates backend instances in the pool. Operators review probe configuration together with NSGs, guest firewalls, application listeners, backend pool membership, and diagnostic metrics such as Health Probe Status. That context helps operators separate resource configuration, runtime behavior, and dependency troubleshooting during reviews.

Security

Security for a health probe focuses on allowing the right probe traffic without opening unnecessary access. Backend NSGs and host firewalls must allow the Azure probe path or port, but that exception should not become a broad inbound rule from untrusted networks. HTTP probe paths should avoid exposing sensitive diagnostics, secrets, stack traces, or administrative endpoints. Operators should document why the probe port is open and monitor unexpected changes. A secure probe endpoint returns only the minimum health signal needed for routing. It should not become a backdoor status page with privileged information. Reviewers should record the approved boundary and verify alerts after any configuration change.

Cost

Cost impact is mostly indirect. Health probes themselves are not usually the main bill driver, but poor probe design can cause overprovisioning, emergency scale-out, failed deployments, and wasted incident time. If probes mark instances unhealthy too easily, teams may add capacity instead of fixing readiness logic. If probes are too shallow, outages can trigger support escalations and business loss. FinOps reviews should connect probe quality to backend utilization and release stability. A small investment in proper health checks often prevents expensive overbuilding and avoidable troubleshooting across compute, monitoring, and networking teams. Tagging and ownership evidence make it easier to challenge waste without breaking useful safeguards.

Reliability

Reliability depends heavily on accurate probe behavior. A probe should fail when the instance cannot safely serve traffic and recover when the instance is ready again. Overly aggressive intervals can flap during transient CPU or startup spikes, while slow probes can keep traffic on broken instances too long. Multi-instance services need enough healthy backends to absorb maintenance, patching, and rolling deployments. Operators should test probe failure deliberately, especially after code changes that affect health endpoints. Reliable designs also avoid sharing one weak probe across services with different readiness requirements. Testing this path before release prevents avoidable surprises during scale, failover, or recovery.

Performance

Performance is affected because health probes control which instances receive new flows. A backend that passes a shallow TCP probe might still be slow at the application layer, causing user-facing latency while appearing healthy to the load balancer. Conversely, a probe path that is too expensive can add unnecessary work to every instance. Operators should design lightweight readiness endpoints, watch Health Probe Status alongside request latency, and confirm probes are not blocked during startup. Probe intervals and thresholds should support fast failure detection without creating instability or excessive false negatives. Measurements should be taken from the application path, not only from control-plane configuration.

Operations

Operations teams use health probes to separate load balancer issues from application issues. During an outage, they check whether the backend is in the pool, whether the probe succeeds, whether NSGs allow the probe, and whether the app listens on the configured port. CLI output, metrics, and VM logs together show whether Azure stopped sending new flows for a valid reason. Probe changes deserve change control because a one-line port or path change can immediately shift traffic. Good runbooks include expected probe response, owner, route, and rollback steps. Documentation should capture the expected state, owner, validation command, and rollback decision.

Common mistakes

  • Checking only the frontend IP and missing that the rule references the wrong backend pool or probe.
  • Opening a public frontend port while assuming backend VMs without public IPs are not exposed.
  • Changing a probe, rule, or frontend during a release without documenting DNS, NSG, and rollback impact.
  • Treating Load Balancer as an application-layer router instead of a Layer 4 traffic distribution service.