Networking Application delivery and API edge premium

Load balancer health probe

A load balancer health probe is Azure’s way of asking each backend instance whether it should receive traffic. The probe checks a configured port or endpoint, then the load balancer sends new flows only to backends that respond as healthy. It is not the same as an application monitoring alert. It is a traffic-control signal, so a bad probe can remove good servers or keep bad servers in rotation. The key is knowing which Azure component owns the behavior before changing production configuration.

Back to glossary browser Open Microsoft Learn source

Aliases: No aliases mapped yet
Difficulty: intermediate
CLI mappings: 4
Last verified: 2026-05-16

Microsoft Learn

Microsoft Learn describes an Azure Load Balancer health probe as the check that determines whether backend pool instances are healthy enough to receive new traffic. Probe configuration includes protocol, port, interval, and threshold choices that influence traffic distribution. Operators should review it with the connected Azure resource settings.

Microsoft Learn: Azure Load Balancer components2026-05-16

Technical context

Technically, the health probe belongs to an Azure Load Balancer and is referenced by load-balancing rules. It can use TCP, HTTP, or HTTPS depending on the design, with settings such as port, request path, interval, and failure threshold. The probe originates from Azure infrastructure and evaluates backend instances in the pool. Operators review probe configuration together with NSGs, guest firewalls, application listeners, backend pool membership, and diagnostic metrics such as Health Probe Status. That context helps operators separate resource configuration, runtime behavior, and dependency troubleshooting during reviews.

Why it matters

The health probe matters because it decides which backend instances receive production traffic. If the probe checks the wrong port, uses a path the app does not serve, or is blocked by a guest firewall, healthy instances can be drained from rotation. If the probe is too shallow, broken applications may continue receiving traffic. This makes probe design a reliability and operations decision, not just a networking detail. It should check readiness, not merely whether the VM is powered on, and it should match how the workload actually fails. Clear ownership also makes incident triage faster because teams know which setting changed and why.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In load balancer settings, health probes appear with protocol, port, path, interval, and threshold values referenced by load-balancing rules during release review, incident triage, and ownership checks.

Signal 02

In Azure metrics, Health Probe Status shows whether backend instances are considered healthy enough to receive new client flows during release review, incident triage, and ownership checks.

Signal 03

During incidents, probes appear when traffic stops reaching one instance because NSGs, guest firewalls, listeners, or readiness endpoints changed while operators compare evidence against the approved runbook.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

Publishing a stable frontend address while backend VMs scale, patch, or change.
Separating internal and public traffic paths for the same application family.
Troubleshooting why healthy-looking infrastructure is not receiving client connections.
Documenting routing dependencies for security review, cutover, and incident response.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Retail readiness probe redesign

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Fabrikam Market, an online retailer, saw checkout failures because backend VMs passed a TCP probe even when the checkout application could not reach its payment dependency.

Business/Technical Objectives

Stop routing checkout traffic to application-broken instances
Detect failed readiness within two probe intervals
Reduce false healthy signals during dependency outages
Improve release safety before peak shopping events

Solution Using Load balancer health probe

The team replaced the shallow TCP probe with an HTTP health probe that called a lightweight readiness endpoint on each checkout VM. The endpoint verified that the web process was running and that required local dependencies were initialized, without exposing payment details. Load-balancing rules were updated to use the new probe, and NSGs were reviewed to allow only the required probe path. Application Insights and Load Balancer metrics were placed on the same dashboard so operators could compare failed readiness with user-impacting checkout errors during releases. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact

Checkout errors during backend failures dropped by 63%
Two bad instances were removed from rotation automatically during testing
Release validation time fell from 90 minutes to 35 minutes
No sensitive diagnostic information was exposed through the probe endpoint

Key Takeaway for Glossary Readers

A health probe should measure readiness to serve traffic, not simply whether a port is open.

Case study 02

Banking API probe tuning

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Bluewater Credit Union operated a member API behind Azure Load Balancer and needed to reduce traffic flapping during short CPU spikes on patched backend VMs.

Business/Technical Objectives

Reduce unnecessary backend removal during brief spikes
Keep unhealthy APIs out of rotation within five minutes
Separate probe failures from application dependency alerts
Document probe thresholds for audit review

Solution Using Load balancer health probe

Architects reviewed existing HTTP probe interval and threshold settings, then tuned them to tolerate short startup spikes while still detecting true failures. They changed the probe path to a fast status endpoint and removed database calls that made the probe too expensive. The rule and probe definitions were exported with Azure CLI for audit evidence, and Application Insights alerts were adjusted to show whether failures came from probe status, dependency latency, or API code. The resulting probe was lightweight, secure, and aligned with real production readiness. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact

False unhealthy events dropped by 71% after the tuning change
Patch windows completed without unnecessary traffic churn
API availability stayed above the internal 99.9% target
Auditors received probe configuration evidence in the same day

Key Takeaway for Glossary Readers

Health probe tuning improves reliability only when it reflects how the workload actually starts, fails, and recovers.

Case study 03

Logistics route-service failover

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Apex Freight Systems needed its route-optimization service to drain traffic from VMs when a background map-cache refresh made the service temporarily unable to answer requests.

Business/Technical Objectives

Prevent clients from reaching route engines during cache rebuilds
Keep existing healthy VMs receiving new traffic
Show operators why each backend was removed
Avoid manual backend pool changes during refresh windows

Solution Using Load balancer health probe

The platform team added a readiness endpoint that returned unhealthy while the route engine rebuilt its local map cache. Azure Load Balancer health probes used that endpoint, and the load-balancing rule stopped new flows to rebuilding instances while leaving healthy backends active. The team created runbook steps for viewing Health Probe Status, VM logs, and backend pool membership together. This avoided manual removal from the pool and ensured the probe reflected service readiness rather than only process uptime. The implementation team documented owners, expected signals, rollback steps, and post-change evidence so operations could support the design after handoff.

Results & Business Impact

Manual backend pool edits were eliminated from cache refresh windows
Customer route latency spikes dropped by 38%
Operators could identify rebuilding nodes within two minutes
The route service completed three refresh cycles without client-visible errors

Key Takeaway for Glossary Readers

A health probe can turn application readiness into safe traffic distribution when the probe endpoint is deliberately designed.

Why use Azure CLI for this?

Azure CLI is useful for a load balancer health probe because probe mistakes often hide behind healthy-looking resources. Commands show probe protocol, port, thresholds, linked rules, backend pools, and related network settings in one reviewable output.

CLI use cases

List frontend IP configurations, rules, probes, and backend pools for a load balancer during incident triage.
Export load balancer configuration before a change so reviewers can compare intended and actual routing behavior.
Check public IP, private IP, and backend pool references when DNS resolves but traffic still fails.
Automate repeatable reviews of exposed ports, probe settings, and backend membership across environments.

Before you run CLI

Confirm the subscription, resource group, load balancer name, and whether the design is public or internal.
Verify you have Network Contributor or equivalent read permissions for load balancer, IP, NSG, and NIC resources.
Know whether you are inspecting only, or making changes that can immediately affect production traffic.
Choose table or JSON output before sharing evidence with responders, reviewers, or automation.

What output tells you

Frontend IP output shows which public or private address receives client traffic and which resource owns it.
Rule output shows frontend port, backend port, protocol, backend pool, probe, idle timeout, and persistence behavior.
Probe and backend pool output helps explain whether Azure considers instances healthy enough for new flows.
Linked resource IDs reveal whether the configuration points to the expected environment and workload.

Mapped Azure CLI commands

Loadbalancer operations

Direct

Az network lb list --resource-group <resource-group>

az network lbdiscoverNetworking

Az network lb show --name <lb-name> --resource-group <resource-group>

az network lbdiscoverNetworking

Az network lb create --name <lb-name> --resource-group <resource-group>

az network lbprovisionNetworking

Az network lb probe list --lb-name <lb-name> --resource-group <resource-group>

az network lb probediscoverNetworking

Architecture context

Security

Security for a health probe focuses on allowing the right probe traffic without opening unnecessary access. Backend NSGs and host firewalls must allow the Azure probe path or port, but that exception should not become a broad inbound rule from untrusted networks. HTTP probe paths should avoid exposing sensitive diagnostics, secrets, stack traces, or administrative endpoints. Operators should document why the probe port is open and monitor unexpected changes. A secure probe endpoint returns only the minimum health signal needed for routing. It should not become a backdoor status page with privileged information. Reviewers should record the approved boundary and verify alerts after any configuration change.

Cost

Cost impact is mostly indirect. Health probes themselves are not usually the main bill driver, but poor probe design can cause overprovisioning, emergency scale-out, failed deployments, and wasted incident time. If probes mark instances unhealthy too easily, teams may add capacity instead of fixing readiness logic. If probes are too shallow, outages can trigger support escalations and business loss. FinOps reviews should connect probe quality to backend utilization and release stability. A small investment in proper health checks often prevents expensive overbuilding and avoidable troubleshooting across compute, monitoring, and networking teams. Tagging and ownership evidence make it easier to challenge waste without breaking useful safeguards.

Reliability

Reliability depends heavily on accurate probe behavior. A probe should fail when the instance cannot safely serve traffic and recover when the instance is ready again. Overly aggressive intervals can flap during transient CPU or startup spikes, while slow probes can keep traffic on broken instances too long. Multi-instance services need enough healthy backends to absorb maintenance, patching, and rolling deployments. Operators should test probe failure deliberately, especially after code changes that affect health endpoints. Reliable designs also avoid sharing one weak probe across services with different readiness requirements. Testing this path before release prevents avoidable surprises during scale, failover, or recovery.

Performance

Performance is affected because health probes control which instances receive new flows. A backend that passes a shallow TCP probe might still be slow at the application layer, causing user-facing latency while appearing healthy to the load balancer. Conversely, a probe path that is too expensive can add unnecessary work to every instance. Operators should design lightweight readiness endpoints, watch Health Probe Status alongside request latency, and confirm probes are not blocked during startup. Probe intervals and thresholds should support fast failure detection without creating instability or excessive false negatives. Measurements should be taken from the application path, not only from control-plane configuration.

Operations

Operations teams use health probes to separate load balancer issues from application issues. During an outage, they check whether the backend is in the pool, whether the probe succeeds, whether NSGs allow the probe, and whether the app listens on the configured port. CLI output, metrics, and VM logs together show whether Azure stopped sending new flows for a valid reason. Probe changes deserve change control because a one-line port or path change can immediately shift traffic. Good runbooks include expected probe response, owner, route, and rollback steps. Documentation should capture the expected state, owner, validation command, and rollback decision.

Common mistakes

Checking only the frontend IP and missing that the rule references the wrong backend pool or probe.
Opening a public frontend port while assuming backend VMs without public IPs are not exposed.
Changing a probe, rule, or frontend during a release without documenting DNS, NSG, and rollback impact.
Treating Load Balancer as an application-layer router instead of a Layer 4 traffic distribution service.

Operator quick checks

List the load balancer and confirm the frontend IP configuration matches DNS and the expected exposure model.
Show all load-balancing rules and verify protocol, frontend port, backend port, pool, and probe.
Check backend pool membership and confirm the intended NICs, VMs, or IP addresses are present.
Review Load Balancer metrics and NSG rules before blaming application code.

Questions to ask

Which clients should reach this load balancer, and is the frontend public or private by design?
Which rule, backend pool, and probe does this traffic path depend on?
What evidence proves backend instances are healthy enough to receive new flows?
What breaks if this frontend, rule, or probe changes during a deployment?

Related terms

No related terms mapped yet.

Graph connections

Graph edges are queued for this term.

Learning paths

Learn next

Use related terms, graph links, command groups, and comparison cards to keep moving through Azure without losing context.

Open relationship graph