Compute Virtual Machines field-manual-complete field-manual operator-field-manual

VM boot diagnostics storage

VM boot diagnostics storage is where Azure keeps the evidence needed when a virtual machine will not start cleanly. Instead of guessing why a VM is stuck, you can review console output and, for supported scenarios, a screenshot from the hypervisor. The storage may be managed by Azure or provided by you. This does not fix the VM by itself, but it gives operators a safer way to understand boot loops, bad images, filesystem errors, network mistakes, and extension problems.

Back to glossary browser Open Microsoft Learn source

Aliases: boot diagnostics storage, Azure boot diagnostics, serial log storage, VM boot log storage
Difficulty: intermediate
CLI mappings: 5
Last verified: 2026-05-28

Microsoft Learn

VM boot diagnostics storage is the storage location used by Azure boot diagnostics to capture serial console logs and screenshots during virtual machine startup. It helps operators troubleshoot startup failures, kernel messages, inaccessible consoles, and early boot problems for Linux and Windows virtual machines.

Microsoft Learn: Azure boot diagnostics2026-05-28

Technical context

Boot diagnostics belongs to the Azure Compute troubleshooting path. When enabled on a VM or scale-set instance, Azure captures serial log information and screenshots during boot and shutdown phases. The data is stored either in a managed storage account or a custom storage account, depending on configuration. Operators access it through the portal, CLI, or support workflows. It connects to VM images, cloud-init, extensions, serial console, repair workflows, storage access, and incident evidence. The VM resource remains the control-plane object, while logs expose guest startup behavior.

Why it matters

Boot diagnostics storage matters because many VM failures happen before normal agents, SSH, RDP, or application monitoring are available. Without serial logs or screenshots, teams waste time resizing, redeploying, or blaming networking when the actual issue is a kernel panic, invalid fstab entry, expired password prompt, failed cloud-init script, or broken image. Good boot diagnostics shortens recovery and improves evidence quality. It also protects support teams from unsafe troubleshooting habits, such as opening public management ports too early. The storage choice matters because diagnostics data can contain hostnames, paths, usernames, startup errors, and occasionally sensitive command output. Evidence should survive repair actions.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In the VM portal Help section, Boot diagnostics displays the captured screenshot, serial log, enabled state, and storage configuration used for startup troubleshooting during incidents.

Signal 02

In ARM or Bicep, diagnosticsProfile.bootDiagnostics shows whether boot diagnostics is enabled and whether a custom storageUri is configured for the VM resource during review.

Signal 03

In Azure CLI output, boot-diagnostics commands return boot logs, screenshot or serial log URIs, and diagnostic profile settings needed during incidents and support cases quickly.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

Diagnose a Linux VM stuck in emergency mode after an fstab, kernel, or disk UUID change prevents normal SSH access.
Review Windows startup screenshots when RDP fails and the guest is waiting at recovery, update, or credential prompts.
Validate a custom image pipeline by collecting serial output from first boot across many newly deployed VMs.
Provide evidence to support teams without immediately opening public management ports or rebuilding the affected VM.
Decide whether to use managed diagnostics storage or custom storage for retention, access, and compliance requirements.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Media encoding fleet catches bad image before rollout

Media encoding fleet catches bad image before rollout: Boot diagnostics storage gives image pipelines objective startup evidence before a flawed image becomes a fleet outage.

Scenario

A streaming platform refreshed the base image for hundreds of Linux encoding VMs. The first canary wave failed to accept SSH, and the release team needed proof before stopping the rollout.

Business/Technical Objectives

Identify the startup failure without rebuilding all canary VMs.
Preserve serial evidence for the image engineering team.
Keep production encoders on the old image until the cause was known.
Reduce rollback decision time during the release window.

Solution Using VM boot diagnostics storage

Boot diagnostics was enabled on the canary VMs before deployment. When the first machines failed to become reachable, operators used az vm boot-diagnostics get-boot-log to collect serial output and found a cloud-init package command hanging on a retired repository mirror. The screenshot showed the OS reached a login prompt, proving networking and credentials were separate issues. The pipeline halted the image promotion, and the image team corrected repository configuration. Diagnostic logs were stored with the release record, while production VM Scale Sets stayed on the previous image version.

Results & Business Impact

Root cause was identified in 18 minutes instead of the usual two-hour rebuild cycle.
The bad image was blocked before reaching 740 production encoders.
Rollback approval required one incident note rather than multiple portal screenshots.
The next image release added automated boot-log scanning for repository failures.

Key Takeaway for Glossary Readers

Boot diagnostics storage gives image pipelines objective startup evidence before a flawed image becomes a fleet outage.

Case study 02

Research cluster recovers from disk mount mistake

Research cluster recovers from disk mount mistake: Boot diagnostics storage is often the fastest way to separate guest configuration failures from Azure infrastructure problems.

Scenario

A genomics lab attached new data disks to analysis VMs and updated fstab through automation. Several machines rebooted into emergency mode just before a grant-reporting deadline.

Business/Technical Objectives

Determine whether the failure was storage, OS configuration, or Azure platform related.
Recover affected VMs without losing attached research data.
Prevent the same mount mistake from reaching remaining nodes.
Document the failure for the lab change board.

Solution Using VM boot diagnostics storage

The on-call engineer collected boot diagnostics serial logs for each affected VM and confirmed that Linux was failing on an invalid mount entry, not on Azure disk attachment. Screenshots showed emergency mode prompts, while serial output named the missing UUID. The team detached the OS disk to a repair VM, corrected fstab, reattached the disk, and restarted the original machine. Automation was updated to validate disk UUIDs before writing mount entries. Custom diagnostic storage retained the boot logs for the change board under restricted access.

Results & Business Impact

Five affected VMs were restored without rebuilding research environments.
Mean recovery time was 42 minutes, down from a previous three-hour manual investigation.
Remaining nodes were patched before their scheduled reboot.
The change board accepted the evidence because logs tied the outage to a specific automation commit.

Key Takeaway for Glossary Readers

Boot diagnostics storage is often the fastest way to separate guest configuration failures from Azure infrastructure problems.

Case study 03

Online school fixes Windows update boot loop

Online school fixes Windows update boot loop: Boot diagnostics storage supports safer incident response because teams can inspect the console before taking risky access shortcuts.

Scenario

An online education provider hosted proctoring relay software on Windows VMs. After a patch weekend, one relay VM failed to return, threatening exam sessions in a smaller region.

Business/Technical Objectives

Understand the boot state before opening emergency RDP exposure.
Restore relay capacity before the morning exam window.
Capture evidence for the patch vendor and internal audit.
Decide whether to pause remaining regional patch jobs.

Solution Using VM boot diagnostics storage

The operations team opened the VM Boot diagnostics blade and retrieved the latest screenshot and serial log. The screenshot showed Windows recovery after a driver update, while the serial output confirmed repeated restarts. Because boot diagnostics was already enabled with managed storage, the team did not need to change storage firewall rules during the incident. Operators restored the VM from the latest recovery point, paused the remaining patch job, and attached diagnostic evidence to the change record. A runbook update required screenshot review before any future emergency port opening.

Results & Business Impact

Relay capacity was restored 74 minutes before the first exam session.
No public RDP rule was opened during the incident.
Patch rollout was paused after one affected VM instead of impacting the full relay pool.
Audit review time dropped by half because evidence was already attached to the change record.

Key Takeaway for Glossary Readers

Boot diagnostics storage supports safer incident response because teams can inspect the console before taking risky access shortcuts.

Why use Azure CLI for this?

I use Azure CLI for boot diagnostics storage because boot failures are time-sensitive and portal clicking is slow during an outage. CLI lets me enable diagnostics, fetch the boot log, get log and screenshot URIs, and capture evidence before someone rebuilds the VM. It is also useful for checking many VMs after an image rollout or migration. After years of VM operations, I want the same command sequence in every incident: prove the VM resource, collect serial output, review storage access, and decide whether to use serial console, run command, disk repair, or rollback. That repeatability saves hours. Evidence collection becomes calmer and repeatable.

CLI use cases

Enable managed boot diagnostics on a VM before a risky image, kernel, or startup-script change.
Retrieve serial boot logs when SSH, RDP, or the Azure VM agent is unavailable.
Get diagnostic log and screenshot URIs for controlled incident evidence collection.
Disable diagnostics only after confirming no troubleshooting or serial console dependency remains.
Audit VMs after migration to find machines missing boot diagnostics before the next reboot window.

Before you run CLI

Confirm tenant, subscription, resource group, VM name, operating system, and whether diagnostics should use managed or custom storage.
Check permissions to read VM instance information and diagnostic artifacts, especially when SAS URIs or storage accounts are involved.
Review whether logs may contain secrets, usernames, internal addresses, or command output before sharing evidence.
Understand that enabling diagnostics is low risk, but disabling it can remove useful troubleshooting visibility for future failures.
For custom storage, verify region, firewall rules, secure transfer, lifecycle policies, and access logging before relying on it.

What output tells you

Boot log output exposes guest startup messages, kernel or service errors, cloud-init progress, extension failures, and timestamps around the failure.
Boot-log URI output provides temporary access paths that must be protected because they can expose serial logs or screenshots.
VM show output reveals whether boot diagnostics is enabled and whether a managed or custom storage URI is configured.
Portal screenshots show whether the VM reached a login screen, recovery prompt, bootloader, or error state visible from the hypervisor.
Activity logs show who enabled, disabled, or changed diagnostics settings during deployment, repair, or incident handling.

Mapped Azure CLI commands

VM boot diagnostics storage operations

direct

az vm boot-diagnostics enable --resource-group <resource-group> --name <vm-name>

az vm boot-diagnosticsconfigureCompute

az vm boot-diagnostics enable --resource-group <resource-group> --name <vm-name> --storage <storage-account-name>

az vm boot-diagnosticsconfigureCompute

az vm boot-diagnostics get-boot-log --resource-group <resource-group> --name <vm-name>

az vm boot-diagnosticsdiscoverCompute

az vm boot-diagnostics get-boot-log-uris --resource-group <resource-group> --name <vm-name>

az vm boot-diagnosticssecureCompute

az vm boot-diagnostics disable --resource-group <resource-group> --name <vm-name>

az vm boot-diagnosticsconfigureCompute

Architecture context

Architecturally, boot diagnostics storage is an observability dependency for the compute layer. I enable it by default on important VM fleets because it helps when the guest OS is not reachable through normal monitoring. The design decision is whether to use Azure-managed diagnostics storage for simplicity and faster provisioning, or a custom storage account when retention, access, network rules, or central evidence handling require it. For regulated environments, I document who can read boot logs, how long diagnostics artifacts remain, and whether SAS URIs are acceptable. Boot diagnostics should be paired with serial console access policy, image validation, cloud-init logging, and a VM repair runbook.

Security

Security impact is direct because boot logs and screenshots may reveal usernames, hostnames, internal IPs, mount paths, failed commands, package repositories, encryption prompts, or application startup errors. If a custom storage account is used, protect it with least-privilege access, private networking where appropriate, secure transfer, retention rules, and activity logging. SAS URIs for screenshots or logs should be treated as sensitive evidence, not pasted into broad chat channels. Avoid writing secrets in startup scripts, cloud-init output, or extension logs because boot diagnostics can preserve them. Control who can enable, disable, or read diagnostics data during incidents. Shared access should be time bounded.

Cost

Cost is usually small but not zero. Azure-managed boot diagnostics storage avoids creating a customer storage account for the diagnostic data, while custom storage can create charges for blobs, transactions, replication, logging, and retention. The larger cost story is operational: without diagnostics, engineers spend more time investigating failed boots, rebuilding VMs, or escalating support cases. If custom storage keeps screenshots and serial logs indefinitely, evidence storage can accumulate across fleets. FinOps reviews should decide whether custom storage is required, set lifecycle rules where appropriate, and avoid disabling useful diagnostics solely to save a tiny storage line item. Cleanup should remove stale diagnostic containers.

Reliability

Reliability improves because operators can diagnose failures even when the VM never reaches a healthy state. Boot diagnostics does not make the VM more available by itself, but it reduces mean time to understand failures from bad images, kernel updates, cloud-init mistakes, filesystem errors, disk attachment problems, or bootloader issues. Reliable operations enable diagnostics before incidents, test access to logs, and keep repair procedures ready. If diagnostics uses custom storage, that storage account becomes part of the troubleshooting path and must be reachable. Disabling diagnostics to save clutter can make the next boot outage longer and more speculative. Storage access should be tested after hardening.

Performance

Boot diagnostics can improve diagnostic performance, not VM runtime performance. Managed boot diagnostics storage can simplify and speed VM creation compared with waiting on a user storage account in some deployment patterns. During incidents, the practical performance benefit is faster access to serial logs and screenshots. The bottlenecks are usually log availability delay, storage access permissions, portal rendering, or the time required to interpret guest startup messages. Operators should measure time from boot failure to evidence collection and make CLI retrieval part of the runbook. Avoid very chatty startup scripts that flood useful serial output with noise. Production evidence must be captured early.

Operations

Operators use boot diagnostics storage when a VM is stuck starting, inaccessible over SSH or RDP, failing cloud-init, looping after patching, or showing guest-agent problems. They fetch serial logs, review screenshots, compare timestamps with deployment events, and decide whether to use serial console, run command, rescue VM, restore from backup, or roll back an image. Day-to-day tasks include enabling diagnostics on critical VMs, confirming storage configuration, restricting evidence access, and cleaning old artifacts if custom storage is used. Runbooks should say how to redact logs before attaching them to tickets. Evidence handling should be routine during incidents. Keep retrieval steps documented.

Common mistakes

Leaving boot diagnostics disabled on critical VMs and discovering there is no serial evidence during a boot outage.
Posting SAS URIs or serial logs containing sensitive startup information into broad support channels.
Using custom storage with firewall rules that block operators from retrieving diagnostics during an incident.
Assuming boot diagnostics fixes the VM instead of treating it as evidence for the repair decision.
Writing secrets in cloud-init or startup scripts that later appear in serial output or diagnostic logs.

Operator quick checks

Is boot diagnostics enabled before the next planned reboot, patch, or image rollout?
Can the on-call team retrieve the boot log without owner-level permissions or unsafe sharing?
Does the serial output show a guest issue, provisioning issue, disk problem, or network lockout?
If custom storage is used, do lifecycle and access policies match incident evidence requirements?
Are logs redacted before being attached to tickets, vendor cases, or post-incident reports?

Questions to ask

Who is allowed to read boot logs and screenshots during an outage?
Does the environment require custom diagnostic storage, or is managed storage simpler and safer?
What repair path follows each boot evidence pattern: serial console, disk repair, restore, or rebuild?
How will the team prevent secrets from appearing in startup output?
What monitoring detects VMs that lack boot diagnostics before a production reboot?

Related terms

No related terms mapped yet.

Graph connections

Graph edges are queued for this term.

Learn next

Use related terms, graph links, command groups, and comparison cards to keep moving through Azure without losing context.

Open relationship graph