AI and Machine Learning Document Intelligence field-manual-complete

Layout model

A layout model helps Azure understand the shape of a document before anyone builds a custom form model. It reads pages and identifies text, tables, checkboxes, paragraphs, and structure so applications can work with documents instead of raw images or blobs. It is useful for invoices, contracts, reports, forms, and knowledge-base content where the location and structure of text matter. The model does not decide business meaning by itself; it prepares structured evidence for later processing.

Aliases
No aliases mapped yet
Difficulty
fundamentals
CLI mappings
12
Last verified
2026-05-15

Microsoft Learn

A layout model in Azure AI Document Intelligence analyzes a document structure and returns extracted text, tables, selection marks, paragraphs, roles, and layout information. Microsoft Learn describes layout capabilities for prebuilt extraction, Markdown or text output, and search enrichment scenarios that need structured document content.

Microsoft Learn: Document Intelligence layout model and Document Layout skill2026-05-15

Technical context

Technically, layout model capabilities sit in Azure AI Document Intelligence and are often used before custom extraction, search indexing, RAG pipelines, or human review workflows. The service analyzes document pages and returns structured elements through APIs and SDKs. In Azure AI Search, the Document Layout skill can transform document structure into text or Markdown for indexing. Architecture decisions include resource region, file type support, page limits, private networking, managed identity, storage input, output format, confidence review, downstream search, and retention of extracted content.

Why it matters

Layout model matters because document automation usually fails when systems treat every page as plain text. Tables, section headings, paragraphs, selection marks, and reading order affect how accurately downstream systems classify, search, summarize, or extract information. A strong layout step can improve search relevance, reduce custom model training needs, and make human review faster. Poor layout handling can scramble tables, lose context, or feed confusing chunks into AI workflows. For glossary readers, the term is important because it explains the bridge between raw document storage and structured document intelligence. That context helps teams explain who owns layout model, what risk it controls, and how it should behave.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In Document Intelligence results, layout model output appears as pages, lines, words, paragraphs, tables, selection marks, roles, spans, and confidence information. Operators validate this signal during incident response, audits, and change reviews.

Signal 02

In Azure AI Search enrichment pipelines, layout capabilities appear when documents are converted into structured text or Markdown before indexing. Operators validate this signal during incident response, audits, and change reviews.

Signal 03

In document operations queues, layout model issues appear as failed analysis jobs, low-confidence extraction, scrambled tables, reviewer corrections, or long batch-processing times. Operators validate this signal during incident response, audits, and change reviews.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • Extract document structure before search indexing.
  • Prepare documents for RAG or knowledge workflows.
  • Analyze tables, paragraphs, selection marks, and page structure.
  • Support custom document-processing applications with reusable layout output.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Improving contract intake search

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

Summit Legal Services had thousands of contracts in storage, but keyword search missed obligations hidden in tables and section headings.

Business/Technical Objectives
  • Extract structured contract text for search indexing.
  • Improve retrieval of table-based obligations by 30%.
  • Protect confidential contract content.
  • Reduce manual intake review time.
Solution Using Layout model

The team used Azure AI Document Intelligence layout capabilities to analyze contract PDFs before indexing them in Azure AI Search. Layout output preserved paragraphs, headings, tables, and selection marks, which improved searchable structure compared with raw OCR text. Storage and AI resources used private endpoints, managed identities, and restricted reviewer access. Azure CLI checks exported resource configuration, role assignments, diagnostic settings, and storage network rules before production. Attorneys reviewed low-confidence or unusual documents in an exception queue. Dashboards tracked analysis duration, failed files, search quality feedback, and manual review volume. The team also documented owner contacts, rollback steps, monitoring signals, and support handoffs so the change remained operable after the first release. Those notes helped engineers distinguish expected behavior from production defects, train new responders, and explain decisions during monthly governance reviews safely clearly.

Results & Business Impact
  • Table-based obligation retrieval improved by 42%.
  • Manual intake review time dropped by 36%.
  • Confidential contract processing stayed inside the private network boundary.
  • Failed document analysis fell below 2% after scan-quality rules.
Key Takeaway for Glossary Readers

Layout model capabilities make document search more useful by preserving structure that plain text extraction loses.

Case study 02

Automating invoice packet review

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

FreightWay Logistics received multi-page invoice packets where charges, signatures, and delivery exceptions appeared in inconsistent locations.

Business/Technical Objectives
  • Prepare invoice packets for downstream validation.
  • Reduce manual sorting by 50%.
  • Identify tables and checkboxes reliably enough for review queues.
  • Keep processing throughput predictable at month end.
Solution Using Layout model

The architecture team added a layout analysis step before custom invoice validation. Document Intelligence extracted page structure, tables, paragraphs, and selection marks, then the workflow routed packets to validation logic or human review. Large month-end batches used queues so users did not wait synchronously. Operators used Azure CLI to confirm AI resource configuration, storage account access, private endpoints, and diagnostic settings. Dashboards measured document count, processing duration, low-confidence output, and reviewer backlog. The team avoided reprocessing duplicate packets by storing layout result hashes. The team also documented owner contacts, rollback steps, monitoring signals, and support handoffs so the change remained operable after the first release. Those notes helped engineers distinguish expected behavior from production defects, train new responders, and explain decisions during monthly governance reviews safely clearly.

Results & Business Impact
  • Manual sorting effort decreased by 57%.
  • Month-end processing completed 11 hours faster.
  • Reviewer backlog stayed below the agreed 200-packet threshold.
  • Duplicate analysis charges dropped by 21% after hash checks.
Key Takeaway for Glossary Readers

Layout analysis is most valuable when it feeds a workflow that can validate, route, and review imperfect documents.

Case study 03

Building a research document knowledge base

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

HelioPharm Research needed searchable study reports, but raw text extraction lost tables, captions, and section context.

Business/Technical Objectives
  • Convert study reports into structured searchable content.
  • Improve answer grounding for research assistants.
  • Avoid broad exposure of sensitive study documents.
  • Track processing quality across document batches.
Solution Using Layout model

The data platform routed approved research PDFs through Document Intelligence layout analysis and converted structured output into Markdown chunks for Azure AI Search. The RAG application used retrieved chunks with section and table context. Private endpoints, managed identities, and restricted indexes limited access to approved research groups. Azure CLI evidence documented AI resources, storage controls, search service configuration, and monitoring setup. Operators tracked failed documents, processing time, search feedback, and citations that referenced table-derived content. Low-confidence batches moved to a curator queue before being made searchable. The team also documented owner contacts, rollback steps, monitoring signals, and support handoffs so the change remained operable after the first release. Those notes helped engineers distinguish expected behavior from production defects, train new responders, and explain decisions during monthly governance reviews safely clearly.

Results & Business Impact
  • Research answer acceptance improved by 33%.
  • Study-report indexing time dropped from days to hours.
  • Sensitive documents remained restricted to approved research identities.
  • Curator review found and corrected 14 scan-quality issues.
Key Takeaway for Glossary Readers

Layout model output can make AI search and RAG systems more trustworthy by carrying document structure forward.

Why use Azure CLI for this?

Azure CLI helps manage the Azure resource, identity, networking, diagnostics, and storage around layout model workflows. The actual layout analysis usually happens through REST APIs, SDKs, or search skills, but CLI evidence is useful for deployment, troubleshooting, and audit support.

CLI use cases

  • Inspect Document Intelligence or Azure AI services resources used for layout analysis.
  • Confirm endpoint, region, tags, diagnostic settings, private endpoints, and role assignments.
  • Check storage accounts and containers that feed document-analysis workflows.
  • Export configuration evidence before onboarding new document-processing applications.

Before you run CLI

  • Know whether you are checking the AI resource, source storage, search enrichment, or application pipeline.
  • Avoid exposing document samples, keys, or extracted text in terminal history or shared tickets.
  • Confirm the expected region, network boundary, identity model, and owning application team.
  • Capture current configuration before changing endpoints, diagnostics, or access permissions.

What output tells you

  • Resource output confirms the endpoint, region, SKU, and identity boundary for layout processing.
  • Network output explains whether private endpoints, DNS, or firewall rules may block document analysis.
  • Storage output identifies where documents enter the workflow and whether access is appropriately scoped.
  • Diagnostic output shows whether failures and usage can be traced during document processing incidents.

Mapped Azure CLI commands

Document Intelligence operations

adjacent
az cognitiveservices account list --resource-group <resource-group>
az cognitiveservices accountdiscoverAI and Machine Learning
az cognitiveservices account show --name <account-name> --resource-group <resource-group>
az cognitiveservices accountdiscoverAI and Machine Learning
az cognitiveservices account create --name <account-name> --resource-group <resource-group> --kind <kind> --sku S0 --location <region>
az cognitiveservices accountprovisionAI and Machine Learning
az cognitiveservices account keys list --name <account-name> --resource-group <resource-group>
az cognitiveservices account keysdiscoverAI and Machine Learning
az cognitiveservices account delete --name <account-name> --resource-group <resource-group>
az cognitiveservices accountremoveAI and Machine Learning

Cognitive operations

direct
az cognitiveservices account show --name <account> --resource-group <resource-group>
az cognitiveservices accountdiscoverAI and Machine Learning
az cognitiveservices account create --name <account> --resource-group <resource-group> --kind <kind> --sku S0 --location <region>
az cognitiveservices accountprovisionAI and Machine Learning
az cognitiveservices account list-kinds
az cognitiveservices accountdiscoverAI and Machine Learning
az cognitiveservices account list-skus --kind <kind> --location <region>
az cognitiveservices accountdiscoverAI and Machine Learning
az cognitiveservices account keys list --name <account> --resource-group <resource-group>
az cognitiveservices account keysdiscoverAI and Machine Learning
az cognitiveservices account deployment list --name <account> --resource-group <resource-group>
az cognitiveservices account deploymentdiscoverAI and Machine Learning
az cognitiveservices account deployment create --name <account> --resource-group <resource-group> --deployment-name <deployment> --model-name <model> --model-version <version> --model-format OpenAI --sku-capacity 1 --sku-name Standard
az cognitiveservices account deploymentprovisionAI and Machine Learning

Architecture context

Technically, layout model capabilities sit in Azure AI Document Intelligence and are often used before custom extraction, search indexing, RAG pipelines, or human review workflows. The service analyzes document pages and returns structured elements through APIs and SDKs. In Azure AI Search, the Document Layout skill can transform document structure into text or Markdown for indexing. Architecture decisions include resource region, file type support, page limits, private networking, managed identity, storage input, output format, confidence review, downstream search, and retention of extracted content.

Security

Security starts with the documents themselves. Layout analysis often processes contracts, health records, invoices, legal files, identity documents, or internal reports. Teams should use managed identities or protected keys, private endpoints where required, and carefully scoped storage access. Extracted layout output can be as sensitive as the source document because it may expose names, account numbers, table values, or signatures in easier-to-search form. Operators should review diagnostic logging, data retention, result storage, key rotation, and who can call the Document Intelligence resource. Human review queues should also protect extracted content and not only original files. That discipline keeps submitted documents, extracted text, access controls, and retention requirements defensible during reviews and reduces hidden exposure.

Cost

Cost depends on document volume, page count, analysis frequency, duplicate processing, and downstream storage or search indexing. Layout analysis can save labor, but repeated analysis of the same file or unnecessary processing of low-value documents wastes money. Teams should avoid sending every document version through analysis unless there is a business reason. Cost control includes batching, deduplication, file filtering, lifecycle rules for results, and reporting by application or department. When layout output feeds Azure AI Search or RAG workflows, the total cost also includes indexing, storage, enrichment, and model usage later in the pipeline. Clear visibility helps FinOps teams connect document volume, page counts, retries, and enrichment pipeline steps to owners and outcomes.

Reliability

Reliability depends on document quality, file format, page limits, service availability, and downstream handling of confidence or extraction errors. Poor scans, rotated pages, unusual fonts, merged cells, or handwritten marks can reduce layout accuracy. Applications should not assume every table or checkbox is perfect. Reliable workflows include validation, retry logic, exception queues, human review for high-risk documents, and clear behavior when analysis fails. Operators should monitor failed analysis operations, processing time, confidence patterns, and output quality. They should also test representative documents, not only clean sample PDFs from a demo. That review path keeps consistent extraction quality and human review for ambiguous documents from becoming a wider production incident.

Performance

Performance depends on document size, page count, file format, regional distance, service load, network path, and downstream processing. Layout analysis may be suitable for near-real-time workflows, but large document batches should usually run asynchronously with queues and monitoring. Operators should track analysis duration, queue wait, retry rate, search enrichment time, and reviewer backlog. Performance tuning may involve splitting large jobs, filtering unnecessary files, colocating storage and AI resources, and avoiding synchronous user waits for long documents. The goal is predictable document throughput, not just a fast single-file demo. Measured evidence helps engineers tune page complexity, request size, batching, and downstream processing latency instead of guessing during pressure.

Operations

Operations teams manage layout model usage through resource configuration, API monitoring, storage integration, pipeline health, and exception review. They inspect Document Intelligence resources, private endpoints, keys or identity, diagnostic settings, request counts, failed operations, and downstream consumers such as Azure AI Search or workflow apps. Azure CLI is useful for confirming the Azure resource and access boundary, while application telemetry shows analysis results and failures. Runbooks should cover quota pressure, bad document batches, private endpoint failures, key rotation, search index enrichment errors, and how reviewers correct layout-related issues. The operating model gives support teams repeatable evidence for model version checks, request monitoring, and review queue handling.

Common mistakes

  • Expecting layout analysis to replace business-specific extraction rules or human review.
  • Sending sensitive documents without checking result storage and diagnostic logging paths.
  • Testing only clean sample files and ignoring scanned, rotated, or complex real documents.
  • Reprocessing unchanged documents and paying repeatedly for the same layout output.