AI and Machine Learning Agents and grounding template-specs-upgraded

Retrieval tool

A retrieval tool gives an AI agent a controlled way to look up knowledge outside the model before answering. In Microsoft Foundry agents, the file search tool can ingest documents, create a vector store, run hybrid retrieval, and return relevant passages for grounding. The agent still generates the response, but the tool supplies the evidence. This is useful when answers must come from manuals, policies, support articles, contracts, or customer-provided files instead of model memory alone.

Aliases
file search tool, agent retrieval tool, Foundry retrieval tool, vector store tool, knowledge retrieval tool
Difficulty
intermediate
CLI mappings
5
Last verified
2026-05-22

Microsoft Learn

A retrieval tool gives an AI agent a controlled way to look up knowledge outside the model before answering. In Microsoft Foundry agents, the file search tool can ingest documents, create a vector store, run hybrid retrieval, and return relevant passages for grounding. The agent still generates the response, but the tool supplies the evidence. This is useful when answers must come from manuals, policies, support articles, contracts, or customer-provided files instead of model memory alone.

Microsoft Learn: File search tool for Microsoft Foundry agents2026-05-22

Technical context

In Azure architecture, a retrieval tool sits between the agent runtime and the knowledge layer. It may use uploaded files, Azure Blob Storage, Azure AI Search, vector stores, embeddings, keyword search, semantic matching, and reranking depending on setup type. Basic agent setup can use Microsoft-managed storage and search resources, while standard setup uses connected customer resources. Operators must account for identity, storage location, ingestion status, vector-store attachment, search capacity, private networking, diagnostics, and how tool outputs flow into model prompts.

Why it matters

Retrieval tools matter because agent behavior becomes safer and more useful when the agent can consult approved knowledge at the right time. Without a retrieval tool, teams often stuff too much context into prompts, accept stale model knowledge, or rely on broad internet search for private questions. With the tool, an agent can answer from uploaded specifications, internal procedures, or connected knowledge stores and provide grounded responses. The risk is configuration drift: a missing vector store, failed ingestion, wrong storage setup, or weak access design can make the agent hallucinate, ignore documents, or expose content outside intended boundaries. during real support conversations.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In Microsoft Foundry agent configuration, you see file search or retrieval tools attached to an agent, often with vector stores or connected search resources. during setup.

Signal 02

In SDK or REST payloads, you notice tool definitions, file IDs, vector-store IDs, citations, and run steps showing whether retrieval was invoked. during troubleshooting sessions.

Signal 03

In Azure resource inventory, standard setup exposes the supporting Blob Storage, Azure AI Search, managed identity, private endpoint, and diagnostic settings behind the tool. during deployment review.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • Let a support agent answer from uploaded product manuals and release notes instead of relying on stale model knowledge.
  • Ground a proposal or contract assistant in customer-specific files without building a custom search pipeline from scratch.
  • Use standard setup when enterprise data-control requirements require Azure Blob Storage and Azure AI Search under customer ownership.
  • Verify tool-call behavior after hallucination complaints by checking whether the agent actually retrieved relevant passages.
  • Attach temporary project documents to a conversation so an agent can answer questions without permanently changing the global knowledge base.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Architecture firm grounds project agents in bid documents

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

An architecture firm used agents to draft proposal responses, but teams kept pasting large bid packets into prompts. Important addenda were missed, and prompt files leaked between project channels.

Business/Technical Objectives
  • Let each project agent answer from its own uploaded bid documents.
  • Verify addenda were ingested before proposal writers asked questions.
  • Keep project knowledge separate across confidential pursuits.
  • Reduce manual prompt stuffing and repeated document searches.
Solution Using Retrieval tool

The digital delivery team configured a Retrieval tool through Microsoft Foundry file search. Each pursuit created a dedicated vector store for the request for proposal, addenda, room schedules, and client design standards. Writers asked questions only after ingestion completed and a validation question proved the latest addendum was retrievable. For high-security pursuits, the team used standard setup with connected Azure Blob Storage and Azure AI Search under the firm’s subscription. Azure CLI verified resource ownership, role assignments, private endpoints, diagnostic settings, and cost tags behind each project environment.

Results & Business Impact
  • Proposal writers reduced document lookup time by 46 percent.
  • Late addendum misses dropped from five in one quarter to one in the next.
  • No cross-project file citations appeared in access-control testing.
  • Prompt token use fell by 29 percent because documents were retrieved instead of pasted wholesale.
Key Takeaway for Glossary Readers

A retrieval tool gives project agents bounded knowledge without turning every prompt into a file dump.

Case study 02

Biotech lab assistant validates protocol answers

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A biotech research group wanted an agent to answer lab protocol questions. Scientists distrusted early answers because the agent could not prove whether it used the latest standard operating procedure.

Business/Technical Objectives
  • Ground protocol answers in approved SOPs and experiment notes.
  • Show citations for safety-critical preparation and disposal steps.
  • Detect ingestion failures before wet-lab teams used the assistant.
  • Keep obsolete protocol files from influencing new answers.
Solution Using Retrieval tool

The team attached a Retrieval tool to the lab assistant and loaded approved SOPs into a vector store. File owners used versioned names and removed superseded protocols before each release. Test questions checked reagent preparation, storage temperature, disposal handling, and exception paths that could only be answered from the current files. Operators inspected tool-call traces and citations when scientists flagged suspicious answers. Azure CLI supported surrounding checks for the AI resource, diagnostic settings, storage setup, and identity assignments, while Foundry workflow logs confirmed which files were active.

Results & Business Impact
  • Protocol lookup time fell from 11 minutes to 2 minutes for common questions.
  • Weekly validation caught four obsolete SOPs before production use.
  • Citation review reduced scientist-reported “unsupported answer” tickets by 64 percent.
  • The lab avoided one near-miss caused by an outdated reagent-disposal instruction.
Key Takeaway for Glossary Readers

Retrieval tools are most valuable when the agent must prove which approved document informed the answer.

Case study 03

Aerospace maintenance agent uses customer-owned search

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

An aerospace maintenance provider needed an agent for technicians working under strict data-control rules. Microsoft-managed storage was not acceptable for aircraft records and customer service histories.

Business/Technical Objectives
  • Use customer-owned storage and search for retrieval grounding.
  • Restrict technicians to aircraft and contract documents they are allowed to access.
  • Monitor retrieval latency at remote maintenance locations.
  • Keep a rollback path for bad document uploads.
Solution Using Retrieval tool

The platform team deployed the Retrieval tool with standard setup so uploaded documents lived in connected Azure Blob Storage and vector stores used connected Azure AI Search. Metadata captured aircraft tail number, contract, document type, and effective date. The agent application filtered retrieval based on the technician’s authorization context before the model saw any passages. Private endpoints kept search and storage off the public network. Azure CLI was used to audit private endpoints, role assignments, search SKU, storage account configuration, and diagnostic settings after every deployment.

Results & Business Impact
  • Technician first-answer time improved from 14 minutes to 3.5 minutes.
  • Access testing blocked 100 percent of cross-contract retrieval attempts.
  • Retrieval p95 latency stayed under 1.8 seconds after adding a search replica.
  • Bad upload rollback dropped from a half-day rebuild to a 35-minute vector-store replacement.
Key Takeaway for Glossary Readers

Standard setup makes retrieval tools practical for agents that must meet enterprise data-control requirements.

Why use Azure CLI for this?

Azure CLI is adjacent rather than primary for retrieval tools, because the agent tool is usually configured through Foundry, SDKs, or REST APIs. I still use CLI heavily around it. CLI verifies the Azure AI Search service, Blob Storage account, AI resource, managed identity, private endpoints, role assignments, diagnostic settings, and cost tags that support the tool. In real incidents, the question is rarely just whether the tool exists; it is whether files ingested, the right identity can query, the network path works, and telemetry proves what happened. CLI gives that repeatable operational evidence. during support reviews and audits. monthly.

CLI use cases

  • Inventory the Azure AI Search and Blob Storage resources that back retrieval in standard setup.
  • Check role assignments for the agent or application identity that must read search indexes and storage.
  • List private endpoints and DNS-linked resources when retrieval works locally but fails in production.
  • Export diagnostic settings for search, storage, and AI resources during a hallucination or outage review.
  • Compare cost tags and SKUs across projects that use retrieval tools heavily.

Before you run CLI

  • Confirm tenant, subscription, resource groups, project resources, search service, storage account, and identity ownership.
  • Distinguish read-only discovery from role changes, network changes, storage changes, or scale changes that may affect production agents.
  • Check whether the agent uses basic Microsoft-managed resources or standard connected Azure resources before interpreting CLI output.
  • Use JSON output for evidence, and avoid exposing file names or sensitive metadata in public incident channels.

What output tells you

  • Search and storage output confirms whether standard setup has customer-owned resources available for ingestion and retrieval.
  • Role assignments reveal whether the agent or application identity can access the connected knowledge resources without shared keys.
  • Private endpoint and network output explains retrieval failures caused by DNS, public access blocks, or unreachable services.
  • Diagnostic settings show whether the supporting resources emit enough telemetry to investigate tool-call failures and latency.

Mapped Azure CLI commands

Retrieval tool Azure CLI commands

operational
az search service show --name <search-service> --resource-group <resource-group>
az search servicediscoverAI and Machine Learning
az storage account show --name <storage-account> --resource-group <resource-group>
az storage accountdiscoverStorage
az cognitiveservices account show --name <ai-resource> --resource-group <resource-group>
az cognitiveservices accountdiscoverAI and Machine Learning
az role assignment list --scope <resource-id> --assignee <principal-id>
az role assignmentdiscoverAI and Machine Learning
az network private-endpoint list --resource-group <resource-group>
az network private-endpointdiscoverAnalytics

Architecture context

Architecturally, a retrieval tool is an agent capability wired to a bounded knowledge source. It should be treated as a dependency with its own lifecycle: document ingestion, chunking, embedding, vector-store attachment, retrieval policy, access boundary, and monitoring. For simple demos, Microsoft-managed search and storage may be enough. For enterprise agents, standard setup with connected Azure Blob Storage and Azure AI Search gives better control over data residency, governance, and capacity. Architects should define which files are allowed, who can upload them, how updates are approved, how citations are exposed, and what happens when retrieval fails. before production rollout. and ownership.

Security

Security is direct because the retrieval tool decides which external knowledge reaches the model. Uploaded documents may contain customer data, intellectual property, legal terms, or regulated instructions. Use Microsoft Entra identities and least-privilege roles for connected resources, prefer standard setup when data-control requirements demand customer-owned storage and search, and use private endpoints where appropriate. Review who can upload files, create vector stores, attach tools to agents, and view responses. Treat retrieved passages as untrusted input that can contain prompt injection. Log enough for audit, but avoid storing sensitive chunks unnecessarily. Validate attachment scope, tenant boundary, storage retention, index ownership, and incident-notification paths.

Cost

Cost impact comes from the retrieval infrastructure and the extra model context it creates. File search can add charges beyond model token usage, and standard setup also depends on connected Azure AI Search and Blob Storage resources. Indexing, embeddings, vector storage, search replicas, semantic features, diagnostic retention, and longer prompts all affect spend. The dangerous pattern is attaching large document sets with weak filtering, then sending too many chunks to every answer. FinOps reviews should track file volume, vector-store size, search tier, query rate, input tokens, ingestion cadence, and whether retrieval improves deflection or accuracy enough to justify the added platform cost.

Reliability

Reliability impact is direct because an agent can be online while its retrieval tool is useless. Files may still be ingesting, vector stores may be missing, search capacity may throttle, storage permissions may break, or a private endpoint DNS issue may block retrieval. Reliable designs verify ingestion completion, attach the correct vector store to the agent or conversation, test questions only answerable from uploaded content, and monitor retrieval failures separately from model failures. Use fallback behavior that says knowledge is unavailable rather than fabricating. For critical agents, version files, keep prior vector stores, and rehearse rollback after bad content uploads.

Performance

Performance depends on ingestion readiness, vector-store size, retrieval quality, search service capacity, reranking work, and prompt size. A retrieval tool can make answers more accurate but also slower because the agent must search, rank, and include grounding passages before generating. Poor chunking can return irrelevant snippets; excessive chunks can increase latency and token use. Standard setup performance also depends on Azure AI Search replicas, partitions, and private network routing. Measure time to ingest, retrieval latency, tool-call frequency, citation quality, model response time, and user-perceived delay. Tune top results and metadata filters before scaling blindly. under expected user concurrency. under load.

Operations

Operators manage retrieval tools by tracking files, vector stores, ingestion jobs, tool attachments, search resources, storage resources, identities, diagnostics, and user reports about missing citations. They confirm that uploaded files completed processing, ask test questions grounded in those files, and inspect whether the agent actually called the tool. Azure CLI supports the surrounding checks: resource inventory, role assignments, private endpoints, diagnostic settings, and cost tags. Runbooks should explain how to replace a bad file, rebuild a vector store, disconnect a tool, investigate hallucination complaints, and prove which knowledge source was active during an answer. after every content update. after releases.

Common mistakes

  • Uploading files and assuming ingestion completed before testing questions that depend on those documents.
  • Attaching the wrong vector store to an agent or conversation and then blaming the model for missing facts.
  • Using a retrieval tool without reviewing who can upload sensitive documents or see generated citations.
  • Ignoring additional retrieval charges and token growth after attaching large document collections.