Analytics Stream Analytics field-manual-complete field-manual-complete field-manual-complete

Stream Analytics function

A Stream Analytics function is a reusable piece of custom logic inside a streaming query. Instead of forcing every rule into standard SQL operators, you can call a function for specialized calculations, transformations, scoring, or encoding. The function runs as part of event processing, so it must be safe, deterministic, and efficient. It is useful when business logic is too specific for built-in Stream Analytics expressions but still needs to run on each event or windowed result.

Aliases
Stream Analytics function, ASA function, Stream Analytics user-defined function, JavaScript UDF for Stream Analytics
Difficulty
intermediate
CLI mappings
5
Last verified
2026-05-26T16:27:57Z

Microsoft Learn

A Stream Analytics function is custom logic that can be invoked from a Stream Analytics query. Functions extend the SQL-like language for operations such as specialized calculations, string handling, enrichment, aggregation, or machine-learning scoring when built-in query expressions are not enough.

Microsoft Learn: User-defined functions in Azure Stream Analytics2026-05-26T16:27:57Z

Technical context

Technically, functions belong to the Stream Analytics job topology beside inputs, outputs, and transformations. A query can call JavaScript functions, JavaScript aggregates, and supported external or machine-learning function integrations depending on the scenario and tooling. Function definitions are deployed with the job and are inspected through the control plane, but failures appear in the streaming data path. They interact with serialization, query shape, event rate, exception handling, diagnostics, and output correctness. Operators review them as code, not as decorative query helpers.

Why it matters

Stream Analytics functions matter because they concentrate business logic in the live event path. A small function can classify transactions, normalize device payloads, calculate risk, translate coded values, or call a scoring pattern that decides what downstream systems see. That power also creates risk: slow code adds latency, nondeterministic logic makes replay unreliable, and exceptions can stop processing. Teams should use functions when they solve a specific gap, not when ordinary SQL would be clearer. The value is strongest when the function is tested with representative events, versioned with the job definition, and monitored as part of production processing safely.

Where you see it

Signals, screens, and Azure surfaces where this term usually becomes operational.

Signal 01

In the Stream Analytics job topology, the Functions blade lists JavaScript, aggregate, or configured scoring functions available to the transformation query during job review and approval.

Signal 02

In the query text, function calls appear beside SELECT, WHERE, JOIN, or aggregation logic, making custom behavior part of live processing and release testing before deployment.

Signal 03

In CLI function show or test output, operators see definition properties, validation results, and errors that explain failed or unsafe function changes before deployment approval.

When this becomes relevant

Specific situations where this term helps solve real Azure design, operations, migration, security, reliability, cost, or governance problems.

  • Normalize device payload fields that arrive in vendor-specific formats before the stream writes to shared operational tables.
  • Apply a compact risk or classification rule per event when built-in Stream Analytics expressions are too limited.
  • Perform custom windowed aggregation logic that needs a JavaScript aggregate rather than standard SQL aggregation.
  • Invoke approved machine-learning scoring for real-time prediction where the output must join the streaming result immediately.
  • Centralize reusable transformation logic so several query branches call the same tested function instead of duplicating brittle expressions.

Real-world case studies

Different enterprise-style examples that show the term being used to hit measurable objectives.

Case study 01

Delivery platform standardizes route-delay scoring

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A food delivery platform streamed courier GPS events, restaurant readiness signals, and weather updates. The standard query became cluttered with repeated route-delay calculations that differed across city teams.

Business/Technical Objectives
  • Use one reviewed delay-scoring function across all city streams.
  • Reduce false delay alerts caused by malformed location events.
  • Keep per-event processing latency below three seconds.
  • Make function changes visible in release reviews.
Solution Using Stream Analytics function

Engineers created a Stream Analytics JavaScript function that accepted route distance, courier speed, restaurant readiness, and weather severity fields, then returned a normalized delay score. The transformation query called the function after filtering impossible coordinates and null courier identifiers. Test payloads included dense downtown routes, rural deliveries, missing GPS fields, and severe weather. Azure CLI listed and showed the function definition during deployment, while the job output wrote high-risk delay events to Service Bus and summary metrics to Azure SQL Database. SU utilization and watermark delay were monitored after rollout to prove the function did not become the bottleneck.

Results & Business Impact
  • False delay alerts dropped by 29 percent in the first two weeks.
  • Median stream-processing latency stayed at 1.8 seconds after rollout.
  • City teams removed 11 duplicated query fragments from local jobs.
  • Release review time for scoring changes fell from two days to four hours.
Key Takeaway for Glossary Readers

A Stream Analytics function is valuable when custom streaming logic needs reuse, tests, and clear operational ownership.

Case study 02

Insurer scores claim events before adjuster assignment

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

An insurance carrier streamed first-notice-of-loss events from mobile apps and call centers. Fraud indicators were inconsistent because each channel applied different rules before assigning claims to adjusters.

Business/Technical Objectives
  • Apply a consistent risk score before claims entered the assignment queue.
  • Avoid storing sensitive rule logic in several channel applications.
  • Keep failed scoring from stopping all normal claim intake.
  • Provide auditors with versioned scoring evidence.
Solution Using Stream Analytics function

The analytics team added a Stream Analytics function that evaluated claim amount, incident time, policy age, location distance, and prior claim signals. The function returned a risk band used by the transformation query to route ordinary claims to SQL Database and suspicious claims to a Service Bus review queue. Engineers tested null policy fields, unusual time zones, and malformed location coordinates before deployment. The function definition and Stream Analytics job configuration were exported during each release. Diagnostic logs captured exceptions, and a fallback query branch marked incomplete records for manual review instead of silently dropping them.

Results & Business Impact
  • High-risk claim routing improved from 22-minute batch delay to under 90 seconds.
  • Channel-specific scoring discrepancies fell by 63 percent after consolidation.
  • Malformed-event exceptions dropped to fewer than five per million claims.
  • Audit evidence collection for scoring changes decreased from five days to one day.
Key Takeaway for Glossary Readers

Putting reviewed scoring logic in a Stream Analytics function can make live decisions consistent without spreading rules across every source system.

Case study 03

Media network cleans live caption metadata

Scenario, objectives, solution, measured impact, and takeaway.

Scenario

A media network processed live caption, language, and timecode metadata for multiple streaming channels. Small formatting differences caused downstream archival jobs to miss clips or create duplicate records.

Business/Technical Objectives
  • Normalize caption metadata before it reached archive and search systems.
  • Reduce duplicate clip records caused by inconsistent timecode formats.
  • Keep live-stream processing within a five-second freshness target.
  • Give broadcast engineers a safe rollback path for function changes.
Solution Using Stream Analytics function

Developers built a lightweight Stream Analytics function that normalized language codes, trimmed caption identifiers, and converted several timecode patterns into a standard field. The transformation query called the function only after validating channel ID and event type, avoiding unnecessary execution for unrelated telemetry. Outputs went to Azure Data Explorer for live search and Data Lake Storage for archive processing. CLI-based deployment captured the function definition and previous job state before each channel rollout. The team added sample events from sports, news, and multilingual broadcasts so regression tests covered the formats most likely to break archival search.

Results & Business Impact
  • Duplicate clip records fell by 48 percent across the first 30 channels.
  • Live metadata freshness remained below four seconds at peak broadcast volume.
  • Archive search misses related to caption formatting dropped by 57 percent.
  • Rollback from a bad timecode rule was completed in 12 minutes during rehearsal.
Key Takeaway for Glossary Readers

Stream Analytics functions are strongest when they solve a narrow transformation problem that would otherwise create downstream data quality failures.

Why use Azure CLI for this?

I use Azure CLI for Stream Analytics functions because function drift is hard to spot by clicking through jobs. CLI lets me list functions, show definitions, test a function, and export the job topology before changing code. After years of incident reviews, I treat streaming functions like production code because one bad exception can stop a job or corrupt output. CLI also helps compare development and production definitions, which matters when a small JavaScript change is deployed outside source control. For approvals, command output gives reviewers function names, properties, job scope, and test results without relying on screenshots from the portal.

CLI use cases

  • List functions under a job before reviewing or updating transformation query logic.
  • Show a function definition and compare it against the version stored in source control.
  • Test a function with representative payloads before starting or updating the production job.
  • Update a function through deployment automation after approvals and rollback evidence are ready.
  • Delete obsolete functions that are no longer referenced by the transformation query.

Before you run CLI

  • Confirm tenant, subscription, resource group, Stream Analytics job name, and function name before testing or changing function definitions.
  • Check whether the command is read-only; create, update, delete, and test operations can affect release evidence or live job behavior.
  • Review the transformation query to confirm whether the function is actively called and what input schema it expects.
  • Prepare representative test payloads, previous function definition, rollback steps, and output format before making a production change.

What output tells you

  • Function list output shows which custom functions exist under the job and helps find unused or unexpected definitions.
  • Function show output exposes the function properties, binding details, and definition material needed for review and drift comparison.
  • Function test output indicates whether the definition is valid for the supplied configuration or payload before the job relies on it.
  • Job show with expanded functions connects function definitions back to the transformation query, inputs, outputs, and runtime settings.

Mapped Azure CLI commands

Stream Analytics function inspection and testing

tests
az stream-analytics function list --job-name <job-name> --resource-group <resource-group> --output table
az stream-analytics functiondiscoverAnalytics
az stream-analytics function show --job-name <job-name> --resource-group <resource-group> --function-name <function-name>
az stream-analytics functiondiscoverAnalytics
az stream-analytics function test --job-name <job-name> --resource-group <resource-group> --function-name <function-name>
az stream-analytics functiondiscoverAnalytics
az stream-analytics function update --job-name <job-name> --resource-group <resource-group> --function-name <function-name> --properties @function.json
az stream-analytics functionconfigureAnalytics
az stream-analytics job show --job-name <job-name> --resource-group <resource-group> --expand "functions,transformation"
az stream-analytics jobdiscoverAnalytics

Architecture context

A Stream Analytics function belongs where streaming query logic needs a focused extension point. It should not become a hidden application inside the job. Architects should decide which logic remains in SQL, which belongs in a function, and which should move to a separate service or downstream processor. The decision depends on latency, determinism, exception risk, maintainability, and cost. Functions that run for every event must be light and predictable; functions used for scoring or enrichment need explicit test data and failure handling. In a production design, the function definition, input schema assumptions, output type, version history, and rollback path should sit beside the transformation query in source control.

Security

Security impact is partly indirect because ordinary Stream Analytics functions usually transform data rather than grant access. Risk still appears through code behavior, sensitive values, and external scoring integrations. Function code should not embed secrets, leak payload details in logs, or expose confidential fields through derived outputs. If a function uses Azure Machine Learning or another configured service integration, identity, network access, and endpoint permissions need review. RBAC should limit who can update functions because a malicious or careless change can reroute, downgrade, or misclassify live data. Security reviewers should also check deterministic behavior where output records support audits, investigations, or policy decisions.

Cost

A function has no separate billing line inside Stream Analytics, but it can affect cost by increasing CPU work, streaming-unit needs, output volume, and downstream service calls. A lightweight string normalization function may be cheap; a complex scoring or aggregation pattern used on every event may require more streaming units or a dedicated cluster. External machine-learning integrations can add endpoint cost and latency. Bad logic can also create too many output rows, increasing Storage, SQL, Cosmos DB, or logging charges. FinOps reviews should look at per-event function calls, event rate, SU utilization, output volume, and whether the same work could be done more cheaply downstream.

Reliability

Reliability impact is direct because function errors can interrupt event processing or produce bad records at streaming speed. User-defined logic must handle nulls, malformed payloads, unexpected types, and boundary values. Nondeterministic functions make replay analysis unreliable because old input may produce different output later. Heavy functions can increase latency or raise resource pressure, especially when called per event. Reliable teams test functions with dirty data, use try-catch patterns where appropriate, avoid random or time-dependent behavior unless intentional, and monitor job errors after deployment. A rollback plan should restore the previous function definition, not only the previous query text during incidents.

Performance

Performance impact is direct because functions execute in the query path. Per-event functions multiply cost by event rate, so small inefficiencies become large under high throughput. Expensive parsing, complex JavaScript, external scoring, or unnecessary aggregate logic can increase watermark delay and SU utilization. The function should be benchmarked with realistic payloads, not only tested for correctness. If latency rises after a release, compare input rate, SU utilization, function exceptions, and output delays. Good performance design keeps functions focused, avoids nondeterministic or blocking behavior, handles nulls quickly, and uses built-in query operators when they are clearer and faster overall under pressure.

Operations

Operators inspect functions when a job starts failing, output values look wrong, or a release changes business logic. Practical work includes listing functions under a job, showing definitions, running function tests, comparing definitions across environments, and confirming that the transformation query calls the expected function name. Function changes should be paired with sample input events and expected output values. Runbooks should include how to disable a risky query path, how to restore the previous definition, and which logs expose function exceptions. Good operations treat functions as deployable artifacts with owners, tests, rollback notes, and monitoring responsibilities after deployment and review.

Common mistakes

  • Putting too much business logic into a function when ordinary query expressions or downstream processing would be simpler.
  • Deploying a function that works for happy-path events but throws exceptions on null, malformed, or late-arriving payloads.
  • Using nondeterministic behavior that makes replayed events produce different results during investigation.
  • Changing a function in one environment and forgetting that production still runs an older definition.
  • Ignoring performance impact when a function is called for every event in a high-volume stream.