variance
This page explains how to use the variance aggregation function in APL.
The variance
aggregation function in APL calculates the variance of a numeric expression across a set of records. Variance is a statistical measurement that represents the spread of data points in a dataset. It’s useful for understanding how much variation exists in your data. In scenarios such as performance analysis, network traffic monitoring, or anomaly detection, variance
helps identify outliers and patterns by showing how data points deviate from the mean.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Usage
Syntax
Parameters
Expression
: A numeric expression or field for which you want to compute the variance. The expression should evaluate to a numeric data type.
Returns
The function returns the variance (a numeric value) of the specified expression across the records.
Use case examples
You can use the variance
function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.
Query
Output
variance_req_duration_ms |
---|
1024.5 |
This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.
List of related aggregations
- stdev: Computes the standard deviation, which is the square root of the variance. Use
stdev
when you need the spread of data in the same units as the original dataset. - avg: Computes the average of a numeric field. Combine
avg
withvariance
to analyze both the central tendency and the spread of data. - count: Counts the number of records. Use
count
alongsidevariance
to get a sense of data size relative to variance. - percentile: Returns a value below which a given percentage of observations fall. Use
percentile
for a more detailed distribution analysis. - max: Returns the maximum value. Use
max
when you are looking for extreme values in addition to variance to detect anomalies.