These functions are often the guts of a statistical analysis for one attribute. These are all aggregate functions. You can use them in formulas for attributes or for measures.
Note: All of these functions take an optional last argument, a Boolean expression. That modifies the function so that it applies only to those cases where the condition is true. For example, max( age, ancestry = "Norwegian") finds the age of the oldest Norwegian in the collection.
Counting and Proportions
count( ) |
The number of cases in a collection |
count( a ) |
The number of cases having a valid value for a that is not false. |
count( condition ) |
The number of cases where condition is true. Example: count( age > 65) gives the number of cases where age is greater than 65. |
proportion( condition) |
Gives the proportion of cases where condition is true. Example: proportion( a > 12) tells us the proportion of all cases where a is greater than 12. The result is a number between 0 and 1, inclusive. |
Measures of Center
mean( a) |
The mean value for a in the collection. |
median( a) |
The median value for a in the collection. |
Measures of Spread
iqr( a) |
The interquartile range of the attribute a. |
popStdDev( a) |
The standard deviation of the attribute you give it, in this case, a. This is the "population standard deviation." |
s( a) |
These three are synonyms. They estimate the standard deviation of a population given that the cases constitute a simple random sample. That is, it's just like popStdDev, except it's a little bigger because we used n 1 instead of n in the calculation. |
popVariance( a) |
The variance of the values in a. This is also popStdDev squared. |
sampleVariance( a ) |
The estimate of the population variance. This is also s squared. (These two are synonyms.) |
stdError( a ) |
An estimate for the error of the population mean of a, assuming that the cases are a simple random sample. This is the same as s( a ) divided by the square root of n. |
Order Statistics
max( a ) |
The maximum value for a in the collection. |
min( a ) |
The minimum value for a in the collection. |
percentile( pct, a) |
Gives the value at the given percentile for a. |
Other
sum( a ) |
The sum of the attribute you give it, in this case, the sum of a. |
uniqueValues( a) |
The number of unique values that attribute has
in the collection. |
first( a ) |
Gives you the value of a from the first case in the collection. |
last( a ) |
Gives you the value of a from the last case in the collection. |
Q1( a ) |
The value of a that lies at the 25th percentile; i.e. the first quartile. |
Q3( a ) |
The value of a that lies at the 75th percentile; i.e. the third quartile. |