# Averaged Value

The `AveragedValue` data structure keeps track of the `count` and `mean` of a stream of numbers with a single pass over the data. The mean calculation uses a numerically stable online algorithm suitable for large numbers of records, similar to Chan et. al.’s parallel variance algorithm on Wikipedia. As long as your count doesn’t overflow a `Long`, the mean calculation won’t overflow.

You can build instances of `AveragedValue` from any numeric type:

``````import com.twitter.algebird._

val longVal = AveragedValue(3L)

val doubleVal = AveragedValue(12.0)

val intVal = AveragedValue(15)
``````

## Algebraic Properties

Combining instances with `+` generates a new instance by adding the `count`s and averaging the `value`s:

``````longVal + doubleVal

longVal + doubleVal + intVal
``````

You can also add numbers directly to an `AveragedValue` instance:

``````longVal + 12
``````

`AveragedValue` is a commutative group. This means you can add instances in any order:

``````longVal + doubleVal == doubleVal + doubleVal
// res3: Boolean = false
``````

An `AveragedValue` with a count and value of `0` act as `Monoid.zero`:

``````Monoid.zero[AveragedValue]

longVal + Monoid.zero[AveragedValue] == longVal
// res5: Boolean = true
``````

Subtracting `AveragedValue`s is the opposite of addition:

``````intVal - longVal

intVal + doubleVal - doubleVal
``````

### Stable Average Algorithm

Each `AveragedValue` instance represents a stream of data. `AveragedValue` uses Chan et. al.’s parallel algorithm to combine the mean values of each stream. Here’s the calculation:

``````// big and small are two AveragedValue instances
val deltaOfMeans = big.value - small.value
val newCount = big.count + small.count
val newMean = small.value + deltaOfMeans * (big.count / newCount)
``````

As long as `newCount` stays within the bounds of `Long`, this calculation won’t overflow for large `value`.

The Wikipedia presentation of this algorithm points out that if both streams are close in size, the algorithm is “prone to loss of precision due to catastrophic cancellation”.

`AveragedValue` guards against this instability by taking a weighted average of the two instances if the smaller of the two has a count greater than `10%` of the combined count.

``````val newCount = big.count + small.count
(big.count * big.value + small.count * small.value) / newCount
``````

## Aggregator

`AveragedValue.aggregator` returns an `Aggregator` that uses `AveragedValue` to calculate the mean of all `Double` values in a stream. For example:

``````val items = List[Double](1.0, 2.2, 3.3, 4.4, 5.5)
// items: List[Double] = List(1.0, 2.2, 3.3, 4.4, 5.5)

AveragedValue.aggregator(items)
// res8: Double = 3.28
``````

`AveragedValue.numericAggregator` works the same way for any numeric type:

``````val items = List[Int](1, 3, 5, 7)
// items: List[Int] = List(1, 3, 5, 7)

AveragedValue.numericAggregator[Int].apply(items)
// res9: Double = 4.0
``````
• DecayedValue calculates a decayed moving average of a stream; this allows you to approximate the mean of a bounded sliding window of the datastream.
• In addition to count and mean, Moments allows you to keep track of the standard deviation, skewness and kurtosis of a stream with one pass over the data.