# Q Tree

A `QTree[A: Monoid]` is like an approximate `Map[Double, A: Monoid]` that can store only a fixed number of keys. Instead of storing key-value pairs, it stores range-value pairs, where each range is a small `Double` interval, and the ranges don’t overlap. When there are too many keys, it combines nearby range-value pairs, combining the ranges, and combining the values using the `Monoid[A]` operation. Each range-value pair also keeps track of its count, i.e. the number of key-value pairs the range is composed of.

For example, suppose we create a `QTree[String]` from the following values: (remember that `String` is a `Monoid` under concatenation)

``````- 0.1, "he"
- 0.2, "ll"
- 0.21, "o"
- 0.26, "wo"
- 0.3, "r"
- 0.4, "l"
- 0.49, "d"
``````

This QTree might be represented as

``````- \[0,0.125), "he", count=1
- \[0.125,0.25), "llo", count=2
- \[0.25, 0.375), "wor", count=2
- \[0.375, 0.5), "ld", count=2
``````

or as

``````- \[0, 0.25), "hello", count=3
- \[0.25, 0.5), "world", count=4
``````

or as

``````- \[0,0.5), "helloworld", count=7
``````

How is this useful?

The most common usage of `QTree` is as `QTree[Unit]`. (Remember, `Unit` is a trivial monoid, where `() + () = ()` and `()` is the identity.)

This is basically a histogram. Why? When two ranges get combined, the counts get added together, so at any point, the count associated with a range `R` is exactly the number of original `Double` values in that range.

• The `quantileBounds` function gives lower and upper bounds for a given quantile. For instance, `qtree.quantileBounds(0.5)` gives bounds for the median double value.
• The `rangeCountBounds` function gives lower and upper bounds for the count within a certain range. For instance, `qtree.rangeCountBounds(0.5, 2.5)` gives lower and upper bounds for the number of elements in the range (0.5, 2.5).

What happens if we use a nontrivial monoid `A`, instead of `Unit`? Then we get some more useful functions:

• `rangeSumBounds` gives lower and upper bounds for the monoid sum of the elements in a certain range.

For example, if we call `qtree.rangeSumBounds(0.1,0.2)` on the first `QTree` described above, it would return `("","hello")`. Why is it a range, and not an exact value? This is because we only store the ranges and we don’t know exactly where in the range each item is. For example, `"he"` might exist between 0 and 0.09999 and `"llo"` might exist between 0.20001 and 0.25, in which case the true sum over (0.1,0.2) would be `""`. On the other hand, `"he"` might exist between 0.1 and 0.125, and `"llo"` might exist between 0.125 and 0.2, in which case the true sum would be `"hello"`. The answer could also be anywhere in between these two results. If we look at the original data points, we see that the correct answer is actually `"he"`, since `"he"` is the only value in the range [0.1,0.2).

• `totalSum` returns the total sum over the entire tree. In the example above, `qtree.totalSum` would return `"helloworld"`.

## REPL Tour

``````import com.twitter.algebird._

val data = List(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8)
// data: List[Int] = List(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8)

val seqQTree = data.map { QTree(_) }
// seqQTree: List[com.twitter.algebird.QTree[Long]] = List((1,0,1,1,None,None), (1,0,1,1,None,None), (2,0,1,2,None,None), (2,0,1,2,None,None), (3,0,1,3,None,None), (3,0,1,3,None,None), (4,0,1,4,None,None), (4,0,1,4,None,None), (5,0,1,5,None,None), (5,0,1,5,None,None), (6,0,1,6,None,None), (6,0,1,6,None,None), (7,0,1,7,None,None), (7,0,1,7,None,None), (8,0,1,8,None,None), (8,0,1,8,None,None))

val qtSemigroup = new QTreeSemigroup[Long](6)

val sum = qtSemigroup.sumOption(seqQTree).get
// sum: com.twitter.algebird.QTree[Long] = (0,4,16,0,Some((0,3,14,0,Some((0,2,6,0,Some((0,1,2,0,None,Some((1,0,2,2,None,None)))),Some((1,1,4,0,Some((2,0,2,4,None,None)),Some((3,0,2,6,None,None)))))),Some((1,2,8,0,Some((2,1,4,0,Some((4,0,2,8,None,None)),Some((5,0,2,10,None,None)))),Some((3,1,4,0,Some((6,0,2,12,None,None)),Some((7,0,2,14,None,None)))))))),Some((1,3,2,0,Some((2,2,2,0,Some((4,1,2,0,Some((8,0,2,16,None,None)),None)),None)),None)))

val sum2 = seqQTree.reduce{qtSemigroup.plus(_,_)}
// sum2: com.twitter.algebird.QTree[Long] = (0,4,16,0,Some((0,3,14,0,Some((0,2,6,0,Some((0,1,2,0,None,Some((1,0,2,2,None,None)))),Some((1,1,4,0,Some((2,0,2,4,None,None)),Some((3,0,2,6,None,None)))))),Some((1,2,8,0,Some((2,1,4,0,Some((4,0,2,8,None,None)),Some((5,0,2,10,None,None)))),Some((3,1,4,0,Some((6,0,2,12,None,None)),Some((7,0,2,14,None,None)))))))),Some((1,3,2,0,Some((2,2,2,0,Some((4,1,2,0,Some((8,0,2,16,None,None)),None)),None)),None)))

sum.count
// warning: there was one inliner warning; re-run with -Yinline-warnings for details
// res0: Long = 16

sum.upperBound
// res1: Double = 16.0

sum.lowerBound
// res2: Double = 0.0

sum.quantileBounds(.5)
// res3: (Double, Double) = (5.0,6.0)

sum.quantileBounds(.5)
// res4: (Double, Double) = (5.0,6.0)

sum.quantileBounds(.25)
// res5: (Double, Double) = (3.0,4.0)

sum.quantileBounds(.75)
// res6: (Double, Double) = (7.0,8.0)
``````

### Documentation Help

We’d love your help fleshing out this documentation! You can edit this page in your browser by clicking this link. These links might be helpful: