the number to convert to l-canonical form
the "l" in l-canonical form
vector of numbers that sum to s. Each entry is a power of 2, and the number of entries of each power of 2 matches the l-canonical representation of s. Note that:
bucketsFromLong(s, l) == fromLong(s, k).toBuckets
bucketsFromLong is more efficient.
the number to convert to l-canonical form
the "l" in l-canonical form
vector of the coefficients of 2^i in the l-canonical representation of s. For example:
scala> Canonical.fromLong(15, 2) res0: Vector[Int] = Vector(3, 2, 2)
15 = (3 * 20) + (2 * 21) + (2 * 2^2) the "l" in l-canonical means that
l
or l + 1
returnVector.last
<= l + 1
## L-Canonical Representation Procedure:
- Find the largest j s.t. 2j <= (s + l) / (1 + l)
- let s' = 2j(1 + l) - l
- let diff = (s - s') is the position of s within that group.
- let b = the little-endian binary rep of diff % (2^j - 1)
- let ret = return vector of length j:(0 until j).map { i => ret(i) = b(i) + l } ret(j) = math.floor(diff / 2^j)
The paper that introduces the exponential histogram proves that, given a positive number
l
, every integer s can be uniquely represented as the sum of(l or (l + 1)) * 2i + (# from 1 to (l + 1)) 2j
for i = (0 to j - 1), given some j.
The paper calls this the "l-canonical" representation of s.
It turns out that if you follow the exponential histogram bucket-merging algorithm, you end up with the invariant that the number of buckets with size 2^i exactly matches that power of 2's coefficient in s's l-canonical representation.
Put another way - only sequences of buckets with sizes matching the l-canonical representation of some number s are valid exponential histograms.
(We use this idea in
ExpHist.rebucket
to take a sequence of buckets of any size and rebucket them into a sequence where the above invariant holds.)This is huge. This means that you can implement
addAll(newBuckets)
by- calculating newS = s + delta contributed by newBuckets - generating the l-canonical sequence of bucket sizes for newS - rebucketing newBuckets ++ oldBuckets into those bucket sizes
The resulting sequence of buckets is a valid exponential histogram.