User Guide¶
Choosing the stats implementation to use for Finagle¶
Finagle uses util-stats heavily throughout and it is important for a well-run service to have this wired up properly. Finagle uses LoadService in order to discover implementations on the classpath. The standard implementation comes via finagle-stats.
If multiple implementations are found, metrics are reported to all of the implementations via a BroadcastStatsReceiver. This feature is useful when transitioning from one library implementation to another. Once you are done migrating, in the interests of efficiency and memory pressure, it is wise to validate that only a single implementation is loaded via your dependencies. This also implies that developers of libraries should not depend on a specific implementation and only depend on util-stats.
If no implementations are found, metrics are reported to a NullStatsReceiver which is essentially
/dev/null
.
Measuring the latency of operations¶
A common need for clients is to capture the latency distribution, often of asynchronous calls such as those returning Futures. This can be done easily via the Stat.time and Stat.timeFuture methods.
Prefer storing counters and stats in member variables¶
While it may be tempting to create your metrics inline, whenever possible you should initialize
these up front, typically in a class constructor. This is because while capturing usage via
Stat.add
and Counter.incr
is optimized for performance, the construction of a Stat
and
Counter
is not.
class CountingExample(statsReceiver: StatsReceiver) {
// create the Counter once and store it in a member
private val numSheep = statsReceiver.counter("sheep")
// incr() called on the member variable many times
// in the application's lifecycle.
def bedtime(n: Int): Unit = {
numSheep.incr(n)
// nifty business logic here
}
}
Note that sometimes this is not possible due to business logic, for example, when a counter is named based on current context. However, care should be taken to ensure that the cardinality of metrics created is bounded or you will run the risk of a memory leak (for example by including application ids in your metric names).
Store gauges in member variables¶
While your application code is not likely to refer to the value returned from
StatsReceiver.addGauge
, you should store it in a member variable. This is due to the common
implementation of Gauge
being CumulativeGauge
, which uses java.lang.ref.WeakReference.
If a CumulativeGauge
is not held by a strong reference, the gauge is eligible for garbage
collection, which would cause your metric to disappear unexpectedly at runtime.
// bad, don't do this:
class BadExample(queue: BlockingQueue[Work], statsReceiver: StatsReceiver) {
statsReceiver.addGauge("num_waiting") { queue.size }
// code that may do incredible things but you'd never know
// the size of the queue.
}
// do this instead:
class GoodExample(queue: BlockingQueue[Work], statsReceiver: StatsReceiver) {
private val queueDepth =
statsReceiver.addGauge("num_waiting") { queue.size }
// code that does incredible things AND you know the
// size of the queue.
}
Prefer addGauge over provideGauge¶
StatsReceiver
offers two similar methods, addGauge
and provideGauge
, and whenever
possible addGauge
should be preferred. provideGauge
is basically a call to addGauge
along with code that holds a strong reference to the gauge in a global linked list. Recall from the
previous note that Gauges
need to held in strong references and as
such you should only rely on provideGauge
when you do not have a place to keep a strong
reference.
Prefer fully-qualified names over scoping¶
There is a convenient API method (scope
) allowing to spawn a “scoped” version of a given
StatsReceiver
. Although it enables effortless code reuse, scoping comes at the cost of extra
allocations. Inlining a fully-qualified metric name directly into the constructing method yields
the same outcome yet avoids the overhead needed to accommodate interim structures.
Put this way, if possible, prefer this
statsReceiver.counter("foo", "bar", "baz")
over this
statsReceiver.scope("foo").scope("bar").counter("baz")
It’s important to note that while this optimization is appealing like an easy win, it shouldn’t
be universally applied. There are many legitimate use-cases when scoping a StatsReceiver
is very reasonable thing to do, and otherwise, would require an alternative channel for the scope
to be propagated between components, presumably introducing the overhead somewhere else.
Testing code that use StatsReceivers¶
If your tests do not need to verify the value of stats, you should use a NullStatsReceiver
which provides a no-op implementation. If your tests need to verify the value of stats, you should
use an InMemoryStatsReceiver which provides ReadableCounters
and ReadableStats
that
enable simpler testing.
Usage from Java¶
There are Java-friendly mechanisms in the StatsReceivers
object (note the trailing s) for
creating counters, gauges and stats. In addition JStats
is available for measuring latency.
Thread-safety¶
It is expected that implementations of StatsReceivers
and their associated counters/gauges/stats
themselves are thread-safe and safe to use across threads.
The caveat is that because Gauges
run a function when they are read, the code you provide as the
function must also be thread-safe.
Leveraging Verbosity Levels¶
Introducing a new application metric (i.e., a histogram or a counter) is always a trade-off between its operational value and its cost within observability. Verbosity levels for StatsReceivers are aiming to reduce the observability cost of Finagle-based services by allowing for a granular control over which metrics are being exported in the application’s steady state.
Similar to log levels, verbosity levels provide means to mark a given metric as “debug” and potentially (assuming supported in implementation) prevent it from being exported under standard operations.
Limiting the number of exported metrics via verbosity levels can reduce applications’ operational cost. However taking this to extremes may drastically affect operability of your service. We recommend using your judgment to make sure denylisting a given metric will not reduce a process’ visibility.
Custom Histogram Percentiles¶
If you need a custom histogram percentile, you can use the MetricBuilder interface, and populate it via MetricBuilder#withPercentiles.
import com.twitter.finagle.stats.{DefaultStatsReceiver, StatsReceiver, MetricBuilder}
val sr: StatsReceiver = DefaultStatsReceiver
val mb: MetricBuilder = sr.metricBuilder().withPercentiles(Seq(0.99, 0.999, 0.9999, 0.90, 0.6))
val stat = mb.histogram("my", "cool", "histo")
Access needed to a StatsReceiver in an inconvenient place¶
Ideally classes would be passed a properly scoped StatsReceiver
in their constructor but this
isn’t always simple or feasible. This may be due to various reasons such as legacy code, code in a
static initializer or a Scala object. In these cases, if you are depending on finagle-core, you
should consider using one of DefaultStatsReceiver
, ClientStatsReceiver
or
ServerStatsReceiver
. These are initialized via Finagle’s LoadService
mechanism.
Useful StatsReceivers¶
There a few StatsReceiver
s which work across implementations that
developers may find useful.
InMemoryStatsReceiver useful for unit testing.
NullStatsReceiver for when you do not care about all metrics.
DenylistStatsReceiver programmatically decide which metrics to ignore.
BroadcastStatsReceiver allows for sending metrics to two or more
StatsReceiver
s.
Viewing per-node metrics¶
This is possible, however the mechanism varies depending on which “application” framework you are using.
Via TwitterServer/finagle-stats — the HTTP admin interface responds with json at
/admin/metrics.json
and there is a web UI for watching them in real-time at /admin/metrics
.