User Guide¶

Choosing the stats implementation to use for Finagle¶

Finagle uses util-stats heavily throughout and it is important for a well-run service to have this wired up properly. Finagle uses LoadService in order to discover implementations on the classpath. The standard implementation comes via finagle-stats.

If multiple implementations are found, metrics are reported to all of the implementations via a BroadcastStatsReceiver. This feature is useful when transitioning from one library implementation to another. Once you are done migrating, in the interests of efficiency and memory pressure, it is wise to validate that only a single implementation is loaded via your dependencies. This also implies that developers of libraries should not depend on a specific implementation and only depend on util-stats.

If no implementations are found, metrics are reported to a NullStatsReceiver which is essentially /dev/null.

Measuring the latency of operations¶

A common need for clients is to capture the latency distribution, often of asynchronous calls such as those returning Futures. This can be done easily via the Stat.time and Stat.timeFuture methods.

Prefer storing counters and stats in member variables¶

While it may be tempting to create your metrics inline, whenever possible you should initialize these up front, typically in a class constructor. This is because while capturing usage via Stat.add and Counter.incr is optimized for performance, the construction of a Stat and Counter is not.

class CountingExample(statsReceiver: StatsReceiver) {
  // create the Counter once and store it in a member
  private val numSheep = statsReceiver.counter("sheep")

  // incr() called on the member variable many times
  // in the application's lifecycle.
  def bedtime(n: Int): Unit = {
    numSheep.incr(n)
    // nifty business logic here
  }
}

Note that sometimes this is not possible due to business logic, for example, when a counter is named based on current context. However, care should be taken to ensure that the cardinality of metrics created is bounded or you will run the risk of a memory leak (for example by including application ids in your metric names).

Store gauges in member variables¶

While your application code is not likely to refer to the value returned from StatsReceiver.addGauge, you should store it in a member variable. This is due to the common implementation of Gauge being CumulativeGauge, which uses java.lang.ref.WeakReference. If a CumulativeGauge is not held by a strong reference, the gauge is eligible for garbage collection, which would cause your metric to disappear unexpectedly at runtime.

// bad, don't do this:
class BadExample(queue: BlockingQueue[Work], statsReceiver: StatsReceiver) {
  statsReceiver.addGauge("num_waiting") { queue.size }
  // code that may do incredible things but you'd never know
  // the size of the queue.
}

// do this instead:
class GoodExample(queue: BlockingQueue[Work], statsReceiver: StatsReceiver) {
  private val queueDepth =
    statsReceiver.addGauge("num_waiting") { queue.size }
  // code that does incredible things AND you know the
  // size of the queue.
}

Prefer addGauge over provideGauge¶

StatsReceiver offers two similar methods, addGauge and provideGauge, and whenever possible addGauge should be preferred. provideGauge is basically a call to addGauge along with code that holds a strong reference to the gauge in a global linked list. Recall from the previous note that Gauges need to held in strong references and as such you should only rely on provideGauge when you do not have a place to keep a strong reference.

Prefer fully-qualified names over scoping¶

There is a convenient API method (scope) allowing to spawn a “scoped” version of a given StatsReceiver. Although it enables effortless code reuse, scoping comes at the cost of extra allocations. Inlining a fully-qualified metric name directly into the constructing method yields the same outcome yet avoids the overhead needed to accommodate interim structures.

Put this way, if possible, prefer this

statsReceiver.counter("foo", "bar", "baz")

over this

statsReceiver.scope("foo").scope("bar").counter("baz")

It’s important to note that while this optimization is appealing like an easy win, it shouldn’t be universally applied. There are many legitimate use-cases when scoping a StatsReceiver is very reasonable thing to do, and otherwise, would require an alternative channel for the scope to be propagated between components, presumably introducing the overhead somewhere else.

Testing code that use StatsReceivers¶

If your tests do not need to verify the value of stats, you should use a NullStatsReceiver which provides a no-op implementation. If your tests need to verify the value of stats, you should use an InMemoryStatsReceiver which provides ReadableCounters and ReadableStats that enable simpler testing.

Usage from Java¶

There are Java-friendly mechanisms in the StatsReceivers object (note the trailing s) for creating counters, gauges and stats. In addition JStats is available for measuring latency.

Thread-safety¶

It is expected that implementations of StatsReceivers and their associated counters/gauges/stats themselves are thread-safe and safe to use across threads.

The caveat is that because Gauges run a function when they are read, the code you provide as the function must also be thread-safe.

Leveraging Verbosity Levels¶

Introducing a new application metric (i.e., a histogram or a counter) is always a trade-off between its operational value and its cost within observability. Verbosity levels for StatsReceivers are aiming to reduce the observability cost of Finagle-based services by allowing for a granular control over which metrics are being exported in the application’s steady state.

Similar to log levels, verbosity levels provide means to mark a given metric as “debug” and potentially (assuming supported in implementation) prevent it from being exported under standard operations.

Limiting the number of exported metrics via verbosity levels can reduce applications’ operational cost. However taking this to extremes may drastically affect operability of your service. We recommend using your judgment to make sure denylisting a given metric will not reduce a process’ visibility.

Custom Histogram Percentiles¶

If you need a custom histogram percentile, you can use the MetricBuilder interface, and populate it via MetricBuilder#withPercentiles.

import com.twitter.finagle.stats.{DefaultStatsReceiver, StatsReceiver, MetricBuilder}

val sr: StatsReceiver = DefaultStatsReceiver
val mb: MetricBuilder = sr.metricBuilder().withPercentiles(Seq(0.99, 0.999, 0.9999, 0.90, 0.6))
val stat = mb.histogram("my", "cool", "histo")

Access needed to a StatsReceiver in an inconvenient place¶

Ideally classes would be passed a properly scoped StatsReceiver in their constructor but this isn’t always simple or feasible. This may be due to various reasons such as legacy code, code in a static initializer or a Scala object. In these cases, if you are depending on finagle-core, you should consider using one of DefaultStatsReceiver, ClientStatsReceiver or ServerStatsReceiver. These are initialized via Finagle’s LoadService mechanism.

Useful StatsReceivers¶

There a few StatsReceivers which work across implementations that developers may find useful.

InMemoryStatsReceiver useful for unit testing.
NullStatsReceiver for when you do not care about all metrics.
DenylistStatsReceiver programmatically decide which metrics to ignore.
BroadcastStatsReceiver allows for sending metrics to two or more StatsReceivers.

Viewing per-node metrics¶

This is possible, however the mechanism varies depending on which “application” framework you are using.

Via TwitterServer/finagle-stats — the HTTP admin interface responds with json at /admin/metrics.json and there is a web UI for watching them in real-time at /admin/metrics.