Migration from Ostrich¶
Ostrich is a library used to maintain and export statistics and track services. It is obsoleted by TwitterServer.
Commons Metrics replaces Ostrich’s stats library.
Note
Ostrich stats are still present on /stats
if finagle-ostrich4 is
in your runtime classpath.
By using two
flags,
the admin HTTP endpoint /admin/stats.json
supports an Ostrich
compatibility mode:
-com.twitter.finagle.stats.useCounterDeltas=true
-com.twitter.finagle.stats.format=ostrich
If these flags are both set,
HTTP requests to /admin/stats.json
with the period=60
query string
parameter will replicate Ostrich’s behavior by computing deltas on counters
every minute and formatting histograms with the same labels Ostrich uses.
Stats format¶
Ostrich:
{
"counters": {
"finagle/closes": 576,
},
"gauges": {
"finagle/connections": 2,
"finagle/http/failfast/unhealthy_for_ms": 0,
"finagle/http/failfast/unhealthy_num_tries": 0,
},
"labels": {},
"metrics": {
"finagle/connection_duration": {
"average": 1076,
"count": 594,
"maximum": 315467,
"minimum": 3,
"p50": 32,
"p90": 116,
"p95": 116,
"p99": 386,
"p999": 315467,
"p9999": 315467,
"sum": 639346
},
}
}
Commons Metrics:
{
"finagle/closes": 575,
"finagle/connection_duration.avg": 561,
"finagle/connection_duration.count": 592,
"finagle/connection_duration.max": 299986,
"finagle/connection_duration.min": 3,
"finagle/connection_duration.p25": 29,
"finagle/connection_duration.p50": 31,
"finagle/connection_duration.p75": 58,
"finagle/connection_duration.p90": 111,
"finagle/connection_duration.p95": 120,
"finagle/connection_duration.p99": 197,
"finagle/connection_duration.p9990": 2038,
"finagle/connection_duration.p9999": 2038,
"finagle/connection_duration.sum": 332690,
"finagle/connections": 2,
"finagle/http/failfast/unhealthy_for_ms": 0,
"finagle/http/failfast/unhealthy_num_tries": 0,
"finagle/success": 0
...
}
With -com.twitter.finagle.stats.format=ostrich
:
{
"finagle/closes": 575,
"finagle/connection_duration.average": 561,
"finagle/connection_duration.count": 592,
"finagle/connection_duration.maximum": 299986,
"finagle/connection_duration.minimum": 3,
"finagle/connection_duration.p25": 29,
"finagle/connection_duration.p50": 31,
"finagle/connection_duration.p75": 58,
"finagle/connection_duration.p90": 111,
"finagle/connection_duration.p95": 120,
"finagle/connection_duration.p99": 197,
"finagle/connection_duration.p999": 2038,
"finagle/connection_duration.p9999": 2038,
"finagle/connection_duration.sum": 332690,
"finagle/connections": 2,
"finagle/http/failfast/unhealthy_for_ms": 0,
"finagle/http/failfast/unhealthy_num_tries": 0,
"finagle/success": 0
...
}
Note
The stats exported by Ostrich will also be computed by the Ostrich library. It’s not a format conversion but a real dual collection/export of stats. You can compare both stats, but note than you can have different results in histogram values because Commons Metrics is far more precise than Ostrich.
For example, here are the difference between the two libraries for 10k random numbers between 1 and 10,000:
scala> "real p50:%d ostrich:%d metrics:%d".format(p50, op50, mp50)
res7: String = real p50:5066 ostrich:5210 metrics:5066
scala> "real p90:%d ostrich:%d metrics:%d".format(p90, op90, mp90)
res8: String = real p90:9072 ostrich:9498 metrics:9072
scala> "real p99:%d ostrich:%d metrics:%d".format(p99, op99, mp99)
res9: String = real p99:9911 ostrich:9498 metrics:9910
You can run this code on your machine to see by yourself.
Step by step guide¶
Convert your code to TwitterServer
Your server will run as before and expose stats through Ostrich’s
/stats
endpoint.
Update your dashboard
Some stats generated by Ostrich will have change. For example, JVM stats in Ostrich are of the form “jvm_gc_ParNew_msec” and with TwitterServer will be “jvm/gc/ParNew/msec”.
Update your collecting system to collect stats from the new URL.
Disable the Ostrich stats
Exclude the finagle-ostrich4 dependency
Enable the Commons Metrics stats
Add the finagle-stats dependency to your classpath. This can be done before removing Ostrich stats in order to ensure that your new alerts and dashboards are correct.