Names and Naming in Finagle¶
Finagle uses names 1 to identify network locations. Names must be supplied when constructing a Finagle client through ClientBuilder.dest or through implementations of Client.
Names are represented by the data-type Name comprising two variants:
case class Name.Bound(va: Var[Addr])
Identifies a set of network locations.
Var[Addr]
(see Addr) represents a set of changeable Internet addresses — or host, port pairs — described below.case class Name.Path(path: Path)
Represents a name denoted by a hierarchical path, represented by a sequence of byte strings.
Resolver.eval parses strings into Names. Strings of the form
scheme!arg
uses the given scheme to interpret arg; they always evaluate into instances of Name.Bound. For example,
inet!twitter.com:80
Uses the inet
resolver to interpret the address twitter.com:80
. Inet,
predictably, uses DNS for this job.
The name
zk!myzkhost.mycompany.com:2181!/my/zk/path
reads a serverset at the path /my/zk/path
on the ZooKeeper ensemble
named by myzkhost.mycompany.com:2181
.
When a scheme is absent, inet is used; thus
twitter.com:8080
is equivalent to
inet!twitter.com:8080
Paths¶
Names beginning with the character /
are hierarchical paths in the
tradition of Unix. They denote an abstract location — i.e. what you
want. Path names must be bound by the current namespace in order to
identify where the network location(s) are.
The name
/s/crawler
might denote the crawler
service.
Interpreting Paths With Delegation Tables¶
A delegation table (or “dtab”) defines the namespace for a Finagle
transaction. A dtab comprises an ordered list of delegations which
together define how a path is interpreted. A delegation is a rewrite
rule src => dest
. When a name has src
as a prefix, the prefix
is replaced with dest
; otherwise the rule does not apply.
Delegations have the concrete syntax
src => dest
where src
and dest
are paths. As an example, the delegation
/s => /s#/foo/bar
rewrites the path
/s/crawler
to
/s#/foo/bar/crawler
Note that prefixes match on path components, not characters; e.g. /s is a prefix of /s/crawler, but not of /s#/foo/bar/crawler.
Furthermore, prefixes may contain the wildcard character * to match any component. For example
/s#/*/bar => /t/bah
rewrites the paths
/s#/foo/bar/baz
or
/s#/boo/bar/baz
to
/t/bah/baz
Paths beginning with /$/
are called “system paths.” They are interpreted
specially by Finagle, similarly to resolver schemes. Paths of the form
/$/namer/path..
uses the given Namer to interpret the remaining path. This allows Finagle to translate paths into addresses. For example
/$/inet/localhost/8080
is bound by Finagle to the Internet address localhost:8080
. Similarly,
/$/com.twitter.serverset/zk.local.twitter.com:2181/foo/bar
is the path describing the serverset /foo/bar
on the ZooKeeper
ensemble zk.local.twitter.com:2181
.
Dtabs may contain line-oriented comments beginning with #
. #
must be preceded by a whitespace character or delimiter such as ;
,
|
, or &
. For example, this Dtab with commentary:
# delegation for /s
/s => /a # prefer /a
| ( /b # or share traffic between /b and /c
& /c
);
is equivalent to this Dtab without commentary:
/s => /a | (/b & /c);
We use dtabs to define how logical names (e.g. /s/crawler
)
translate into addresses. Because rewriting is abstracted away, we can
adapt a Finagle process to its environment by manipulating its dtab.
For example, this allows us to define /s/crawler
to mean one set
of hosts when a process is running in a production setting, and
another set of hosts when developing or testing. A more complete
example follows.
With the dtab
/zk# => /$/com.twitter.serverset;
/zk => /zk#;
/s## => /zk/zk.local.twitter.com:2181;
/s# => /s##/prod;
/s => /s#;
the path
/s/crawler
is rewritten thus:
1. /s/crawler
2. /s#/crawler
3. /s##/prod/crawler
4. /zk/zk.local.twitter.com:2181/prod/crawler
5. /zk#/zk.local.twitter.com:2181/prod/crawler
6. /$/com.twitter.serverset/zk.local.twitter.com:2181/prod/crawler
We’ve turned the path /s/crawler into the serverset /prod/crawler
on zk.local.twitter.com:2181. We use the #
character to denote
handlers — the path /s#
“handles” /s
and so on. To
see why this indirection is necessary, consider redefining /s
by
adding a prefix — a common namespacing operation. The entry
/s => /s/prefix
would recurse; for example the name /s/crawler
would be rewritten
/s/crawler
/s/prefix/crawler
/s/prefix/prefix/crawler
...
and so on. With /s#
, we’d instead add
/s => /s#/prefix
to get the desired effect
/s/crawler
/s#/prefix/crawler
...
We can easily manipulate our Dtab to affect certain parts of the resolution.
For example, if we wanted to use staging instances of services instead of their
production ones, we’d append the delegation /s# => /s##/staging
making the
Dtab
/zk# => /$/com.twitter.serverset; (a)
/zk => /zk#; (b)
/s## => /zk/zk.local.twitter.com:2181; (c)
/s# => /s##/prod; (d)
/s => /s#; (e)
/s# => /s##/staging; (f)
/s/crawler
would then be rewritten as follows. Each step is
labelled with the rule applied from the above Dtab.
/s/crawler
(e) /s#/crawler
(f) /s##/staging/crawler
(c) /zk/zk.local.twitter.com:2181/staging/crawler
(b) /zk#/zk.local.twitter.com:2181/staging/crawler
(a) /$/com.twitter.serverset/zk.local.twitter.com:2181/staging/crawler
Simply adding a new delegation is sufficient. Later entries are attempted before earlier ones; if a rewrite rooted at a delegation fails to produce an address, rewriting resumes from the next matching delegation.
The combined effect is a fallback mechanism — if the
crawler
exists in the staging environment, it is used; otherwise
we fall back to its production definition.
In the above example, if /staging/crawler
did not exist on
zk.local.twitter.com:2181
, the search would backtrack from (a),
producing the following set of rewrites:
/s/crawler
(e) /s#/crawler
(f) /s##/staging/crawler
(c) /zk/zk.local.twitter.com:2181/staging/crawler
(b) /zk#/zk.local.twitter.com:2181/staging/crawler
(a) /$/com.twitter.serverset/zk.local.twitter.com:2181/staging/crawler
(d) /s##/prod/crawler
(c) /zk/zk.local.twitter.com:2181/prod/crawler
(b) /zk#/zk.local.twitter.com:2181/prod/crawler
(a) /$/com.twitter.serverset/zk.local.twitter.com:2181/prod/crawler
We now see that delegations provide a simple and flexible means by which to define a namespace. Its effect is similar to that of a Unix mount table: Names stand on their own, but the minutiae of binding is handled by the environment — i.e. the dtab.
Delegations are passed between servers if a supported protocol is used. Thus a server alters the interpretation of names in the context of the entire request graph, allowing a server to affect downstream behavior for the current transaction. As an example a developer might want to replace an individual component in a distributed system with a development version of that component. This can be done by orchestrating the originator (for example, an HTTP frontend) to add a delegation expressing this override.
Finagle has protocol support for delegation passing in TTwitter, Mux, its variant ThriftMux, and HTTP. When these protocols are used, delegations that are added dynamically to a request can be in effect throughout the distributed request graph — i.e. scope of the namespace is a transaction. Delegations are added dynamically through the Dtab API.
(This is a powerful facility that should be used with care.)
The Dtab API¶
Delegations can be added or overridden dynamically through the Dtab API – specifically by way of scoped delegation tables Dtab.local and Dtab.limited.
The local delegation is defined as the “per-request,” propagated scope. It’s ideal for overrides you’d like to apply to the entire request graph, as it applies to downstream services.
The limited delegation is the “per-request,” non-propagated scope. Unlike Dtab.local, Dtab.limited applies only to the current request and does not affect the rest of the call graph. Furthermore, when a Dtab.limited conflicts with a Dtab.local, only the Dtab.local is respected.
The higher granularity of a limited delegation allows for more fine-grained control over fallback and failover behavior. This is ideal for large request graphs in which only some endpoints need to be rerouted. As more endpoints begin to fail, the local granularity becomes a more useful “broad strokes” approach to rerouting.
To demonstrate this further, consider the following example service graph: ServiceA -> ServiceB -> ServiceC
A Dtab.local set on a request to ServiceA will also be reflected in ServiceB and ServiceC. A Dtab.limited set on a request to ServiceA, however, will only exist for the scope of the request in ServiceA. The state of the Dtab.limited will not be visible to ServiceB or ServiceC.
Let’s consider a few more examples:
ServiceA sets a Dtab.local for ServiceC for a request, rerouting it to ServiceD. ServiceB will, accordingly, propagate the Dtab.local and re-route requests to ServiceD.
ServiceA sets a Dtab.limited for ServiceC for a request, rerouting it to ServiceD. This has no effect because ServiceA does not call ServiceC directly and the limited state is not propagated to ServiceB.
ServiceA sets a Dtab.local for ServiceC, rerouting it to ServiceD. ServiceA also sets a Dtab.limited for ServiceC, rerouting it to ServiceE. The behavior of example 1 will be seen again.
ServiceA sets a Dtab.limited for ServiceB for a request, rerouting it to ServiceD. The request to ServiceB will be rerouted, but the state will not be propagated.
Addr¶
Name.Bound comprises a Var[Addr]
, representing a dynamically
changing Addr. (Var implements a form of self-adjusting
computation); Addrs
are in one of 3 states:
Addr.Pending
The binding is still pending: perhaps because we are awaiting a DNS answer or Zookeeper operation completion.
Addr.Neg
The binding was negative, meaning that the destination does not exist.
Addr.Failed(cause: Throwable)
The binding failed with the given
cause
.Addr.Bound(addrs: Set[Address])
The binding succeeded with the given set of addresses, representing concrete endpoints.
We now see that a Var[Addr]
is capable of representing a moving target,
for example a dynamic serverset.
Footnotes
- 1
A name identities what you want; an address is a location, identifying where an object resides. Binding is the process that turns names into addresses.