Exactly the same as merge.
Exactly the same as merge. Here by analogy with the scala.collections API
Ensure this is scheduled, but return something equivalent to the argument
like the function par
in Haskell.
Ensure this is scheduled, but return something equivalent to the argument
like the function par
in Haskell.
This can be used to combine two independent Producers in a way that ensures
that the Platform will plan both into a single Plan.
Prefer to flatMap for transforming a subset of items like optionMap but convenient with case syntax in scala prod.
Prefer to flatMap for transforming a subset of items like optionMap but convenient with case syntax in scala prod.collect { case x if fn(x) => g(x) }
Builds a new KeyedProvider by applying a partial function to keys of elements of this one on which the function is defined.
Builds a new KeyedProvider by applying a partial function to keys of elements of this one on which the function is defined.
Builds a new KeyedProvider by applying a partial function to values of elements of this one on which the function is defined.
Builds a new KeyedProvider by applying a partial function to values of elements of this one on which the function is defined.
Merge a different type of Producer into a single stream
Merge a different type of Producer into a single stream
Keep only the items that satisfy the fn
Keep only the items that satisfy the fn
Prefer this to filter or flatMap/flatMapKeys if you are filtering.
Prefer this to filter or flatMap/flatMapKeys if you are filtering. This may be optimized in the future with an intrinsic node in the Producer graph. We know this never increases the number of items, and we know it does not rekey the partition.
Prefer this to filter or flatMap/flatMapValues if you are filtering.
Prefer this to filter or flatMap/flatMapValues if you are filtering. This may be optimized in the future with an intrinsic node in the Producer graph. We know this never increases the number of items, and we know it does not rekey the partition.
Only use this function if you may return more than 1 item sometimes.
Only use this function if you may return more than 1 item sometimes. otherwise use collect or optionMap, which can be pushed up the graph
Prefer to call this method to flatMap if you are expanding only keys.
Prefer to call this method to flatMap if you are expanding only keys. It may trigger optimizations, that can significantly improve performance
Prefer this to a raw map as this may be optimized to avoid a key reshuffle
Prefer this to a raw map as this may be optimized to avoid a key reshuffle
Return just the keys
Return just the keys
Do a windowed join on a stream.
Do a windowed join on a stream. You need to provide a sink that manages the buffer. Offline, this might be a bounded HDFS partition. Online it might be a cache that evicts after a period of time.
Do a lookup/join on a service.
Do a lookup/join on a service. This is how you trigger async computation is summingbird. Any remote API call, DB lookup, etc... happens here
This is identical to a certain leftJoin: map((_, ())).
This is identical to a certain leftJoin: map((_, ())).leftJoin(srv).mapValues{case (_, v) => v} Useful when you are looking up values from say a stream of inputs, such as IDs.
Map each item to a new value
Map each item to a new value
Prefer to call this method to flatMap/map if you are mapping only keys.
Prefer to call this method to flatMap/map if you are mapping only keys. It may trigger optimizations, that can significantly improve performance
Prefer this to a raw map as this may be optimized to avoid a key reshuffle
Prefer this to a raw map as this may be optimized to avoid a key reshuffle
Combine the output into one Producer
Combine the output into one Producer
Naming a node is so that you may give Options for that node that may change the run-time performance of the job (parameter tuning, etc.
Naming a node is so that you may give Options for that node that may change the run-time performance of the job (parameter tuning, etc...)
Prefer this or collect to flatMap if you are always emitting 0 or 1 items
Prefer this or collect to flatMap if you are always emitting 0 or 1 items
emits a KeyedProducer with a value that is the store value, just BEFORE a merge, and the right is a new delta (which may include, depending on the Platform, Store and Options, more than a single aggregated item).
emits a KeyedProducer with a value that is the store value, just BEFORE a merge, and the right is a new delta (which may include, depending on the Platform, Store and Options, more than a single aggregated item).
so, the sequence out of this has the property that: (v0, vdelta1), (v0 + vdelta1, vdelta2), (v0 + vdelta1 + vdelta2, vdelta3), ...
Exchange values for keys
Exchange values for keys
Keep only the values
Keep only the values
Cause some side effect on the sink, but pass through the values so they can be consumed downstream
Cause some side effect on the sink, but pass through the values so they can be consumed downstream