The serialization of your data will be smaller if any classes passed between tasks in your job are listed here.
The serialization of your data will be smaller if any classes passed between tasks in your job are listed here. Without this, strings are used to write the types IN EACH RECORD, which compression probably takes care of, but compression acts AFTER the data is serialized into buffers and spilling has been triggered.
configure flow listeneres for observability
Prepend an estimator so it will be tried first.
Prepend an estimator so it will be tried first. If it returns None, the previously-set estimators will be tried in order.
Prepend an estimator so it will be tried first.
Prepend an estimator so it will be tried first. If it returns None, the previously-set estimators will be tried in order.
Allocate a new UniqueID if there is not one present
Returns None if not set, otherwise reflection is used to create the Class.forName
This is a name that if present is passed to flow.setName, which should appear in the job tracker.
This function gets the set of classes that have been registered to Kryo.
This function gets the set of classes that have been registered to Kryo. They may or may not be used in this job, but Cascading might want to be made aware that these classes exist
Get the number of reducers (this is the parameter Hadoop will use)
Non-fat-jar use cases require this, BUT using it with fat jars can cause problems.
Non-fat-jar use cases require this, BUT using it with fat jars can cause problems. It is not set by default, but if you have problems you might need to set the Job class here Consider also setting this same class here: setScaldingFlowClass
Set username from System.used for querying hRaven.
Set the entire list of reducer estimators (overriding the existing list)
Set this configuration option to require all grouping/cogrouping to use OrderedSerialization
Set an ID to be shared across this usage of run for Execution
Set to true to enable very verbose logging during FileSource's validation and planning.
Set to true to enable very verbose logging during FileSource's validation and planning. This can help record what files were present / missing at runtime. Should only be enabled for debugging.
This is a wrapper class on top of Map[String, String]