Transform this TypedSource into another by mapping after.
Transform this TypedSource into another by mapping after. We don't call this map because of conflicts with Mappable, unfortunately
Because TupleConverter cannot be covariant, we need to jump through this hoop.
Because TupleConverter cannot be covariant, we need to jump through this hoop. A typical implementation might be: (implicit conv: TupleConverter[T]) and then:
override def converter[U >: T] = TupleConverter.asSuperConverter[T, U](conv)
Creates a local tap.
Creates a local tap.
The mode for handling output conflicts.
A tap.
Subclasses of Source MUST override this method.
Subclasses of Source MUST override this method. They may call out to TestTapFactory for making Taps suitable for testing.
If you want to filter, you should use this and output a 0 or 1 length Iterable.
If you want to filter, you should use this and output a 0 or 1 length Iterable. Filter does not change column names, and we generally expect to change columns here
The scheme to use if the source is on hdfs.
The scheme to use if the source is on hdfs.
A path to use for the local tap.
A path to use for the local tap.
The scheme to use if the source is local.
The scheme to use if the source is local.
Determines if a path is 'valid' for this source.
Determines if a path is 'valid' for this source. In strict mode all paths must be valid. In non-strict mode, all invalid paths will be filtered out.
Subclasses can override this to validate paths.
The default implementation is a quick sanity check to look for missing or empty directories. It is necessary but not sufficient -- there are cases where this will return true but there is in fact missing data.
TODO: consider writing a more in-depth version of this method in TimePathedSource that looks for TODO: missing days / hours etc.
This is a name the refers to this exact instance of the source (put another way, if s1.sourceId == s2.sourceId, the job should work the same if one is replaced with the other
This is a name the refers to this exact instance of the source (put another way, if s1.sourceId == s2.sourceId, the job should work the same if one is replaced with the other
Similar in behavior to TimePathedSource.writePathFor
.
Similar in behavior to TimePathedSource.writePathFor
.
Strip out the trailing slash star.
Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS.
Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS. Typical use might be to read in Job.next to determine if another job is needed
The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source.
The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source. By default, as of 0.9.0, it is considered as a Mock of the Source. If you set this to true, the mock in TestMode will be considered to be a mock of the Tap (which must be transformed) and not the Source.
write the pipe but return the input so it can be chained into the next operation
write the pipe but return the input so it can be chained into the next operation
(Since version 0.9.0) replace with Mappable.toIterator