Package

com.twitter

scalding

Permalink

package scalding

Source
package.scala
Linear Supertypes
AnyRef, Any
Content Hierarchy
MultipleWritableSequenceFiles[K, V]WritableSequenceFile[K, V]WritableSequenceFileSchemeFixedPathTypedDelimited[T]TypedDelimited[T]TypedOsvTypedPsvTypedCsvTypedTsvTypedSeperatedFileReflectionSetter[T]ReflectionTupleUnpacker[T]LowPriorityTupleUnpackersTupleUnpackerTupleUnpacker[T]TupleSetterLowPriorityTupleSettersGeneratedTupleSettersTupleSetter[T]OrderedConstructorConverter[T]OrderedTuplePacker[T]ReflectionTupleConverter[T]ReflectionTuplePacker[T]LowPriorityTuplePackersCaseClassPackersTuplePackerTuplePacker[T]TupleGetterLowPriorityTupleGetterTupleConverterLowPriorityTupleConvertersGeneratedTupleConvertersTupleConverter[T]TupleArityMostRecentGoodSourceTimePathedSourceTimeSeqPathedSourceTemplatedSequenceFileTemplatedTsvTemplateSourceStreamOperations[Self]GroupBuilderNullSourceBaseNullSourceSingleMappable[T]TextLineSchemeMappable[T]OptionalSource[T]IterableSource[T]Mappable22[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V]Mappable21[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U]Mappable20[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T]Mappable19[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S]Mappable18[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R]Mappable17[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q]Mappable16[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P]Mappable15[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O]Mappable14[A, B, C, D, E, F, G, H, I, J, K, L, M, N]Mappable13[A, B, C, D, E, F, G, H, I, J, K, L, M]Mappable12[A, B, C, D, E, F, G, H, I, J, K, L]Mappable11[A, B, C, D, E, F, G, H, I, J, K]Mappable10[A, B, C, D, E, F, G, H, I, J]Mappable9[A, B, C, D, E, F, G, H, I]Mappable8[A, B, C, D, E, F, G, H]Mappable7[A, B, C, D, E, F, G]Mappable6[A, B, C, D, E, F]Mappable5[A, B, C, D, E]Mappable4[A, B, C, D]Mappable3[A, B, C]Mappable2[A, B]Mappable1[A]OffsetTextLineSourceSchemedSourceWriteReadAccessModeSortable[Self]FoldOperations[Self]SkewReplicationBSkewReplicationASkewReplicationRichPipeReduceOperations[Self]PartitionedSequenceFilePartitionedTsvPartitionSourceTypedBufferOp[K, V, U]SampleWithReplacementSideEffectBufferOp[I, T, C, X]BufferOp[I, T, X]MRMFunctor[T, X]FoldFunctor[X]MRMAggregator[T, X, U]FoldAggregator[T, X]FilterFunction[T]SideEffectFlatMapFunction[S, C, T]SideEffectMapFunction[S, C, T]SideEffectBaseOperation[C]AdaptiveMapsideCache[K, V]SummingMapsideCache[K, V]MapsideCache[K, V]TypedMapsideReduce[K, V]MapsideReduce[V]CollectFunction[S, T]CleanupIdentityFunctionIdentityFunctionMapFunction[S, T]FlatMapFunction[S, T]ScaldingPrepare[C]TestLocalHadoopTestHdfsTestModeCascadingLocalHadoopModeModeJoinAlgorithmsCascadeTestJobTestCounterVerificationScriptJobExecutionJob[T]UtcDateRangeJobDefaultDateRangeJobJobCascadeJobHfsConfPropertySetterCoGroupBuilderMultipleDelimitedFilesMultipleTextLineFilesMultipleSequenceFilesSequenceFileTextLineOsvCsvMultipleTsvFilesTsvFixedPathSourceLocalTapSourceSuccessFileSourceSequenceFileSchemeDelimitedSchemeTextSourceSchemeFileSourceLocalSourceOverrideHfsTapProviderStringField[T]IntField[T]Field[T]FieldConversionsDslLowPriorityFieldConversionsOuterJoinModeInnerJoinModeJoinModeArgHelpArgHelperBooleanArgListArgOptionalArgRequiredArgDescribedArgYearsWeeksSecondsMonthsMonthGlobMinutesMillisecsHoursHourGlobGlobifierDurationListDurationDaysAbstractDurationList[T]AbsoluteDurationDayGlobBaseGlobifierAbsoluteDurationList
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. scalding
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. sealed trait AbsoluteDuration extends Duration with Ordered[AbsoluteDuration]

    Permalink
  2. case class AbsoluteDurationList(parts: List[AbsoluteDuration]) extends AbstractDurationList[AbsoluteDuration] with AbsoluteDuration with Product with Serializable

    Permalink
  3. abstract class AbstractDurationList[T <: Duration] extends Duration

    Permalink
  4. sealed abstract class AccessMode extends AnyRef

    Permalink
  5. class AdaptiveMapsideCache[K, V] extends MapsideCache[K, V]

    Permalink
  6. trait ArgHelper extends AnyRef

    Permalink
  7. class Args extends Serializable

    Permalink
  8. case class ArgsException(message: String) extends RuntimeException with Product with Serializable

    Permalink
  9. class BaseGlobifier extends Serializable

    Permalink
  10. trait BaseNullSource extends Source

    Permalink
  11. case class BooleanArg(key: String, description: String) extends DescribedArg with Product with Serializable

    Permalink
  12. class BufferOp[I, T, X] extends BaseOperation[Any] with Buffer[Any] with ScaldingPrepare[Any]

    Permalink
  13. abstract class CascadeJob extends Job

    Permalink
  14. class CascadeTest extends JobTest

    Permalink
  15. trait CascadingLocal extends Mode

    Permalink
  16. trait CaseClassPackers extends LowPriorityTuplePackers

    Permalink
  17. class CleanupIdentityFunction extends BaseOperation[Any] with Function[Any] with ScaldingPrepare[Any]

    Permalink
  18. class CoGroupBuilder extends GroupBuilder

    Permalink

    Builder classes used internally to implement coGroups (joins).

    Builder classes used internally to implement coGroups (joins). Can also be used for more generalized joins, e.g., star joins.

  19. class CollectFunction[S, T] extends BaseOperation[Any] with Function[Any] with ScaldingPrepare[Any]

    Permalink
  20. trait Config extends Serializable

    Permalink

    This is a wrapper class on top of Map[String, String]

  21. trait CounterVerification extends Job

    Permalink

    Allows custom counter verification logic when the job completes.

  22. case class Csv(p: String, separator: String = ",", fields: Fields = Fields.ALL, skipHeader: Boolean = false, writeHeader: Boolean = false, quote: String = "\"", sinkMode: SinkMode = SinkMode.REPLACE) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Permalink

    Csv value source separated by commas and quotes wrapping all fields

  23. trait DateParser extends Serializable

    Permalink
  24. case class DateRange(start: RichDate, end: RichDate) extends Product with Serializable

    Permalink

    represents a closed interval of time.

    represents a closed interval of time.

    TODO: This should be Range[RichDate, Duration] for an appropriate notion of Range

  25. case class DayGlob(pat: String)(implicit tz: TimeZone) extends BaseGlobifier with Product with Serializable

    Permalink
  26. case class Days(cnt: Int)(implicit tz: TimeZone) extends Duration with Product with Serializable

    Permalink
  27. trait DefaultDateRangeJob extends Job

    Permalink

    Sets up an implicit dateRange to use in your sources and an implicit timezone.

    Sets up an implicit dateRange to use in your sources and an implicit timezone. Example args: --date 2011-10-02 2011-10-04 --tz UTC If no timezone is given, Pacific is assumed.

  28. trait DelimitedScheme extends SchemedSource

    Permalink

    Mix this in for delimited schemes such as TSV or one-separated values By default, TSV is given

  29. sealed trait DescribedArg extends AnyRef

    Permalink
  30. class DescriptionValidationException extends RuntimeException

    Permalink
  31. abstract class Duration extends Serializable

    Permalink
  32. case class DurationList(parts: List[Duration]) extends AbstractDurationList[Duration] with Product with Serializable

    Permalink
  33. sealed trait Execution[+T] extends Serializable

    Permalink

    Execution[T] represents and computation that can be run and will produce a value T and keep track of counters incremented inside of TypedPipes using a Stat.

    Execution[T] represents and computation that can be run and will produce a value T and keep track of counters incremented inside of TypedPipes using a Stat.

    Execution[T] is the recommended way to compose multistep computations that involve branching (if/then), intermediate calls to remote services, file operations, or looping (e.g. testing for convergence).

    Library functions are encouraged to implement functions from TypedPipes or ValuePipes to Execution[R] for some result R. Refrain from calling run in library code. Let the caller of your library call run.

    Note this is a Monad, meaning flatMap composes in series as you expect. It is also an applicative functor, which means zip (called join in some libraries) composes two Executions is parallel. Prefer zip to flatMap if you want to run two Executions in parallel.

  34. trait ExecutionApp extends Serializable

    Permalink
  35. trait ExecutionContext extends AnyRef

    Permalink
  36. trait ExecutionCounters extends AnyRef

    Permalink

    This represents the counters portion of the JobStats that are returned.

    This represents the counters portion of the JobStats that are returned. Counters are just a vector of longs with counter name, group keys.

  37. abstract class ExecutionJob[+T] extends Job

    Permalink

    This is a simple job that allows you to launch Execution[T] instances using scalding.Tool and scald.rb.

    This is a simple job that allows you to launch Execution[T] instances using scalding.Tool and scald.rb. You cannot print the graph.

  38. sealed trait Field[T] extends Serializable

    Permalink
  39. trait FieldConversions extends LowPriorityFieldConversions

    Permalink
  40. abstract class FileSource extends SchemedSource with LocalSourceOverride with HfsTapProvider

    Permalink

    This is a base class for File-based sources

  41. class FilterFunction[T] extends BaseOperation[Any] with Filter[Any] with ScaldingPrepare[Any]

    Permalink
  42. abstract class FixedPathSource extends FileSource

    Permalink
  43. class FlatMapFunction[S, T] extends BaseOperation[Any] with Function[Any] with ScaldingPrepare[Any]

    Permalink
  44. case class FlowState(sourceMap: Map[String, Source] = Map.empty, flowConfigUpdates: Set[(String, String)] = Set()) extends Product with Serializable

    Permalink

    Immutable state that we attach to the Flow using the FlowStateMap

  45. class FoldAggregator[T, X] extends BaseOperation[X] with Aggregator[X] with ScaldingPrepare[X]

    Permalink
  46. abstract class FoldFunctor[X] extends Functor

    Permalink

    This handles the mapReduceMap work on the map-side of the operation.

    This handles the mapReduceMap work on the map-side of the operation. The code below attempts to be optimal with respect to memory allocations and performance, not functional style purity.

  47. trait FoldOperations[+Self <: FoldOperations[Self]] extends ReduceOperations[Self] with Sortable[Self]

    Permalink

    Implements reductions on top of a simple abstraction for the Fields-API We use the f-bounded polymorphism trick to return the type called Self in each operation.

  48. class FutureCache[-K, V] extends AnyRef

    Permalink

    This is a map for values that are produced in futures as is common in Execution

  49. trait GeneratedTupleAdders extends AnyRef

    Permalink
  50. trait GeneratedTupleConverters extends LowPriorityTupleConverters

    Permalink
  51. trait GeneratedTupleSetters extends LowPriorityTupleSetters

    Permalink
  52. case class Globifier(pat: String)(implicit tz: TimeZone) extends BaseGlobifier with Serializable with Product with Serializable

    Permalink
  53. class GroupBuilder extends FoldOperations[GroupBuilder] with StreamOperations[GroupBuilder]

    Permalink

    This controls the sequence of reductions that happen inside a particular grouping operation.

    This controls the sequence of reductions that happen inside a particular grouping operation. Not all elements can be combined, for instance, a scanLeft/foldLeft generally requires a sorting but such sorts are (at least for now) incompatible with doing a combine which includes some map-side reductions.

  54. type Grouped[K, +V] = scalding.typed.Grouped[K, V]

    Permalink
  55. case class HadoopArgs(toArray: Array[String]) extends Product with Serializable

    Permalink
  56. trait HadoopMode extends Mode

    Permalink
  57. case class HadoopTest(conf: Configuration, buffers: (Source) ⇒ Option[Buffer[Tuple]]) extends HadoopMode with TestMode with Product with Serializable

    Permalink
  58. case class Hdfs(strict: Boolean, conf: Configuration) extends HadoopMode with Product with Serializable

    Permalink
  59. class HelpException extends RuntimeException

    Permalink
  60. trait HfsConfPropertySetter extends HfsTapProvider

    Permalink
  61. trait HfsTapProvider extends AnyRef

    Permalink
  62. case class HourGlob(pat: String)(implicit tz: TimeZone) extends BaseGlobifier with Product with Serializable

    Permalink
  63. case class Hours(cnt: Int) extends Duration with AbsoluteDuration with Product with Serializable

    Permalink
  64. case class IntField[T](id: Integer)(implicit ord: Ordering[T], mf: Option[Manifest[T]]) extends Field[T] with Product with Serializable

    Permalink
    Annotations
    @DefaultSerializer()
  65. class IntegralComparator extends Comparator[AnyRef] with Hasher[AnyRef] with Serializable

    Permalink
  66. class InvalidJoinModeException extends Exception

    Permalink
  67. class InvalidSourceException extends RuntimeException

    Permalink

    thrown when validateTaps fails

  68. class InvalidSourceTap extends SourceTap[JobConf, RecordReader[_, _]]

    Permalink

    InvalidSourceTap used in createTap method when we want to defer the failures to validateTaps method.

    InvalidSourceTap used in createTap method when we want to defer the failures to validateTaps method.

    This is used because for Job classes, createTap method on sources is called when the class is initialized. In most cases though, we want any exceptions to be thrown by validateTaps method, which is called subsequently during flow planning.

    hdfsPaths represents user-supplied list that was detected as not containing any valid paths.

  69. case class IterableSource[+T](iter: Iterable[T], inFields: Fields = Fields.NONE)(implicit set: TupleSetter[T], conv: TupleConverter[T]) extends Source with Mappable[T] with Product with Serializable

    Permalink

    Allows working with an iterable object defined in the job (on the submitter) to be used within a Job as you would a Pipe/RichPipe

    Allows working with an iterable object defined in the job (on the submitter) to be used within a Job as you would a Pipe/RichPipe

    These lists should probably be very tiny by Hadoop standards. If they are getting large, you should probably dump them to HDFS and use the normal mechanisms to address the data (a FileSource).

  70. class Job extends FieldConversions with Serializable

    Permalink

    Job is a convenience class to make using Scalding easier.

    Job is a convenience class to make using Scalding easier. Subclasses of Job automatically have a number of nice implicits to enable more concise syntax, including: conversion from Pipe, Source or Iterable to RichPipe conversion from Source or Iterable to Pipe conversion to collections or Tuple[1-22] to cascading.tuple.Fields

    Additionally, the job provides an implicit Mode and FlowDef so that functions that register starts or ends of a flow graph, specifically anything that reads or writes data on Hadoop, has the needed implicits available.

    If you want to write code outside of a Job, you will want to either:

    make all methods that may read or write data accept implicit FlowDef and Mode parameters.

    OR:

    write code that rather than returning values, it returns a (FlowDef, Mode) => T, these functions can be combined Monadically using algebird.monad.Reader.

  71. case class JobStats(toMap: Map[String, Any]) extends Product with Serializable

    Permalink
  72. class JobTest extends AnyRef

    Permalink

    This class is used to construct unit tests for scalding jobs.

    This class is used to construct unit tests for scalding jobs. You should not use it unless you are writing tests. For examples of how to do that, see the tests included in the main scalding repository: https://github.com/twitter/scalding/tree/master/scalding-core/src/test/scala/com/twitter/scalding

  73. trait JoinAlgorithms extends AnyRef

    Permalink
  74. sealed abstract class JoinMode extends AnyRef

    Permalink
  75. type KeyedList[K, +V] = scalding.typed.KeyedList[K, V]

    Permalink
  76. case class ListArg(key: String, description: String) extends DescribedArg with Product with Serializable

    Permalink
  77. case class Local(strictSources: Boolean) extends CascadingLocal with Product with Serializable

    Permalink
  78. trait LocalSourceOverride extends SchemedSource

    Permalink

    A trait which provides a method to create a local tap.

  79. trait LocalTapSource extends SchemedSource with LocalSourceOverride

    Permalink

    Use this class to add support for Cascading local mode via the Hadoop tap.

    Use this class to add support for Cascading local mode via the Hadoop tap. Put another way, this runs a Hadoop tap outside of Hadoop in the Cascading local mode

  80. trait LowPriorityFieldConversions extends AnyRef

    Permalink
  81. trait LowPriorityTupleConverters extends Serializable

    Permalink
  82. trait LowPriorityTupleGetter extends Serializable

    Permalink
  83. trait LowPriorityTuplePackers extends Serializable

    Permalink
  84. trait LowPriorityTupleSetters extends Serializable

    Permalink
  85. trait LowPriorityTupleUnpackers extends AnyRef

    Permalink
  86. class MRMAggregator[T, X, U] extends BaseOperation[Tuple] with Aggregator[Tuple] with ScaldingPrepare[Tuple]

    Permalink
  87. class MRMBy[T, X, U] extends AggregateBy

    Permalink

    MapReduceMapBy Class

  88. class MRMFunctor[T, X] extends FoldFunctor[X]

    Permalink

    This handles the mapReduceMap work on the map-side of the operation.

    This handles the mapReduceMap work on the map-side of the operation. The code below attempts to be optimal with respect to memory allocations and performance, not functional style purity.

  89. class MapFunction[S, T] extends BaseOperation[Any] with Function[Any] with ScaldingPrepare[Any]

    Permalink
  90. trait Mappable[+T] extends Source with TypedSource[T]

    Permalink

    Usually as soon as we open a source, we read and do some mapping operation on a single column or set of columns.

    Usually as soon as we open a source, we read and do some mapping operation on a single column or set of columns. T is the type of the single column. If doing multiple columns T will be a TupleN representing the types, e.g. (Int,Long,String)

    Prefer to use TypedSource unless you are working with the fields API

    NOTE: If we don't make this extend Source, established implicits are ambiguous when TDsl is in scope.

  91. trait Mappable1[A] extends Source with Mappable[(A)]

    Permalink
  92. trait Mappable10[A, B, C, D, E, F, G, H, I, J] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J)]

    Permalink
  93. trait Mappable11[A, B, C, D, E, F, G, H, I, J, K] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K)]

    Permalink
  94. trait Mappable12[A, B, C, D, E, F, G, H, I, J, K, L] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L)]

    Permalink
  95. trait Mappable13[A, B, C, D, E, F, G, H, I, J, K, L, M] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M)]

    Permalink
  96. trait Mappable14[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]

    Permalink
  97. trait Mappable15[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]

    Permalink
  98. trait Mappable16[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]

    Permalink
  99. trait Mappable17[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]

    Permalink
  100. trait Mappable18[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]

    Permalink
  101. trait Mappable19[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]

    Permalink
  102. trait Mappable2[A, B] extends Source with Mappable[(A, B)]

    Permalink
  103. trait Mappable20[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]

    Permalink
  104. trait Mappable21[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]

    Permalink
  105. trait Mappable22[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends Source with Mappable[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]

    Permalink
  106. trait Mappable3[A, B, C] extends Source with Mappable[(A, B, C)]

    Permalink
  107. trait Mappable4[A, B, C, D] extends Source with Mappable[(A, B, C, D)]

    Permalink
  108. trait Mappable5[A, B, C, D, E] extends Source with Mappable[(A, B, C, D, E)]

    Permalink
  109. trait Mappable6[A, B, C, D, E, F] extends Source with Mappable[(A, B, C, D, E, F)]

    Permalink
  110. trait Mappable7[A, B, C, D, E, F, G] extends Source with Mappable[(A, B, C, D, E, F, G)]

    Permalink
  111. trait Mappable8[A, B, C, D, E, F, G, H] extends Source with Mappable[(A, B, C, D, E, F, G, H)]

    Permalink
  112. trait Mappable9[A, B, C, D, E, F, G, H, I] extends Source with Mappable[(A, B, C, D, E, F, G, H, I)]

    Permalink
  113. sealed trait MapsideCache[K, V] extends AnyRef

    Permalink
  114. class MapsideReduce[V] extends BaseOperation[MapsideCache[Tuple, V]] with Function[MapsideCache[Tuple, V]] with ScaldingPrepare[MapsideCache[Tuple, V]]

    Permalink
  115. class MemoryTap[In, Out] extends Tap[Properties, In, Out]

    Permalink
  116. class MemoryTupleEntryCollector extends TupleEntryCollector

    Permalink
  117. case class Millisecs(cnt: Int) extends Duration with AbsoluteDuration with Product with Serializable

    Permalink
  118. case class Minutes(cnt: Int) extends Duration with AbsoluteDuration with Product with Serializable

    Permalink
  119. trait Mode extends Serializable

    Permalink
  120. case class ModeException(message: String) extends RuntimeException with Product with Serializable

    Permalink
  121. case class ModeLoadException(message: String, origin: ClassNotFoundException) extends RuntimeException with Product with Serializable

    Permalink
  122. case class MonthGlob(pat: String)(implicit tz: TimeZone) extends BaseGlobifier with Product with Serializable

    Permalink
  123. case class Months(cnt: Int)(implicit tz: TimeZone) extends Duration with Product with Serializable

    Permalink
  124. abstract class MostRecentGoodSource extends TimePathedSource

    Permalink
  125. case class MultipleDelimitedFiles(f: Fields, separator: String, quote: String, skipHeader: Boolean, writeHeader: Boolean, p: String*) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Permalink

    Delimited files source allowing to override separator and quotation characters and header configuration

  126. case class MultipleSequenceFiles(p: String*) extends FixedPathSource with SequenceFileScheme with LocalTapSource with Product with Serializable

    Permalink
  127. case class MultipleTextLineFiles(p: String*) extends FixedPathSource with TextLineScheme with Product with Serializable

    Permalink
  128. case class MultipleTsvFiles(p: Seq[String], fields: Fields = Fields.ALL, skipHeader: Boolean = false, writeHeader: Boolean = false) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Permalink

    Allows the use of multiple Tsv input paths.

    Allows the use of multiple Tsv input paths. The Tsv files will be process through your flow as if they are a single pipe. Tsv files must have the same schema. For more details on how multiple files are handled check the cascading docs.

  129. case class MultipleWritableSequenceFiles[K <: Writable, V <: Writable](p: Seq[String], f: Fields)(implicit evidence$7: Manifest[K], evidence$8: Manifest[V]) extends FixedPathSource with WritableSequenceFileScheme with LocalTapSource with Mappable[(K, V)] with Product with Serializable

    Permalink

    This is only a TypedSource (which is a superclass of Mappable) as sinking into multiple directories is not well defined

  130. class NamedPoolThreadFactory extends ThreadFactory

    Permalink
  131. case class NonHadoopArgs(toArray: Array[String]) extends Product with Serializable

    Permalink
  132. class NullTap[Config, Input, Output, SourceContext, SinkContext] extends SinkTap[Config, Output]

    Permalink

    A tap that output nothing.

    A tap that output nothing. It is used to drive execution of a task for side effect only. This can be used to drive a pipe without actually writing to HDFS.

  133. class OffsetTextLine extends FixedPathSource with Mappable[(Long, String)] with TextSourceScheme

    Permalink

    Alternate typed TextLine source that keeps both 'offset and 'line fields.

  134. case class OptionalArg(key: String, description: String) extends DescribedArg with Product with Serializable

    Permalink
  135. case class OptionalSource[T](src: Mappable[T]) extends Source with Mappable[T] with Product with Serializable

    Permalink
  136. class OrderedConstructorConverter[T] extends TupleConverter[T]

    Permalink
  137. class OrderedTuplePacker[T] extends TuplePacker[T]

    Permalink

    This just blindly uses the first public constructor with the same arity as the fields size

  138. case class Osv(p: String, f: Fields = Fields.ALL, sinkMode: SinkMode = SinkMode.REPLACE) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Permalink

    One separated value (commonly used by Pig)

  139. abstract class PartitionSource extends SchemedSource with HfsTapProvider

    Permalink

    This is a base class for partition-based output sources

  140. case class PartitionedSequenceFile(basePath: String, partition: Partition, sequenceFields: Fields, sinkMode: SinkMode) extends PartitionSource with SequenceFileScheme with Product with Serializable

    Permalink

    An implementation of SequenceFile output, split over a partition tap.

    An implementation of SequenceFile output, split over a partition tap.

    basePath

    The root path for the output.

    partition

    The partitioning strategy to use.

    sequenceFields

    The set of fields to use for the sequence file.

    sinkMode

    How to handle conflicts with existing output.

  141. case class PartitionedTsv(basePath: String, partition: Partition, writeHeader: Boolean, tsvFields: Fields, sinkMode: SinkMode) extends PartitionSource with DelimitedScheme with Product with Serializable

    Permalink

    An implementation of TSV output, split over a partition tap.

    An implementation of TSV output, split over a partition tap.

    basePath

    The root path for the output.

    partition

    The partitioning strategy to use.

    writeHeader

    Flag to indicate that the header should be written to the file.

    sinkMode

    How to handle conflicts with existing output.

  142. case class PipeDebug(output: Output = Output.STDERR, prefix: String = null, printFieldsEvery: Option[Int] = None, printTuplesEvery: Int = 1) extends Product with Serializable

    Permalink

    This is a builder for Cascading's Debug object.

    This is a builder for Cascading's Debug object. The default instance is the same default as cascading's new Debug() https://github.com/cwensel/cascading/blob/wip-2.5/cascading-core/src/main/java/cascading/operation/Debug.java#L46 This is based on work by: https://github.com/granthenke https://github.com/twitter/scalding/pull/559

  143. case class Range[T](lower: T, upper: T)(implicit ord: Ordering[T]) extends Product with Serializable

    Permalink
  144. class RangedArgs extends AnyRef

    Permalink
  145. trait ReduceOperations[+Self <: ReduceOperations[Self]] extends Serializable

    Permalink

    Implements reductions on top of a simple abstraction for the Fields-API This is for associative and commutive operations (particularly Monoids and Semigroups play a big role here)

    Implements reductions on top of a simple abstraction for the Fields-API This is for associative and commutive operations (particularly Monoids and Semigroups play a big role here)

    We use the f-bounded polymorphism trick to return the type called Self in each operation.

  146. class ReflectionSetter[T] extends TupleSetter[T]

    Permalink
  147. class ReflectionTupleConverter[T] extends TupleConverter[T]

    Permalink
  148. class ReflectionTuplePacker[T] extends TuplePacker[T]

    Permalink

    Packs a tuple into any object with set methods, e.g.

    Packs a tuple into any object with set methods, e.g. thrift or proto objects. TODO: verify that protobuf setters for field camel_name are of the form setCamelName. In that case this code works for proto.

  149. class ReflectionTupleUnpacker[T] extends TupleUnpacker[T]

    Permalink
  150. case class RequiredArg(key: String, description: String) extends DescribedArg with Product with Serializable

    Permalink
  151. case class RichDate(timestamp: Long) extends Ordered[RichDate] with Product with Serializable

    Permalink

    A value class wrapper for milliseconds since the epoch.

    A value class wrapper for milliseconds since the epoch. Its tempting to extend this with AnyVal but this causes problem with Java code.

  152. case class RichFields(toFieldList: List[Field[_]]) extends Fields with Product with Serializable

    Permalink
  153. class RichFlowDef extends AnyRef

    Permalink

    This is an enrichment-pattern class for cascading.flow.FlowDef.

    This is an enrichment-pattern class for cascading.flow.FlowDef. The rule is to never use this class directly in input or return types, but only to add methods to FlowDef.

  154. class RichPathFilter extends AnyRef

    Permalink
  155. class RichPipe extends Serializable with JoinAlgorithms

    Permalink

    This is an enrichment-pattern class for cascading.pipe.Pipe.

    This is an enrichment-pattern class for cascading.pipe.Pipe. The rule is to never use this class directly in input or return types, but only to add methods to Pipe.

  156. class SampleWithReplacement extends BaseOperation[Poisson] with Function[Poisson] with ScaldingPrepare[Poisson]

    Permalink
  157. class ScaldingMultiSourceTap extends MultiSourceTap[Tap[JobConf, RecordReader[_, _], OutputCollector[_, _]], JobConf, RecordReader[_, _]]

    Permalink
  158. trait ScaldingPrepare[C] extends Operation[C]

    Permalink
  159. class ScanLeftIterator[T, U] extends Iterator[U] with Serializable

    Permalink

    Scala 2.8 Iterators don't support scanLeft so we have to reimplement The Scala 2.9 implementation creates an off-by-one bug with the unused fields in the Fields API

  160. abstract class SchemedSource extends Source

    Permalink

    A base class for sources that take a scheme trait.

  161. class ScriptJob extends Job

    Permalink
  162. case class Seconds(cnt: Int) extends Duration with AbsoluteDuration with Product with Serializable

    Permalink
  163. case class SequenceFile(p: String, f: Fields = Fields.ALL, sinkMode: SinkMode = SinkMode.REPLACE) extends FixedPathSource with SequenceFileScheme with LocalTapSource with Product with Serializable

    Permalink
  164. trait SequenceFileScheme extends SchemedSource

    Permalink
  165. abstract class SideEffectBaseOperation[C] extends BaseOperation[C] with ScaldingPrepare[C]

    Permalink
  166. class SideEffectBufferOp[I, T, C, X] extends SideEffectBaseOperation[C] with Buffer[C]

    Permalink
  167. class SideEffectFlatMapFunction[S, C, T] extends SideEffectBaseOperation[C] with Function[C]

    Permalink
  168. class SideEffectMapFunction[S, C, T] extends SideEffectBaseOperation[C] with Function[C]

    Permalink
  169. trait SingleMappable[T] extends Source with Mappable[T]

    Permalink

    Mappable extension that defines the proper converter implementation for a Mappable with a single item.

  170. sealed abstract class SkewReplication extends AnyRef

    Permalink

    Represents a strategy for replicating rows when performing skewed joins.

  171. case class SkewReplicationA(replicationFactor: Int = 1) extends SkewReplication with Product with Serializable

    Permalink

    See https://github.com/twitter/scalding/pull/229#issuecomment-10773810

  172. case class SkewReplicationB(maxKeysInMemory: Int = 1E6.toInt, maxReducerOutput: Int = 1E7.toInt) extends SkewReplication with Product with Serializable

    Permalink

    See https://github.com/twitter/scalding/pull/229#issuecomment-10792296

  173. trait Sortable[+Self] extends AnyRef

    Permalink
  174. abstract class Source extends Serializable

    Permalink

    Every source must have a correct toString method.

    Every source must have a correct toString method. If you use case classes for instances of sources, you will get this for free. This is one of the several reasons we recommend using cases classes

    java.io.Serializable is needed if the Source is going to have any methods attached that run on mappers or reducers, which will happen if you implement transformForRead or transformForWrite.

  175. trait Stat extends Serializable

    Permalink
  176. case class StatKey(counter: String, group: String) extends Serializable with Product with Serializable

    Permalink
  177. trait Stateful extends AnyRef

    Permalink

    A simple trait for releasable resource.

    A simple trait for releasable resource. Provides noop implementation.

  178. class StatsFlowListener extends FlowListener

    Permalink

    FlowListener that checks counter values against a function.

  179. trait StreamOperations[+Self <: StreamOperations[Self]] extends Sortable[Self] with Serializable

    Permalink

    Implements reductions on top of a simple abstraction for the Fields-API We use the f-bounded polymorphism trick to return the type called Self in each operation.

  180. case class StringField[T](id: String)(implicit ord: Ordering[T], mf: Option[Manifest[T]]) extends Field[T] with Product with Serializable

    Permalink
    Annotations
    @DefaultSerializer()
  181. trait SuccessFileSource extends FileSource

    Permalink

    Ensures that a _SUCCESS file is present in every directory included by a glob, as well as the requirements of FileSource.pathIsGood.

    Ensures that a _SUCCESS file is present in every directory included by a glob, as well as the requirements of FileSource.pathIsGood. The set of directories to check for _SUCCESS is determined by examining the list of all paths returned by globPaths and adding parent directories of the non-hidden files encountered. pathIsGood should still be considered just a best-effort test. As an illustration the following layout with an in-flight job is accepted for the glob dir*/*:

      dir1/_temporary
      dir2/file1
      dir2/_SUCCESS
    

    Similarly if dir1 is physically empty pathIsGood is still true for dir*/* above

    On the other hand it will reject an empty output directory of a finished job:

      dir1/_SUCCESS
    

  182. class SummingMapsideCache[K, V] extends MapsideCache[K, V]

    Permalink
  183. abstract class TemplateSource extends SchemedSource with HfsTapProvider

    Permalink

    This is a base class for template based output sources

  184. case class TemplatedSequenceFile(basePath: String, template: String, sequenceFields: Fields = Fields.ALL, pathFields: Fields = Fields.ALL, sinkMode: SinkMode = SinkMode.REPLACE) extends TemplateSource with SequenceFileScheme with Product with Serializable

    Permalink

    An implementation of SequenceFile output, split over a template tap.

    An implementation of SequenceFile output, split over a template tap.

    basePath

    The root path for the output.

    template

    The java formatter style string to use as the template. e.g. %s/%s.

    sequenceFields

    The set of fields to use for the sequence file.

    pathFields

    The set of fields to apply to the path.

    sinkMode

    How to handle conflicts with existing output.

  185. case class TemplatedTsv(basePath: String, template: String, pathFields: Fields = Fields.ALL, writeHeader: Boolean = false, sinkMode: SinkMode = SinkMode.REPLACE, fields: Fields = Fields.ALL) extends TemplateSource with DelimitedScheme with Product with Serializable

    Permalink

    An implementation of TSV output, split over a template tap.

    An implementation of TSV output, split over a template tap.

    basePath

    The root path for the output.

    template

    The java formatter style string to use as the template. e.g. %s/%s.

    pathFields

    The set of fields to apply to the path.

    writeHeader

    Flag to indicate that the header should be written to the file.

    sinkMode

    How to handle conflicts with existing output.

    fields

    The set of fields to apply to the output.

  186. case class Test(buffers: (Source) ⇒ Option[Buffer[Tuple]]) extends TestMode with CascadingLocal with Product with Serializable

    Permalink

    Memory only testing for unit tests

  187. trait TestMode extends Mode

    Permalink
  188. class TestTapFactory extends Serializable

    Permalink
  189. class TextLine extends FixedPathSource with TextLineScheme

    Permalink
  190. trait TextLineScheme extends SchemedSource with TextSourceScheme with SingleMappable[String]

    Permalink
  191. trait TextSourceScheme extends SchemedSource

    Permalink

    The fields here are ('offset, 'line)

  192. abstract class TimePathedSource extends TimeSeqPathedSource

    Permalink

    This will automatically produce a globbed version of the given path.

    This will automatically produce a globbed version of the given path. THIS MEANS YOU MUST END WITH A / followed by * to match a file For writing, we write to the directory specified by the END time.

  193. abstract class TimeSeqPathedSource extends FileSource

    Permalink
  194. class Tool extends Configured with org.apache.hadoop.util.Tool

    Permalink
  195. case class Tsv(p: String, fields: Fields = Fields.ALL, skipHeader: Boolean = false, writeHeader: Boolean = false, sinkMode: SinkMode = SinkMode.REPLACE) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Permalink

    Tab separated value source

  196. trait TupleArity extends AnyRef

    Permalink

    Mixed in to both TupleConverter and TupleSetter to improve arity safety of cascading jobs before we run anything on Hadoop.

  197. trait TupleConverter[T] extends Serializable with TupleArity

    Permalink

    Typeclass to represent converting from cascading TupleEntry to some type T.

    Typeclass to represent converting from cascading TupleEntry to some type T. The most common application is to convert to scala Tuple objects for use with the Fields API. The typed API internally manually handles its mapping to cascading Tuples, so the implicit resolution mechanism is not used.

    WARNING: if you are seeing issues with the singleConverter being found when you expect something else, you may have an issue where the enclosing scope needs to take an implicit TupleConverter of the correct type.

    Unfortunately, the semantics we want (prefer to flatten tuples, but otherwise put everything into one postition in the tuple) are somewhat difficlut to encode in scala.

  198. trait TupleGetter[T] extends Serializable

    Permalink

    Typeclass roughly equivalent to a Lens, which allows getting items out of a tuple.

    Typeclass roughly equivalent to a Lens, which allows getting items out of a tuple. This is useful because cascading has type coercion (string to int, for instance) that users expect in the fields API. This code is not used in the typesafe API, which does not allow suc silent coercion. See the generated TupleConverters for an example of where this is used

  199. trait TuplePacker[T] extends Serializable

    Permalink

    Typeclass for packing a cascading Tuple into some type T, this is used to put fields of a cascading tuple into Thrift, Protobuf, or case classes, for instance, but you can add your own instances to control how this is done.

  200. trait TupleSetter[T] extends Serializable with TupleArity

    Permalink

    Typeclass to represent converting back to (setting into) a cascading Tuple This looks like it can be contravariant, but it can't because of our approach of falling back to the singleSetter, you really want the most specific setter you can get.

    Typeclass to represent converting back to (setting into) a cascading Tuple This looks like it can be contravariant, but it can't because of our approach of falling back to the singleSetter, you really want the most specific setter you can get. Put more directly: a TupleSetter[Any] is not just as good as TupleSetter[(Int, Int)] from the scalding DSL's point of view. The latter will flatten the (Int, Int), but the former won't.

  201. trait TupleUnpacker[T] extends Serializable

    Permalink
  202. class TupleUnpackerException extends Exception

    Permalink
  203. trait TypeDescriptor[T] extends Serializable

    Permalink

    This class is used to bind together a Fields instance which may contain a type array via getTypes, a TupleConverter and TupleSetter, which are inverses of one another.

    This class is used to bind together a Fields instance which may contain a type array via getTypes, a TupleConverter and TupleSetter, which are inverses of one another. Note the size of the Fields object and the arity values for the converter and setter are all the same. Note in the com.twitter.scalding.macros package there are macros to generate this for case classes, which may be very convenient.

    Annotations
    @implicitNotFound( ... )
  204. class TypedBufferOp[K, V, U] extends BaseOperation[Any] with Buffer[Any] with ScaldingPrepare[Any]

    Permalink

    In the typed API every reduce operation is handled by this Buffer

  205. class TypedMapsideReduce[K, V] extends BaseOperation[MapsideCache[K, V]] with Function[MapsideCache[K, V]] with ScaldingPrepare[MapsideCache[K, V]]

    Permalink
  206. type TypedPipe[+T] = scalding.typed.TypedPipe[T]

    Permalink
  207. trait TypedSeperatedFile extends Serializable

    Permalink

    Trait to assist with creating objects such as TypedTsv to read from separated files.

    Trait to assist with creating objects such as TypedTsv to read from separated files. Override separator, skipHeader, writeHeader as needed.

  208. type TypedSink[-T] = scalding.typed.TypedSink[T]

    Permalink
  209. trait TypedSink1[A] extends TypedSink[(A)]

    Permalink
  210. trait TypedSink10[A, B, C, D, E, F, G, H, I, J] extends TypedSink[(A, B, C, D, E, F, G, H, I, J)]

    Permalink
  211. trait TypedSink11[A, B, C, D, E, F, G, H, I, J, K] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K)]

    Permalink
  212. trait TypedSink12[A, B, C, D, E, F, G, H, I, J, K, L] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L)]

    Permalink
  213. trait TypedSink13[A, B, C, D, E, F, G, H, I, J, K, L, M] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M)]

    Permalink
  214. trait TypedSink14[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]

    Permalink
  215. trait TypedSink15[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]

    Permalink
  216. trait TypedSink16[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]

    Permalink
  217. trait TypedSink17[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]

    Permalink
  218. trait TypedSink18[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]

    Permalink
  219. trait TypedSink19[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]

    Permalink
  220. trait TypedSink2[A, B] extends TypedSink[(A, B)]

    Permalink
  221. trait TypedSink20[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]

    Permalink
  222. trait TypedSink21[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]

    Permalink
  223. trait TypedSink22[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends TypedSink[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]

    Permalink
  224. trait TypedSink3[A, B, C] extends TypedSink[(A, B, C)]

    Permalink
  225. trait TypedSink4[A, B, C, D] extends TypedSink[(A, B, C, D)]

    Permalink
  226. trait TypedSink5[A, B, C, D, E] extends TypedSink[(A, B, C, D, E)]

    Permalink
  227. trait TypedSink6[A, B, C, D, E, F] extends TypedSink[(A, B, C, D, E, F)]

    Permalink
  228. trait TypedSink7[A, B, C, D, E, F, G] extends TypedSink[(A, B, C, D, E, F, G)]

    Permalink
  229. trait TypedSink8[A, B, C, D, E, F, G, H] extends TypedSink[(A, B, C, D, E, F, G, H)]

    Permalink
  230. trait TypedSink9[A, B, C, D, E, F, G, H, I] extends TypedSink[(A, B, C, D, E, F, G, H, I)]

    Permalink
  231. type TypedSource[+T] = scalding.typed.TypedSource[T]

    Permalink
  232. trait TypedSource1[A] extends TypedSource[(A)]

    Permalink
  233. trait TypedSource10[A, B, C, D, E, F, G, H, I, J] extends TypedSource[(A, B, C, D, E, F, G, H, I, J)]

    Permalink
  234. trait TypedSource11[A, B, C, D, E, F, G, H, I, J, K] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K)]

    Permalink
  235. trait TypedSource12[A, B, C, D, E, F, G, H, I, J, K, L] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L)]

    Permalink
  236. trait TypedSource13[A, B, C, D, E, F, G, H, I, J, K, L, M] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M)]

    Permalink
  237. trait TypedSource14[A, B, C, D, E, F, G, H, I, J, K, L, M, N] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N)]

    Permalink
  238. trait TypedSource15[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O)]

    Permalink
  239. trait TypedSource16[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P)]

    Permalink
  240. trait TypedSource17[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q)]

    Permalink
  241. trait TypedSource18[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R)]

    Permalink
  242. trait TypedSource19[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S)]

    Permalink
  243. trait TypedSource2[A, B] extends TypedSource[(A, B)]

    Permalink
  244. trait TypedSource20[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T)]

    Permalink
  245. trait TypedSource21[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U)]

    Permalink
  246. trait TypedSource22[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V] extends TypedSource[(A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V)]

    Permalink
  247. trait TypedSource3[A, B, C] extends TypedSource[(A, B, C)]

    Permalink
  248. trait TypedSource4[A, B, C, D] extends TypedSource[(A, B, C, D)]

    Permalink
  249. trait TypedSource5[A, B, C, D, E] extends TypedSource[(A, B, C, D, E)]

    Permalink
  250. trait TypedSource6[A, B, C, D, E, F] extends TypedSource[(A, B, C, D, E, F)]

    Permalink
  251. trait TypedSource7[A, B, C, D, E, F, G] extends TypedSource[(A, B, C, D, E, F, G)]

    Permalink
  252. trait TypedSource8[A, B, C, D, E, F, G, H] extends TypedSource[(A, B, C, D, E, F, G, H)]

    Permalink
  253. trait TypedSource9[A, B, C, D, E, F, G, H, I] extends TypedSource[(A, B, C, D, E, F, G, H, I)]

    Permalink
  254. case class UniqueID(get: String) extends Product with Serializable

    Permalink

    Used to inject a typed unique identifier to uniquely name each scalding flow.

    Used to inject a typed unique identifier to uniquely name each scalding flow. This is here mostly to deal with the case of testing where there are many concurrent threads running Flows. Users should never have to worry about these

  255. trait UtcDateRangeJob extends Job with DefaultDateRangeJob

    Permalink
  256. type ValuePipe[+T] = scalding.typed.ValuePipe[T]

    Permalink
  257. case class Weeks(cnt: Int)(implicit tz: TimeZone) extends Duration with Product with Serializable

    Permalink
  258. case class WritableSequenceFile[K <: Writable, V <: Writable](p: String, f: Fields, sinkMode: SinkMode = SinkMode.REPLACE)(implicit evidence$3: Manifest[K], evidence$4: Manifest[V]) extends FixedPathSource with WritableSequenceFileScheme with LocalTapSource with TypedSink[(K, V)] with Mappable[(K, V)] with Product with Serializable

    Permalink
  259. trait WritableSequenceFileScheme extends SchemedSource

    Permalink
  260. class XHandler extends AnyRef

    Permalink

    Provide handlers and mapping for exceptions

  261. case class Years(cnt: Int)(implicit tz: TimeZone) extends Duration with Product with Serializable

    Permalink
  262. class FixedPathTypedDelimited[T] extends FixedPathSource with TypedDelimited[T]

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 2015-07) Use FixedTypedText instead

  263. trait TupleConversions extends AnyRef

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.9.0) This trait does nothing now

  264. trait TypedDelimited[T] extends SchemedSource with DelimitedScheme with Mappable[T] with TypedSink[T]

    Permalink

    Allows you to set the types, prefer this: If T is a subclass of Product, we assume it is a tuple.

    Allows you to set the types, prefer this: If T is a subclass of Product, we assume it is a tuple. If it is not, wrap T in a Tuple1: e.g. TypedTsv[Tuple1[List[Int]]]

    Annotations
    @deprecated
    Deprecated

    (Since version 2015-07) Use TypedTextDelimited instead

Value Members

  1. object AbsoluteDuration extends Serializable

    Permalink
  2. object AcceptAllPathFilter extends PathFilter

    Permalink
  3. object ArgHelp extends ArgHelper

    Permalink
  4. object Args extends Serializable

    Permalink

    The args class does a simple command line parsing.

    The args class does a simple command line parsing. The rules are: keys start with one or more "-". Each key has zero or more values following.

  5. object BijectedOrderedSerialization

    Permalink
  6. object CalendarOps

    Permalink

  7. object CascadeTest

    Permalink
  8. object CascadingTokenUpdater

    Permalink
  9. object CastHfsTap

    Permalink
  10. object Config extends Serializable

    Permalink
  11. object DateOps extends Serializable

    Permalink

    Holds some coversion functions for dealing with strings as RichDate objects

  12. object DateParser extends Serializable

    Permalink
  13. object DateRange extends Serializable

    Permalink
  14. object Dsl extends FieldConversions with Serializable

    Permalink

    This object has all the implicit functions and values that are used to make the scalding DSL, which includes the functions for automatically creating cascading.tuple.Fields objects from scala tuples of Strings, Symbols or Ints, as well as the cascading.pipe.Pipe enrichment to RichPipe which adds the scala.collections-like API to Pipe.

    This object has all the implicit functions and values that are used to make the scalding DSL, which includes the functions for automatically creating cascading.tuple.Fields objects from scala tuples of Strings, Symbols or Ints, as well as the cascading.pipe.Pipe enrichment to RichPipe which adds the scala.collections-like API to Pipe.

    It's useful to import Dsl._ when you are writing scalding code outside of a Job.

  15. object Duration extends Serializable

    Permalink

    Represents millisecond based duration (non-calendar based): seconds, minutes, hours calField should be a java.util.Calendar field

  16. object Execution extends Serializable

    Permalink

    Execution has many methods for creating Execution[T] instances, which are the preferred way to compose computations in scalding libraries.

  17. object ExecutionApp extends Serializable

    Permalink
  18. object ExecutionContext

    Permalink
  19. object ExecutionCounters

    Permalink

    The companion gives several ways to create ExecutionCounters from other CascadingStats, JobStats, or Maps

  20. object ExecutionUtil

    Permalink
  21. object ExpandLibJarsGlobs

    Permalink
  22. object Field extends Serializable

    Permalink
  23. object FileSource extends Serializable

    Permalink
  24. object FixedPathTypedDelimited extends Serializable

    Permalink
  25. object FlowStateMap

    Permalink

    This is a mutable threadsafe store for attaching scalding information to the mutable flowDef

    This is a mutable threadsafe store for attaching scalding information to the mutable flowDef

    NOTE: there is a subtle bug in scala regarding case classes with multiple sets of arguments, and their equality. For this reason, we use Source.sourceId as the key in this map

  26. object FunctionImplicits

    Permalink
  27. object HadoopSchemeInstance

    Permalink
  28. object HiddenFileFilter extends PathFilter

    Permalink
  29. object IdentityFunction extends BaseOperation[Any] with Function[Any] with ScaldingPrepare[Any]

    Permalink
  30. object InnerJoinMode extends JoinMode with Product with Serializable

    Permalink
  31. object Job extends Serializable

    Permalink
  32. object JobStats extends Serializable

    Permalink
  33. object JobTest

    Permalink
  34. object JoinAlgorithms extends Serializable

    Permalink
  35. object LineNumber

    Permalink
  36. object MapsideCache

    Permalink
  37. object MapsideReduce extends Serializable

    Permalink

    An implementation of map-side combining which is appropriate for associative and commutative functions If a cacheSize is given, it is used, else we query the config for cascading.aggregateby.threshold (standard cascading param for an equivalent case) else we use a default value of 100,000

    An implementation of map-side combining which is appropriate for associative and commutative functions If a cacheSize is given, it is used, else we query the config for cascading.aggregateby.threshold (standard cascading param for an equivalent case) else we use a default value of 100,000

    This keeps a cache of keys up to the cache-size, summing values as keys collide On eviction, or completion of this Operation, the key-value pairs are put into outputCollector.

    This NEVER spills to disk and generally never be a performance penalty. If you have poor locality in the keys, you just don't get any benefit but little added cost.

    Note this means that you may still have repeated keys in the output even on a single mapper since the key space may be so large that you can't fit all of them in the cache at the same time.

    You can use this with the Fields-API by doing:

    val msr = new MapsideReduce(Semigroup.from(fn), 'key, 'value, None)
    // MUST map onto the same key,value space (may be multiple fields)
    val mapSideReduced = pipe.eachTo(('key, 'value) -> ('key, 'value)) { _ => msr }

    That said, this is equivalent to AggregateBy, and the only value is that it is much simpler than AggregateBy. AggregateBy assumes several parallel reductions are happening, and thus has many loops, and array lookups to deal with that. Since this does many fewer allocations, and has a smaller code-path it may be faster for the typed-API.

  38. object Mode extends Serializable

    Permalink
  39. object MultipleWritableSequenceFiles extends Serializable

    Permalink
  40. object NullSource extends Source with BaseNullSource

    Permalink

    A source outputs nothing.

    A source outputs nothing. It is used to drive execution of a task for side effect only.

  41. object OffsetTextLine extends Serializable

    Permalink

    Alternate typed TextLine source that keeps both 'offset and 'line fields.

  42. object OuterJoinMode extends JoinMode with Product with Serializable

    Permalink
  43. object PartitionedSequenceFile extends Serializable

    Permalink

    An implementation of SequenceFile output, split over a partition tap.

    An implementation of SequenceFile output, split over a partition tap.

    apply assumes user wants a DelimitedPartition (the only strategy bundled with Cascading).

  44. object PartitionedTsv extends Serializable

    Permalink

    An implementation of TSV output, split over a partition tap.

    An implementation of TSV output, split over a partition tap.

    Similar to TemplateSource, but with addition of tsvFields, to let users explicitly specify which fields they want to see in the TSV (allows user to discard path fields).

    apply assumes user wants a DelimitedPartition (the only strategy bundled with Cascading).

  45. object RangedArgs

    Permalink
  46. object Read extends AccessMode with Product with Serializable

    Permalink
  47. object ReflectionUtils

    Permalink

    A helper for working with class reflection.

    A helper for working with class reflection. Allows us to avoid code repetition.

  48. object RichDate extends Serializable

    Permalink

    RichDate adds some nice convenience functions to the Java date/calendar classes We commonly do Date/Time work in analysis jobs, so having these operations convenient is very helpful.

  49. object RichFields extends Serializable

    Permalink
  50. object RichPathFilter

    Permalink
  51. object RichPipe extends Serializable

    Permalink
  52. object RichXHandler

    Permalink

    Provide apply method for creating XHandlers with default or custom settings and contain messages and mapping

  53. object RuntimeStats extends Serializable

    Permalink

    Wrapper around a FlowProcess useful, for e.g.

    Wrapper around a FlowProcess useful, for e.g. incrementing counters.

  54. object Stat extends Serializable

    Permalink
  55. object StatKey extends Serializable

    Permalink
  56. object Stats

    Permalink
  57. object StringUtility

    Permalink
  58. object SuccessFileFilter extends PathFilter

    Permalink
  59. val TDsl: scalding.typed.TDsl.type

    Permalink

    The objects for the Typed-API live in the scalding.typed package but are aliased here.

  60. object TestTapFactory extends Serializable

    Permalink

    Use this to create Taps for testing.

  61. object TextLine extends Serializable

    Permalink
  62. object TimePathedSource extends Serializable

    Permalink
  63. object Tool

    Permalink
  64. object Tracing

    Permalink

    Calling init registers "com.twitter.scalding" as a "tracing boundary" for Cascading.

    Calling init registers "com.twitter.scalding" as a "tracing boundary" for Cascading. That means that when Cascading sends trace information to a DocumentService such as Driven, the trace will have information about the caller of Scalding instead of about the internals of Scalding. com.twitter.scalding.Job and its subclasses will automatically initialize Tracing.

    register and unregister methods are provided for testing, but should not be needed for most development

  65. object TupleConverter extends GeneratedTupleConverters

    Permalink
  66. object TupleGetter extends LowPriorityTupleGetter

    Permalink
  67. object TuplePacker extends CaseClassPackers

    Permalink
  68. object TupleSetter extends GeneratedTupleSetters

    Permalink
  69. object TupleUnpacker extends LowPriorityTupleUnpackers with Serializable

    Permalink

    Typeclass for objects which unpack an object into a tuple.

    Typeclass for objects which unpack an object into a tuple. The packer can verify the arity, types, and also the existence of the getter methods at plan time, without having the job blow up in the middle of a run.

  70. object TypeDescriptor extends Serializable

    Permalink
  71. object TypedCsv extends TypedSeperatedFile

    Permalink

    Typed comma separated values file

  72. object TypedOsv extends TypedSeperatedFile

    Permalink

    Typed one separated values file (commonly used by Pig)

  73. val TypedPipe: scalding.typed.TypedPipe.type

    Permalink
  74. object TypedPipeChecker

    Permalink

    This class is used to assist with testing a TypedPipe

  75. object TypedPsv extends TypedSeperatedFile

    Permalink

    Typed pipe separated values flile

  76. object TypedTsv extends TypedSeperatedFile

    Permalink

    Typed tab separated values file

  77. object UniqueID extends Serializable

    Permalink
  78. object WritableSequenceFile extends Serializable

    Permalink
  79. object Write extends AccessMode with Product with Serializable

    Permalink
  80. package bdd

    Permalink
  81. package cascading_interop

    Permalink
  82. package filecache

    Permalink
  83. package macros

    Permalink
  84. package mathematics

    Permalink
  85. package reducer_estimation

    Permalink
  86. val scaldingVersion: String

    Permalink

    Make sure this is in sync with version.sbt

  87. package serialization

    Permalink
  88. package source

    Permalink
  89. package typed

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped