Class

com.twitter.scalding.typed

PartitionedDelimitedSource

Related Doc: package typed

Permalink

case class PartitionedDelimitedSource[P, T](path: String, template: String, separator: String, fields: Fields, skipHeader: Boolean = false, writeHeader: Boolean = false, quote: String = "\"", strict: Boolean = true, safe: Boolean = true)(implicit mt: Manifest[T], valueSetter: TupleSetter[T], valueConverter: TupleConverter[T], partitionSetter: TupleSetter[P], partitionConverter: TupleConverter[P]) extends SchemedSource with PartitionSchemed[P, T] with Serializable with Product with Serializable

Scalding source to read or write partitioned delimited text.

For writing it expects a pair of (P, T), where P is the data used for partitioning and T is the output to write out. Below is an example.

val data = List(
  (("a", "x"), ("i", 1)),
  (("a", "y"), ("j", 2)),
  (("b", "z"), ("k", 3))
)
IterablePipe(data, flowDef, mode)
  .write(PartitionedDelimited[(String, String), (String, Int)](args("out"), "col1=%s/col2=%s"))

For reading it produces a pair (P, T) where P is the partition data and T is data in the files. Below is an example.

val in: TypedPipe[((String, String), (String, Int))] = PartitionedDelimited[(String, String), (String, Int)](args("in"), "col1=%s/col2=%s")
Source
PartitionedDelimitedSource.scala
Linear Supertypes
Serializable, Product, Equals, PartitionSchemed[P, T], HfsTapProvider, Mappable[(P, T)], TypedSource[(P, T)], TypedSink[(P, T)], SchemedSource, Source, Serializable, AnyRef, Any
Type Hierarchy
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. PartitionedDelimitedSource
  2. Serializable
  3. Product
  4. Equals
  5. PartitionSchemed
  6. HfsTapProvider
  7. Mappable
  8. TypedSource
  9. TypedSink
  10. SchemedSource
  11. Source
  12. Serializable
  13. AnyRef
  14. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PartitionedDelimitedSource(path: String, template: String, separator: String, fields: Fields, skipHeader: Boolean = false, writeHeader: Boolean = false, quote: String = "\"", strict: Boolean = true, safe: Boolean = true)(implicit mt: Manifest[T], valueSetter: TupleSetter[T], valueConverter: TupleConverter[T], partitionSetter: TupleSetter[P], partitionConverter: TupleConverter[P])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def andThen[U](fn: ((P, T)) ⇒ U): TypedSource[U]

    Permalink

    Transform this TypedSource into another by mapping after.

    Transform this TypedSource into another by mapping after. We don't call this map because of conflicts with Mappable, unfortunately

    Definition Classes
    TypedSource
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def checkFlowDefNotNull()(implicit flowDef: FlowDef, mode: Mode): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Source
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def contraMap[U](fn: (U) ⇒ (P, T)): TypedSink[U]

    Permalink

    Transform this sink into another type by applying a function first

    Transform this sink into another type by applying a function first

    Definition Classes
    TypedSink
  9. def converter[U >: (P, T)]: TupleConverter[U]

    Permalink

    Combine both the partition and value converter to extract the data from a flat cascading tuple into a pair of P and T.

    Combine both the partition and value converter to extract the data from a flat cascading tuple into a pair of P and T.

    Definition Classes
    PartitionSchemedTypedSource
  10. def createHfsTap(scheme: Scheme[JobConf, RecordReader[_, _], OutputCollector[_, _], _, _], path: String, sinkMode: SinkMode): Hfs

    Permalink
    Definition Classes
    HfsTapProvider
  11. def createTap(readOrWrite: AccessMode)(implicit mode: Mode): Tap[_, _, _]

    Permalink

    Creates the taps for local and hdfs mode.

    Creates the taps for local and hdfs mode.

    Definition Classes
    PartitionSchemedSource
  12. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. val fields: Fields

    Permalink
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def flatMapTo[U](out: Fields)(mf: ((P, T)) ⇒ TraversableOnce[U])(implicit flowDef: FlowDef, mode: Mode, setter: TupleSetter[U]): Pipe

    Permalink

    If you want to filter, you should use this and output a 0 or 1 length Iterable.

    If you want to filter, you should use this and output a 0 or 1 length Iterable. Filter does not change column names, and we generally expect to change columns here

    Definition Classes
    Mappable
  16. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  17. def hdfsScheme: Scheme[JobConf, RecordReader[_, _], OutputCollector[_, _], _, _]

    Permalink

    The scheme to use if the source is on hdfs.

    The scheme to use if the source is on hdfs.

    Definition Classes
    PartitionedDelimitedSourceSchemedSource
  18. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  19. def localScheme: Scheme[Properties, InputStream, OutputStream, _, _]

    Permalink

    The scheme to use if the source is local.

    The scheme to use if the source is local.

    Definition Classes
    PartitionedDelimitedSourceSchemedSource
  20. final def mapTo[U](out: Fields)(mf: ((P, T)) ⇒ U)(implicit flowDef: FlowDef, mode: Mode, setter: TupleSetter[U]): Pipe

    Permalink
    Definition Classes
    Mappable
  21. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  23. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. implicit val partitionConverter: TupleConverter[P]

    Permalink
  25. def partitionFields: Fields

    Permalink
    Definition Classes
    PartitionSchemed
  26. implicit val partitionSetter: TupleSetter[P]

    Permalink
  27. val path: String

    Permalink
  28. val quote: String

    Permalink
  29. def read(implicit flowDef: FlowDef, mode: Mode): Pipe

    Permalink
    Definition Classes
    Source
  30. val safe: Boolean

    Permalink
  31. val separator: String

    Permalink
  32. def setter[U <: (P, T)]: TupleSetter[U]

    Permalink

    Flatten a pair of P and T into a cascading tuple.

    Flatten a pair of P and T into a cascading tuple.

    Definition Classes
    PartitionSchemedTypedSink
  33. def sinkFields: Fields

    Permalink
    Definition Classes
    PartitionSchemedTypedSink
  34. val sinkMode: SinkMode

    Permalink
    Definition Classes
    SchemedSource
  35. val skipHeader: Boolean

    Permalink
  36. def sourceFields: Fields

    Permalink
    Definition Classes
    TypedSource
  37. def sourceId: String

    Permalink

    This is a name the refers to this exact instance of the source (put another way, if s1.sourceId == s2.sourceId, the job should work the same if one is replaced with the other

    This is a name the refers to this exact instance of the source (put another way, if s1.sourceId == s2.sourceId, the job should work the same if one is replaced with the other

    Definition Classes
    Source
  38. val strict: Boolean

    Permalink
  39. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  40. val template: String

    Permalink
  41. def toIterator(implicit config: Config, mode: Mode): Iterator[(P, T)]

    Permalink

    Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS.

    Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS. Typical use might be to read in Job.next to determine if another job is needed

    Definition Classes
    Mappable
  42. def transformForRead(pipe: Pipe): Pipe

    Permalink
    Attributes
    protected
    Definition Classes
    Source
  43. def transformForWrite(pipe: Pipe): Pipe

    Permalink
    Attributes
    protected
    Definition Classes
    Source
  44. def transformInTest: Boolean

    Permalink

    The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source.

    The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source. By default, as of 0.9.0, it is considered as a Mock of the Source. If you set this to true, the mock in TestMode will be considered to be a mock of the Tap (which must be transformed) and not the Source.

    Definition Classes
    Source
  45. val types: Array[Class[_]]

    Permalink
  46. def validateTaps(mode: Mode): Unit

    Permalink
    Definition Classes
    Source
  47. implicit val valueConverter: TupleConverter[T]

    Permalink
  48. implicit val valueSetter: TupleSetter[T]

    Permalink
  49. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  51. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  52. def writeFrom(pipe: Pipe)(implicit flowDef: FlowDef, mode: Mode): Pipe

    Permalink

    write the pipe but return the input so it can be chained into the next operation

    write the pipe but return the input so it can be chained into the next operation

    Definition Classes
    Source
  53. val writeHeader: Boolean

    Permalink

Deprecated Value Members

  1. def readAtSubmitter[T](implicit mode: Mode, conv: TupleConverter[T]): Stream[T]

    Permalink
    Definition Classes
    Source
    Annotations
    @deprecated
    Deprecated

    (Since version 0.9.0) replace with Mappable.toIterator

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from PartitionSchemed[P, T]

Inherited from HfsTapProvider

Inherited from Mappable[(P, T)]

Inherited from TypedSource[(P, T)]

Inherited from TypedSink[(P, T)]

Inherited from SchemedSource

Inherited from Source

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped