Object

com.twitter.penguin.korean.tokenizer

KoreanTokenizer

Related Doc: package tokenizer

Permalink

object KoreanTokenizer

Provides Korean tokenization.

Chunk: 어절 - 공백으로 구분되어 있는 단위 (사랑하는사람을) Word: 단어 - 하나의 문장 구성 요소 (사랑하는, 사람을) Token: 토큰 - 형태소와 비슷한 단위이지만 문법적으로 정확하지는 않음 (사랑, 하는, 사람, 을)

Whenever there is an updates in the behavior of KoreanParser, the initial cache has to be updated by running tools.CreateInitialCache.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. KoreanTokenizer
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Type Members

  1. case class CandidateParse(parse: ParsedChunk, curTrie: List[KoreanPosTrie], ending: Option[KoreanPos]) extends Product with Serializable

    Permalink
  2. case class KoreanToken(text: String, pos: KoreanPos, offset: Int, length: Int, unknown: Boolean = false) extends Product with Serializable

    Permalink
  3. case class ParsedChunk(posNodes: Seq[KoreanToken], words: Int) extends Product with Serializable

    Permalink

    A candidate parse for a chunk.

    A candidate parse for a chunk.

    posNodes

    Sequence of KoreanTokens.

    words

    Number of words in this candidate parse.

  4. case class ParsedChunkWithMinScore(parsedChunk: Option[ParsedChunk], score: Float) extends Product with Serializable

    Permalink
  5. case class PossibleTrie(curTrie: KoreanPosTrie, words: Int) extends Product with Serializable

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  16. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  17. def tokenize(text: CharSequence): Seq[KoreanToken]

    Permalink

    Parse Korean text into a sequence of KoreanTokens

    Parse Korean text into a sequence of KoreanTokens

    text

    Input Korean chunk

    returns

    sequence of KoreanTokens

  18. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped