Object

com.twitter.penguin.korean

TwitterKoreanProcessor

Related Doc: package korean

Permalink

object TwitterKoreanProcessor

TwitterKoreanTokenizer provides error and slang tolerant Korean tokenization.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. TwitterKoreanProcessor
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def extractPhrases(tokens: Seq[KoreanToken], filterSpam: Boolean = false, enableHashtags: Boolean = true): Seq[KoreanPhrase]

    Permalink

    Extract noun-phrases from Korean text

    Extract noun-phrases from Korean text

    tokens

    Korean tokens

    filterSpam

    true if spam/slang terms to be filtered out (default: false)

    enableHashtags

    true if #hashtags to be included (default: true)

    returns

    A sequence of extracted phrases

  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  12. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  13. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def normalize(text: CharSequence): CharSequence

    Permalink

    Normalize Korean text.

    Normalize Korean text. Uses KoreanNormalizer.normalize().

    text

    Input text

    returns

    Normalized Korean text

  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. def splitSentences(text: CharSequence): Seq[Sentence]

    Permalink

    Split input text into sentences.

    Split input text into sentences.

    text

    input text

    returns

    A sequence of sentences.

  18. def stem(tokens: Seq[KoreanToken]): Seq[KoreanToken]

    Permalink

    Wrapper for Korean stemmer

    Wrapper for Korean stemmer

    tokens

    Korean tokens

    returns

    A sequence of stemmed tokens

  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  21. def tokenize(text: CharSequence): Seq[KoreanToken]

    Permalink

    Tokenize text into a sequence of KoreanTokens, which includes part-of-speech information and whether a token is an out-of-vocabulary term.

    Tokenize text into a sequence of KoreanTokens, which includes part-of-speech information and whether a token is an out-of-vocabulary term.

    text

    input text

    returns

    A sequence of KoreanTokens.

  22. def tokensToStrings(tokens: Seq[KoreanToken]): Seq[String]

    Permalink

    Tokenize text into a sequence of token strings.

    Tokenize text into a sequence of token strings. This excludes spaces.

    tokens

    Korean tokens

    returns

    A sequence of token strings.

  23. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped