com.twitter.common.text.token
Class TokenizedCharSequenceStream

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by com.twitter.common.text.token.TokenStream
          extended by com.twitter.common.text.token.TokenizedCharSequenceStream

public class TokenizedCharSequenceStream
extends TokenStream

Reproduces the result of tokenization if an input text is an instance of TokenizedCharSequence. Otherwise, passes the input text to downstream TokenStream.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
 
Constructor Summary
TokenizedCharSequenceStream()
          Constructor.
TokenizedCharSequenceStream(TokenStream inputStream)
          Constructor.
 
Method Summary
 boolean incrementToken()
          Consumers call this method to advance the stream to the next token.
 void reset(CharSequence input)
          Resets this TokenStream (and also downstream tokens if they exist) to parse a new input.
 
Methods inherited from class com.twitter.common.text.token.TokenStream
getInstanceOf, toStringList
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TokenizedCharSequenceStream

public TokenizedCharSequenceStream(TokenStream inputStream)
Constructor. If an input text is not tokenized (is not an instance of TokenizedCharSequence), this uses inputStream to tokenize it.

Parameters:
inputStream - a token stream to tokenize a text if it's not tokenized yet.

TokenizedCharSequenceStream

public TokenizedCharSequenceStream()
Constructor. This can only accept an already-tokenized text (TokenzedCharSequence) as input.

Method Detail

incrementToken

public boolean incrementToken()
Description copied from class: TokenStream
Consumers call this method to advance the stream to the next token.

Specified by:
incrementToken in class TokenStream
Returns:
false for end of stream; true otherwise

reset

public void reset(CharSequence input)
Description copied from class: TokenStream
Resets this TokenStream (and also downstream tokens if they exist) to parse a new input.

Specified by:
reset in class TokenStream
Parameters:
input - new text to parse.