com.twitter.common.text.token
Class TokenProcessor

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by com.twitter.common.text.token.TokenStream
          extended by com.twitter.common.text.token.TokenProcessor
Direct Known Subclasses:
ExtractorBasedTokenCombiner, LookAheadTokenCombiner, RegexDetector, TokenFilter, TokenStreamDuplicator, TokenStreamDuplicator.DuplicatedTokenStream

public abstract class TokenProcessor
extends TokenStream

A TokenStream whose input is another TokenStream. In other words, this class corresponds to TokenFilter in Lucene.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
 
Constructor Summary
TokenProcessor(TokenStream inputStream)
          Constructs a new TokenProcessor.
 
Method Summary
protected  TokenStream getInputStream()
           
<T extends TokenStream>
T
getInstanceOf(Class<T> cls)
          Searches and returns an instance of a specified class in this TokenStream chain.
 void reset(CharSequence input)
          Resets this TokenStream (and also downstream tokens if they exist) to parse a new input.
 
Methods inherited from class com.twitter.common.text.token.TokenStream
incrementToken, toStringList
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TokenProcessor

public TokenProcessor(TokenStream inputStream)
Constructs a new TokenProcessor.

Parameters:
inputStream - input TokenStream
Method Detail

reset

public void reset(CharSequence input)
Description copied from class: TokenStream
Resets this TokenStream (and also downstream tokens if they exist) to parse a new input.

Specified by:
reset in class TokenStream
Parameters:
input - new text to parse.

getInputStream

protected TokenStream getInputStream()

getInstanceOf

public <T extends TokenStream> T getInstanceOf(Class<T> cls)
Description copied from class: TokenStream
Searches and returns an instance of a specified class in this TokenStream chain.

Overrides:
getInstanceOf in class TokenStream
Parameters:
cls - class to search for
Returns:
instance of the class cls if found or null if not found