com.twitter.common.text.token.attribute
Class CharSequenceTermAttributeImpl

java.lang.Object
  extended by org.apache.lucene.util.AttributeImpl
      extended by com.twitter.common.text.token.attribute.CharSequenceTermAttributeImpl
All Implemented Interfaces:
CharSequenceTermAttribute, Serializable, Cloneable, org.apache.lucene.util.Attribute

public class CharSequenceTermAttributeImpl
extends org.apache.lucene.util.AttributeImpl
implements CharSequenceTermAttribute, Cloneable, Serializable

Implementation of CharSequenceTermAttribute. The implementation differs from Lucene's TermAttributeImpl, which relies on an internal char[] termBuffer that can grow. Extracting a token with TermAttributeImpl involves a copy into this buffer, and setting the length of the term. In contrast, with this class, the client instead refers to a span in the underlying CharSequence by start index (offset) and end index.

Note that this class explicitly suppresses the ability for instance to be serialized, inherited via AttributeImpl.

See Also:
Serialized Form

Constructor Summary
CharSequenceTermAttributeImpl()
           
 
Method Summary
 void clear()
           
 void copyTo(org.apache.lucene.util.AttributeImpl target)
          Passing a CharSequenceTermAttribute instead of a TermAttribute will obviate the construction of an extra String.
 boolean equals(Object other)
           
 CharSequence getCharSequence()
          Provides access to the encapsulated CharSequence.
 int getLength()
          The length is the length in characters of the span referenced by this CharSequenceTermAttribute.
 int getOffset()
          The offset is the character index, with respect to the underlying CharSequence, of the first character in the span referenced by this CharSequenceTermAttribute.
 CharSequence getTermCharSequence()
          Returns the term text as a CharSequence, without needing to construct a String.
 String getTermString()
          Returns the term text as a String.
 int hashCode()
          This is largely based on ArrayUtil.hashCode(char[], int, int).
 void setCharSequence(CharSequence originalCharSequence)
          Sets the encapsulated CharSequence.
 void setLength(int length)
          Assigns the length to the specified value.
 void setOffset(int offset)
          Assigns the offset to the specified value.
 void setTermBuffer(CharSequence seq)
          Assigns the backing CharSequence for this attribute to the specified CharSequence.
 void setTermBuffer(CharSequence seq, int offset, int length)
          Assigns the backing CharSequence for this attribute to the specified CharSequence.
 
Methods inherited from class org.apache.lucene.util.AttributeImpl
clone, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CharSequenceTermAttributeImpl

public CharSequenceTermAttributeImpl()
Method Detail

getTermCharSequence

public CharSequence getTermCharSequence()
Description copied from interface: CharSequenceTermAttribute
Returns the term text as a CharSequence, without needing to construct a String. This method is preferred over CharSequenceTermAttribute.getTermString().

Specified by:
getTermCharSequence in interface CharSequenceTermAttribute
Returns:
CharSequence representing the term text.

getTermString

public String getTermString()
Description copied from interface: CharSequenceTermAttribute
Returns the term text as a String. CharSequenceTermAttribute.getTermCharSequence() is preferred over this method.

Specified by:
getTermString in interface CharSequenceTermAttribute
Returns:
String representing the term text.

setTermBuffer

public void setTermBuffer(CharSequence seq)
Description copied from interface: CharSequenceTermAttribute
Assigns the backing CharSequence for this attribute to the specified CharSequence. The start character index is set to zero, and the end character index is set to the length of the specified CharSequence.

Specified by:
setTermBuffer in interface CharSequenceTermAttribute
Parameters:
seq - CharSequence that will become the new underlying CharSequence for this attribute.

setTermBuffer

public void setTermBuffer(CharSequence seq,
                          int offset,
                          int length)
Description copied from interface: CharSequenceTermAttribute
Assigns the backing CharSequence for this attribute to the specified CharSequence. The start character index is set to specified offset, and the end character index is set to offset plus length.

Specified by:
setTermBuffer in interface CharSequenceTermAttribute
Parameters:
seq - CharSequence that will become the new underlying CharSequence for this attribute.
offset - character index with respect to the specified CharSequence that will become the new start character index for this attribute.
length - this value will be added to the specified offset value, and the result will become the new end character index for this attribute.

clear

public void clear()
Specified by:
clear in class org.apache.lucene.util.AttributeImpl

copyTo

public void copyTo(org.apache.lucene.util.AttributeImpl target)
Passing a CharSequenceTermAttribute instead of a TermAttribute will obviate the construction of an extra String.

Specified by:
copyTo in class org.apache.lucene.util.AttributeImpl

equals

public boolean equals(Object other)
Specified by:
equals in class org.apache.lucene.util.AttributeImpl

hashCode

public int hashCode()
This is largely based on ArrayUtil.hashCode(char[], int, int).

Specified by:
hashCode in class org.apache.lucene.util.AttributeImpl

getOffset

public int getOffset()
Description copied from interface: CharSequenceTermAttribute
The offset is the character index, with respect to the underlying CharSequence, of the first character in the span referenced by this CharSequenceTermAttribute. The offset may point to the end of the underlying CharSequence when length is zero.

Specified by:
getOffset in interface CharSequenceTermAttribute
Returns:
the current offset

getLength

public int getLength()
Description copied from interface: CharSequenceTermAttribute
The length is the length in characters of the span referenced by this CharSequenceTermAttribute.

Specified by:
getLength in interface CharSequenceTermAttribute
Returns:
the current length

setOffset

public void setOffset(int offset)
Description copied from interface: CharSequenceTermAttribute
Assigns the offset to the specified value.

Specified by:
setOffset in interface CharSequenceTermAttribute
Parameters:
offset - new value for the offset, which must be at least zero, and less than or equal to the length of the underlying CharSequence

setLength

public void setLength(int length)
Description copied from interface: CharSequenceTermAttribute
Assigns the length to the specified value.

Specified by:
setLength in interface CharSequenceTermAttribute
Parameters:
length - new value for the length, which must be at least zero, and at most equal to the length of the underlying CharSequence

getCharSequence

public CharSequence getCharSequence()
Description copied from interface: CharSequenceTermAttribute
Provides access to the encapsulated CharSequence.

Specified by:
getCharSequence in interface CharSequenceTermAttribute
Returns:
the underlying CharSequence object

setCharSequence

public void setCharSequence(CharSequence originalCharSequence)
Description copied from interface: CharSequenceTermAttribute
Sets the encapsulated CharSequence.

Specified by:
setCharSequence in interface CharSequenceTermAttribute
Parameters:
originalCharSequence - CharSequence encapsulated by this CharSequenceAttribute