com.twitter.common.text.util
Class TokenStreamSerializer

java.lang.Object
  extended by com.twitter.common.text.util.TokenStreamSerializer

public class TokenStreamSerializer
extends Object

Helper class to serialize a TokenStream into a byte array. A list of AttributeSerializers must be defined using the Builder, which serialize and deserialize individual attributes. The same TokenStreamSerializer should be used for serialization/de-serialization, as the order of the TokenStreamSerializer.AttributeSerializers must be consistent.


Nested Class Summary
static class TokenStreamSerializer.AttributeInputStream
          A DataInputStream that supports VInt-encoding.
static class TokenStreamSerializer.AttributeOutputStream
          A DataOutputStream that supports VInt-encoding.
static interface TokenStreamSerializer.AttributeSerializer
          Defines how individual attributes a (de)serialized.
static class TokenStreamSerializer.Builder
          Builds an TokenStreamSerializer.
static class TokenStreamSerializer.Version
           
 
Field Summary
protected static TokenStreamSerializer.Version CURRENT_VERSION
           
 
Constructor Summary
TokenStreamSerializer(List<TokenStreamSerializer.AttributeSerializer> attributeSerializers)
           
 
Method Summary
 int attributeSerializersFingerprint()
          The fingerprint of the attribute serializers that are attached to this TokenStreamSerializer.
static TokenStreamSerializer.Builder builder()
          Returns a new Builder to build a TokenStreamSerializer.
static int computeFingerprint(List<TokenStreamSerializer.AttributeSerializer> attributeSerializers)
           
 TokenStream deserialize(byte[] data, CharSequence charSequence)
          Deserializes the previously serialized TokenStream using the provided AttributeSerializer(s).
 TokenStream deserialize(byte[] data, int offset, int length, CharSequence charSequence)
           
 TokenStream deserialize(ByteArrayInputStream bais, CharSequence charSequence)
           
static TokenStreamSerializer.Version readVersionAndCheckFingerprint(TokenStreamSerializer.AttributeInputStream input, int attributeSerializersFingerprint)
           
 byte[] serialize(TokenStream tokenStream)
          Serialize the given TokenStream into a byte array using the provided AttributeSerializer(s).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CURRENT_VERSION

protected static final TokenStreamSerializer.Version CURRENT_VERSION
Constructor Detail

TokenStreamSerializer

public TokenStreamSerializer(List<TokenStreamSerializer.AttributeSerializer> attributeSerializers)
Method Detail

computeFingerprint

public static int computeFingerprint(List<TokenStreamSerializer.AttributeSerializer> attributeSerializers)

attributeSerializersFingerprint

public int attributeSerializersFingerprint()
The fingerprint of the attribute serializers that are attached to this TokenStreamSerializer.


serialize

public final byte[] serialize(TokenStream tokenStream)
                       throws IOException
Serialize the given TokenStream into a byte array using the provided AttributeSerializer(s). Note that this method doesn't serialize the CharSequence of the TokenStream - the caller has to take care of serializing this if necessary.

Throws:
IOException

deserialize

public final TokenStream deserialize(byte[] data,
                                     CharSequence charSequence)
                              throws IOException
Deserializes the previously serialized TokenStream using the provided AttributeSerializer(s). This method only deserializes all Attributes; the CharSequence instance containing the text must be provided separately.

Throws:
IOException

deserialize

public final TokenStream deserialize(byte[] data,
                                     int offset,
                                     int length,
                                     CharSequence charSequence)
                              throws IOException
Throws:
IOException

readVersionAndCheckFingerprint

public static TokenStreamSerializer.Version readVersionAndCheckFingerprint(TokenStreamSerializer.AttributeInputStream input,
                                                                           int attributeSerializersFingerprint)
                                                                    throws IOException
Throws:
IOException

deserialize

public final TokenStream deserialize(ByteArrayInputStream bais,
                                     CharSequence charSequence)
                              throws IOException
Throws:
IOException

builder

public static TokenStreamSerializer.Builder builder()
Returns a new Builder to build a TokenStreamSerializer.