class documentation
class EncoderState: (source)
Known subclasses: tivars.tokenizer.state.Line
, tivars.tokenizer.state.MaxMode
, tivars.tokenizer.state.MinMode
, tivars.tokenizer.state.SmartMode
Constructor: EncoderState(length)
Base class for encoder states
Each state represents some encoding context which affects tokenization.
Method | __init__ |
Undocumented |
Method | munch |
Munch the input string and determine the resulting token, encoder state, and remainder of the string |
Method | next |
Determines the next encode state given a token |
Class Variable | max |
The maximum number of tokens to emit before leaving this state |
Class Variable | mode |
Whether to munch maximally (0) or minimally (-1) |
Instance Variable | length |
Undocumented |
Munch the input string and determine the resulting token, encoder state, and remainder of the string
Parameters | |
string:str | The text string to tokenize |
trie:TokenTrie | The TokenTrie object to use for tokenization |
Returns | |
tuple[ | A tuple of the output Token , the remainder of string, and a list of states to add to the stack |
overridden in
tivars.tokenizer.state.Line
, tivars.tokenizer.state.SmartMode
Determines the next encode state given a token
The current state is popped from the stack, and the states returned by this method are pushed.
- If the list of returned states is...
- empty, then the encoder is exiting the current state.
- length one, then the encoder's current state is being replaced by a new state.
- length two, then the encoder is entering a new state, able to exit back to this one.
Parameters | |
token:Token | The current token |
Returns | |
list[ | A list of encoder states to add to the stack |
overridden in
tivars.tokenizer.state.InterpolationStart
, tivars.tokenizer.state.MaxMode
, tivars.tokenizer.state.MinMode
, tivars.tokenizer.state.Name
, tivars.tokenizer.state.SmartMode
, tivars.tokenizer.state.String
Whether to munch maximally (0) or minimally (-1)