tivars.tokenizer.state.EncoderState

class documentation

class EncoderState: (source)

Known subclasses: tivars.tokenizer.state.Line, tivars.tokenizer.state.MaxMode, tivars.tokenizer.state.MinMode, tivars.tokenizer.state.SmartMode

Constructor: EncoderState(length)

View In Hierarchy

Base class for encoder states

Each state represents some encoding context which affects tokenization.

Method	`__init__`	Undocumented
Method	`munch`	Munch the input string and determine the resulting token, encoder state, and remainder of the string
Method	`next`	Determines the next encode state given a token
Class Variable	`max_length`	The maximum number of tokens to emit before leaving this state
Class Variable	`mode`	Whether to munch maximally (`0`) or minimally (`-1`)
Instance Variable	`length`	Undocumented

def __init__(self, length: int = 0): (source) ¶

Undocumented

def munch(self, string: str, trie: TITokenTrie) -> tuple[TIToken, str, list[EncoderState]]: (source) ¶

Munch the input string and determine the resulting token, encoder state, and remainder of the string

Parameters
string:`str`	The text string to tokenize
trie:`TITokenTrie`	The `TokenTrie` object to use for tokenization

Returns
`tuple[TIToken, str, list[EncoderState]]`	A tuple of the output `Token`, the remainder of `string`, and a list of states to add to the stack

def next(self, token: TIToken) -> list[EncoderState]: (source) ¶

overridden in tivars.tokenizer.state.Line, tivars.tokenizer.state.SmartMode

Determines the next encode state given a token

The current state is popped from the stack, and the states returned by this method are pushed.

If the list of returned states is...

empty, then the encoder is exiting the current state.
length one, then the encoder's current state is being replaced by a new state.
length two, then the encoder is entering a new state, able to exit back to this one.

Parameters
token:`TIToken`	The current token

Returns
`list[EncoderState]`	A list of encoder states to add to the stack

max_length = (source) ¶

overridden in tivars.tokenizer.state.ListName, tivars.tokenizer.state.ProgramName

The maximum number of tokens to emit before leaving this state

mode: int = (source) ¶

overridden in tivars.tokenizer.state.InterpolationStart, tivars.tokenizer.state.MaxMode, tivars.tokenizer.state.MinMode, tivars.tokenizer.state.Name, tivars.tokenizer.state.SmartMode, tivars.tokenizer.state.String

Whether to munch maximally (0) or minimally (-1)

length = (source) ¶

Undocumented