Skip to content

Tokenization

[/ˌtoʊkənɪˈzeɪʃən/]

nounAI & Technology#ai#nlp#tokens#preprocessing
0 views1 definitions

Definitions

1
+1113

The process of converting raw text into discrete units called tokens that a language model can process. Tokens are typically subword units — common words become single tokens while rare words split into multiple tokens. All LLM pricing and context limits are measured in tokens, not characters or words.

The word "unbelievable" tokenized into three pieces: "un", "believ", "able".
by @mlresearcher1/1/1970

Related Terms

Related terms are generated only from public tags, classes, translations, and explicit relationships. No unavailable semantic relationships are fabricated.