GPT 模型自太平洋時間 3 月 31 日起享 9 折優惠。立即試用!

Large Language Models (LLMs)

Tokenization

Tokenization is the process of breaking text into smaller pieces called tokens—such as words or subwords—that a language model can understand. For example, "ChatGPT" might become "Chat" and "GPT." These tokens are then converted into numbers the model uses to process language. Tokenization affects how much text a model can handle at once, how fast it runs, and how accurate its output is. In short, it's the first step in helping AI read and work with language.

Tokenization is the process of breaking text into smaller pieces called tokens—such as words or subwords—that a language model can understand. For example, "ChatGPT" might become "Chat" and "GPT." These tokens are then converted into numbers the model uses to process language.

Tokenization affects how much text a model can handle at once, how fast it runs, and how accurate its output is. In short, it's the first step in helping AI read and work with language.

FAQ

Tokenization is the process of breaking text into smaller pieces called tokens like words or subwords so a language model can understand it. It's the first step that helps AI read and work with language.