Existence of under-trained and unused tokens and Identification Techniques using GPT-2 Small as an Example
We have observed the existence of both unused and under-trained tokens in exploration of transformer based large language models (LLMs) such as ChatGPT, of which the tokenization and the model training stay as two separate processes. Unused tokens and under-trained tokens have the following different behaviors:
- here
Add A Comment