Dynamic Vocabulary Pruning: Stable LLM-RL by Taming the Tail

Published:

Yingru Li, Jiawei Xu, Jiacai Liu, Yuxuan Tong, Ziniu Li, Tianle Cai, Ge Zhang, Qian Liu, Baoxiang Wang.