TLT: Training-time Speculative Decoding to Speed LLM Training
MIT researchers propose a way to turn idle GPU time into training signal for reasoning LLMs. By verifying tokens from a small draft model during RL rollouts, teams can accelerate training without sacrificing accuracy.
TLT: Training-time Speculative Decoding to Speed LLM Training Read Post »





