Modern LLMs typically return answers through a token-by-token streaming process, where each token is packaged into a separate ...