ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch Paper • 2506.03558 • Published Jun 4 • 5
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28 • 44
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1 • 25