shi-labs 's Collections

VisPer-LM

Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation