Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
manohar03 's Collections
Mobile Testing
tech
Text-to-video
Embed
merged
Medical
ROBOTICS
text-to-image
General Purpose
Audio
VLM

VLM

updated Nov 15, 2025
Upvote
-

  • Runtime error
    Featured
    453

    OmniParser V2

    🏢
    453

    OmniParser, turn your LLM into GUI agent


  • ScreenAI: A Vision-Language Model for UI and Infographics Understanding

    Paper • 2402.04615 • Published Feb 7, 2024 • 44

  • microsoft/Magma-8B

    Robotics • 9B • Updated Dec 10, 2025 • 1.14k • 412

  • mlfoundations/Gelato-30B-A3B

    Image-Text-to-Text • 31B • Updated Nov 15, 2025 • 146 • 28
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs