PulseAugur
EN
LIVE 06:33:58

NemoStation releases Marlin-2B, a compact VLM for video analysis

NemoStation has released Marlin-2B, a compact video large model (VLM) designed for extracting structured information from videos. This 2-billion parameter model excels at dense captioning and temporal grounding, outperforming other models in its weight class on benchmarks like CaReBench and TimeLens-Bench. Marlin-2B is optimized for deployment, capable of running on a single consumer GPU and offering developer-friendly APIs for easy integration into applications. AI

IMPACT Provides a highly efficient, deployable VLM for structured video analysis, potentially lowering costs for video processing applications.

RANK_REASON New open-source model release with benchmark results. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Hugging Face Trending Models →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. Hugging Face Trending Models TIER_1 English(EN) · NemoStation ·

    NemoStation/Marlin-2B

    video-text-to-text · 6,032 downloads · 309 likes