NemoStation/Marlin-2B
NemoStation has released Marlin-2B, a compact video large model (VLM) designed for extracting structured information from videos. This 2-billion parameter model excels at dense captioning and temporal grounding, outperforming other models in its weight class on benchmarks like CaReBench and TimeLens-Bench. Marlin-2B is optimized for deployment, capable of running on a single consumer GPU and offering developer-friendly APIs for easy integration into applications. AI
IMPACT Provides a highly efficient, deployable VLM for structured video analysis, potentially lowering costs for video processing applications.