Brief

last 24h

[3/3] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 9h

Clin-JEPA: A Multi-Phase Co-Training Framework for Joint-Embedding Predictive Pretraining on EHR Patient Trajectories

Researchers have developed Clin-JEPA, a novel multi-phase co-training framework designed for joint-embedding predictive pretraining on electronic health records (EHR). This framework addresses the challenge of creating a single AI model that can both forecast patient trajectories and perform various downstream risk-prediction tasks without requiring task-specific fine-tuning. Clin-JEPA employs a five-phase pretraining curriculum to stably co-train a Qwen3-8B encoder and a latent trajectory predictor, demonstrating improved performance on EHR data by uniquely converging latent rollout drift and learning clinically discriminative latent geometries. AI

IMPACT This framework could advance the development of AI models capable of complex predictive tasks within healthcare, improving patient care and risk assessment.
TOOL · arXiv cs.CV English(EN) · 3w

Zero-Shot Object Re-Identification in Egocentric Kitchen Videos via Multi-Stage SAM3 Feature Fusion

Researchers have developed a new zero-shot object re-identification pipeline for egocentric kitchen videos, addressing challenges like viewpoint changes and occlusions. The proposed method, built around the SAM3 segmentation model, significantly improves performance over existing feature extractors. By integrating SAM3 with DINOv2 and CLIP, and incorporating geometric consistency checks, the pipeline achieves a notable increase in accuracy. AI

IMPACT This research offers a more robust method for identifying objects in complex, egocentric video data, potentially improving applications in robotics and assistive technologies.
- SAM3
- DINOv2
- DreamSim
- I-JEPA
- EPIC-Kitchens
RESEARCH · arXiv cs.CV English(EN) · 1mo · [3 sources]

Text-Conditional JEPA for Learning Semantically Rich Visual Representations

Researchers have introduced Text-Conditional JEPA (TC-JEPA), a novel approach to visual self-supervised learning that leverages image captions to enhance semantic understanding. By using text to guide the prediction of masked image features, TC-JEPA aims to overcome the limitations of purely visual prediction methods. This technique shows promise in improving downstream task performance, training stability, and scaling properties, offering a new vision-language pretraining paradigm. AI

IMPACT Introduces a new vision-language pretraining paradigm that outperforms contrastive methods on tasks requiring fine-grained visual understanding.
- TC-JEPA
- I-JEPA
- arXiv

Brief

Clin-JEPA: A Multi-Phase Co-Training Framework for Joint-Embedding Predictive Pretraining on EHR Patient Trajectories

Zero-Shot Object Re-Identification in Egocentric Kitchen Videos via Multi-Stage SAM3 Feature Fusion

Text-Conditional JEPA for Learning Semantically Rich Visual Representations