PulseAugur
EN
LIVE 13:54:35

New method improves semi-supervised learning with noisy data

Researchers have developed a new semi-supervised regression method designed for scenarios with abundant noisy proxy covariates and scarce task-specific labels. The proposed two-stage estimator learns kernel eigenfeatures from all covariates and then fits a ridge predictor using the limited labeled data. Theoretical bounds show this approach can achieve efficient learning rates, particularly when unlabeled proxy data is plentiful and perturbation is managed, with distribution regression identified as a special case. AI

IMPACT Introduces a novel approach to semi-supervised learning that could improve model performance in data-scarce environments.

RANK_REASON The cluster contains a research paper published on arXiv detailing a new machine learning methodology.

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv stat.ML TIER_1 English(EN) · Kwangho Kim, Jisu Kim ·

    Semi-Supervised Learning with Noisy Proxy Covariates: Generalization Bounds and Distribution Regression

    arXiv:2606.00512v1 Announce Type: cross Abstract: In many modern machine learning pipelines, abundant pretrained representations serve as noisy proxy covariates, while task-specific labels remain scarce. We study semi-supervised regression in this setting, and propose a simple tw…

  2. arXiv stat.ML TIER_1 English(EN) · Jisu Kim ·

    Semi-Supervised Learning with Noisy Proxy Covariates: Generalization Bounds and Distribution Regression

    In many modern machine learning pipelines, abundant pretrained representations serve as noisy proxy covariates, while task-specific labels remain scarce. We study semi-supervised regression in this setting, and propose a simple two stage estimator that learns kernel eigenfeatures…