New algorithm efficiently finds most influential data sets

By PulseAugur Editorial · [4 sources] · 2026-06-03 04:00

Researchers have developed a new algorithmic approach to efficiently identify the most influential sets of data points within a dataset. This method simplifies the computationally intensive task of searching through all possible subsets by reducing it to a sequence of top-k problems. The algorithm, based on Dinkelbach's method, offers a cost-effective solution for identifying these influential sets, which can significantly alter statistical estimations and model conclusions. AI

IMPACT Provides a more efficient method for identifying influential data points, potentially improving the robustness and interpretability of machine learning models.

RANK_REASON The cluster contains two arXiv papers on a statistical method for identifying influential data sets.

Read on arXiv stat.ML →

paper
other

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

COVERAGE [4]

arXiv stat.ML TIER_1 English(EN) · Lucas D. Konrad, Nikolas Kuschnig · 2026-06-05 04:00

Finding Most Influential Sets

arXiv:2606.05919v1 Announce Type: new Abstract: Identifying most influential sets (MIS) - size-$k$ subsets whose removal maximally changes a target estimand - is typically infeasible because it requires searching over $\binom{n}{k}$ subsets. For estimands with linear-fractional l…
arXiv stat.ML TIER_1 English(EN) · Nikolas Kuschnig · 2026-06-04 09:24

Finding Most Influential Sets

Identifying most influential sets (MIS) - size-$k$ subsets whose removal maximally changes a target estimand - is typically infeasible because it requires searching over $\binom{n}{k}$ subsets. For estimands with linear-fractional leave-set-out effects, we show that MIS selection…
arXiv stat.ML TIER_1 English(EN) · Nikolas Kuschnig · 2026-06-04 09:24

Finding Most Influential Sets

Identifying most influential sets (MIS) - size-$k$ subsets whose removal maximally changes a target estimand - is typically infeasible because it requires searching over $\binom{n}{k}$ subsets. For estimands with linear-fractional leave-set-out effects, we show that MIS selection…
arXiv stat.ML TIER_1 English(EN) · Lucas D. Konrad, Nikolas Kuschnig · 2026-06-03 04:00

Testing Most Influential Sets

arXiv:2510.20372v4 Announce Type: replace Abstract: Small influential data subsets can dramatically impact model conclusions, with a few data points overturning key findings. While recent work identifies these most influential sets, there is no formal way to tell when maximum inf…

COVERAGE [4]

Finding Most Influential Sets

Finding Most Influential Sets

Finding Most Influential Sets

Testing Most Influential Sets

RELATED ENTITIES

RELATED TOPICS