CASP: Support-Aware Offline Policy Selection for Two-Stage Recommender Systems
Researchers have introduced CASP (Coupled Action-Set Pessimism), a novel method for selecting policies in two-stage recommender systems. This approach addresses the challenge where changing the initial item generator can alter both the estimated policy value and the data supporting that estimation. CASP combines doubly robust value estimation with a penalty for weak data support, aiming to select more reliable policies by considering the credibility of the data. AI
IMPACT Introduces a new offline selection method for two-stage recommender systems, potentially improving recommendation accuracy by accounting for data support.