Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 7h

FOUNDv2: Learning Unified User Quantized Tokenizers for User Representation

Researchers have introduced FOUNDv2, a novel framework for user representation learning designed to address limitations in traditional continuous embedding methods. This new scheme utilizes a Unified User Quantized Tokenizer (U2QT) to convert heterogeneous user data into a standardized, discrete token space, significantly reducing storage and computational costs. FOUNDv2 employs a two-stage architecture for feature extraction and discretization, incorporating multi-scale alignment objectives to capture both fine-grained behaviors and temporal patterns. Large-scale deployment on Alipay has demonstrated its practical scalability and efficiency in industrial scenarios. AI

IMPACT This research offers a more efficient method for user representation, potentially improving personalization services and reducing infrastructure costs for large-scale platforms.

Alipay
FOUNDv2
U2QT
Chuan He