Researchers have developed ctdGAN, a novel conditional Generative Adversarial Network designed to address class imbalance in tabular datasets. This new model partitions input samples into clusters and employs a probabilistic sampling strategy to generate synthetic data within these identified subspaces. The method also incorporates a cluster-wise scaling technique to capture multiple feature modes and a loss function that penalizes mis-predictions at both the cluster and class levels. Evaluations on 14 imbalanced datasets showed ctdGAN's effectiveness in producing high-fidelity samples and improving classification accuracy. AI
IMPACT This research offers a new method for improving the performance of machine learning models on imbalanced tabular datasets.
RANK_REASON The cluster contains a research paper detailing a new model and methodology for tabular data generation. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →