Parcae: Doing more with fewer parameters using stable looped models
Together AI has introduced Parcae, a novel stable architecture for looped language models. This new design allows models to achieve the quality of larger Transformers while using significantly fewer parameters, by increasing recurrence rather than solely scaling data. Parcae demonstrates improved stability over previous looped models and establishes the first scaling laws for this type of architecture, suggesting a more efficient frontier for training memory-constrained on-device models. AI
IMPACT Introduces a more parameter-efficient model architecture, potentially enabling higher quality on-device AI with reduced memory footprints.