Sequence Parallelism
PulseAugur coverage of Sequence Parallelism — every cluster mentioning Sequence Parallelism across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
Spotlight system cuts DiT RL post-training costs using spot GPUs
Researchers have developed Spotlight, a novel system designed to significantly reduce the cost of post-training Diffusion Transformers (DiTs) for reinforcement learning. By leveraging insights into exploration tolerance…
-
Zyphra's TSP strategy boosts LLM training throughput by 2.6x
Zyphra has developed a new technique called Tensor and Sequence Parallelism (TSP) designed to optimize the training and inference of large transformer models. This hardware-aware strategy combines aspects of Tensor Para…
-
New TSP strategy folds tensor and sequence parallelism for memory-efficient training
Researchers have introduced a new parallel execution strategy called Tensor and Sequence Parallelism (TSP) designed to enhance memory efficiency during the training and inference of Transformer models. TSP combines tens…