RTPrune boosts DeepSeek-OCR inference speed by 1.23x with novel token pruning

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-01 04:30

Researchers have developed RTPrune, a novel two-stage token pruning method designed to enhance the efficiency of DeepSeek-OCR inference. This method mimics the model's two-stage reading process, first prioritizing high-norm tokens for salient information and then merging remaining tokens using optimal transport theory. RTPrune also incorporates a dynamic pruning ratio tailored for OCR tasks, achieving a superior balance between accuracy and efficiency. AI

影响 Improves inference speed and efficiency for OCR tasks, potentially reducing computational costs for processing long documents.

排序理由 This is a research paper detailing a new method for optimizing inference efficiency in an existing OCR model.

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Ben Wan, Yan Feng, Zihan Tang, Weizhe Huang, Yuting Zeng, Jia Wang, Tongxuan Liu · 2026-05-04 04:00

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

arXiv:2605.00392v1 Announce Type: new Abstract: DeepSeek-OCR leverages visual-text compression to reduce long-text processing costs and accelerate inference, yet visual tokens remain prone to redundant textual and structural information. Moreover, current token pruning methods fo…
arXiv cs.CV TIER_1 English(EN) · Tongxuan Liu · 2026-05-01 04:30

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

DeepSeek-OCR leverages visual-text compression to reduce long-text processing costs and accelerate inference, yet visual tokens remain prone to redundant textual and structural information. Moreover, current token pruning methods for conventional vision-language models (VLMs) fai…

报道来源 [2]

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

相关实体

相关话题