PulseAugur
EN
LIVE 23:55:21

DocQT dataset improves document forgery detection robustness

Researchers have developed DocQT, a new dataset and method to improve the robustness of document forgery localization models. These models often fail in real-world scenarios due to a mismatch between training data and operational document compression. DocQT addresses this by using diverse JPEG quantization tables sampled from real-world insurance documents, leading to significant gains in localization accuracy and reduced false positives, particularly for architectures that explicitly process quantization table information. AI

IMPACT Enhances the reliability of AI models used for detecting manipulated documents in real-world applications.

RANK_REASON The cluster contains an academic paper detailing a new dataset and methodology for improving AI model performance.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

DocQT dataset improves document forgery detection robustness

COVERAGE [2]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables

    Document manipulation localization models achieve strong performance on public benchmarks yet fail to generalize to operational document workflows. We identify a critical and overlooked source of this gap: the mismatch between the narrow distribution of JPEG quantization tables u…

  2. arXiv cs.CV TIER_1 English(EN) · Nicolas Sidère ·

    DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables

    Document manipulation localization models achieve strong performance on public benchmarks yet fail to generalize to operational document workflows. We identify a critical and overlooked source of this gap: the mismatch between the narrow distribution of JPEG quantization tables u…