PulseAugur
EN
LIVE 13:17:08

New library and corpus streamline German legal reference processing

Researchers have developed "bundesrecht," an open-source library and corpus designed to improve the automated processing of German statutory references. This resource addresses the complexities of legal citations, such as variable formatting and abbreviations, by parsing, normalizing, and resolving them to specific legal provisions. The system includes a structured corpus of German federal law and has been evaluated on a dataset of annotated references, demonstrating its effectiveness in linking raw citation strings to canonical forms and statutory text. AI

IMPACT Enhances NLP capabilities for legal tech, potentially improving efficiency in legal research and document analysis.

RANK_REASON The cluster contains an academic paper detailing a new open-source library and corpus for a specific NLP task.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

COVERAGE [3]

  1. arXiv cs.CL TIER_1 English(EN) · Luca Foppiano, Christian Boulanger ·

    Digging Up Citations: FOSSIL, a Dataset and Workflow for Reference Extraction in Law and the Humanities

    arXiv:2606.01109v1 Announce Type: cross Abstract: Citation extraction tools are designed for the structured end-of-document bibliographies of the natural sciences, but law and humanities scholarship cites references primarily in footnotes, where bibliographic data is interleaved …

  2. arXiv cs.CL TIER_1 English(EN) · Harshil Darji, Martin Heckelmann, Christina Kratsch, Gerard de Melo ·

    Bundesrecht: An Open Library and Corpus for German Statutory Reference Processing

    arXiv:2605.31338v1 Announce Type: new Abstract: Statutory references are central to legal language understanding, but are difficult to process automatically, as they appear in compact and variable surface forms, may combine multiple targets, use special abbreviations, and often p…

  3. arXiv cs.CL TIER_1 English(EN) · Gerard de Melo ·

    Bundesrecht: An Open Library and Corpus for German Statutory Reference Processing

    Statutory references are central to legal language understanding, but are difficult to process automatically, as they appear in compact and variable surface forms, may combine multiple targets, use special abbreviations, and often point to lower-level units. Existing tools for Ge…