New method detects functional memorization in code LLMs

By PulseAugur Editorial · [1 sources] · 2026-06-12 04:00

Researchers have developed a new method to detect functional memorization in code language models, going beyond simple textual overlap. By comparing a mid-trained model exposed to target code with a reference model, they can identify if functional logic, not just verbatim text, is being reproduced. This study used Olmo-3-32B and Python code, employing both textual similarity and execution-based functional similarity metrics to demonstrate the presence of functional memorization. The findings underscore the necessity for advanced auditing metrics that capture functional equivalence in code generation. AI

IMPACT Highlights the need for more sophisticated evaluation metrics for code generation models, impacting how their safety and originality are assessed.

RANK_REASON This is a research paper detailing a new method for evaluating code language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Matthieu Meeus, Anil Ramakrishna, Matthew Grange, Zheng Xu, Luca Melis · 2026-06-12 04:00

Detecting Functional Memorization in Code Language Models

arXiv:2606.12764v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to generate code at scale. Meanwhile, prior work has investigated whether training data may be recoverable from model outputs, by auditing the textual overlap between training exa…

COVERAGE [1]

Detecting Functional Memorization in Code Language Models

RELATED ENTITIES

RELATED TOPICS