PulseAugur
实时 05:29:59

Researchers distill DeepSeek-R1 reasoning into compact models for code clone detection

Researchers have developed a knowledge distillation framework to improve the reliability and practicality of compact open-source models for cross-language code clone detection. This method transfers reasoning capabilities from a larger model, DeepSeek-R1, to smaller models like Phi3 and Qwen-Coder. The approach incorporates response stabilization techniques and utilizes synthetic training data derived from Project CodeNet, showing improved performance and reduced inference time. AI

影响 Enhances the utility of smaller, open-source models for specialized code analysis tasks, potentially reducing reliance on larger, proprietary systems.

排序理由 This is a research paper detailing a new method for improving open-source models for a specific task.

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

Researchers distill DeepSeek-R1 reasoning into compact models for code clone detection

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Mohamad Khajezade, Fatemeh H. Fard, Mohamed Sami Shehata ·

    Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection

    arXiv:2605.02860v1 Announce Type: cross Abstract: Cross-language code clone detection (X-CCD) is challenging because semantically equivalent programs written in different languages often share little surface similarity. Although large language models (LLMs) have shown promise for…

  2. arXiv cs.AI TIER_1 English(EN) · Mohamed Sami Shehata ·

    Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection

    Cross-language code clone detection (X-CCD) is challenging because semantically equivalent programs written in different languages often share little surface similarity. Although large language models (LLMs) have shown promise for semantic clone detection, their use as black-box …