New frameworks enhance Text-to-SQL models with flexible interaction and fine-grained feedback
ByPulseAugur Editorial·
Summary by gemini-2.5-flash-lite
from 22 sources
Researchers have developed several new frameworks to improve Text-to-SQL generation, particularly for smaller language models and complex database interactions. FineStep and FINER-SQL introduce novel reinforcement learning approaches with step-level credit assignment and fine-grained execution feedback to enhance accuracy and efficiency. Rose-SQL leverages in-context learning with small reasoning models for multi-turn queries, while FlexSQL focuses on flexible database interaction and exploration for better query interpretation. Additionally, EGRefine addresses schema ambiguity by optimizing naming conventions to improve downstream Text-to-SQL performance across various models.
AI
IMPACT
These advancements offer more efficient, accurate, and privacy-preserving Text-to-SQL solutions, potentially enabling wider adoption of natural language database querying.
RANK_REASON
Multiple research papers introduce novel frameworks and techniques for improving Text-to-SQL generation.
SQL dialects vary in syntax, types, and functions across database engines. Text-to-SQL benchmarks, however, predominantly support only SQLite. This creates a critical evaluation gap: cross-dialect evaluation reveals weak per-query agreement (Cohen's ), showing that SQLite perform…
arXiv cs.CL
TIER_1·Vicki Stover Hertzberg, Eduardo Valverde, Joyce C. Ho·
arXiv:2605.05525v1 Announce Type: cross Abstract: Natural language interfaces to databases have gained popularity, yet the theoretical foundations for evaluating and designing these systems remain underdeveloped. We present QUEST (Query Understanding Evaluation through Semantic T…
arXiv:2605.04719v1 Announce Type: new Abstract: Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely…
Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely on coarse-grained outcome supervision, resultin…
arXiv:2605.03720v1 Announce Type: new Abstract: Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks re…
arXiv cs.CL
TIER_1·Thanh Dat Hoang, Thanh Trung Huynh, Matthias Weidlich, Thanh Tam Nguyen, Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen·
arXiv:2605.03465v1 Announce Type: cross Abstract: Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. …
Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks remains largely underexplored. Existing approaches…
Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. A natural alternative is to use small language mod…
arXiv cs.CL
TIER_1·Quang Hieu Pham, Yang He, Ping Nie, Canwen Xu, Davood Rafiei, Yuepeng Wang, Xi Ye, Jocelyn Qiaochu Chen·
arXiv:2605.02815v1 Announce Type: new Abstract: Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved o…
Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved once upfront and the database is only revisited f…
arXiv:2605.00628v1 Announce Type: cross Abstract: Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches trea…
Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches treat schemas as fixed and address errors downstream. …
arXiv:2604.28028v1 Announce Type: cross Abstract: Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or un…
arXiv:2604.28049v1 Announce Type: new Abstract: Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers as…
Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers assume access to ground-truth queries and structur…
Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or unseen schemas, due to inconsistent accuracy and the…
arXiv:2604.25325v1 Announce Type: cross Abstract: Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistent…
arXiv cs.CL
TIER_1·Yusuf Denizay D\"onder, Derek Hommel, Andrea W Wen-Yi, David Mimno, Unso Eun Seo Jo·
arXiv:2505.14174v2 Announce Type: replace Abstract: LLMs are effective at code generation tasks like text-to-SQL, but is it worth the cost? Many state-of-the-art approaches use non-task-specific LLM techniques including Chain-of-Thought (CoT), self-consistency, and fine-tuning. T…
Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
arXiv cs.AI
TIER_1·Sepideh Abedini, M. Tamer \"Ozsu·
arXiv:2604.21214v2 Announce Type: replace-cross Abstract: Text-to-SQL models have significantly improved with the adoption of Large Language Models (LLMs), leading to their increasing use in real-world applications. Although many benchmarks exist for evaluating the performance of…
arXiv cs.CL
TIER_1·Tanmay Parekh, Ella Hofmann-Coyle, Shuyi Wang, Sachith Sri Ram Kothur, Srivas Prasad, Yunmo Chen·
arXiv:2604.22934v1 Announce Type: cross Abstract: LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation within the lens of software test c…