New frameworks enhance Text-to-SQL models with flexible interaction and fine-grained feedback

By PulseAugur Editorial · [22 sources] · 2026-04-28 04:00

Researchers have developed several new frameworks to improve Text-to-SQL generation, particularly for smaller language models and complex database interactions. FineStep and FINER-SQL introduce novel reinforcement learning approaches with step-level credit assignment and fine-grained execution feedback to enhance accuracy and efficiency. Rose-SQL leverages in-context learning with small reasoning models for multi-turn queries, while FlexSQL focuses on flexible database interaction and exploration for better query interpretation. Additionally, EGRefine addresses schema ambiguity by optimizing naming conventions to improve downstream Text-to-SQL performance across various models. AI

IMPACT These advancements offer more efficient, accurate, and privacy-preserving Text-to-SQL solutions, potentially enabling wider adoption of natural language database querying.

RANK_REASON Multiple research papers introduce novel frameworks and techniques for improving Text-to-SQL generation.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 22 sources. How we write summaries →

COVERAGE [22]

arXiv cs.CL TIER_1 English(EN) · Andrea Giovannini · 2026-05-08 14:32

PolySQL: Scaling Text-to-SQL Evaluation Across SQL Dialects via Automated Backend Isomorphism

SQL dialects vary in syntax, types, and functions across database engines. Text-to-SQL benchmarks, however, predominantly support only SQLite. This creates a critical evaluation gap: cross-dialect evaluation reveals weak per-query agreement (Cohen's ), showing that SQLite perform…
arXiv cs.CL TIER_1 English(EN) · Vicki Stover Hertzberg, Eduardo Valverde, Joyce C. Ho · 2026-05-08 04:00

Anatomy of a Query: W5H Dimensions and FAR Patterns for Text-to-SQL Evaluation

arXiv:2605.05525v1 Announce Type: cross Abstract: Natural language interfaces to databases have gained popularity, yet the theoretical foundations for evaluating and designing these systems remain underdeveloped. We present QUEST (Query Understanding Evaluation through Semantic T…
arXiv cs.CL TIER_1 English(EN) · Yaxun Dai, Baolin Sun, Junying Wang, Pengfei Wang, Yingqi Gao, Xuemei Dong, Mengdie Chu, Xiang Qi, Pingfu Chao · 2026-05-07 04:00

Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL

arXiv:2605.04719v1 Announce Type: new Abstract: Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely…
arXiv cs.CL TIER_1 English(EN) · Pingfu Chao · 2026-05-06 10:10

Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL

Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely on coarse-grained outcome supervision, resultin…
arXiv cs.CL TIER_1 English(EN) · Le Zhou, Feng Yao, Fengcai Qiao, Bo Xu, Fangyuan Wang, Boyan Xu · 2026-05-06 04:00

Rose-SQL: Role-State Evolution Guided Structured Reasoning for Multi-Turn Text-to-SQL

arXiv:2605.03720v1 Announce Type: new Abstract: Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks re…
arXiv cs.CL TIER_1 English(EN) · Thanh Dat Hoang, Thanh Trung Huynh, Matthias Weidlich, Thanh Tam Nguyen, Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen · 2026-05-06 04:00

FINER-SQL: Boosting Small Language Models for Text-to-SQL

arXiv:2605.03465v1 Announce Type: cross Abstract: Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. …
arXiv cs.CL TIER_1 English(EN) · Boyan Xu · 2026-05-05 13:06

Rose-SQL: Role-State Evolution Guided Structured Reasoning for Multi-Turn Text-to-SQL

Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks remains largely underexplored. Existing approaches…
arXiv cs.CL TIER_1 English(EN) · Quoc Viet Hung Nguyen · 2026-05-05 07:51

FINER-SQL: Boosting Small Language Models for Text-to-SQL

Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. A natural alternative is to use small language mod…
arXiv cs.CL TIER_1 English(EN) · Quang Hieu Pham, Yang He, Ping Nie, Canwen Xu, Davood Rafiei, Yuepeng Wang, Xi Ye, Jocelyn Qiaochu Chen · 2026-05-05 04:00

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

arXiv:2605.02815v1 Announce Type: new Abstract: Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved o…
arXiv cs.CL TIER_1 English(EN) · Jocelyn Qiaochu Chen · 2026-05-04 16:51

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved once upfront and the database is only revisited f…
arXiv cs.CL TIER_1 English(EN) · Jiaqian Wang, Yutao Qi, Wenjin Hou, Yu Pang, Rui Yang · 2026-05-04 04:00

EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

arXiv:2605.00628v1 Announce Type: cross Abstract: Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches trea…
arXiv cs.CL TIER_1 English(EN) · Rui Yang · 2026-05-01 13:01

EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches treat schemas as fixed and address errors downstream. …
arXiv cs.AI TIER_1 English(EN) · Smit Jivani, Sarvam Maheshwari, Sunita Sarawagi · 2026-05-01 04:00

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

arXiv:2604.28028v1 Announce Type: cross Abstract: Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or un…
arXiv cs.AI TIER_1 English(EN) · Taslim Jamal Arif, Kuldeep Singh · 2026-05-01 04:00

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

arXiv:2604.28049v1 Announce Type: new Abstract: Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers as…
arXiv cs.AI TIER_1 English(EN) · Kuldeep Singh · 2026-04-30 15:59

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers assume access to ground-truth queries and structur…
arXiv cs.CL TIER_1 English(EN) · Sunita Sarawagi · 2026-04-30 15:44

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or unseen schemas, due to inconsistent accuracy and the…
arXiv cs.CL TIER_1 English(EN) · Hojae Han, Yeonseok Jeong, Seung-won Hwang, Zhewei Yao, Yuxiong He · 2026-04-29 04:00

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

arXiv:2604.25325v1 Announce Type: cross Abstract: Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistent…
arXiv cs.CL TIER_1 English(EN) · Yusuf Denizay D\"onder, Derek Hommel, Andrea W Wen-Yi, David Mimno, Unso Eun Seo Jo · 2026-04-29 04:00

Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning

arXiv:2505.14174v2 Announce Type: replace Abstract: LLMs are effective at code generation tasks like text-to-SQL, but is it worth the cost? Many state-of-the-art approaches use non-task-specific LLM techniques including Chain-of-Thought (CoT), self-consistency, and fine-tuning. T…
arXiv cs.CL TIER_1 English(EN) · Yuxiong He · 2026-04-28 07:40

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-04-28 07:40

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
arXiv cs.AI TIER_1 English(EN) · Sepideh Abedini, M. Tamer \"Ozsu · 2026-04-28 04:00

SQLyzr: A Comprehensive Benchmark and Evaluation Platform for Text-to-SQL

arXiv:2604.21214v2 Announce Type: replace-cross Abstract: Text-to-SQL models have significantly improved with the adoption of Large Language Models (LLMs), leading to their increasing use in real-world applications. Although many benchmarks exist for evaluating the performance of…
arXiv cs.CL TIER_1 English(EN) · Tanmay Parekh, Ella Hofmann-Coyle, Shuyi Wang, Sachith Sri Ram Kothur, Srivas Prasad, Yunmo Chen · 2026-04-28 04:00

PExA: Parallel Exploration Agent for Complex Text-to-SQL

arXiv:2604.22934v1 Announce Type: cross Abstract: LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation within the lens of software test c…

COVERAGE [22]

RELATED ENTITIES

RELATED TOPICS