New frameworks enhance Text-to-SQL models with flexible interaction and fine-grained feedback

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 22 sources

Researchers have developed several new frameworks to improve Text-to-SQL generation, particularly for smaller language models and complex database interactions. FineStep and FINER-SQL introduce novel reinforcement learning approaches with step-level credit assignment and fine-grained execution feedback to enhance accuracy and efficiency. Rose-SQL leverages in-context learning with small reasoning models for multi-turn queries, while FlexSQL focuses on flexible database interaction and exploration for better query interpretation. Additionally, EGRefine addresses schema ambiguity by optimizing naming conventions to improve downstream Text-to-SQL performance across various models. AI

Summary written by gemini-2.5-flash-lite from 22 sources. How we write summaries →

IMPACT These advancements offer more efficient, accurate, and privacy-preserving Text-to-SQL solutions, potentially enabling wider adoption of natural language database querying.

RANK_REASON Multiple research papers introduce novel frameworks and techniques for improving Text-to-SQL generation.

Read on arXiv cs.CL →

COVERAGE [22]

arXiv cs.CL TIER_1 · Andrea Giovannini · 2026-05-08 14:32

PolySQL: Scaling Text-to-SQL Evaluation Across SQL Dialects via Automated Backend Isomorphism

SQL dialects vary in syntax, types, and functions across database engines. Text-to-SQL benchmarks, however, predominantly support only SQLite. This creates a critical evaluation gap: cross-dialect evaluation reveals weak per-query agreement (Cohen's ), showing that SQLite perform…
arXiv cs.CL TIER_1 · Vicki Stover Hertzberg, Eduardo Valverde, Joyce C. Ho · 2026-05-08 04:00

Anatomy of a Query: W5H Dimensions and FAR Patterns for Text-to-SQL Evaluation

arXiv:2605.05525v1 Announce Type: cross Abstract: Natural language interfaces to databases have gained popularity, yet the theoretical foundations for evaluating and designing these systems remain underdeveloped. We present QUEST (Query Understanding Evaluation through Semantic T…
arXiv cs.CL TIER_1 · Yaxun Dai, Baolin Sun, Junying Wang, Pengfei Wang, Yingqi Gao, Xuemei Dong, Mengdie Chu, Xiang Qi, Pingfu Chao · 2026-05-07 04:00

Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL

arXiv:2605.04719v1 Announce Type: new Abstract: Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely…
arXiv cs.CL TIER_1 · Pingfu Chao · 2026-05-06 10:10

Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL

Tool-integrated Text-to-SQL parsing has emerged as a promising paradigm, framing SQL generation as a sequential decision-making process interleaved with tool execution. However, existing reinforcement learning approaches mainly rely on coarse-grained outcome supervision, resultin…
arXiv cs.CL TIER_1 · Le Zhou, Feng Yao, Fengcai Qiao, Bo Xu, Fangyuan Wang, Boyan Xu · 2026-05-06 04:00

Rose-SQL: Role-State Evolution Guided Structured Reasoning for Multi-Turn Text-to-SQL

arXiv:2605.03720v1 Announce Type: new Abstract: Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks re…
arXiv cs.CL TIER_1 · Thanh Dat Hoang, Thanh Trung Huynh, Matthias Weidlich, Thanh Tam Nguyen, Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen · 2026-05-06 04:00

FINER-SQL: Boosting Small Language Models for Text-to-SQL

arXiv:2605.03465v1 Announce Type: cross Abstract: Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. …
arXiv cs.CL TIER_1 · Boyan Xu · 2026-05-05 13:06

Rose-SQL: Role-State Evolution Guided Structured Reasoning for Multi-Turn Text-to-SQL

Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought have demonstrated remarkable capabilities in code generation and mathematical reasoning. However, their potential in multi-turn Text-to-SQL tasks remains largely underexplored. Existing approaches…
arXiv cs.CL TIER_1 · Quoc Viet Hung Nguyen · 2026-05-05 07:51

FINER-SQL: Boosting Small Language Models for Text-to-SQL

Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. A natural alternative is to use small language mod…
arXiv cs.CL TIER_1 · Quang Hieu Pham, Yang He, Ping Nie, Canwen Xu, Davood Rafiei, Yuepeng Wang, Xi Ye, Jocelyn Qiaochu Chen · 2026-05-05 04:00

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

arXiv:2605.02815v1 Announce Type: new Abstract: Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved o…
arXiv cs.CL TIER_1 · Jocelyn Qiaochu Chen · 2026-05-04 16:51

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved once upfront and the database is only revisited f…
arXiv cs.CL TIER_1 · Jiaqian Wang, Yutao Qi, Wenjin Hou, Yu Pang, Rui Yang · 2026-05-04 04:00

EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

arXiv:2605.00628v1 Announce Type: cross Abstract: Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches trea…
arXiv cs.CL TIER_1 · Rui Yang · 2026-05-01 13:01

EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches treat schemas as fixed and address errors downstream. …
arXiv cs.AI TIER_1 · Smit Jivani, Sarvam Maheshwari, Sunita Sarawagi · 2026-05-01 04:00

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

arXiv:2604.28028v1 Announce Type: cross Abstract: Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or un…
arXiv cs.AI TIER_1 · Taslim Jamal Arif, Kuldeep Singh · 2026-05-01 04:00

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

arXiv:2604.28049v1 Announce Type: new Abstract: Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers as…
arXiv cs.AI TIER_1 · Kuldeep Singh · 2026-04-30 15:59

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or schema-dependent semantic parsers assume access to ground-truth queries and structur…
arXiv cs.CL TIER_1 · Sunita Sarawagi · 2026-04-30 15:44

Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

Large language models (LLMs) have revolutionized Text-to-SQL generation, allowing users to query structured data using natural language with growing ease. Yet, real-world deployment remains challenging, especially in complex or unseen schemas, due to inconsistent accuracy and the…
arXiv cs.CL TIER_1 · Hojae Han, Yeonseok Jeong, Seung-won Hwang, Zhewei Yao, Yuxiong He · 2026-04-29 04:00

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

arXiv:2604.25325v1 Announce Type: cross Abstract: Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistent…
arXiv cs.CL TIER_1 · Yusuf Denizay D\"onder, Derek Hommel, Andrea W Wen-Yi, David Mimno, Unso Eun Seo Jo · 2026-04-29 04:00

Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning

arXiv:2505.14174v2 Announce Type: replace Abstract: LLMs are effective at code generation tasks like text-to-SQL, but is it worth the cost? Many state-of-the-art approaches use non-task-specific LLM techniques including Chain-of-Thought (CoT), self-consistency, and fine-tuning. T…
arXiv cs.CL TIER_1 · Yuxiong He · 2026-04-28 07:40

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
Hugging Face Daily Papers TIER_1 · 2026-04-28 07:40

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ra…
arXiv cs.AI TIER_1 · Sepideh Abedini, M. Tamer \"Ozsu · 2026-04-28 04:00

SQLyzr: A Comprehensive Benchmark and Evaluation Platform for Text-to-SQL

arXiv:2604.21214v2 Announce Type: replace-cross Abstract: Text-to-SQL models have significantly improved with the adoption of Large Language Models (LLMs), leading to their increasing use in real-world applications. Although many benchmarks exist for evaluating the performance of…
arXiv cs.CL TIER_1 · Tanmay Parekh, Ella Hofmann-Coyle, Shuyi Wang, Sachith Sri Ram Kothur, Srivas Prasad, Yunmo Chen · 2026-04-28 04:00

PExA: Parallel Exploration Agent for Complex Text-to-SQL

arXiv:2604.22934v1 Announce Type: cross Abstract: LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation within the lens of software test c…

COVERAGE [22]

RELATED ENTITIES

RELATED TOPICS