commit 8337134 (2022-01-23 09:56:47 -0500) Torsten Scholak: treat lists as maybe + nonempty

DuoRAT: Towards Simpler Text-to-SQL Models

Torsten Scholak Raymond Li Dzmitry Bahdanau Harm de Vries Chris Pal

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Published on Jun 1, 2021

Official link: https://aclanthology.org/2021.naacl-main.103/

PDF Code

Tagged as: research

TL;DR: It's like RAT-SQL, but simpler and faster.

Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. Working mostly on the Spider dataset, researchers have proposed increasingly sophisticated solutions to the problem. Contrary to this trend, in this paper, we focus on simplifications. We begin by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAT-SQL is using only relation-aware or vanilla transformers as the building blocks. We perform several ablation experiments using DuoRAT as the baseline model. Our experiments confirm the usefulness of some techniques and point out the redundancy of others, including structural SQL features and features that link the question with the schema.

Next Publication

PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Nov 1, 2021

Introducing PICARD - a simple and effective constrained beam search algorithm for any language model. PICARD helps to generate valid code, which is useful for program synthesis and semantic parsing. We achieve SoTA on both Spider and CoSQL.