This post is a brief summary about the paper that I read for my study and curiosity, so I shortly arrange the content of the paper, titled Improving Text-to-SQL Evaluation Methodology (Finegan-Dollak et al., ACL 2018), that I read and studied.

They reorganized the existing datasets to resolve the promblem which is that traditional question-based splits allow queries to appear in both train and test.

Finegan-Dollack et al., ACL 2018

For detailed experiment and explanation, refer to the paper, titled Improving Text-to-SQL Evaluation Methodology (Finegan-Dollak et al., ACL 2018)