AtScale Introduces Industry-First Public Leaderboard for Text-to-SQL Solutions, Setting Transparent Standards for Natural Language Query (NLQ) Evaluation

AtScale Introduces Industry-First Public Leaderboard for Text-to-SQL Solutions, Setting Transparent Standards for Natural Language Query (NLQ) Evaluation

Media:
Nicole Francoeur
nicole.francoeur@atscale.com
AtScale

AtScale, a leader in semantic layer technology, has launched an open, public leaderboard for Text-to-SQL (T2SQL) solutions, addressing a critical need for transparency and standardization in evaluating natural language data query capabilities. This resource enables academia, vendors, and developers to measure and compare T2SQL performance on a consistent, replicable benchmark using an industry standard open dataset, schema, and evaluation methods.

The surge in interest for T2SQL solutions, fueled by Generative AI advancements, enables non-technical users to ask complex questions of proprietary data without SQL skills. However, inconsistent and proprietary evaluation methods make it challenging to validate or compare these solutions. AtScale’s public benchmark solves this issue, providing an objective framework inspired by canonical benchmarks, like TPC-DS, and metrics that account for query and schema complexity.

“AtScale’s leaderboard sets a new standard for transparency in Text-to-SQL evaluation,” said John Langton, Head of Engineering at AtScale. “By creating an open, objective framework, we’re enabling the industry to validate and improve solutions that make natural language data queries more accessible and reliable for everyone.”

The AtScale Text-to-SQL Leaderboard includes:

Open Benchmarking Environment: A public GitHub repository with detailed download instructions for the TPC-DS dataset, evaluation questions, KPI definitions, and scoring methods that serve as a replicable standard for T2SQL evaluations.
Objective Complexity Metrics: Evaluation metrics that consider question and schema complexity, with scores across two dimensions:
- Question Complexity: Measures the complexity of KPIs required to answer a question, from simple selections to complex aggregations.
- Schema Complexity: Captures the number of tables needed to answer a question accurately, with questions requiring five or more tables rated as high complexity.
Real-Time Public Leaderboard: An industry-first live performance tracker that displays the scores of T2SQL solutions, fostering transparency and competition.
Community Collaboration: As a community resource, the leaderboard encourages participation, feedback, and collaborative improvement, allowing the industry to continuously refine the evaluation framework.

For Participation and Feedback

For organizations or developers interested in submitting a Text-to-SQL solution to the leaderboard or providing feedback, please contact: ailink@atscale.com.

To read the full whitepaper “Enable Natural Language Prompting with AtScale’s Semantic Layer & Generative AI” click here.

Read the announcement Blog for more information on AtScale’s NLQ Benchmark Release.

About AtScale:

AtScale enables smarter data-driven decisions by bridging the gap between data and analytics, simplifying and extending BI and AI capabilities. With its Universal Semantic Layer, AtScale empowers enterprises to create business-friendly data models that ensure consistency and accuracy across analytics tools. With over a decade of innovation, AtScale continues to lead the industry, transforming how enterprises utilize and analyze their data. For more information, please visit www.atscale.com and follow us on LinkedIn.