Text-to-SQL

Before the Query
Comes the Company

Text-to-SQL made a beautiful promise: ask a question in English, get an answer from your company's data.

And for the first time, that promise felt believable. Modern AI could read table names, understand columns, write joins, and produce SQL that often worked. The interface to data was no longer a query editor. It was a sentence.

But inside a real company, the sentence is only the beginning. "Did onboarding improve?" is not one question. It hides five others.

text-to-sqlcompany-to-query7 min read

The promise was real

The first breakthrough was SQL

Text-to-SQL proved that language could become executable data work. SQL became possible. That was the breakthrough. But the company question still had to be translated before the SQL could be trusted.

Visual 1: the beautiful old promise

English to executable data work
01

Question

Did onboarding improve?

02

Database

Find users, accounts, events, and timestamps.

03

SQL

Write joins and counts that often work.

04

Answer

Activation is up 4%.

This was not a naive dream. Text-to-SQL was a real breakthrough: language could become executable data work. It solved the last mile first, the act of writing the query.

The hidden question

Company questions are not born precise

A person asks, Did onboarding improve? The old system sees users, events, and timestamps. The company sees launch exposure, metric definitions, product paths, customer eligibility, and exceptions.

Visual 2: what the question hides

Did onboarding improve?

What the database can see

user_signed_up
onboarding_started
setup_completed
invite_sent
first_project_created

What the company has to know

Improve what?Reveal

A signup, a setup step, an invite, a first project, or a return visit can each tell a different story.

For whom?Reveal

Self-serve users, enterprise accounts, trial teams, and legacy customers may belong in different populations.

Compared to when?Reveal

A launch week, the previous flow, the last full cohort, and seasonality can all change the baseline.

Who saw the launch?Reveal

Plan type, feature flags, region, fallback logic, and product paths decide who actually touched the redesign.

Which event is trusted?Reveal

The database may show several nearby events. Only one may still match the business definition of activation.

The query is not wrong because the SQL is bad. The query is wrong because it answered the shallow version of the question.

Meaning

The database has facts. The company gives them meaning.

A table can tell you that something happened. It cannot tell you whether that thing is the metric. It can show a signup, a setup step, an invite, a payment, or a return visit. The company has to decide which of those means activation.

Business meaning

What does "activation" actually mean?

Is it signing up, completing setup, inviting a teammate, creating a first project, or coming back the next day?

Product behavior

Who actually experienced the launch?

Some users saw the redesigned onboarding. Some stayed on the old path because of plan type, feature flags, region, or fallback logic.

Data reality

Which signal should be trusted?

The database may contain several nearby events. Only one may match the business definition.

Team memory

What does everyone know but nobody wrote down?

Launch caveats, renamed events, customer exceptions, broken tracking, and "do not use that table after March" knowledge often live in people's heads.

The old pipeline

The old pipeline answered too early

Ask question, generate SQL, get answer, then ask humans whether the definitions were right. That can produce a clean answer to the shallow version of the question.

Visual 3: same question, different context

Did onboarding improve?

Database-only answer

Activation is up 4%.

Clean, fast, and plausible. But it mixes customers who saw the redesign with customers still routed through the old flow.

Company-aware answer

Activation improved for self-serve accounts that saw the redesigned onboarding. Enterprise accounts stayed on the old path, which muted the top-line effect.

Eligible usersLaunch exposureTrusted eventLegacy exclusions

The new pipeline

The new pipeline reframes first

The next step is not more SQL tricks. It is not just a bigger prompt or a longer schema dump. The next step is to assemble company context before writing the query.

Visual 4: the query arrives late

SQL is not removed. SQL is delayed until the question is ready.

01

Business meaning

Define what improvement and activation mean for this company.

02

Product behavior

Understand who saw the redesigned onboarding path and who did not.

03

Data reality

Choose the trusted tables, events, rows, and filters.

04

Team memory

Bring in launch caveats, renamed events, and customer exceptions.

05

Reframed question

Among eligible accounts that saw the new path, did the real activation event improve?

06

SQL

Now write the query with the population, metric, joins, and caveats in place.

07

Evidenced answer

Activation improved for self-serve accounts; enterprise fallback muted the top line.

That context lives across the business, the product, the codebase, the database, and the memory of the people who built the system. It tells the agent what teams mean by activation, which users actually saw the launch, which table carries the trusted signal, and which exceptions would make a clean-looking answer misleading.

Once that context is assembled, the original question changes. "Did onboarding improve?" becomes: "Among eligible accounts that saw the redesigned onboarding path, did the real activation event improve compared with the previous flow, excluding customers still routed through legacy onboarding?"

Only now is it time to write SQL.

The category lesson

Text-to-SQL succeeded enough to reveal the real problem.

The bottleneck was not only query generation. It was question interpretation.

In a company, the hard part is not turning English into SQL. The hard part is turning a business question into the right data question.

The future is not Text-to-SQL. It is company-to-query.