

# Introduction
Knowledge has turn out to be an indispensable useful resource for any profitable enterprise, because it gives priceless insights for knowledgeable decision-making. Given the significance of knowledge, many corporations are constructing methods to retailer and analyze it. Nonetheless, there are various instances when it’s exhausting to amass and analyze the required information, particularly with the rising complexity of the info system.
With the arrival of generative AI, information work has turn out to be considerably simpler, as we are able to now use easy pure language to obtain largely correct output that intently follows the enter we offer. It’s additionally relevant to information processing and evaluation with SQL, the place we are able to ask for question improvement.
On this article, we’ll develop a easy API utility that interprets pure language into SQL queries that our database understands. We’ll use three principal instruments: OpenAI, FastAPI, and SQLite.
Right here’s the plan.
# Textual content-to-SQL App Improvement
First, we’ll put together the whole lot wanted for our mission. All you have to present is the OpenAI API key, which we’ll use to entry the generative mannequin. To containerize the appliance, we’ll use Docker, which you’ll be able to purchase for the native implementation utilizing Docker Desktop.
Different elements, similar to SQLite, will already be obtainable once you set up Python, and FastAPI shall be put in later.
For the general mission construction, we’ll use the next:
text_to_sql_app/
├── app/
│ ├── __init__.py
│ ├── database.py
│ ├── openai_utils.py
│ └── principal.py
├── demo.db
├── init_db.sql
├── necessities.txt
├── Dockerfile
├── docker-compose.yml
├── .env
Create the construction like above, or you need to use the next repository to make issues simpler. We’ll nonetheless undergo every file to realize an understanding of learn how to develop the appliance.
Let’s begin by populating the .env
file with the OpenAI API key we beforehand acquired. You are able to do that with the next code:
OPENAI_API_KEY=YOUR-API-KEY
Then, go to the necessities.txt
to fill within the obligatory libraries we’ll use for
fastapi
uvicorn
sqlalchemy
openai
pydantic
python-dotenv
Subsequent, we transfer on to the __init__.py
file, and we’ll put the next code inside:
from pathlib import Path
from dotenv import load_dotenv
load_dotenv(dotenv_path=Path(__file__).resolve().father or mother.father or mother / ".env", override=False)
The code above ensures that the surroundings accommodates all the required keys we’d like.
Then, we’ll develop Python code within the database.py
file to hook up with the SQLite database we’ll create later (known asdemo.db
) and supply a technique to run SQL queries.
from sqlalchemy import create_engine, textual content
from sqlalchemy.orm import Session
ENGINE = create_engine("sqlite:///demo.db", future=True, echo=False)
def run_query(sql: str) -> checklist[dict]:
with Session(ENGINE) as session:
rows = session.execute(textual content(sql)).mappings().all()
return [dict(r) for r in rows]
After that, we’ll put together the openai_utils.py
file that may settle for the database schema and the enter questions. The output shall be JSON containing the SQL question (with a guard to stop any write operations).
import os
import json
from openai import OpenAI
consumer = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
_SYSTEM_PROMPT = """
You change natural-language questions into read-only SQLite SQL.
By no means output INSERT / UPDATE / DELETE.
Return JSON: { "sql": "..." }.
"""
def text_to_sql(query: str, schema: str) -> str:
response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
temperature=0.1,
response_format={"kind": "json_object"},
messages=[
{"role": "system", "content": _SYSTEM_PROMPT},
{"role": "user",
"content": f"schema:n{schema}nnquestion: {question}"}
]
)
payload = json.hundreds(response.decisions[0].message.content material)
return payload["sql"]
With each the code and the connection prepared, we’ll put together the appliance utilizing FastAPI. The appliance will settle for pure language questions and the database schema, convert them into SQL SELECT
queries, run them by the SQLite database, and return the outcomes as JSON. The appliance shall be an API we are able to entry through the CLI.
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from sqlalchemy import examine
from .database import ENGINE, run_query
from .openai_utils import text_to_sql
app = FastAPI(title="Textual content-to-SQL Demo")
class NLRequest(BaseModel):
query: str
@app.on_event("startup")
def capture_schema() -> None:
insp = examine(ENGINE)
world SCHEMA_STR
SCHEMA_STR = "n".be part of(
f"CREATE TABLE {t} ({', '.be part of(c['name'] for c in insp.get_columns(t))});"
for t in insp.get_table_names()
)
@app.submit("/question")
def question(req: NLRequest):
attempt:
sql = text_to_sql(req.query, SCHEMA_STR)
if not sql.lstrip().decrease().startswith("choose"):
elevate ValueError("Solely SELECT statements are allowed")
return {"sql": sql, "end result": run_query(sql)}
besides Exception as e:
elevate HTTPException(status_code=400, element=str(e))
That’s the whole lot we’d like for the primary utility. The subsequent factor we’ll put together is the database. Use the database under within the init_db.sql
for instance functions, however you may at all times change it if you’d like.
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS funds;
DROP TABLE IF EXISTS merchandise;
DROP TABLE IF EXISTS clients;
CREATE TABLE clients (
id INTEGER PRIMARY KEY,
identify TEXT NOT NULL,
nation TEXT,
signup_date DATE
);
CREATE TABLE merchandise (
id INTEGER PRIMARY KEY,
identify TEXT NOT NULL,
class TEXT,
worth REAL
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER,
order_date DATE,
whole REAL,
FOREIGN KEY (customer_id) REFERENCES clients(id)
);
CREATE TABLE order_items (
order_id INTEGER,
product_id INTEGER,
amount INTEGER,
unit_price REAL,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES merchandise(id)
);
CREATE TABLE funds (
id INTEGER PRIMARY KEY,
order_id INTEGER,
payment_date DATE,
quantity REAL,
technique TEXT,
FOREIGN KEY (order_id) REFERENCES orders(id)
);
INSERT INTO clients (id, identify, nation, signup_date) VALUES
(1,'Alice','USA','2024-01-05'),
(2,'Bob','UK','2024-03-10'),
(3,'Choi','KR','2024-06-22'),
(4,'Dara','ID','2025-01-15');
INSERT INTO merchandise (id, identify, class, worth) VALUES
(1,'Laptop computer Professional','Electronics',1500.00),
(2,'Noise-Canceling Headphones','Electronics',300.00),
(3,'Standing Desk','Furnishings',450.00),
(4,'Ergonomic Chair','Furnishings',250.00),
(5,'Monitor 27"','Electronics',350.00);
INSERT INTO orders (id, customer_id, order_date, whole) VALUES
(1,1,'2025-02-01',1850.00),
(2,2,'2025-02-03',600.00),
(3,3,'2025-02-05',350.00),
(4,1,'2025-02-07',450.00);
INSERT INTO order_items (order_id, product_id, amount, unit_price) VALUES
(1,1,1,1500.00),
(1,2,1,300.00),
(1,5,1,350.00),
(2,3,1,450.00),
(2,4,1,250.00),
(3,5,1,350.00),
(4,3,1,450.00);
INSERT INTO funds (id, order_id, payment_date, quantity, technique) VALUES
(1,1,'2025-02-01',1850.00,'Credit score Card'),
(2,2,'2025-02-03',600.00,'PayPal'),
(3,3,'2025-02-05',350.00,'Credit score Card'),
(4,4,'2025-02-07',450.00,'Financial institution Switch');
Then, run the next code in your CLI to create a SQLite database for our mission.
sqlite3 demo.db < init_db.sql
With the database prepared, we’ll create a Dockerfile
to containerize our utility.
FROM python:3.12-slim
WORKDIR /code
COPY necessities.txt .
RUN pip set up --no-cache-dir -r necessities.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
We may also create a docker-compose.yml
file for working the appliance extra easily.
providers:
text2sql:
construct: .
env_file: .env
ports:
- "8000:8000"
restart: unless-stopped
volumes:
- ./demo.db:/code/demo.db
With the whole lot prepared, begin your Docker Desktop and run the next code to construct the appliance.
docker compose construct --no-cache
docker compose up -d
If the whole lot is finished properly, you may check the appliance by utilizing the next code. We’ll ask what number of clients now we have within the information.
curl -X POST "http://localhost:8000/question" -H "Content material-Sort: utility/json" -d "{"query":"What number of clients?"}"
The output will seem like this.
{"sql":"SELECT COUNT(*) AS customer_count FROM clients;","end result":[{"customer_count":4}]}
We are able to attempt one thing extra advanced, just like the variety of orders for every buyer:
curl -X POST "http://localhost:8000/question" -H "Content material-Sort: utility/json" -d "{"query":"What's the variety of orders positioned by every buyer"}"
With output like under.
{"sql":"SELECT customer_id, COUNT(*) AS number_of_orders FROM orders GROUP BY customer_id;","end result":[{"customer_id":1,"number_of_orders":2},{"customer_id":2,"number_of_orders":1},{"customer_id":3,"number_of_orders":1}]}
That’s all you have to construct a primary Textual content-to-SQL utility. You possibly can improve it additional with a front-end interface and a extra advanced system tailor-made to your wants.
# Wrapping Up
Knowledge is the center of any information work, and firms use it to make selections. Many instances, the system now we have is just too advanced, and we have to depend on generative AI to assist us navigate it.
On this article, now we have realized learn how to develop a easy Textual content-to-SQL utility utilizing the OpenAI mannequin, FastAPI, and SQLite.
I hope this has helped!
Cornellius Yudha Wijaya is an information science assistant supervisor and information author. Whereas working full-time at Allianz Indonesia, he likes to share Python and information suggestions through social media and writing media. Cornellius writes on a wide range of AI and machine studying subjects.


# Introduction
Knowledge has turn out to be an indispensable useful resource for any profitable enterprise, because it gives priceless insights for knowledgeable decision-making. Given the significance of knowledge, many corporations are constructing methods to retailer and analyze it. Nonetheless, there are various instances when it’s exhausting to amass and analyze the required information, particularly with the rising complexity of the info system.
With the arrival of generative AI, information work has turn out to be considerably simpler, as we are able to now use easy pure language to obtain largely correct output that intently follows the enter we offer. It’s additionally relevant to information processing and evaluation with SQL, the place we are able to ask for question improvement.
On this article, we’ll develop a easy API utility that interprets pure language into SQL queries that our database understands. We’ll use three principal instruments: OpenAI, FastAPI, and SQLite.
Right here’s the plan.
# Textual content-to-SQL App Improvement
First, we’ll put together the whole lot wanted for our mission. All you have to present is the OpenAI API key, which we’ll use to entry the generative mannequin. To containerize the appliance, we’ll use Docker, which you’ll be able to purchase for the native implementation utilizing Docker Desktop.
Different elements, similar to SQLite, will already be obtainable once you set up Python, and FastAPI shall be put in later.
For the general mission construction, we’ll use the next:
text_to_sql_app/
├── app/
│ ├── __init__.py
│ ├── database.py
│ ├── openai_utils.py
│ └── principal.py
├── demo.db
├── init_db.sql
├── necessities.txt
├── Dockerfile
├── docker-compose.yml
├── .env
Create the construction like above, or you need to use the next repository to make issues simpler. We’ll nonetheless undergo every file to realize an understanding of learn how to develop the appliance.
Let’s begin by populating the .env
file with the OpenAI API key we beforehand acquired. You are able to do that with the next code:
OPENAI_API_KEY=YOUR-API-KEY
Then, go to the necessities.txt
to fill within the obligatory libraries we’ll use for
fastapi
uvicorn
sqlalchemy
openai
pydantic
python-dotenv
Subsequent, we transfer on to the __init__.py
file, and we’ll put the next code inside:
from pathlib import Path
from dotenv import load_dotenv
load_dotenv(dotenv_path=Path(__file__).resolve().father or mother.father or mother / ".env", override=False)
The code above ensures that the surroundings accommodates all the required keys we’d like.
Then, we’ll develop Python code within the database.py
file to hook up with the SQLite database we’ll create later (known asdemo.db
) and supply a technique to run SQL queries.
from sqlalchemy import create_engine, textual content
from sqlalchemy.orm import Session
ENGINE = create_engine("sqlite:///demo.db", future=True, echo=False)
def run_query(sql: str) -> checklist[dict]:
with Session(ENGINE) as session:
rows = session.execute(textual content(sql)).mappings().all()
return [dict(r) for r in rows]
After that, we’ll put together the openai_utils.py
file that may settle for the database schema and the enter questions. The output shall be JSON containing the SQL question (with a guard to stop any write operations).
import os
import json
from openai import OpenAI
consumer = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
_SYSTEM_PROMPT = """
You change natural-language questions into read-only SQLite SQL.
By no means output INSERT / UPDATE / DELETE.
Return JSON: { "sql": "..." }.
"""
def text_to_sql(query: str, schema: str) -> str:
response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
temperature=0.1,
response_format={"kind": "json_object"},
messages=[
{"role": "system", "content": _SYSTEM_PROMPT},
{"role": "user",
"content": f"schema:n{schema}nnquestion: {question}"}
]
)
payload = json.hundreds(response.decisions[0].message.content material)
return payload["sql"]
With each the code and the connection prepared, we’ll put together the appliance utilizing FastAPI. The appliance will settle for pure language questions and the database schema, convert them into SQL SELECT
queries, run them by the SQLite database, and return the outcomes as JSON. The appliance shall be an API we are able to entry through the CLI.
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from sqlalchemy import examine
from .database import ENGINE, run_query
from .openai_utils import text_to_sql
app = FastAPI(title="Textual content-to-SQL Demo")
class NLRequest(BaseModel):
query: str
@app.on_event("startup")
def capture_schema() -> None:
insp = examine(ENGINE)
world SCHEMA_STR
SCHEMA_STR = "n".be part of(
f"CREATE TABLE {t} ({', '.be part of(c['name'] for c in insp.get_columns(t))});"
for t in insp.get_table_names()
)
@app.submit("/question")
def question(req: NLRequest):
attempt:
sql = text_to_sql(req.query, SCHEMA_STR)
if not sql.lstrip().decrease().startswith("choose"):
elevate ValueError("Solely SELECT statements are allowed")
return {"sql": sql, "end result": run_query(sql)}
besides Exception as e:
elevate HTTPException(status_code=400, element=str(e))
That’s the whole lot we’d like for the primary utility. The subsequent factor we’ll put together is the database. Use the database under within the init_db.sql
for instance functions, however you may at all times change it if you’d like.
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS funds;
DROP TABLE IF EXISTS merchandise;
DROP TABLE IF EXISTS clients;
CREATE TABLE clients (
id INTEGER PRIMARY KEY,
identify TEXT NOT NULL,
nation TEXT,
signup_date DATE
);
CREATE TABLE merchandise (
id INTEGER PRIMARY KEY,
identify TEXT NOT NULL,
class TEXT,
worth REAL
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER,
order_date DATE,
whole REAL,
FOREIGN KEY (customer_id) REFERENCES clients(id)
);
CREATE TABLE order_items (
order_id INTEGER,
product_id INTEGER,
amount INTEGER,
unit_price REAL,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES merchandise(id)
);
CREATE TABLE funds (
id INTEGER PRIMARY KEY,
order_id INTEGER,
payment_date DATE,
quantity REAL,
technique TEXT,
FOREIGN KEY (order_id) REFERENCES orders(id)
);
INSERT INTO clients (id, identify, nation, signup_date) VALUES
(1,'Alice','USA','2024-01-05'),
(2,'Bob','UK','2024-03-10'),
(3,'Choi','KR','2024-06-22'),
(4,'Dara','ID','2025-01-15');
INSERT INTO merchandise (id, identify, class, worth) VALUES
(1,'Laptop computer Professional','Electronics',1500.00),
(2,'Noise-Canceling Headphones','Electronics',300.00),
(3,'Standing Desk','Furnishings',450.00),
(4,'Ergonomic Chair','Furnishings',250.00),
(5,'Monitor 27"','Electronics',350.00);
INSERT INTO orders (id, customer_id, order_date, whole) VALUES
(1,1,'2025-02-01',1850.00),
(2,2,'2025-02-03',600.00),
(3,3,'2025-02-05',350.00),
(4,1,'2025-02-07',450.00);
INSERT INTO order_items (order_id, product_id, amount, unit_price) VALUES
(1,1,1,1500.00),
(1,2,1,300.00),
(1,5,1,350.00),
(2,3,1,450.00),
(2,4,1,250.00),
(3,5,1,350.00),
(4,3,1,450.00);
INSERT INTO funds (id, order_id, payment_date, quantity, technique) VALUES
(1,1,'2025-02-01',1850.00,'Credit score Card'),
(2,2,'2025-02-03',600.00,'PayPal'),
(3,3,'2025-02-05',350.00,'Credit score Card'),
(4,4,'2025-02-07',450.00,'Financial institution Switch');
Then, run the next code in your CLI to create a SQLite database for our mission.
sqlite3 demo.db < init_db.sql
With the database prepared, we’ll create a Dockerfile
to containerize our utility.
FROM python:3.12-slim
WORKDIR /code
COPY necessities.txt .
RUN pip set up --no-cache-dir -r necessities.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
We may also create a docker-compose.yml
file for working the appliance extra easily.
providers:
text2sql:
construct: .
env_file: .env
ports:
- "8000:8000"
restart: unless-stopped
volumes:
- ./demo.db:/code/demo.db
With the whole lot prepared, begin your Docker Desktop and run the next code to construct the appliance.
docker compose construct --no-cache
docker compose up -d
If the whole lot is finished properly, you may check the appliance by utilizing the next code. We’ll ask what number of clients now we have within the information.
curl -X POST "http://localhost:8000/question" -H "Content material-Sort: utility/json" -d "{"query":"What number of clients?"}"
The output will seem like this.
{"sql":"SELECT COUNT(*) AS customer_count FROM clients;","end result":[{"customer_count":4}]}
We are able to attempt one thing extra advanced, just like the variety of orders for every buyer:
curl -X POST "http://localhost:8000/question" -H "Content material-Sort: utility/json" -d "{"query":"What's the variety of orders positioned by every buyer"}"
With output like under.
{"sql":"SELECT customer_id, COUNT(*) AS number_of_orders FROM orders GROUP BY customer_id;","end result":[{"customer_id":1,"number_of_orders":2},{"customer_id":2,"number_of_orders":1},{"customer_id":3,"number_of_orders":1}]}
That’s all you have to construct a primary Textual content-to-SQL utility. You possibly can improve it additional with a front-end interface and a extra advanced system tailor-made to your wants.
# Wrapping Up
Knowledge is the center of any information work, and firms use it to make selections. Many instances, the system now we have is just too advanced, and we have to depend on generative AI to assist us navigate it.
On this article, now we have realized learn how to develop a easy Textual content-to-SQL utility utilizing the OpenAI mannequin, FastAPI, and SQLite.
I hope this has helped!
Cornellius Yudha Wijaya is an information science assistant supervisor and information author. Whereas working full-time at Allianz Indonesia, he likes to share Python and information suggestions through social media and writing media. Cornellius writes on a wide range of AI and machine studying subjects.