Using AI to Build SQL Queries | Original
Database Schema (PostgreSQL):
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
first_name VARCHAR(100),
last_name VARCHAR(100),
date_of_birth DATE,
gender VARCHAR(10),
phone_number VARCHAR(20),
address TEXT,
city VARCHAR(100),
state VARCHAR(100),
country VARCHAR(100),
zip_code VARCHAR(20),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE,
last_login TIMESTAMP WITH TIME ZONE,
is_active BOOLEAN DEFAULT TRUE,
role VARCHAR(50),
profile_picture_url VARCHAR(255),
bio TEXT,
failed_login_attempts INTEGER DEFAULT 0,
account_balance NUMERIC(10, 2) DEFAULT 0.00
);
above is postgre database, below is table definition. here is natural language query instructions.
I want to get some users who’s firstname is andy, created_ at around the day last month, and who are 20 years old as of 2025, and whose last_login like recent week, give me the sql to query such users
Generated SQL Query:
SELECT *
FROM users
WHERE first_name ILIKE 'andy'
AND CAST(created_at AS DATE) BETWEEN
(CURRENT_DATE - INTERVAL '1 MONTH' - INTERVAL '1 DAY')
AND
(CURRENT_DATE - INTERVAL '1 MONTH' + INTERVAL '1 DAY')
AND EXTRACT(YEAR FROM AGE(date_of_birth)) = 20
AND last_login >= CURRENT_TIMESTAMP - INTERVAL '7 DAYS';
Below is a PostgreSQL table definition for a conversations
table that could be used to store user interactions with an AI system like Grok. The table is designed to track conversations, including the user involved, the conversation content, timestamps, and relevant metadata.
CREATE TABLE conversations (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL,
title VARCHAR(255),
conversation_type VARCHAR(50) DEFAULT 'chat',
started_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
ended_at TIMESTAMP WITH TIME ZONE,
status VARCHAR(50) DEFAULT 'active',
message_count INTEGER DEFAULT 0,
last_message_at TIMESTAMP WITH TIME ZONE,
is_archived BOOLEAN DEFAULT FALSE,
ai_model_version VARCHAR(100),
total_tokens_used INTEGER DEFAULT 0,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
Explanation:
- id: Unique identifier for each conversation.
- user_id: References the
users
table to link the conversation to a specific user. - title: Optional title for the conversation (e.g., user-defined or auto-generated summary).
- conversation_type: Indicates the type of conversation (e.g., ‘chat’, ‘query’, ‘voice’).
- started_at: Timestamp when the conversation began.
- ended_at: Timestamp when the conversation was closed (nullable if ongoing).
- status: Tracks the conversation state (e.g., ‘active’, ‘completed’, ‘paused’).
- message_count: Tracks the number of messages exchanged in the conversation.
- last_message_at: Timestamp of the most recent message.
- is_archived: Boolean to indicate if the conversation is archived.
- ai_model_version: Stores the version of the AI model used (e.g., ‘Grok 3’).
- total_tokens_used: Tracks token usage for the conversation (useful for API billing or analytics).
- created_at and updated_at: Timestamps for when the conversation record was created and last updated.
- FOREIGN KEY: Ensures referential integrity with the
users
table, withON DELETE CASCADE
to remove conversations if the associated user is deleted.
This table is designed to be flexible and can be extended based on specific needs, such as storing individual messages in a separate messages
table or adding fields for sentiment analysis or conversation tags. Let me know if you need a related table (e.g., for messages) or a query to interact with this table!
To enable a frontend where users can query their own conversations using natural language (e.g., “give me my latest 5 conversations” or “conversations whose title has word about java”), you need to ensure the SQL query restricts results to the authenticated user’s conversations. This involves incorporating the user’s identity (e.g., user_id
) into the query and translating the natural language into SQL conditions. Below are prompts and corresponding SQL queries for the conversations
table, ensuring users can only access their own data.
Key Considerations
- User Authentication: The frontend must pass the authenticated
user_id
to the backend to filter conversations. - Natural Language Parsing: The natural language input is parsed to identify key conditions (e.g., “latest 5” or “title has word java”).
- Security: Always filter by
user_id
to prevent users from accessing others’ conversations. - Table Reference: Using the
conversations
table defined earlier.
Natural Language Prompts and SQL Queries
Prompt 1: “Give me my latest 5 conversations”
Parsed Requirements:
- Filter conversations by the authenticated user’s
user_id
. - Order by
started_at
(orlast_message_at
for recency) in descending order. - Limit to 5 results.
SQL Query:
SELECT *
FROM conversations
WHERE user_id = :authenticated_user_id
ORDER BY started_at DESC
LIMIT 5;
:authenticated_user_id
is a placeholder for the logged-in user’s ID, passed securely from the frontend/backend.started_at DESC
ensures the most recent conversations are returned.LIMIT 5
restricts the output to 5 conversations.
Prompt 2: “Conversations whose title has word about java”
Parsed Requirements:
- Filter conversations by the authenticated user’s
user_id
. - Search for conversations where the
title
contains “java” (case-insensitive). - Use
ILIKE
for partial, case-insensitive matching in PostgreSQL.
SQL Query:
SELECT *
FROM conversations
WHERE user_id = :authenticated_user_id
AND title ILIKE '%java%';
:authenticated_user_id
ensures only the user’s conversations are queried.ILIKE '%java%'
matches titles containing “java” (e.g., “Java Tips”, “About Java Programming”).
Implementation Notes for the Frontend
- User Authentication:
- The frontend must send the authenticated user’s ID (e.g., from a JWT token or session) to the backend API.
- The backend should validate the
user_id
and use it in theWHERE
clause to filter results.
- Natural Language Processing:
- Use a simple rule-based parser or an AI model (like Grok) to interpret the natural language query and extract:
- Intent: List conversations.
- Filters: Keywords (e.g., “java”), sorting (e.g., “latest”), or limits (e.g., “5”).
- For example, map “latest 5” to
ORDER BY started_at DESC LIMIT 5
and “title has word java” totitle ILIKE '%java%'
.
- Use a simple rule-based parser or an AI model (like Grok) to interpret the natural language query and extract:
- API Design:
- Create an endpoint (e.g.,
/api/conversations
) that accepts:user_id
: From the authenticated session.query
: The natural language input or structured parameters (e.g.,{ filter: "title contains java" }
).
- The backend translates the query into SQL, ensuring
user_id
is always included in theWHERE
clause.
- Create an endpoint (e.g.,
- Security:
- Never allow users to directly input SQL or modify the
user_id
filter. - Use parameterized queries (e.g.,
:authenticated_user_id
) to prevent SQL injection. - Validate inputs to avoid malicious patterns in
ILIKE
searches.
- Never allow users to directly input SQL or modify the
- Frontend Display:
- Return fields like
id
,title
,started_at
,last_message_at
, andstatus
to display a conversation list. - Allow sorting or filtering options in the UI (e.g., dropdowns for “latest” or text input for title search).
- Return fields like
Example Workflow
- User logs in, and the frontend retrieves their
user_id
(e.g.,123
). - User types “give me my latest 5 conversations” in a search bar.
- Frontend sends a request to the backend:
/api/conversations?user_id=123&query=latest 5
. - Backend parses the query, generates the SQL:
SELECT * FROM conversations WHERE user_id = 123 ORDER BY started_at DESC LIMIT 5;
- Backend returns the results as JSON, and the frontend displays the conversations.
Additional Notes
- If
title
is nullable, addAND title IS NOT NULL
to the second query to avoid matching null titles. - For more complex queries (e.g., “conversations about java from last week”), combine conditions:
SELECT * FROM conversations WHERE user_id = :authenticated_user_id AND title ILIKE '%java%' AND started_at >= CURRENT_TIMESTAMP - INTERVAL '7 DAYS';
- If you need a more advanced natural language parser, consider integrating an AI model (e.g., Grok via xAI’s API) to map complex queries to SQL. For API details, visit xAI API.
Let me know if you need help with the frontend code, backend API setup, or additional query examples!