Using AI to Build SQL Queries

Using AI to Build SQL Queries | Original

Home 2025.08

Database Schema (PostgreSQL):

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    first_name VARCHAR(100),
    last_name VARCHAR(100),
    date_of_birth DATE,
    gender VARCHAR(10),
    phone_number VARCHAR(20),
    address TEXT,
    city VARCHAR(100),
    state VARCHAR(100),
    country VARCHAR(100),
    zip_code VARCHAR(20),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE,
    last_login TIMESTAMP WITH TIME ZONE,
    is_active BOOLEAN DEFAULT TRUE,
    role VARCHAR(50),
    profile_picture_url VARCHAR(255),
    bio TEXT,
    failed_login_attempts INTEGER DEFAULT 0,
    account_balance NUMERIC(10, 2) DEFAULT 0.00
);

above is postgre database, below is table definition. here is natural language query instructions.

I want to get some users who’s firstname is andy, created_ at around the day last month, and who are 20 years old as of 2025, and whose last_login like recent week, give me the sql to query such users

Generated SQL Query:

SELECT *
FROM users
WHERE first_name ILIKE 'andy'
  AND CAST(created_at AS DATE) BETWEEN 
      (CURRENT_DATE - INTERVAL '1 MONTH' - INTERVAL '1 DAY') 
      AND 
      (CURRENT_DATE - INTERVAL '1 MONTH' + INTERVAL '1 DAY')
  AND EXTRACT(YEAR FROM AGE(date_of_birth)) = 20
  AND last_login >= CURRENT_TIMESTAMP - INTERVAL '7 DAYS';

Below is a PostgreSQL table definition for a conversations table that could be used to store user interactions with an AI system like Grok. The table is designed to track conversations, including the user involved, the conversation content, timestamps, and relevant metadata.

CREATE TABLE conversations (
    id SERIAL PRIMARY KEY,
    user_id INTEGER NOT NULL,
    title VARCHAR(255),
    conversation_type VARCHAR(50) DEFAULT 'chat',
    started_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    ended_at TIMESTAMP WITH TIME ZONE,
    status VARCHAR(50) DEFAULT 'active',
    message_count INTEGER DEFAULT 0,
    last_message_at TIMESTAMP WITH TIME ZONE,
    is_archived BOOLEAN DEFAULT FALSE,
    ai_model_version VARCHAR(100),
    total_tokens_used INTEGER DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE,
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);

Explanation:

id: Unique identifier for each conversation.
user_id: References the users table to link the conversation to a specific user.
title: Optional title for the conversation (e.g., user-defined or auto-generated summary).
conversation_type: Indicates the type of conversation (e.g., ‘chat’, ‘query’, ‘voice’).
started_at: Timestamp when the conversation began.
ended_at: Timestamp when the conversation was closed (nullable if ongoing).
status: Tracks the conversation state (e.g., ‘active’, ‘completed’, ‘paused’).
message_count: Tracks the number of messages exchanged in the conversation.
last_message_at: Timestamp of the most recent message.
is_archived: Boolean to indicate if the conversation is archived.
ai_model_version: Stores the version of the AI model used (e.g., ‘Grok 3’).
total_tokens_used: Tracks token usage for the conversation (useful for API billing or analytics).
created_at and updated_at: Timestamps for when the conversation record was created and last updated.
FOREIGN KEY: Ensures referential integrity with the users table, with ON DELETE CASCADE to remove conversations if the associated user is deleted.

This table is designed to be flexible and can be extended based on specific needs, such as storing individual messages in a separate messages table or adding fields for sentiment analysis or conversation tags. Let me know if you need a related table (e.g., for messages) or a query to interact with this table!

To enable a frontend where users can query their own conversations using natural language (e.g., “give me my latest 5 conversations” or “conversations whose title has word about java”), you need to ensure the SQL query restricts results to the authenticated user’s conversations. This involves incorporating the user’s identity (e.g., user_id) into the query and translating the natural language into SQL conditions. Below are prompts and corresponding SQL queries for the conversations table, ensuring users can only access their own data.

Key Considerations

User Authentication: The frontend must pass the authenticated user_id to the backend to filter conversations.
Natural Language Parsing: The natural language input is parsed to identify key conditions (e.g., “latest 5” or “title has word java”).
Security: Always filter by user_id to prevent users from accessing others’ conversations.
Table Reference: Using the conversations table defined earlier.

Natural Language Prompts and SQL Queries

Prompt 1: “Give me my latest 5 conversations”

Parsed Requirements:

Filter conversations by the authenticated user’s user_id.
Order by started_at (or last_message_at for recency) in descending order.
Limit to 5 results.

SQL Query:

SELECT *
FROM conversations
WHERE user_id = :authenticated_user_id
ORDER BY started_at DESC
LIMIT 5;

:authenticated_user_id is a placeholder for the logged-in user’s ID, passed securely from the frontend/backend.
started_at DESC ensures the most recent conversations are returned.
LIMIT 5 restricts the output to 5 conversations.

Prompt 2: “Conversations whose title has word about java”

Parsed Requirements:

Filter conversations by the authenticated user’s user_id.
Search for conversations where the title contains “java” (case-insensitive).
Use ILIKE for partial, case-insensitive matching in PostgreSQL.

SQL Query:

SELECT *
FROM conversations
WHERE user_id = :authenticated_user_id
  AND title ILIKE '%java%';

:authenticated_user_id ensures only the user’s conversations are queried.
ILIKE '%java%' matches titles containing “java” (e.g., “Java Tips”, “About Java Programming”).

Implementation Notes for the Frontend

User Authentication:
- The frontend must send the authenticated user’s ID (e.g., from a JWT token or session) to the backend API.
- The backend should validate the user_id and use it in the WHERE clause to filter results.
Natural Language Processing:
- Use a simple rule-based parser or an AI model (like Grok) to interpret the natural language query and extract:
  - Intent: List conversations.
  - Filters: Keywords (e.g., “java”), sorting (e.g., “latest”), or limits (e.g., “5”).
- For example, map “latest 5” to ORDER BY started_at DESC LIMIT 5 and “title has word java” to title ILIKE '%java%'.
API Design:
- Create an endpoint (e.g., /api/conversations) that accepts:
  - user_id: From the authenticated session.
  - query: The natural language input or structured parameters (e.g., { filter: "title contains java" }).
- The backend translates the query into SQL, ensuring user_id is always included in the WHERE clause.
Security:
- Never allow users to directly input SQL or modify the user_id filter.
- Use parameterized queries (e.g., :authenticated_user_id) to prevent SQL injection.
- Validate inputs to avoid malicious patterns in ILIKE searches.
Frontend Display:
- Return fields like id, title, started_at, last_message_at, and status to display a conversation list.
- Allow sorting or filtering options in the UI (e.g., dropdowns for “latest” or text input for title search).

Example Workflow

User logs in, and the frontend retrieves their user_id (e.g., 123).
User types “give me my latest 5 conversations” in a search bar.
Frontend sends a request to the backend: /api/conversations?user_id=123&query=latest 5.

Backend parses the query, generates the SQL:

SELECT *
FROM conversations
WHERE user_id = 123
ORDER BY started_at DESC
LIMIT 5;

Backend returns the results as JSON, and the frontend displays the conversations.

Additional Notes

If title is nullable, add AND title IS NOT NULL to the second query to avoid matching null titles.

For more complex queries (e.g., “conversations about java from last week”), combine conditions:

SELECT *
FROM conversations
WHERE user_id = :authenticated_user_id
  AND title ILIKE '%java%'
  AND started_at >= CURRENT_TIMESTAMP - INTERVAL '7 DAYS';

If you need a more advanced natural language parser, consider integrating an AI model (e.g., Grok via xAI’s API) to map complex queries to SQL. For API details, visit xAI API.

Let me know if you need help with the frontend code, backend API setup, or additional query examples!

Back Donate