Common Table Expressions (CTEs) in IBM Db2 | Generated by AI
Common Table Expressions (CTEs) are temporary result sets that exist only within the execution scope of a single SQL statement. They’re particularly useful for improving code readability and working with recursive queries. Let me explain how CTEs work in IBM Db2 with examples.
Basic CTE Syntax
The basic syntax for a CTE in Db2 follows this pattern:
WITH cte_name (column_list) AS (
SELECT statement
)
SELECT * FROM cte_name;
Your example demonstrates a multi-level CTE that:
- Creates
regional_sales
to calculate total sales by region - Uses that to create
top_regions
containing the top 3 regions by sales - Finally joins back to the original orders table to get orders over $1000 from those top regions
Simple CTE Example
Let’s start with a basic example:
WITH employee_salaries AS (
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
)
SELECT department, avg_salary
FROM employee_salaries
WHERE avg_salary > 50000
ORDER BY avg_salary DESC;
This query calculates the average salary by department and then filters to show only departments with average salaries above $50,000.
Multiple CTEs
As shown in your example, you can define multiple CTEs separated by commas:
WITH dept_summary AS (
SELECT department, COUNT(*) as emp_count
FROM employees
GROUP BY department
),
salary_summary AS (
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
)
SELECT d.department, d.emp_count, s.avg_salary
FROM dept_summary d
JOIN salary_summary s ON d.department = s.department
ORDER BY emp_count DESC;
Recursive CTEs
One of the most powerful features of CTEs in Db2 is recursive queries. Here’s an example to generate a sequence of dates:
WITH RECURSIVE date_sequence (date_value) AS (
-- Anchor member
SELECT DATE '2025-01-01' FROM SYSIBM.SYSDUMMY1
UNION ALL
-- Recursive member
SELECT date_value + 1 DAY
FROM date_sequence
WHERE date_value < DATE '2025-01-10'
)
SELECT date_value FROM date_sequence;
This will generate dates from January 1, 2025, to January 10, 2025.
Employee Hierarchy Example
A classic use of recursive CTEs is traversing hierarchical data:
WITH RECURSIVE emp_hierarchy (emp_id, name, manager_id, level) AS (
-- Anchor: Select the CEO (employee with no manager)
SELECT emp_id, name, manager_id, 1 as level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Join to get direct reports
SELECT e.emp_id, e.name, e.manager_id, h.level + 1
FROM employees e
JOIN emp_hierarchy h ON e.manager_id = h.emp_id
)
SELECT emp_id, name, level,
REPEAT(' ', level-1) || name as org_chart
FROM emp_hierarchy
ORDER BY level, name;
This query builds an organizational chart showing the hierarchy of employees.
CTEs for Data Analysis
CTEs are excellent for breaking down complex analytical queries:
WITH monthly_sales AS (
SELECT MONTH(order_date) as month,
SUM(amount) as total_sales
FROM orders
WHERE YEAR(order_date) = 2024
GROUP BY MONTH(order_date)
),
monthly_growth AS (
SELECT month,
total_sales,
LAG(total_sales) OVER (ORDER BY month) as prev_month_sales
FROM monthly_sales
)
SELECT month,
total_sales,
prev_month_sales,
(total_sales - prev_month_sales) as sales_change,
CASE
WHEN prev_month_sales IS NULL THEN NULL
ELSE DECIMAL((total_sales - prev_month_sales) * 100.0 / prev_month_sales, 5, 2)
END as growth_percentage
FROM monthly_growth
ORDER BY month;
This example calculates month-over-month sales growth using CTEs to break the calculation into clear steps.
Performance Considerations
In Db2, CTEs are materialized (stored in memory) if they’re referenced multiple times in the query. Some tips:
- If a CTE is used only once, the optimizer typically inlines it (treats it like a subquery)
- For complex queries, CTEs can improve performance by avoiding redundant calculations
- Use the EXPLAIN command to understand how Db2 is processing your CTE queries
- Consider creating indexes on commonly joined or filtered columns in the base tables
Db2-Specific CTE Features
IBM Db2 supports some specific features for CTEs:
-- Using XMLTABLE with a CTE
WITH xml_data(doc) AS (
SELECT XMLPARSE(DOCUMENT '<root><item id="1">Apple</item><item id="2">Orange</item></root>')
FROM SYSIBM.SYSDUMMY1
)
SELECT x.id, x.name
FROM xml_data,
XMLTABLE('//item' PASSING doc
COLUMNS id INTEGER PATH '@id',
name VARCHAR(50) PATH 'text()') AS x;
Using CTEs with INSERT, UPDATE, and DELETE
CTEs can be used with modification statements in Db2:
-- Using a CTE with INSERT
WITH top_customers AS (
SELECT customer_id, SUM(amount) as total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC
FETCH FIRST 100 ROWS ONLY
)
INSERT INTO premium_customers (customer_id, total_spent, join_date)
SELECT customer_id, total_spent, CURRENT DATE
FROM top_customers;
-- Using a CTE with UPDATE
WITH avg_dept_salaries AS (
SELECT department, AVG(salary) as dept_avg
FROM employees
GROUP BY department
)
UPDATE employees e
SET salary = salary * 1.1
WHERE EXISTS (
SELECT 1 FROM avg_dept_salaries a
WHERE e.department = a.department
AND a.dept_avg < 50000
);
These examples show how CTEs can simplify data manipulation operations by clearly separating the data selection logic from the actual modification.
Would you like me to elaborate further on any specific aspect of CTEs in Db2?