Common Table Expressions (CTEs) in IBM Db2 | Generated by AI

Home PDF

Common Table Expressions (CTEs) are temporary result sets that exist only within the execution scope of a single SQL statement. They’re particularly useful for improving code readability and working with recursive queries. Let me explain how CTEs work in IBM Db2 with examples.

Basic CTE Syntax

The basic syntax for a CTE in Db2 follows this pattern:

WITH cte_name (column_list) AS (
    SELECT statement
)
SELECT * FROM cte_name;

Your example demonstrates a multi-level CTE that:

  1. Creates regional_sales to calculate total sales by region
  2. Uses that to create top_regions containing the top 3 regions by sales
  3. Finally joins back to the original orders table to get orders over $1000 from those top regions

Simple CTE Example

Let’s start with a basic example:

WITH employee_salaries AS (
    SELECT department, AVG(salary) as avg_salary
    FROM employees
    GROUP BY department
)
SELECT department, avg_salary
FROM employee_salaries
WHERE avg_salary > 50000
ORDER BY avg_salary DESC;

This query calculates the average salary by department and then filters to show only departments with average salaries above $50,000.

Multiple CTEs

As shown in your example, you can define multiple CTEs separated by commas:

WITH dept_summary AS (
    SELECT department, COUNT(*) as emp_count
    FROM employees
    GROUP BY department
),
salary_summary AS (
    SELECT department, AVG(salary) as avg_salary
    FROM employees
    GROUP BY department
)
SELECT d.department, d.emp_count, s.avg_salary
FROM dept_summary d
JOIN salary_summary s ON d.department = s.department
ORDER BY emp_count DESC;

Recursive CTEs

One of the most powerful features of CTEs in Db2 is recursive queries. Here’s an example to generate a sequence of dates:

WITH RECURSIVE date_sequence (date_value) AS (
    -- Anchor member
    SELECT DATE '2025-01-01' FROM SYSIBM.SYSDUMMY1
    UNION ALL
    -- Recursive member
    SELECT date_value + 1 DAY
    FROM date_sequence
    WHERE date_value < DATE '2025-01-10'
)
SELECT date_value FROM date_sequence;

This will generate dates from January 1, 2025, to January 10, 2025.

Employee Hierarchy Example

A classic use of recursive CTEs is traversing hierarchical data:

WITH RECURSIVE emp_hierarchy (emp_id, name, manager_id, level) AS (
    -- Anchor: Select the CEO (employee with no manager)
    SELECT emp_id, name, manager_id, 1 as level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive: Join to get direct reports
    SELECT e.emp_id, e.name, e.manager_id, h.level + 1
    FROM employees e
    JOIN emp_hierarchy h ON e.manager_id = h.emp_id
)
SELECT emp_id, name, level, 
       REPEAT('    ', level-1) || name as org_chart
FROM emp_hierarchy
ORDER BY level, name;

This query builds an organizational chart showing the hierarchy of employees.

CTEs for Data Analysis

CTEs are excellent for breaking down complex analytical queries:

WITH monthly_sales AS (
    SELECT MONTH(order_date) as month, 
           SUM(amount) as total_sales
    FROM orders
    WHERE YEAR(order_date) = 2024
    GROUP BY MONTH(order_date)
),
monthly_growth AS (
    SELECT month, 
           total_sales,
           LAG(total_sales) OVER (ORDER BY month) as prev_month_sales
    FROM monthly_sales
)
SELECT month, 
       total_sales, 
       prev_month_sales,
       (total_sales - prev_month_sales) as sales_change,
       CASE 
           WHEN prev_month_sales IS NULL THEN NULL
           ELSE DECIMAL((total_sales - prev_month_sales) * 100.0 / prev_month_sales, 5, 2)
       END as growth_percentage
FROM monthly_growth
ORDER BY month;

This example calculates month-over-month sales growth using CTEs to break the calculation into clear steps.

Performance Considerations

In Db2, CTEs are materialized (stored in memory) if they’re referenced multiple times in the query. Some tips:

  1. If a CTE is used only once, the optimizer typically inlines it (treats it like a subquery)
  2. For complex queries, CTEs can improve performance by avoiding redundant calculations
  3. Use the EXPLAIN command to understand how Db2 is processing your CTE queries
  4. Consider creating indexes on commonly joined or filtered columns in the base tables

Db2-Specific CTE Features

IBM Db2 supports some specific features for CTEs:

-- Using XMLTABLE with a CTE
WITH xml_data(doc) AS (
    SELECT XMLPARSE(DOCUMENT '<root><item id="1">Apple</item><item id="2">Orange</item></root>')
    FROM SYSIBM.SYSDUMMY1
)
SELECT x.id, x.name
FROM xml_data,
     XMLTABLE('//item' PASSING doc
              COLUMNS id INTEGER PATH '@id',
                      name VARCHAR(50) PATH 'text()') AS x;

Using CTEs with INSERT, UPDATE, and DELETE

CTEs can be used with modification statements in Db2:

-- Using a CTE with INSERT
WITH top_customers AS (
    SELECT customer_id, SUM(amount) as total_spent
    FROM orders
    GROUP BY customer_id
    ORDER BY total_spent DESC
    FETCH FIRST 100 ROWS ONLY
)
INSERT INTO premium_customers (customer_id, total_spent, join_date)
SELECT customer_id, total_spent, CURRENT DATE
FROM top_customers;

-- Using a CTE with UPDATE
WITH avg_dept_salaries AS (
    SELECT department, AVG(salary) as dept_avg
    FROM employees
    GROUP BY department
)
UPDATE employees e
SET salary = salary * 1.1
WHERE EXISTS (
    SELECT 1 FROM avg_dept_salaries a
    WHERE e.department = a.department
    AND a.dept_avg < 50000
);

These examples show how CTEs can simplify data manipulation operations by clearly separating the data selection logic from the actual modification.

Would you like me to elaborate further on any specific aspect of CTEs in Db2?


Back 2025.03.24 Donate