Explaining MLflow SQLite Queries | Generated by AI

Home 2026.06

Question: Explain these SQL queries used to inspect an MLflow SQLite database.

Answer:

These queries are directly reading the MLflow tracking database (mlflow.db) using SQLite.


1. Calculate Run Duration

(r.end_time - r.start_time)/1000/60 as duration_min
FROM runs r
JOIN experiments e ON r.experiment_id = e.experiment_id
ORDER BY r.start_time;

What it does

MLflow stores timestamps in milliseconds since Unix epoch.

Example:

start_time = 1749500000000
end_time   = 1749503600000

Difference:

3600000 ms

Convert to minutes:

(end_time - start_time)
/1000      -- ms → seconds
/60        -- seconds → minutes

Result:

duration_min = 60

Meaning

Shows how long each training run took.

Example:

run
duration_min
12.3
45.8
102.4

Useful for:


2. Show All Parameters

SELECT run_uuid, key, value
FROM params
ORDER BY run_uuid, key;

MLflow params table

Stores hyperparameters.

Example:

run_uuid key value
abc123 depth 24
abc123 lr 0.001
abc123 batch_size 32

Meaning:

mlflow.log_param("depth", 24)
mlflow.log_param("lr", 0.001)

creates rows in this table.

Typical NanoChat parameters

You may see:

depth=24
window_pattern=L
device_batch_size=32
max_seq_len=2048
num_iterations=5

3. Show Metric History

SELECT
    m.run_uuid,
    substr(r.run_uuid,1,8) as run_short,
    t.value as run_name,
    m.key,
    m.value,
    m.step
FROM metrics m
JOIN runs r ON m.run_uuid = r.run_uuid
LEFT JOIN tags t
  ON t.run_uuid = r.run_uuid
 AND t.key = 'mlflow.runName'
ORDER BY r.run_uuid, m.key, m.step
LIMIT 60;

Tables involved

metrics

Stores every logged metric point.

Example:

run_uuid key value step
abc123 loss 3.2 0
abc123 loss 2.8 1
abc123 loss 2.4 2

runs

Contains run metadata.


tags

Stores extra information.

Example:

mlflow.runName = "depth24_batch32"

What the query produces

Example output:

run_name key value step
depth24 loss 3.2 0
depth24 loss 2.8 1
depth24 loss 2.4 2

This shows the full metric history.

Useful for plotting training curves.


4. Show Latest Metrics Only

SELECT
    t.value as run_name,
    lm.key,
    lm.value as latest_value,
    lm.step
FROM latest_metrics lm
JOIN runs r ON lm.run_uuid = r.run_uuid
LEFT JOIN tags t
  ON t.run_uuid = r.run_uuid
 AND t.key = 'mlflow.runName'
ORDER BY run_name, lm.key;

latest_metrics table

MLflow maintains a special table:

latest_metrics

containing only the newest metric value for each metric.

Instead of:

loss step 0
loss step 1
loss step 2
loss step 3

it keeps:

loss step 3

Example

Metrics history:

step loss
0 4.0
100 3.1
200 2.7
300 2.4

latest_metrics stores:

key value step
loss 2.4 300

Why this is useful

Quickly compare runs:

run_name metric value
run1 loss 2.4
run2 loss 2.1
run3 loss 1.9

No need to scan millions of metric rows.


MLflow Database Structure (Simplified)

experiments
    │
    └── runs
            │
            ├── params
            ├── metrics
            ├── latest_metrics
            └── tags

Relationship

Experiment
   └── Run
         ├── Parameters
         ├── Metrics History
         ├── Latest Metrics
         └── Tags

For NanoChat training runs:

The next step after launching the MLflow UI is usually:

mlflow ui --backend-store-uri sqlite:///mlflow.db

Then open:

http://127.0.0.1:5000

to compare NanoChat runs visually, inspect parameters, and view loss curves without querying SQLite manually.


Back Donate