Persistent Storage Versus Snapshotting for Data | Generated by AI
Question: Would using an attached block storage volume make snapshot creation faster compared to snapshotting the root disk?
Answer:
Actually, it’s a nuanced tradeoff. Let me break it down:
Block Storage Volume Snapshots vs Droplet Snapshots
| Droplet Snapshot | Volume Snapshot | |
|---|---|---|
| What’s captured | Entire root disk | Just the volume |
| Speed to create | Slow (full disk) | Also slow (same mechanism) |
| Speed to restore | Fast (CoW lazy) | Also fast (CoW lazy) |
| Cost | $0.06/GB/month | $0.06/GB/month |
| Reattach to new droplet | ❌ Not directly | ✅ Yes, detach/reattach |
So snapshot creation speed is roughly the same — both go through the same CoW block storage backend.
Where Block Storage Volume Actually Wins for Your Workflow
The real advantage is persistence and reattachability:
Droplet Snapshot workflow (current):
destroy droplet → snapshot stored → create new droplet → lazy hydrate
Block Storage Volume workflow:
destroy droplet → volume persists → create new droplet → attach volume → done
With a volume, no snapshot needed at all between sessions:
# End of session
doctl compute droplet delete amd-gpu --force
# Volume still exists, data intact, no snapshot creation time
# Next session
doctl compute droplet create amd-gpu \
--image <base-image-id> \
--size gpu-h100x80-1 \
--region ... \
--ssh-keys ...
# then attach volume
doctl compute volume-action attach <volume-id> <new-droplet-id>
What to Put Where
Root disk (small, cheap droplet snapshot):
- OS
- conda/pip environments
- dotfiles, SSH keys, tmux config
- your CLI tools (ww, zz, iclaw)
Block storage volume (persistent, never snapshotted):
- Model weights (70B models, checkpoints)
- Datasets
- Training outputs
- Git repos with large artifacts
The Optimal Workflow for You
Session start (~2 min total):
1. Create droplet from small root snapshot (fast, lazy CoW)
2. Attach persistent volume (instant, just a mount)
3. mount /dev/sda /mnt/data
4. All your models/datasets immediately available
Session end:
1. Save any new model checkpoints to volume
2. Destroy droplet (no snapshot needed)
3. Volume persists, billed at ~$0.10/GB/month at rest
Cost Reality Check
For 130GB of model weights on a volume:
130GB × $0.10/GB/month = $13/month
vs
130GB snapshot × $0.06/GB/month = $7.80/month
Volume is slightly more expensive but saves you 20+ min snapshot creation time every session and gives you instant attach without hydration lag for the heavy data.
Bottom Line
- Snapshot speed: no real difference between root disk vs volume snapshot
- Real win: use volume as persistent storage so you skip snapshotting entirely for your large model weights
- Best setup: thin root disk snapshot (OS + env) + persistent volume (models + data)
This is essentially the same pattern as cloud ML workflows on AWS (EBS volume for datasets, AMI snapshot just for the base environment).