Remote CUDA Driver Installation Guide | Generated by AI

Home 2026.03

📋 Review: CUDA Driver Installation on Remote Server


Step 1 — Connect & discover hardware

SSH’d to root@47.107.70.224, ran lscpu, free, lspci, nvidia-smi. Found 2× RTX 3090 but no driver loaded. Key insight: lspci showed the GPUs at PCIe level even when nvidia-smi couldn’t see them (driver-independent check).

Step 2 — Set up passwordless SSH

Used sshpass to copy the local ed25519 public key to ~/.ssh/authorized_keys on the remote. This let me chain multiple commands without re-authenticating. Verified with ssh -o BatchMode=yes.

Step 3 — Check available drivers

Ran apt update && ubuntu-drivers devices on the remote. This queries Ubuntu’s driver database and lists compatible NVIDIA drivers. The recommended one was nvidia-driver-580-open. Chose nvidia-driver-580 (non-open, proprietary — better CUDA compatibility).

Step 4 — Install NVIDIA driver

DEBIAN_FRONTEND=noninteractive apt install -y nvidia-driver-580 nvidia-utils-580

Step 5 — Install CUDA toolkit (the messy part)

First attempt: apt install cuda-toolkit-12-8 — failed, package not in default Ubuntu repos.

Fixed by adding NVIDIA’s official repo:

wget cuda-keyring_1.1-1_all.deb   # Sets up NVIDIA's apt source + GPG key
dpkg -i cuda-keyring_1.1-1_all.deb
apt update
apt install cuda-toolkit-12-8

Second attempt: SSH connection dropped mid-install (exit code 255 = connection killed). Long-running apt over SSH is risky.

Third attempt: Used nohup to background the install on the remote:

nohup apt install -y cuda-toolkit-12-8 > /tmp/cuda-install.log 2>&1 &

But hit a lock conflict — the earlier apt process (PID 23704) was still alive from the dropped SSH session. The duplicate waited on /var/lib/dpkg/lock-frontend forever.

Fix: Killed the duplicate, waited for the original apt to finish, confirmed with dpkg -l cuda-toolkit-12-8.

Step 6 — Configure PATH

CUDA installs to /usr/local/cuda-12.8/bin which isn’t in PATH by default. Added to ~/.bashrc:

export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH

🤔 What went wrong / what I’d do differently

Issue Root cause Better approach
CUDA package not found Tried installing from Ubuntu repos, not NVIDIA’s Should’ve added cuda-keyring first
SSH dropped during install Long-running apt over SSH killed the connection Use nohup or tmux/screen from the start for any install > 30s
Lock conflict Two apt instances running simultaneously Kill stale processes before retrying, or use flock
nvcc not found in non-interactive shell .bashrc isn’t sourced in non-interactive SSH sessions Use explicit export PATH=... in scripts, or add to /etc/environment

✅ What went well

TL;DR: Should’ve used nohup/tmux from the start for the CUDA install, and should’ve added NVIDIA’s repo before trying to install. The rest was smooth. Lesson learned: never run long apt installs over raw SSH without a session wrapper.


Back Donate