Linux Hardware Hotplug Events: Optimizing GPU Detection for Ollama & LM Studio

TL;DR

Linux hardware hotplug events let your system detect and configure GPUs automatically when they appear or change state. For local LLM deployments with Ollama and LM Studio, proper hotplug handling ensures your models can leverage GPU acceleration without manual intervention after driver updates, system reboots, or hardware changes.

The kernel’s udev subsystem monitors hardware events and triggers rules stored in /etc/udev/rules.d/. When a GPU appears, udev can restart services, update environment variables, or notify applications. For Ollama running as a systemd service, a udev rule can trigger a service restart when NVIDIA or AMD GPUs become available, ensuring the OLLAMA_NUM_GPU environment variable reflects the current hardware state.

A practical example: create /etc/udev/rules.d/99-gpu-ollama.rules with a rule that matches your GPU’s vendor ID and triggers a systemd service reload. When the GPU driver loads, udev executes your custom script that verifies GPU availability via nvidia-smi or rocm-smi, then restarts the Ollama service. This eliminates the common issue where Ollama starts before GPU drivers finish initializing, forcing CPU-only inference.

For LM Studio users, hotplug events matter less since it’s a GUI application you launch manually. However, system-level GPU detection still affects which models LM Studio can load. If your udev rules ensure drivers load before desktop session startup, LM Studio will correctly detect available VRAM on first launch.

Caution: Always test udev rules in a development environment first. Incorrect rules can trigger service restart loops or prevent system boot. Validate any AI-generated udev syntax against the udev manual pages before deploying to production systems. Use udevadm test to dry-run rules without triggering actual actions, and monitor /var/log/syslog for udev errors during testing.

Understanding Linux Hardware Hotplug and GPU Detection

Linux hardware hotplug events occur when the kernel detects physical device changes – GPUs being added, removed, or reset. The udev subsystem manages these events and triggers rules that can restart services or reconfigure applications. For local LLM deployments, proper hotplug handling ensures Ollama and LM Studio detect GPU changes without manual intervention.

When you insert a GPU or the driver reloads, the kernel generates a uevent. The udev daemon reads rules from /etc/udev/rules.d/ and /lib/udev/rules.d/, executing actions based on device attributes. GPU events typically include subsystem type (pci, drm), vendor ID, and device class.

Check current GPU detection with:

udevadm monitor --environment --udev

Then trigger a GPU event by reloading the driver:

sudo rmmod nvidia_uvm nvidia_drm nvidia_modeset nvidia
sudo modprobe nvidia

You’ll see events like ACTION=add and SUBSYSTEM=pci in the monitor output.

Integration with Ollama

Ollama reads GPU availability at startup through CUDA or ROCm libraries. It does not automatically detect mid-session GPU changes. A udev rule can restart the Ollama service when GPU events occur:

# /etc/udev/rules.d/99-ollama-gpu.rules
ACTION=="add", SUBSYSTEM=="pci", ATTR{class}=="0x030000", RUN+="/bin/systemctl restart ollama"

Reload rules with sudo udevadm control --reload-rules. This ensures Ollama reinitializes with the correct OLLAMA_NUM_GPU value after hardware changes.

LM Studio Considerations

LM Studio detects GPUs at application launch. Since it’s a GUI application, automatic restarts are less practical. Users typically need to manually restart LM Studio after GPU hotplug events to refresh hardware detection.

Caution: Always validate udev rules in a test environment before deploying to production systems. Incorrect rules can trigger restart loops or system instability.

Why GPU Detection Matters for Local LLM Inference

When you run local LLMs with Ollama or LM Studio, GPU detection directly determines whether your inference runs in seconds or minutes. Modern language models like Llama 3 or Mistral require substantial compute resources – a 7B parameter model running on CPU might generate 2-3 tokens per second, while the same model on a mid-range GPU can produce 30-50 tokens per second.

Linux systems rely on hardware hotplug events to detect and initialize GPUs at boot time and during runtime. If your system fails to properly detect an NVIDIA or AMD GPU, Ollama will silently fall back to CPU inference without warning. You might launch ollama run llama3 and wonder why responses crawl along, unaware that your RTX 4070 sits idle because the CUDA runtime never initialized.

Hotplug event failures typically occur after kernel updates, driver reinstalls, or when running containerized workloads. A fresh kernel might load nouveau instead of the proprietary NVIDIA driver, or udev rules might fail to trigger the correct device permissions. When Ollama starts, it queries available compute devices – if none appear, it proceeds with CPU-only mode.

You can verify GPU availability before launching Ollama:

nvidia-smi  # For NVIDIA GPUs
rocm-smi    # For AMD GPUs

If these commands fail or show no devices, Ollama cannot use GPU acceleration regardless of the OLLAMA_NUM_GPU environment variable setting. The hotplug subsystem must successfully enumerate and initialize the GPU before any inference tool can access it.

LM Studio provides visual feedback in its GUI when GPU detection fails, but Ollama operates silently. Monitoring udev events and kernel logs helps catch detection failures before they impact inference performance. A properly configured hotplug system ensures your GPU becomes available immediately after driver load, eliminating the need to restart Ollama or LM Studio after hardware changes.

Setting Up Udev Rules for GPU Hotplug Events

Udev rules provide a declarative way to trigger actions when hardware changes occur. For GPU hotplug scenarios, you can configure udev to restart Ollama or notify LM Studio when a new GPU appears or an existing one becomes available after driver reload.

Create a new rule file in /etc/udev/rules.d/ with a numeric prefix below 99 to ensure it runs before default rules:

sudo nano /etc/udev/rules.d/80-gpu-hotplug.rules

Add a rule that matches NVIDIA GPU devices and triggers a script:

ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", RUN+="/usr/local/bin/gpu-hotplug-handler.sh"

For AMD GPUs, replace the vendor ID with 0x1002. The ACTION=="add" filter ensures the rule only fires when devices appear, not on every udev event.

Writing the Handler Script

Create the handler script that restarts Ollama when a GPU becomes available:

sudo nano /usr/local/bin/gpu-hotplug-handler.sh

#!/bin/bash
sleep 2
systemctl restart ollama
logger "GPU hotplug detected, restarted Ollama service"

Make it executable:

sudo chmod +x /usr/local/bin/gpu-hotplug-handler.sh

The two-second sleep allows the kernel driver to fully initialize before Ollama attempts to access the GPU. Without this delay, Ollama may start before CUDA or ROCm libraries detect the device.

Reloading and Testing Rules

Apply the new rules without rebooting:

sudo udevadm control --reload-rules
sudo udevadm trigger --subsystem-match=pci

Monitor the system log to verify the rule fires:

journalctl -f | grep "GPU hotplug"

Caution: Validate any AI-generated udev rules carefully before deploying to production systems. Incorrect rules can cause boot failures or system instability. Test in a VM or development environment first, and always keep a backup of working configurations.

Configuring Ollama Service Restart on GPU Changes

When GPU hardware changes occur – whether from driver updates, physical card swaps, or power state transitions – Ollama may continue running with stale device references. Automating service restarts ensures the runtime detects new GPU configurations without manual intervention.

Most Ollama installations run as a systemd service. Create a drop-in configuration to handle restart behavior:

sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo tee /etc/systemd/system/ollama.service.d/restart.conf <<EOF
[Service]
Restart=on-failure
RestartSec=5
EOF
sudo systemctl daemon-reload

This configuration tells systemd to restart Ollama automatically if the process exits unexpectedly, which can happen when GPU device nodes disappear during hotplug events.

Triggering Restarts from Udev Rules

Link your udev rules directly to the Ollama service. Modify your GPU detection rule to include a restart action:

ACTION=="add", SUBSYSTEM=="pci", ATTR{class}=="0x030000", \
  RUN+="/bin/systemctl restart ollama.service"

For multi-GPU systems, add a debounce mechanism to prevent rapid restart loops:

ACTION=="add", SUBSYSTEM=="pci", ATTR{class}=="0x030000", \
  RUN+="/usr/local/bin/ollama-restart-debounce.sh"

Create the debounce script:

#!/bin/bash
LOCKFILE=/var/lock/ollama-restart.lock
if [ ! -f "$LOCKFILE" ]; then
    touch "$LOCKFILE"
    sleep 10
    systemctl restart ollama.service
    rm "$LOCKFILE"
fi

Caution: Automated restarts interrupt active inference requests. Test restart behavior with your workload before deploying to production environments. Consider implementing health checks that verify GPU availability before restarting, especially in systems running continuous inference tasks.

Verify the service restarts correctly by checking logs after a GPU event:

journalctl -u ollama.service -f

Handling LM Studio GPU Detection

LM Studio handles GPU detection through its graphical interface, but you can monitor and troubleshoot hardware changes from the command line to ensure the application recognizes new or re-enabled GPUs without requiring a full system restart.

When a GPU becomes available after a hotplug event, verify that the system recognizes it before launching LM Studio:

# Check NVIDIA GPU status
nvidia-smi --query-gpu=index,name,driver_version --format=csv

# Verify CUDA runtime detection
ldconfig -p | grep libcuda

# Check for GPU device nodes
ls -l /dev/nvidia*

If LM Studio was already running when you added or enabled a GPU, the application typically requires a restart to detect the new hardware. Unlike CLI tools that can reload configuration, LM Studio’s GUI architecture means hardware detection happens at startup.

Automating LM Studio Restarts

Create a udev rule that triggers a notification when GPU state changes:

# /etc/udev/rules.d/99-gpu-notify.rules
ACTION=="add", SUBSYSTEM=="pci", ATTR{class}=="0x030000", RUN+="/usr/local/bin/notify-gpu-change.sh"

The notification script can alert you to restart LM Studio manually:

#!/bin/bash
# /usr/local/bin/notify-gpu-change.sh
export DISPLAY=:0
export DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
notify-send "GPU Detected" "New GPU available -- restart LM Studio to use it"

Make the script executable with chmod +x /usr/local/bin/notify-gpu-change.sh. This approach keeps you informed without attempting automated restarts of GUI applications, which can cause session management issues.

Caution: Always verify udev rules in a test environment before deploying to production systems. Incorrect rules can prevent boot or cause hardware initialization failures.

For headless servers running LM Studio’s local API server, consider using systemd to manage the process lifecycle and enable automatic restarts when hardware changes occur.

Installation and Configuration Steps

Most modern Linux distributions include udev by default. Verify your installation with udevadm --version. For systems missing udev tools, install via your package manager:

# Debian/Ubuntu
sudo apt install udev

# Fedora/RHEL
sudo dnf install systemd-udev

# Arch Linux
sudo pacman -S systemd

Creating GPU Hotplug Detection Rules

Create a custom udev rule to trigger actions when GPU devices appear. Place rules in /etc/udev/rules.d/ with numeric prefixes determining execution order:

sudo nano /etc/udev/rules.d/99-gpu-hotplug.rules

Add this rule to detect NVIDIA GPU hotplug events:

ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", RUN+="/usr/local/bin/gpu-detected.sh"

For AMD GPUs, replace the vendor ID with 0x1002. The script path points to your custom handler that restarts AI services.

Building the GPU Detection Handler

Create the handler script that udev executes:

sudo nano /usr/local/bin/gpu-detected.sh

#!/bin/bash
logger "GPU hotplug detected, restarting Ollama"
systemctl restart ollama

Make it executable:

sudo chmod +x /usr/local/bin/gpu-detected.sh

Reloading Rules and Testing

Apply new udev rules without rebooting:

sudo udevadm control --reload-rules
sudo udevadm trigger

Test your rule by simulating a hotplug event or checking logs after a real GPU insertion. Monitor with journalctl -f to see the logger output.

Caution: Validate all scripts before production deployment. AI-generated udev rules can cause boot failures if they reference non-existent binaries or create infinite restart loops. Test in a VM or development environment first.

For LM Studio users, the GUI application handles GPU detection automatically on startup, but you can still use udev rules to trigger desktop notifications when new GPUs become available.

TL;DR#

Understanding Linux Hardware Hotplug and GPU Detection#

Integration with Ollama#

LM Studio Considerations#

Why GPU Detection Matters for Local LLM Inference#

Setting Up Udev Rules for GPU Hotplug Events#

Writing the Handler Script#

Reloading and Testing Rules#

Configuring Ollama Service Restart on GPU Changes#

Triggering Restarts from Udev Rules#

Handling LM Studio GPU Detection#

Automating LM Studio Restarts#

Installation and Configuration Steps#

Creating GPU Hotplug Detection Rules#

Building the GPU Detection Handler#

Reloading Rules and Testing#