mirror of
https://github.com/containers/ramalama.git
synced 2026-02-05 15:47:26 +01:00
193 lines
6.7 KiB
Plaintext
193 lines
6.7 KiB
Plaintext
---
|
|
title: cuda
|
|
description: Platform-specific setup guide
|
|
# This file is auto-generated from manpages. Do not edit manually.
|
|
# Source: ramalama-cuda.7.md
|
|
---
|
|
|
|
# cuda
|
|
|
|
# Setting Up RamaLama with CUDA Support on Linux systems
|
|
|
|
This guide walks through the steps required to set up RamaLama with CUDA support.
|
|
|
|
## Install the NVIDIA Container Toolkit
|
|
|
|
Follow the installation instructions provided in the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
|
|
|
|
### Installation using dnf/yum (For RPM based distros like Fedora)
|
|
|
|
* Install the NVIDIA Container Toolkit packages
|
|
|
|
```bash
|
|
sudo dnf install -y nvidia-container-toolkit
|
|
```
|
|
|
|
:::note
|
|
The NVIDIA Container Toolkit is required on the host for running CUDA in containers.
|
|
:::
|
|
:::note
|
|
If the above installation is not working for you and you are running Fedora, try removing it and using the [COPR](https://copr.fedorainfracloud.org/coprs/g/ai-ml/nvidia-container-toolkit/).
|
|
:::
|
|
|
|
### Installation using APT (For Debian based distros like Ubuntu)
|
|
|
|
* Configure the Production Repository
|
|
|
|
```bash
|
|
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
|
|
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
|
|
|
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
|
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
|
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
|
```
|
|
|
|
* Update the packages list from the repository
|
|
|
|
```bash
|
|
sudo apt-get update
|
|
```
|
|
|
|
* Install the NVIDIA Container Toolkit packages
|
|
|
|
```bash
|
|
sudo apt-get install -y nvidia-container-toolkit
|
|
```
|
|
|
|
:::note
|
|
The NVIDIA Container Toolkit is required for WSL to have CUDA resources while running a container.
|
|
:::
|
|
|
|
## Setting Up CUDA Support
|
|
|
|
For additional information see: [Support for Container Device Interface](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html)
|
|
|
|
# Generate the CDI specification file
|
|
|
|
```bash
|
|
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
|
|
```
|
|
|
|
# Check the names of the generated devices
|
|
|
|
Open and edit the NVIDIA container runtime configuration:
|
|
|
|
```bash
|
|
nvidia-ctk cdi list
|
|
INFO[0000] Found 1 CDI devices
|
|
nvidia.com/gpu=all
|
|
```
|
|
|
|
:::note
|
|
Generate a new CDI specification after any configuration change most notably when the driver is upgraded!
|
|
:::
|
|
|
|
## Testing the Setup
|
|
|
|
**Based on this Documentation:** [Running a Sample Workload](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/sample-workload.html)
|
|
|
|
---
|
|
|
|
# **Test the Installation**
|
|
|
|
Run the following command to verify setup:
|
|
|
|
```bash
|
|
podman run --rm --device=nvidia.com/gpu=all fedora nvidia-smi
|
|
```
|
|
|
|
# **Expected Output**
|
|
|
|
Verify everything is configured correctly, with output similar to this:
|
|
|
|
```text
|
|
Thu Dec 5 19:58:40 2024
|
|
+-----------------------------------------------------------------------------------------+
|
|
| NVIDIA-SMI 565.72 Driver Version: 566.14 CUDA Version: 12.7 |
|
|
|-----------------------------------------+------------------------+----------------------+
|
|
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
|
|
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|
|
| | | MIG M. |
|
|
|=========================================+========================+======================|
|
|
| 0 NVIDIA GeForce RTX 3080 On | 00000000:09:00.0 On | N/A |
|
|
| 34% 24C P5 31W / 380W | 867MiB / 10240MiB | 7% Default |
|
|
| | | N/A |
|
|
+-----------------------------------------+------------------------+----------------------+
|
|
|
|
+-----------------------------------------------------------------------------------------+
|
|
| Processes: |
|
|
| GPU GI CI PID Type Process name GPU Memory |
|
|
| ID ID Usage |
|
|
|=========================================================================================|
|
|
| 0 N/A N/A 35 G /Xwayland N/A |
|
|
| 0 N/A N/A 35 G /Xwayland N/A |
|
|
+-----------------------------------------------------------------------------------------+
|
|
```
|
|
|
|
:::note
|
|
On systems that have SELinux enabled, it may be necessary to turn on the `container_use_devices` boolean in order to run the `nvidia-smi` command successfully from a container.
|
|
:::
|
|
|
|
To check the status of the boolean, run the following:
|
|
|
|
```bash
|
|
getsebool container_use_devices
|
|
```
|
|
|
|
If the result of the command shows that the boolean is `off`, run the following to turn the boolean on:
|
|
|
|
```bash
|
|
sudo setsebool -P container_use_devices 1
|
|
```
|
|
|
|
### CUDA_VISIBLE_DEVICES
|
|
|
|
RamaLama respects the `CUDA_VISIBLE_DEVICES` environment variable if it's already set in your environment. If not set, RamaLama will default to using all the GPU detected by nvidia-smi.
|
|
|
|
You can specify which GPU devices should be visible to RamaLama by setting this variable before running RamaLama commands:
|
|
|
|
```bash
|
|
export CUDA_VISIBLE_DEVICES="0,1" # Use GPUs 0 and 1
|
|
ramalama run granite
|
|
```
|
|
|
|
This is particularly useful in multi-GPU systems where you want to dedicate specific GPUs to different workloads.
|
|
|
|
If the `CUDA_VISIBLE_DEVICES` environment variable is set to an empty string, RamaLama will default to using the CPU.
|
|
|
|
```bash
|
|
export CUDA_VISIBLE_DEVICES="" # Defaults to CPU
|
|
ramalama run granite
|
|
```
|
|
|
|
To revert to using all available GPUs, unset the environment variable:
|
|
|
|
```bash
|
|
unset CUDA_VISIBLE_DEVICES
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### CUDA Updates
|
|
|
|
On some CUDA software updates, RamaLama stops working complaining about missing shared NVIDIA libraries for example:
|
|
|
|
```bash
|
|
ramalama run granite
|
|
Error: crun: cannot stat `/lib64/libEGL_nvidia.so.565.77`: No such file or directory: OCI runtime attempted to invoke a command that was not found
|
|
```
|
|
|
|
Because the CUDA version is updated, the CDI specification file needs to be recreated.
|
|
|
|
```bash
|
|
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
|
|
```
|
|
|
|
## See Also
|
|
|
|
[ramalama(1)](/docs/commands/ramalama/), [podman(1)](https://github.com/containers/podman/blob/main/docs/source/markdown/podman.1.md)
|
|
|
|
---
|
|
|
|
*Jan 2025, Originally compiled by Dan Walsh <dwalsh@redhat.com>* |