How to Use GPU

Ollama supports GPU acceleration for model inference. Here's how to configure it on Windows.

NVIDIA

Supported GPUs

NVIDIA GeForce RTX series (20/30/40/50 series and above)
NVIDIA GeForce GTX 16 series and above
NVIDIA Tesla series
6GB+ VRAM recommended
CUDA Capability 7.0 or higher

Install CUDA

Visit NVIDIA website to download CUDA Toolkit (https://developer.nvidia.com/cuda-downloads)
Select Windows and your version
Download and install CUDA Toolkit (v11.7 or later recommended)
Verify installation by running:

nvidia-smi

Restart Ollama to enable GPU acceleration

AMD

Supported GPUs

Officially supported:

AMD Radeon RX 9000 series
AMD Radeon RX 7000 series
AMD Radeon RX 6000 series
AMD Instinct series
6GB+ VRAM recommended

Install HIP

Download and install the latest AMD drivers
Install HIP SDK (https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html)
Restart Ollama to enable GPU acceleration

Unsupported AMD GPUs

Some AMD GPUs (500 series, RDNA 5000 series, 680M, etc.) lack official ROCm support. Use the following workaround:

Ollama-for-AMD

Visit https://github.com/likelovewant/ollama-for-amd
Download pre-compiled binaries or build from source
Download pre-compiled rocblas and library files
Replace rocblas.dll and library files accordingly
Restart Ollama

Easier Method

Use Ollama-For-AMD-Installer
Select your GPU model and click "Check latest version"
The tool will automatically complete all configuration

Important Notes

If GPU still can't be used (common on dual-GPU laptops), try setting environment variables to force Ollama to use a specific GPU
Set your power plan to "High Performance" mode
Keep your GPU drivers up to date
Monitor VRAM usage to avoid overflow
Close other GPU-intensive applications when using large models