nvidia-nim-single 2026-03-12 false AI: Tools: AI,LLM,local-ai,NVIDIA,NIM,CUDA,inference,OpenAI API,chatbot,language model nvcr.io/nim/meta/llama-3.2-3b-instruct:latest nvcr.io https://github.com/PikkonMG/unraid-docker-templates/ https://github.com/PikkonMG/unraid-docker-templates/blob/main/docs/NVIDIA-NIM_SINGLE.md bridge bash false https://forums.unraid.net/topic/197721-support-nvidia-nim-on-unraid-gpu-ai-inference-server/ https://build.nvidia.com/ NVIDIA NIM AI inference server for running LLMs locally on NVIDIA GPUs with CUDA acceleration and an OpenAI-compatible API. Be sure to check out for NIM related support https://developer.nvidia.com/nim DEFAULT MODEL: meta/llama-3.2-3b-instruct -- recommended for GPUs with 12 GB VRAM or less (RTX 3060, 3070, etc). TO CHANGE MODELS: Update BOTH the Repository image tag AND the NIM_MODEL_NAME variable to matching values. Browse available models at https://build.nvidia.com/models VRAM REQUIREMENTS (approximate): - Llama 3.2 3B ~6 GB -- fits 8-12 GB cards - Mistral 7B ~14 GB -- needs 16 GB+ (fp16 uses more than expected) - Llama 3.1 8B ~22 GB -- needs 24 GB+ - Llama 3.1 70B ~80 GB -- multi-GPU only BEFORE FIRST START -- REQUIRED STEPS (run once in Unraid terminal): Before you can pull the image you must have a NVIDIA API key from https://build.nvidia.com. Generate a Personal API Key from your profile. Step 1: Login to NGC registry (only needed once, persists until reboot): docker login nvcr.io Username: $oauthtoken Password: YOUR_NGC_API_KEY REQUIRED BEFORE FIRST START -- run in Unraid terminal: Step 2: Fix cache directory permissions: chown -R 1000:1000 /mnt/user/appdata/nvidia-nim/cache chmod -R 775 /mnt/user/appdata/nvidia-nim/cache REQUIRES: NVIDIA GPU (Turing/RTX 20 series or newer) | nvidia-driver Unraid plugin | NGC API key from build.nvidia.com URLS (replace YOUR_SERVER_IP with your Unraid IP): WebUI / Swagger docs : http://YOUR_SERVER_IP:8000/docs API base URL : http://YOUR_SERVER_IP:8000/v1 (use this in AnythingLLM, Open WebUI, etc.) Models list : http://YOUR_SERVER_IP:8000/v1/models http://[IP]:[PORT:8000]/docs https://raw.githubusercontent.com/PikkonMG/unraid-docker-templates/main/templates/nvidia-nim-single.xml https://raw.githubusercontent.com/PikkonMG/unraid-docker-templates/main/templates/img/nvidia-nim-single.png --gpus all --shm-size=16gb --ulimit memlock=-1 --ulimit stack=67108864 NVIDIA GPU (Turing or newer) | nvidia-driver plugin (Community Applications) | NGC API Key (build.nvidia.com) 8000 /mnt/user/appdata/nvidia-nim/cache meta/llama-3.2-3b-instruct 16384 /opt/nim/.cache 0 1 expandable_segments:True INFO