{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deploy VSS 3.1 (Video Search & Summarization)\n", "\n", "This notebook deploys the NVIDIA VSS 3.1 Blueprint on GPU-equipped cloud instances.\n", "\n", "**What it does:**\n", "1. Validates GPU hardware and Docker prerequisites\n", "2. Installs and configures the NGC CLI\n", "3. Configures Docker storage for large image pulls\n", "4. Gets the deployment code (from a local path or GitHub)\n", "5. Detects network configuration (internal + external IPs)\n", "6. Runs `dev-profile.sh` to deploy the selected profile\n", "7. Verifies all services are healthy\n", "\n", "**Supported profiles:** `base`, `search`, `alerts`, `lvs` \n", "**Supported hardware:** H100, L40S, RTX PRO 6000 Blackwell, DGX SPARK\n", "\n", "---\n", "\n", "## Prerequisites\n", "\n", "- Linux instance with 2+ NVIDIA GPUs (H100, L40S, RTX PRO 6000 BW, or DGX SPARK)\n", "- NVIDIA driver 550+ and CUDA 12.x installed\n", "- Docker Engine 24+ with Docker Compose v2\n", "- NGC API key from [ngc.nvidia.com](https://ngc.nvidia.com)\n", "- **500GB+ disk space** for Docker images and models. Most GPU cloud instances have a small root disk (~200-250GB) plus a large ephemeral NVMe. Section 4 will auto-detect this and move Docker/containerd storage to the NVMe." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Configuration\n", "\n", "Set your NGC API key, deployment profile, and hardware below. These variables are used by all subsequent cells." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ============================================================\n", "# REQUIRED: Set these before running anything else\n", "# ============================================================\n", "\n", "NGC_CLI_API_KEY = \"\" # Your NGC API key — get one at https://ngc.nvidia.com\n", "\n", "PROFILE = \"base\" # Deployment profile: base, search, alerts, lvs\n", "\n", "HARDWARE_PROFILE = \"RTXPRO6000BW\" # Hardware: RTXPRO6000BW, H100, L40S, DGX-SPARK, IGX-THOR, AGX-THOR, OTHER\n", "\n", "# ============================================================\n", "# OPTIONAL: Override defaults if needed\n", "# ============================================================\n", "\n", "# Deployment source — set ONE of these:\n", "# DEPLOY_SOURCE_PATH: Path to a pre-extracted repo on this machine (e.g. copied via scp/rsync).\n", "# Must contain scripts/dev-profile.sh and deployments/.\n", "# If empty, clones from GitHub (requires network access).\n", "DEPLOY_SOURCE_PATH = \"\" # e.g. \"/home/ubuntu/video-search-and-summarization\"\n", "\n", "GIT_BRANCH = \"3.1.0\" # Git branch or tag (only used when cloning from GitHub)\n", "\n", "ALERTS_MODE = \"verification\" # Only used when PROFILE=alerts: verification or real-time\n", "\n", "USE_REMOTE_LLM = False # Set True to use a remote LLM endpoint instead of local\n", "USE_REMOTE_VLM = False # Set True to use a remote VLM endpoint instead of local\n", "\n", "# Network overrides (auto-detected in Section 7 if left empty)\n", "HOST_IP_OVERRIDE = \"\" # Internal IP — leave empty for auto-detect\n", "EXTERNAL_IP_OVERRIDE = \"\" # External IP — leave empty for auto-detect" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "# ---- Validate configuration ----\n", "import os, sys\n", "\n", "assert NGC_CLI_API_KEY, \"NGC_CLI_API_KEY is required. Get one at https://ngc.nvidia.com\"\n", "assert PROFILE in (\"base\", \"search\", \"alerts\", \"lvs\"), f\"Invalid PROFILE: {PROFILE}\"\n", "assert HARDWARE_PROFILE in (\"H100\", \"L40S\", \"RTXPRO6000BW\", \"DGX-SPARK\", \"IGX-THOR\", \"AGX-THOR\", \"OTHER\"), \\\n", " f\"Invalid HARDWARE_PROFILE: {HARDWARE_PROFILE}\"\n", "\n", "if PROFILE == \"alerts\":\n", " assert ALERTS_MODE in (\"verification\", \"real-time\"), f\"Invalid ALERTS_MODE: {ALERTS_MODE}\"\n", "\n", "if HARDWARE_PROFILE in (\"DGX-SPARK\", \"IGX-THOR\", \"AGX-THOR\"):\n", " assert PROFILE in (\"base\", \"alerts\"), \\\n", " f\"{HARDWARE_PROFILE} only supports base and alerts profiles, not {PROFILE}\"\n", "\n", "if DEPLOY_SOURCE_PATH:\n", " assert os.path.isdir(DEPLOY_SOURCE_PATH), f\"DEPLOY_SOURCE_PATH does not exist: {DEPLOY_SOURCE_PATH}\"\n", "\n", "# Export NGC key to environment for shell cells and dev-profile.sh\n", "os.environ[\"NGC_CLI_API_KEY\"] = NGC_CLI_API_KEY\n", "\n", "print(\"Configuration valid.\")\n", "print(f\" Profile: {PROFILE}\")\n", "print(f\" Hardware: {HARDWARE_PROFILE}\")\n", "print(f\" Source: {DEPLOY_SOURCE_PATH or f'GitHub (branch: {GIT_BRANCH})'}\")\n", "print(f\" LLM: {'remote' if USE_REMOTE_LLM else 'local'}\")\n", "print(f\" VLM: {'remote' if USE_REMOTE_VLM else 'local'}\")\n", "if PROFILE == \"alerts\":\n", " print(f\" Alerts: {ALERTS_MODE}\")\n", "print(f\" NGC key: {NGC_CLI_API_KEY[:4]}...{NGC_CLI_API_KEY[-4:]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Prerequisites Check\n", "\n", "Verify that the NVIDIA driver, CUDA, Docker, and Docker Compose are installed and functional." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "%%bash\n", "set -e\n", "\n", "echo \"=== NVIDIA Driver & GPU ===\"\n", "nvidia-smi --query-gpu=index,name,driver_version,memory.total --format=csv,noheader\n", "echo \"\"\n", "\n", "echo \"=== GPU Count ===\"\n", "GPU_COUNT=$(nvidia-smi --query-gpu=index --format=csv,noheader | wc -l)\n", "echo \"Detected $GPU_COUNT GPU(s)\"\n", "echo \"\"\n", "\n", "echo \"=== Docker ===\"\n", "docker --version\n", "docker compose version\n", "echo \"\"\n", "\n", "echo \"=== NVIDIA Container Toolkit ===\"\n", "if docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi > /dev/null 2>&1; then\n", " echo \"NVIDIA Container Toolkit: OK\"\n", "else\n", " echo \"WARNING: NVIDIA Container Toolkit may not be installed.\"\n", " echo \"Install: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html\"\n", "fi\n", "echo \"\"\n", "\n", "echo \"=== Disk Space ===\"\n", "df -h / | tail -1 | awk '{print \"Root:\", $4, \"available of\", $2}'\n", "echo \"\"\n", "echo \"Prerequisites check complete.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Install NGC CLI\n", "\n", "The NGC CLI is required to download models during deployment. This cell installs it if not already present, then configures it with your API key." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, os, shutil\n", "\n", "def run(cmd, **kwargs):\n", " \"\"\"Run a shell command, raise on failure with output.\"\"\"\n", " r = subprocess.run(cmd, shell=True, capture_output=True, text=True, **kwargs)\n", " if r.returncode != 0:\n", " raise RuntimeError(f\"Command failed: {cmd}\\n{r.stderr}\\n{r.stdout}\")\n", " return r.stdout.strip()\n", "\n", "# Check if NGC CLI is already installed\n", "ngc_path = shutil.which(\"ngc\")\n", "if ngc_path:\n", " ver = run(\"ngc --version 2>&1 | head -1\")\n", " print(f\"NGC CLI already installed: {ver}\")\n", "else:\n", " import platform\n", " arch = platform.machine()\n", " if arch in (\"aarch64\", \"arm64\"):\n", " filename = \"ngccli_linux_arm64.zip\"\n", " else:\n", " filename = \"ngccli_linux.zip\"\n", "\n", " # Use version-pinned URL (update this if a newer version is needed)\n", " NGC_CLI_VERSION = \"4.13.0\"\n", " url = f\"https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/{NGC_CLI_VERSION}/files/{filename}\"\n", "\n", " print(f\"Installing NGC CLI {NGC_CLI_VERSION} ...\")\n", " run(f\"cd /tmp && wget -q --content-disposition '{url}' -O ngc_cli.zip\")\n", "\n", " # Verify download is not empty\n", " size = os.path.getsize(\"/tmp/ngc_cli.zip\")\n", " if size < 1000:\n", " raise RuntimeError(f\"NGC CLI download failed — file is only {size} bytes. Check the version URL.\")\n", " print(f\" Downloaded {size / 1024 / 1024:.1f} MB\")\n", "\n", " run(\"cd /tmp && unzip -o ngc_cli.zip\")\n", " # NGC bundles its own Python — copy the entire directory\n", " run(\"sudo cp -r /tmp/ngc-cli/* /usr/local/bin/\")\n", " run(\"rm -rf /tmp/ngc_cli.zip /tmp/ngc-cli\")\n", "\n", " ver = run(\"ngc --version 2>&1 | head -1\")\n", " print(f\" Installed: {ver}\")\n", "\n", "# Configure NGC CLI with API key and org\n", "print(\"Configuring NGC CLI...\")\n", "ngc_dir = os.path.expanduser(\"~/.ngc\")\n", "os.makedirs(ngc_dir, exist_ok=True)\n", "\n", "with open(os.path.join(ngc_dir, \"config\"), \"w\") as f:\n", " f.write(f\"\"\";WARNING - This is a machine generated file. Do not edit manually.\n", ";WARNING - To update local config settings, see 'ngc config set -h'.\n", "\n", "[CURRENT]\n", "apikey = {NGC_CLI_API_KEY}\n", "format_type = ascii\n", "org = nvstaging\n", "\"\"\")\n", "\n", "print(\"NGC CLI configured.\")\n", "print(run(\"ngc config current\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Docker & Containerd Storage\n", "\n", "Docker images and containerd layers for VSS require **~250GB** (NIM models, DeepStream, ELK, etc.). Most GPU cloud instances ship with a small root disk (200-250GB) that **will run out of space** during deployment.\n", "\n", "This cell auto-detects whether your root disk is too small and moves Docker and containerd storage to a larger mount. Docker **volumes** (Elasticsearch indices, uploaded videos, Kafka data) are kept on the root disk so your data persists even if the instance is stopped and the ephemeral NVMe is wiped. Images and layers are re-pulled automatically on next deploy.\n", "\n", "**Common NVMe mount points** (auto-detected):\n", "- AWS DLAMI: `/opt/dlami/nvme`\n", "- Brev/Crusoe: `/ephemeral`\n", "- Custom RAID: `/data`\n", "\n", "To override auto-detection, set `STORAGE_ROOT` below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, json, os, shutil\n", "\n", "STORAGE_ROOT = \"\" # Override: set to a mount path (e.g. \"/mnt/data\") to skip auto-detection\n", "\n", "MIN_ROOT_FREE_GB = 350 # If root has less than this free, move storage\n", "\n", "# --- Auto-detect large mount ---\n", "\n", "def get_disk_free_gb(path):\n", " \"\"\"Return free space in GB for the filesystem containing path.\"\"\"\n", " st = os.statvfs(path)\n", " return (st.f_bavail * st.f_frsize) / (1024 ** 3)\n", "\n", "def get_disk_total_gb(path):\n", " st = os.statvfs(path)\n", " return (st.f_blocks * st.f_frsize) / (1024 ** 3)\n", "\n", "def find_large_mount():\n", " \"\"\"Look for a large non-root mount suitable for Docker storage.\"\"\"\n", " candidates = [\"/opt/dlami/nvme\", \"/ephemeral\", \"/data\"]\n", " for path in candidates:\n", " if os.path.isdir(path) and os.path.ismount(path):\n", " free = get_disk_free_gb(path)\n", " if free > 200:\n", " return path, free\n", " return None, 0\n", "\n", "def find_mount_unit(mount_path):\n", " \"\"\"Convert a mount path to a systemd mount unit name (e.g. /opt/dlami/nvme -> opt-dlami-nvme.mount).\"\"\"\n", " # Strip leading slash, replace remaining slashes with dashes\n", " unit = mount_path.strip(\"/\").replace(\"/\", \"-\") + \".mount\"\n", " # Verify this unit exists on the system\n", " r = subprocess.run([\"systemctl\", \"cat\", unit], capture_output=True, text=True)\n", " if r.returncode == 0:\n", " return unit\n", " return None\n", "\n", "root_free = get_disk_free_gb(\"/\")\n", "root_total = get_disk_total_gb(\"/\")\n", "\n", "print(f\"Root disk: {root_free:.0f} GB free / {root_total:.0f} GB total\")\n", "\n", "if STORAGE_ROOT:\n", " large_mount = STORAGE_ROOT\n", " mount_free = get_disk_free_gb(STORAGE_ROOT)\n", " print(f\"Using override: {STORAGE_ROOT} ({mount_free:.0f} GB free)\")\n", " need_move = True\n", "else:\n", " large_mount, mount_free = find_large_mount()\n", " need_move = root_free < MIN_ROOT_FREE_GB and large_mount is not None\n", "\n", " if large_mount:\n", " print(f\"Large mount: {large_mount} ({mount_free:.0f} GB free)\")\n", " else:\n", " print(\"No large ephemeral mount detected.\")\n", "\n", " if root_free >= MIN_ROOT_FREE_GB:\n", " print(f\"\\nRoot disk has enough space ({root_free:.0f} GB free). No storage move needed.\")\n", " elif not large_mount:\n", " print(f\"\\nWARNING: Root disk only has {root_free:.0f} GB free and no large mount was found.\")\n", " print(\"Deployment may fail due to disk space. Consider attaching a larger volume.\")\n", "\n", "if need_move:\n", " DOCKER_DATA_ROOT = os.path.join(large_mount, \"docker\")\n", " CONTAINERD_ROOT = os.path.join(large_mount, \"containerd\")\n", " VOLUMES_DIR = \"/var/lib/docker/volumes\" # Keep volumes on persistent root disk\n", "\n", " print(f\"\\nMoving Docker and containerd storage to {large_mount}\")\n", " print(f\" Docker images/layers: {DOCKER_DATA_ROOT}\")\n", " print(f\" Containerd: {CONTAINERD_ROOT}\")\n", " print(f\" Docker volumes: {VOLUMES_DIR} (stays on root for persistence)\")\n", "\n", " # --- Check what needs changing ---\n", " daemon_json = \"/etc/docker/daemon.json\"\n", " config = {}\n", " try:\n", " with open(daemon_json) as f:\n", " config = json.load(f)\n", " except (FileNotFoundError, json.JSONDecodeError):\n", " pass\n", "\n", " need_daemon_json = config.get(\"data-root\") != DOCKER_DATA_ROOT\n", "\n", " subprocess.run([\"sudo\", \"mkdir\", \"-p\", DOCKER_DATA_ROOT], check=True)\n", " subprocess.run([\"sudo\", \"mkdir\", \"-p\", VOLUMES_DIR], check=True)\n", "\n", " volumes_link = os.path.join(DOCKER_DATA_ROOT, \"volumes\")\n", " need_volumes_symlink = not (os.path.islink(volumes_link) and os.readlink(volumes_link) == VOLUMES_DIR)\n", "\n", " containerd_link = \"/var/lib/containerd\"\n", " need_containerd = not (os.path.islink(containerd_link) and os.readlink(containerd_link) == CONTAINERD_ROOT)\n", "\n", " # Even if symlinks are correct, ensure NVMe target dirs actually exist\n", " # (they get wiped when ephemeral NVMe is reset on instance stop/start)\n", " need_target_dirs = not os.path.isdir(DOCKER_DATA_ROOT) or not os.path.isdir(CONTAINERD_ROOT)\n", " if need_target_dirs:\n", " print(f\"\\n NVMe target dir(s) missing — recreating...\")\n", " subprocess.run([\"sudo\", \"mkdir\", \"-p\", DOCKER_DATA_ROOT, CONTAINERD_ROOT], check=True)\n", "\n", " if not need_daemon_json and not need_volumes_symlink and not need_containerd:\n", " print(f\"\\n Docker data-root already set to {DOCKER_DATA_ROOT}\")\n", " print(f\" Volumes symlink already correct: {volumes_link} -> {VOLUMES_DIR}\")\n", " print(f\" Containerd already symlinked: {containerd_link} -> {CONTAINERD_ROOT}\")\n", "\n", " # Always ensure the boot-time restore service is up to date\n", " # (handles the case where service exists but is missing mount dependencies)\n", " _update_restore_service = True\n", " _need_restart = need_target_dirs # Restart Docker/containerd if we had to recreate dirs\n", " else:\n", " _update_restore_service = True\n", " _need_restart = True\n", "\n", " # Stop Docker AND docker.socket (socket can reactivate Docker and recreate dirs)\n", " print(\"\\n Stopping Docker and containerd for storage reconfiguration...\")\n", " subprocess.run([\"sudo\", \"systemctl\", \"stop\", \"docker.socket\"], check=False)\n", " subprocess.run([\"sudo\", \"systemctl\", \"stop\", \"docker\"], check=True)\n", " subprocess.run([\"sudo\", \"systemctl\", \"stop\", \"containerd\"], check=True)\n", "\n", " # --- Docker daemon.json ---\n", " if need_daemon_json:\n", " config[\"data-root\"] = DOCKER_DATA_ROOT\n", " new_config = json.dumps(config, indent=2)\n", " subprocess.run(\n", " f\"echo '{new_config}' | sudo tee {daemon_json}\",\n", " shell=True, check=True, capture_output=True\n", " )\n", " print(f\" Docker data-root set to {DOCKER_DATA_ROOT}\")\n", " else:\n", " print(f\" Docker data-root already set to {DOCKER_DATA_ROOT}\")\n", "\n", " # --- Volumes symlink (use ln -sfn for idempotency) ---\n", " if need_volumes_symlink:\n", " # ln -sfn: force, no-dereference (replaces existing dir/symlink atomically)\n", " subprocess.run([\"sudo\", \"rm\", \"-rf\", volumes_link], check=True)\n", " subprocess.run([\"sudo\", \"ln\", \"-sfn\", VOLUMES_DIR, volumes_link], check=True)\n", " print(f\" Created symlink: {volumes_link} -> {VOLUMES_DIR}\")\n", " else:\n", " print(f\" Volumes symlink already correct: {volumes_link} -> {VOLUMES_DIR}\")\n", "\n", " # --- Containerd ---\n", " if need_containerd:\n", " subprocess.run([\"sudo\", \"mkdir\", \"-p\", CONTAINERD_ROOT], check=True)\n", " if os.path.isdir(containerd_link) and not os.path.islink(containerd_link):\n", " # Move existing containerd data\n", " subprocess.run(f\"sudo mv {containerd_link}/* {CONTAINERD_ROOT}/ 2>/dev/null; true\",\n", " shell=True, check=False)\n", " subprocess.run([\"sudo\", \"rm\", \"-rf\", containerd_link], check=True)\n", " print(f\" Containerd data moved to {CONTAINERD_ROOT}\")\n", " elif os.path.lexists(containerd_link):\n", " subprocess.run([\"sudo\", \"rm\", \"-f\", containerd_link], check=True)\n", " subprocess.run([\"sudo\", \"ln\", \"-sfn\", CONTAINERD_ROOT, containerd_link], check=True)\n", " print(f\" Containerd symlinked: {containerd_link} -> {CONTAINERD_ROOT}\")\n", " else:\n", " print(f\" Containerd already symlinked: {containerd_link} -> {CONTAINERD_ROOT}\")\n", "\n", " # --- Install/update boot-time restore service ---\n", " # Ephemeral NVMe is wiped on instance stop/start. This systemd service\n", " # recreates the directories before Docker/containerd start so they don't crash-loop.\n", " # We use RequiresMountsFor= so the service waits for the NVMe to actually be mounted.\n", " if _update_restore_service:\n", " unit_name = \"docker-nvme-restore.service\"\n", " unit_path = f\"/etc/systemd/system/{unit_name}\"\n", "\n", " # Build After= line — include the mount unit if systemd knows about it\n", " after_targets = \"local-fs.target\"\n", " mount_unit = find_mount_unit(large_mount)\n", " if mount_unit:\n", " after_targets += f\" {mount_unit}\"\n", "\n", " unit_content = f\"\"\"[Unit]\n", "Description=Restore Docker/containerd dirs on ephemeral NVMe\n", "Before=containerd.service docker.service\n", "After={after_targets}\n", "RequiresMountsFor={large_mount}\n", "\n", "[Service]\n", "Type=oneshot\n", "ExecStart=/bin/bash -c 'mkdir -p {DOCKER_DATA_ROOT} {CONTAINERD_ROOT}'\n", "\n", "[Install]\n", "WantedBy=multi-user.target\n", "\"\"\"\n", " import tempfile\n", " with tempfile.NamedTemporaryFile(mode='w', suffix='.service', delete=False) as tmp:\n", " tmp.write(unit_content)\n", " tmp_path = tmp.name\n", " subprocess.run([\"sudo\", \"cp\", tmp_path, unit_path], check=True)\n", " os.unlink(tmp_path)\n", " subprocess.run([\"sudo\", \"systemctl\", \"daemon-reload\"], check=True)\n", " subprocess.run([\"sudo\", \"systemctl\", \"enable\", unit_name], check=True, capture_output=True)\n", " print(f\" Installed {unit_name} (restores NVMe dirs on boot, waits for mount)\")\n", "\n", " # --- Restart if needed ---\n", " if _need_restart:\n", " print(\"\\n Starting containerd and Docker...\")\n", " subprocess.run([\"sudo\", \"systemctl\", \"start\", \"containerd\"], check=True)\n", " subprocess.run([\"sudo\", \"systemctl\", \"start\", \"docker.socket\"], check=True)\n", " subprocess.run([\"sudo\", \"systemctl\", \"start\", \"docker\"], check=True)\n", "\n", " r = subprocess.run([\"docker\", \"info\", \"--format\", \"{{.DockerRootDir}}\"],\n", " capture_output=True, text=True)\n", " print(f\"\\n Docker data-root: {r.stdout.strip()}\")\n", " target = os.readlink(containerd_link) if os.path.islink(containerd_link) else containerd_link\n", " print(f\" Containerd root: {target}\")\n", " print(f\"\\n Storage configuration complete.\")\n", "else:\n", " if not STORAGE_ROOT and root_free >= MIN_ROOT_FREE_GB:\n", " print(\"Skipping storage move.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Docker Login\n", "\n", "Authenticate with the NVIDIA Container Registry (`nvcr.io`) to pull deployment images." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess\n", "\n", "result = subprocess.run(\n", " [\"docker\", \"login\", \"nvcr.io\",\n", " \"--username\", \"$oauthtoken\",\n", " \"--password\", NGC_CLI_API_KEY],\n", " capture_output=True, text=True\n", ")\n", "if result.returncode == 0:\n", " print(\"Docker login to nvcr.io: OK\")\n", "else:\n", " print(f\"Docker login FAILED:\\n{result.stderr}\")\n", " raise RuntimeError(\"Docker login to nvcr.io failed\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Get Deployment Code\n", "\n", "This cell locates the deployment code (scripts, compose files, configs). There are two ways to get it onto the server:\n", "\n", "### Option 1 — Brev Launchable (automatic)\n", "\n", "When configured as a Brev Launchable, the git repository is cloned onto the instance automatically. Set `DEPLOY_SOURCE_PATH` in Section 1 to the path where Brev placed it (typically `~/video-search-and-summarization`).\n", "\n", "### Option 2 — Manual tarball\n", "\n", "If the code isn't already on the server, create a tarball from your local checkout and copy it over:\n", "\n", "```bash\n", "# On your local machine:\n", "cd /path/to/video-search-and-summarization\n", "tar czf ~/vss-deploy-3.1.0.tar.gz --exclude='.git' .\n", "scp ~/vss-deploy-3.1.0.tar.gz @:~/\n", "\n", "# On the server:\n", "mkdir -p ~/video-search-and-summarization\n", "tar xzf ~/vss-deploy-3.1.0.tar.gz -C ~/video-search-and-summarization\n", "```\n", "\n", "Then set in **Section 1**: `DEPLOY_SOURCE_PATH = \"/home//video-search-and-summarization\"`\n", "\n", "---\n", "\n", "If `DEPLOY_SOURCE_PATH` is set, uses the code at that path directly. Otherwise, attempts to clone from GitHub (requires the repo to be accessible)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, os\n", "\n", "if DEPLOY_SOURCE_PATH:\n", " # --- Use pre-extracted local repo ---\n", " REPO_DIR = DEPLOY_SOURCE_PATH\n", " print(f\"Using local deployment source: {REPO_DIR}\")\n", "else:\n", " # --- Clone from GitHub ---\n", " DEPLOY_DIR = os.path.expanduser(\"~/deployments\")\n", " REPO_DIR = os.path.join(DEPLOY_DIR, \"video-search-and-summarization\")\n", " GITHUB_REPO = \"https://github.com/NVIDIA-AI-IOT/video-search-and-summarization.git\"\n", "\n", " os.makedirs(DEPLOY_DIR, exist_ok=True)\n", "\n", " if os.path.isdir(REPO_DIR):\n", " print(f\"Repo already exists at {REPO_DIR}\")\n", " print(f\"Fetching latest and checking out {GIT_BRANCH}...\")\n", " subprocess.run([\"git\", \"fetch\", \"--all\", \"--prune\"], cwd=REPO_DIR, check=True,\n", " capture_output=True)\n", " result = subprocess.run([\"git\", \"checkout\", GIT_BRANCH], cwd=REPO_DIR,\n", " capture_output=True, text=True)\n", " if result.returncode != 0:\n", " raise RuntimeError(f\"Failed to checkout {GIT_BRANCH}:\\n{result.stderr}\")\n", " subprocess.run([\"git\", \"pull\", \"--ff-only\"], cwd=REPO_DIR, check=False,\n", " capture_output=True)\n", " else:\n", " print(f\"Cloning from GitHub (branch: {GIT_BRANCH})...\")\n", " result = subprocess.run(\n", " [\"git\", \"clone\", \"--branch\", GIT_BRANCH, \"--single-branch\", GITHUB_REPO, REPO_DIR],\n", " capture_output=True, text=True\n", " )\n", " if result.returncode != 0:\n", " raise RuntimeError(f\"Clone failed:\\n{result.stderr}\")\n", " print(\"Clone complete.\")\n", "\n", "# Validate repo structure\n", "SCRIPT_DIR = os.path.join(REPO_DIR, \"scripts\")\n", "assert os.path.isfile(os.path.join(SCRIPT_DIR, \"dev-profile.sh\")), \\\n", " f\"dev-profile.sh not found in {SCRIPT_DIR}\"\n", "assert os.path.isdir(os.path.join(REPO_DIR, \"deployments\")), \\\n", " f\"deployments/ not found in {REPO_DIR}\"\n", "\n", "# Show commit info (if it's a git repo)\n", "commit = \"(not a git repo)\"\n", "branch = \"\"\n", "if os.path.isdir(os.path.join(REPO_DIR, \".git\")):\n", " commit = subprocess.run(\n", " [\"git\", \"log\", \"--oneline\", \"-1\"],\n", " cwd=REPO_DIR, capture_output=True, text=True\n", " ).stdout.strip()\n", " branch = subprocess.run(\n", " [\"git\", \"branch\", \"--show-current\"],\n", " cwd=REPO_DIR, capture_output=True, text=True\n", " ).stdout.strip()\n", "\n", "print(f\"\\nRepo: {REPO_DIR}\")\n", "if branch:\n", " print(f\"Branch: {branch}\")\n", "print(f\"Commit: {commit}\")\n", "print(f\"Scripts: {SCRIPT_DIR}\")\n", "print(f\"\\nContents of deployments/:\")\n", "for entry in sorted(os.listdir(os.path.join(REPO_DIR, \"deployments\"))):\n", " print(f\" {entry}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. Detect Network Configuration\n", "\n", "Auto-detects internal (`HOST_IP`) and external (`EXTERNAL_IP`) addresses. On NAT'd cloud instances (Brev, AWS), these are different — the internal IP is used for inter-container communication while the external IP is used for browser access.\n", "\n", "If auto-detection fails or gives the wrong result, set the overrides in Section 1." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": "import subprocess, os\n\ndef detect_internal_ip():\n \"\"\"Detect internal IP via ip route (same method as dev-profile.sh).\"\"\"\n try:\n out = subprocess.run(\n [\"bash\", \"-c\", \"ip route get 1.1.1.1 | awk '/src/ {for (i=1;i<=NF;i++) if ($i==\\\"src\\\") print $(i+1)}'\"],\n capture_output=True, text=True, timeout=5\n )\n return out.stdout.strip()\n except Exception:\n return \"\"\n\ndef detect_external_ip():\n \"\"\"Detect external IP via public service.\"\"\"\n for cmd in [\"curl -s --max-time 5 ifconfig.me\", \"curl -s --max-time 5 icanhazip.com\"]:\n try:\n out = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=10)\n ip = out.stdout.strip()\n if ip:\n return ip\n except Exception:\n continue\n return \"\"\n\ndef read_etc_environment():\n \"\"\"Read key=value pairs from /etc/environment (Brev sets BREV_ENV_ID there).\"\"\"\n env = {}\n try:\n with open(\"/etc/environment\") as f:\n for line in f:\n line = line.strip()\n if \"=\" in line and not line.startswith(\"#\"):\n key, _, value = line.partition(\"=\")\n env[key.strip()] = value.strip().strip('\"')\n except FileNotFoundError:\n pass\n return env\n\nHOST_IP = HOST_IP_OVERRIDE or detect_internal_ip()\nEXTERNAL_IP = EXTERNAL_IP_OVERRIDE or detect_external_ip()\n\nprint(f\"Internal IP (HOST_IP): {HOST_IP}\")\nprint(f\"External IP: {EXTERNAL_IP}\")\n\nif HOST_IP == EXTERNAL_IP:\n print(\"\\nInternal == External (direct connection, no NAT)\")\nelse:\n print(\"\\nNAT detected — internal and external IPs differ.\")\n print(\"The deploy script will set EXTERNAL_IP automatically.\")\n\nif not HOST_IP:\n print(\"\\nWARNING: Could not detect internal IP. Set HOST_IP_OVERRIDE in Section 1.\")\nif not EXTERNAL_IP:\n print(\"\\nWARNING: Could not detect external IP. Set EXTERNAL_IP_OVERRIDE in Section 1.\")\n\n# --- Brev Secure Links ---\n# On Brev, all browser-facing traffic routes through an nginx reverse proxy\n# on a single port (default 7777). This avoids CORS issues with Cloudflare\n# Access when each port gets its own hostname.\n# Check os.environ first, then fall back to /etc/environment (Jupyter kernels\n# may not inherit /etc/environment depending on how the notebook server starts).\n_etc_env = read_etc_environment()\nBREV_ENV_ID = os.environ.get(\"BREV_ENV_ID\") or _etc_env.get(\"BREV_ENV_ID\", \"\")\nif BREV_ENV_ID:\n # Ensure it's in os.environ so dev-profile.sh picks it up\n os.environ[\"BREV_ENV_ID\"] = BREV_ENV_ID\n proxy_port = os.environ.get(\"PROXY_PORT\", \"7777\")\n # Brev launchables create secure links with a \"0\" suffix on the port name\n # (e.g. port 7777 → \"77770-xxx.brevlab.com\"). Set BREV_LINK_PREFIX to\n # override if your setup differs (e.g. manually created links use \"7777\").\n brev_link_prefix = os.environ.get(\"BREV_LINK_PREFIX\", f\"{proxy_port}0\")\n os.environ[\"BREV_LINK_PREFIX\"] = brev_link_prefix\n brev_ui_url = f\"https://{brev_link_prefix}-{BREV_ENV_ID}.brevlab.com\"\n print(f\"\\n=== Brev Environment Detected ===\")\n print(f\" BREV_ENV_ID: {BREV_ENV_ID}\")\n print(f\" Secure link prefix: {brev_link_prefix} (set BREV_LINK_PREFIX to override)\")\n print(f\" All browser-facing URLs route through nginx proxy (port {proxy_port})\")\n print(f\" UI will be available at: {brev_ui_url}\")\nelse:\n BREV_ENV_ID = \"\" # ensure defined for later cells" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8. Deploy Profile\n", "\n", "This is the main deployment cell. It runs `dev-profile.sh up` with all the configuration from Section 1 and the network settings from Section 7.\n", "\n", "This will:\n", "- Generate environment files\n", "- Download required models from NGC\n", "- Pull and build Docker images\n", "- Start all containers\n", "\n", "**This cell takes 10-30 minutes** depending on network speed and whether images are cached.\n", "\n", "The cell shows a live progress summary. Full output is captured to `~/deploy_vss.log` — if something fails, check that file for details." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, os, re, time, datetime\n", "from IPython.display import display, clear_output, HTML\n", "\n", "LOG_FILE = os.path.expanduser(\"~/deploy_vss.log\")\n", "\n", "# Build the dev-profile.sh command\n", "# NGC_CLI_API_KEY is passed via environment (no longer a CLI flag)\n", "cmd = [\n", " \"bash\", os.path.join(SCRIPT_DIR, \"dev-profile.sh\"), \"up\",\n", " \"--profile\", PROFILE,\n", " \"--hardware-profile\", HARDWARE_PROFILE,\n", " \"--host-ip\", HOST_IP,\n", "]\n", "\n", "if EXTERNAL_IP and EXTERNAL_IP != HOST_IP:\n", " cmd += [\"--external-ip\", EXTERNAL_IP]\n", "\n", "if USE_REMOTE_LLM:\n", " cmd += [\"--use-remote-llm\"]\n", "\n", "if USE_REMOTE_VLM:\n", " cmd += [\"--use-remote-vlm\"]\n", "\n", "if PROFILE == \"alerts\":\n", " cmd += [\"--mode\", ALERTS_MODE]\n", "\n", "# Print the command\n", "display_cmd = \" \".join(cmd)\n", "print(f\"Command: {display_cmd}\")\n", "print(f\"Full log: {LOG_FILE}\\n\")\n", "\n", "# --- Phase detection and filtering ---\n", "\n", "# Lines matching these patterns are noise — suppress them\n", "SUPPRESS_PATTERNS = [\n", " re.compile(r\"^(\\s*[\\u2800-\\u28FF]|⠋|⠙|⠹|⠸|⠴|⠦|⠧|⠏)\"), # NGC spinner frames\n", " re.compile(r\"^\\s*M\\u2026|^\\s*$\"), # Truncated NGC progress fragments\n", " re.compile(r\"^\\s*#\\d+\\s+sha256:\"), # Docker buildkit layer download/extract progress\n", " re.compile(r\"^\\s*#\\d+\\s+extracting\\s\"), # Docker buildkit layer extraction\n", " re.compile(r\"^\\s*#\\d+\\s+\\.\\.\\.\"), # Docker buildkit continuation\n", " re.compile(r\"^time=.*level=warning\"), # Docker compose unset variable warnings\n", " re.compile(r\"^WARNING! Using --password\"), # Docker login warning\n", " re.compile(r\"Login Succeeded\"), # Docker login success (we print our own)\n", " re.compile(r\"^\\s*Getting files to download\"), # NGC download preamble\n", " re.compile(r\"^\\s*━\"), # NGC progress bars\n", " re.compile(r\"^\\s*[0-9a-f]{12}\\s+(Downloading|Extracting|Waiting|Verifying|Pull complete)\"), # Docker layer progress\n", "]\n", "\n", "# Lines matching these indicate phase transitions — always show\n", "PHASE_PATTERNS = [\n", " (re.compile(r\"\\[INFO\\] Generating environment\"), \"Generating environment\"),\n", " (re.compile(r\"\\[INFO\\] Downloading.*models\"), \"Downloading models from NGC\"),\n", " (re.compile(r\"\\[INFO\\].*models downloaded\"), \"Models downloaded\"),\n", " (re.compile(r\"\\[INFO\\] Logging into nvcr\"), \"Docker login\"),\n", " (re.compile(r\"\\[INFO\\] Starting docker compose\"), \"Starting Docker Compose\"),\n", " (re.compile(r\"\\[INFO\\] State up completed\"), \"Deployment complete\"),\n", "]\n", "\n", "# Image pull tracking — service-level \"Pulling \" / \"Pulled \"\n", "PULLING_RE = re.compile(r\"^\\s*Pulling\\s+(\\S+)\")\n", "PULLED_RE = re.compile(r\"^\\s*Pulled\\s+(\\S+)\")\n", "\n", "# Image build tracking — \"#N [service-name step/total] COMMAND\" / \"#N DONE Ns\"\n", "BUILD_STEP_RE = re.compile(r\"^\\s*#\\d+\\s+\\[(\\S+)\\s+(\\d+/\\d+)\\]\")\n", "BUILD_DONE_RE = re.compile(r\"^\\s*#\\d+\\s+DONE\\s+[\\d.]+s\")\n", "IMAGE_BUILT_RE = re.compile(r\"^\\s*Image\\s+(\\S+)\\s+Built\")\n", "\n", "# Container lifecycle — track creating/starting/healthy\n", "CONTAINER_RE = re.compile(r\"^\\s*Container\\s+(\\S+)\\s+(Creating|Created|Starting|Started|Healthy|Waiting|Exited.*)\")\n", "\n", "phases_seen = []\n", "images_pulling = set() # images we've seen \"Pulling\" for\n", "images_pulled = set() # images we've seen \"Pulled\" for\n", "builds = {} # service -> \"step/total\" for active builds\n", "builds_done = set() # services that finished building\n", "containers = {}\n", "errors = []\n", "start_time = time.time()\n", "\n", "def elapsed():\n", " s = int(time.time() - start_time)\n", " return f\"{s // 60}m {s % 60:02d}s\"\n", "\n", "def print_status():\n", " clear_output(wait=True)\n", " print(f\"Command: {display_cmd}\")\n", " print(f\"Full log: {LOG_FILE}\\n\")\n", "\n", " # Phases\n", " for p in phases_seen:\n", " print(f\" [done] {p}\")\n", " if phases_seen:\n", " print()\n", "\n", " # Image pull progress\n", " if images_pulling:\n", " total = len(images_pulling)\n", " done = len(images_pulled)\n", " if done < total:\n", " still_pulling = sorted(images_pulling - images_pulled)\n", " print(f\" Pulling images: {done}/{total} complete ({elapsed()})\")\n", " for img in still_pulling:\n", " print(f\" {img:<45s} pulling...\")\n", " print()\n", " else:\n", " print(f\" Pulling images: {total}/{total} complete\\n\")\n", "\n", " # Image build progress\n", " active_builds = {s: step for s, step in builds.items() if s not in builds_done}\n", " if builds:\n", " done_count = len(builds_done)\n", " total_count = len(builds)\n", " if active_builds:\n", " print(f\" Building images: {done_count}/{total_count} complete ({elapsed()})\")\n", " for svc, step in sorted(active_builds.items()):\n", " print(f\" {svc:<45s} [{step}]\")\n", " print()\n", " else:\n", " print(f\" Building images: {total_count}/{total_count} complete\\n\")\n", "\n", " # Container summary\n", " if containers:\n", " healthy = sum(1 for s in containers.values() if s == \"Healthy\")\n", " started = sum(1 for s in containers.values() if s in (\"Started\", \"Healthy\"))\n", " total = len(containers)\n", " print(f\" Containers: {started}/{total} started, {healthy}/{total} healthy ({elapsed()})\")\n", "\n", " # Show containers that aren't healthy yet\n", " pending = {n: s for n, s in containers.items() if s != \"Healthy\" and s not in (\"Exited\",)}\n", " if pending:\n", " # Only show non-trivial pending (skip init containers that exited)\n", " waiting = {n: s for n, s in pending.items() if \"Exited\" not in s}\n", " if waiting:\n", " print()\n", " for name, status in sorted(waiting.items()):\n", " print(f\" {name:<45s} {status}\")\n", " print()\n", "\n", " # Errors\n", " for e in errors:\n", " print(f\" ERROR: {e}\")\n", "\n", "# Run the process\n", "process = subprocess.Popen(\n", " cmd,\n", " stdout=subprocess.PIPE,\n", " stderr=subprocess.STDOUT,\n", " text=True,\n", " bufsize=1,\n", " cwd=SCRIPT_DIR,\n", " env={**os.environ, \"NGC_CLI_API_KEY\": NGC_CLI_API_KEY}\n", ")\n", "\n", "last_refresh = 0\n", "with open(LOG_FILE, \"w\") as log:\n", " for line in process.stdout:\n", " log.write(line)\n", " log.flush()\n", " stripped = line.rstrip()\n", "\n", " # Capture errors\n", " if \"[ERROR]\" in stripped:\n", " errors.append(stripped)\n", " print_status()\n", " continue\n", "\n", " # Track image pulls (before suppression check)\n", " m_pulling = PULLING_RE.match(stripped)\n", " if m_pulling:\n", " images_pulling.add(m_pulling.group(1))\n", " now = time.time()\n", " if now - last_refresh > 2:\n", " last_refresh = now\n", " print_status()\n", " continue\n", "\n", " m_pulled = PULLED_RE.match(stripped)\n", " if m_pulled:\n", " images_pulled.add(m_pulled.group(1))\n", " now = time.time()\n", " if now - last_refresh > 2:\n", " last_refresh = now\n", " print_status()\n", " continue\n", "\n", " # Track image builds\n", " m_build = BUILD_STEP_RE.match(stripped)\n", " if m_build:\n", " svc, step = m_build.group(1), m_build.group(2)\n", " builds[svc] = step\n", " now = time.time()\n", " if now - last_refresh > 2:\n", " last_refresh = now\n", " print_status()\n", " continue\n", "\n", " m_built = IMAGE_BUILT_RE.match(stripped)\n", " if m_built:\n", " svc = m_built.group(1)\n", " builds_done.add(svc)\n", " now = time.time()\n", " if now - last_refresh > 2:\n", " last_refresh = now\n", " print_status()\n", " continue\n", "\n", " # Suppress noise\n", " if any(p.search(stripped) for p in SUPPRESS_PATTERNS):\n", " continue\n", "\n", " # Detect phase transitions\n", " for pattern, label in PHASE_PATTERNS:\n", " if pattern.search(stripped):\n", " if label not in phases_seen:\n", " phases_seen.append(label)\n", " print_status()\n", " break\n", "\n", " # Track container lifecycle\n", " m = CONTAINER_RE.match(stripped)\n", " if m:\n", " name, status = m.group(1), m.group(2)\n", " # Normalize \"Exited (0) ...\" to \"Exited\"\n", " if status.startswith(\"Exited\"):\n", " status = \"Exited\"\n", " containers[name] = status\n", " # Refresh display at most every 2 seconds to avoid flicker\n", " now = time.time()\n", " if now - last_refresh > 2:\n", " last_refresh = now\n", " print_status()\n", "\n", "process.wait()\n", "\n", "# Final status\n", "print_status()\n", "print(\"=\" * 50)\n", "if process.returncode == 0 and not errors:\n", " print(f\"Deployment complete in {elapsed()}.\")\n", "else:\n", " print(f\"\\nDeployment FAILED (exit code {process.returncode}).\")\n", " if errors:\n", " print(f\"\\n{len(errors)} error(s) found — see above.\")\n", " print(f\"\\nFull log: {LOG_FILE}\")\n", " print(f\" View with: cat {LOG_FILE}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9. Verify Deployment\n", "\n", "Check that all containers are running and core services are healthy. The health checks poll with retries since some services take a few minutes to fully start." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, time, urllib.request, urllib.error, os\n", "\n", "# Show running containers\n", "print(\"=== Running Containers ===\")\n", "subprocess.run([\"docker\", \"ps\", \"--format\", \"table {{.Names}}\\t{{.Status}}\\t{{.Ports}}\"])\n", "print()\n", "\n", "# Determine proxy port\n", "proxy_port = os.environ.get(\"PROXY_PORT\", \"7777\")\n", "\n", "# Health check endpoints by profile\n", "checks = [\n", " (\"Proxy\", f\"http://localhost:{proxy_port}/health\"),\n", " (\"Agent\", \"http://localhost:8000/health\"),\n", " (\"VST\", \"http://localhost:30888/vst/api/v1/sensor/list\"),\n", " (\"UI\", \"http://localhost:3000\"),\n", "]\n", "if PROFILE in (\"search\", \"alerts\", \"lvs\"):\n", " checks += [\n", " (\"Elasticsearch\", \"http://localhost:9200\"),\n", " (\"Kibana\", \"http://localhost:5601/api/status\"),\n", " ]\n", "if PROFILE == \"alerts\":\n", " checks.append((\"Video Analytics API\", \"http://localhost:8081/livez\"))\n", "\n", "# Poll with retries\n", "MAX_RETRIES = 30\n", "RETRY_INTERVAL = 10\n", "results = {}\n", "\n", "print(f\"=== Health Checks (up to {MAX_RETRIES * RETRY_INTERVAL}s) ===\")\n", "pending = list(checks)\n", "\n", "for attempt in range(1, MAX_RETRIES + 1):\n", " still_pending = []\n", " for name, url in pending:\n", " try:\n", " req = urllib.request.urlopen(url, timeout=5)\n", " results[name] = f\"OK ({req.getcode()})\"\n", " except Exception:\n", " still_pending.append((name, url))\n", " pending = still_pending\n", " if not pending:\n", " break\n", " waiting = \", \".join(n for n, _ in pending)\n", " print(f\" [{attempt}/{MAX_RETRIES}] Waiting for: {waiting}\")\n", " time.sleep(RETRY_INTERVAL)\n", "\n", "for name, url in pending:\n", " results[name] = \"FAILED\"\n", "\n", "print()\n", "all_ok = True\n", "for name, status in results.items():\n", " marker = \"OK\" if \"OK\" in status else \"FAIL\"\n", " if marker == \"FAIL\":\n", " all_ok = False\n", " print(f\" {name:.<30s} {status}\")\n", "\n", "# Check perception container status (no HTTP health endpoint — DeepStream pipeline)\n", "if PROFILE == \"search\":\n", " print()\n", " r = subprocess.run(\n", " [\"docker\", \"ps\", \"--filter\", \"name=perception\", \"--format\", \"{{.Names}}: {{.Status}}\"],\n", " capture_output=True, text=True\n", " )\n", " if r.stdout.strip():\n", " print(\" Perception containers:\")\n", " for line in r.stdout.strip().splitlines():\n", " print(f\" {line}\")\n", " else:\n", " print(\" WARNING: No perception containers found (required for search profile).\")\n", " all_ok = False\n", "\n", "print()\n", "if all_ok:\n", " print(\"All services healthy.\")\n", "else:\n", " print(\"Some services failed to start. Check container logs:\")\n", " print(\" docker compose -p mdx logs \")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10. Access the UI\n", "\n", "Once deployment is verified, open the VSS UI in your browser. Accessing the front-end will open up the chat interface where you can interact with the agent and should look like this:\n", "\n", "![VSS UI Chat Interface](images/vss_ui_chat.png)\n", "\n", "Run the cell below to generate your VSS UI URL.\n", "\n", "**On Brev:** All browser-facing traffic routes through an nginx reverse proxy on a single port (default 7777). Create **one** Brev secure link for port 7777 in the dashboard — no individual port forwarding needed. For the **alerts** and **lvs** profiles, you will also need separate secure links for Kibana and other services (see table below).\n", "\n", "**On other cloud providers:** Depending on your CSP's firewall and security group configuration, you may need to expose or forward ports to access the UI and other services from your browser. The following ports are used by VSS:\n", "\n", "| Port | Service | Profiles | Brev Secure Link |\n", "|------|---------|----------|------------------|\n", "| 7777 | Nginx proxy (consolidates UI, Agent, VST) | all | Required (primary) |\n", "| 3000 | VSS UI | all | Not needed (behind proxy) |\n", "| 8000 | VSS Agent API | all | Not needed (behind proxy) |\n", "| 30888 | VST (Video Storage Toolkit) | all | Not needed (behind proxy) |\n", "| 5601 | Kibana | search, alerts, lvs | Required (separate link) |\n", "| 6006 | Phoenix (LLM tracing/observability) | all | Optional |\n", "| 9200 | Elasticsearch | alerts, lvs | Not needed |\n", "| 8081 | Video Analytics API | alerts | Not needed |\n", "| 31000 | nvstreamer (WebRTC live view) | search, alerts | Required for live camera view |\n", "| 8554 | RTSP (if using test stream) | alerts | Not needed |\n", "\n", "**Brev summary:** For the **base** profile, create 1 secure link (port 7777). For **search**, create secure links for ports 7777, 5601, and 31000. For **alerts** or **lvs**, create secure links for ports 7777, 5601, and 31000 (alerts only). Port 6006 (Phoenix) is optional for debugging.\n", "\n", "If direct access is not possible, use SSH port forwarding or your CSP's port sharing/tunneling feature." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": "import os\n\nif BREV_ENV_ID:\n proxy_port = os.environ.get(\"PROXY_PORT\", \"7777\")\n brev_link_prefix = os.environ.get(\"BREV_LINK_PREFIX\", f\"{proxy_port}0\")\n ui_url = f\"https://{brev_link_prefix}-{BREV_ENV_ID}.brevlab.com\"\n print(f\"VSS UI (via Brev secure link): {ui_url}\")\n print()\n print(\"Setup:\")\n print(f\" 1. Ensure a Brev secure link exists for port {proxy_port}\")\n print(f\" 2. Open: {ui_url}\")\n print()\n print(\"All services (Agent API, VST, UI) are consolidated behind the proxy.\")\n print(\"No individual port forwarding is needed.\")\n if PROFILE in (\"search\", \"alerts\", \"lvs\"):\n print()\n print(f\"=== Additional Secure Links ({PROFILE}) ===\")\n print(\"Create these additional secure links in the Brev dashboard:\")\n print()\n kibana_url = f\"https://56010-{BREV_ENV_ID}.brevlab.com\"\n print(f\" Kibana (port 5601): {kibana_url}\")\n if PROFILE == \"alerts\":\n nvstreamer_url = f\"https://310000-{BREV_ENV_ID}.brevlab.com\"\n print(f\" nvstreamer (port 31000): {nvstreamer_url}\")\n phoenix_url = f\"https://60060-{BREV_ENV_ID}.brevlab.com\"\n print(f\" Phoenix (port 6006): {phoenix_url} (optional, for LLM tracing)\")\nelse:\n ui_url = f\"http://{EXTERNAL_IP or HOST_IP}:3000\"\n print(f\"VSS UI: {ui_url}\")\n print()\n print(\"If the URL is not directly accessible, use one of these methods:\")\n print()\n print(\" SSH port forwarding (works everywhere):\")\n print(f\" ssh -L 3000:localhost:3000 @{EXTERNAL_IP or HOST_IP}\")\n print(f\" Then open: http://localhost:3000\")\n print()\n print(\" VSCode Remote SSH:\")\n print(\" Connect to the instance via Remote-SSH, ports forward automatically.\")\n print()\n if PROFILE in (\"search\", \"alerts\", \"lvs\"):\n print(f\" Kibana dashboard: http://{EXTERNAL_IP or HOST_IP}:5601\")\n if PROFILE == \"alerts\":\n print(f\" nvstreamer (live view): http://{EXTERNAL_IP or HOST_IP}:31000\")\n print(f\" Phoenix (LLM tracing): http://{EXTERNAL_IP or HOST_IP}:6006\")" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 11. Next Steps\n", "\n", "Once you can access the VSS frontend, continue the QuickStart example which involves uploading a video, engaging in Q&A, and generating a report: [Quickstart - Upload a Video](https://docs.nvidia.com/vss/3.1.0/quickstart.html#step-2-upload-a-video)\n", "\n", "You can either use your own videos for these examples or download the [VSS Sample Data from NGC](https://docs.nvidia.com/vss/3.1.0/quickstart.html#download-sample-data-from-ngc).\n", "\n", "Once you've gone through the QuickStart example, you can follow **Step 12** in this notebook to deploy different [Agent Workflows](https://docs.nvidia.com/vss/3.1.0/adding-workflows.html).\n", "\n", "**Step 13** provides instructions on stopping the deployment." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 12. Profile-Specific Next Steps\n", "\n", "Quick-start instructions for your deployed profile." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "if PROFILE == \"base\":\n", " print(\"\"\"=== Base Profile — Quick Start ===\n", "\n", "1. Open the VSS UI (see Section 10 for the URL).\n", "\n", "2. Upload a video using the \"Upload\" button in the sidebar.\n", " Supported formats: MP4, AVI, MOV. Wait for the upload to complete.\n", "\n", "3. Once uploaded, select the video and start chatting about it.\n", " Try asking: \"What is happening in this video?\"\n", "\n", "4. The agent uses the VLM to analyze video frames and answers\n", " questions about the content.\n", "\"\"\")\n", "\n", "elif PROFILE == \"search\":\n", " print(\"\"\"=== Search Profile — Quick Start ===\n", "\n", "1. Open the VSS UI and switch to the \"Search\" tab.\n", "\n", "2. Upload a video using the upload button. The video will be\n", " split into chunks and embedded for semantic search. This\n", " takes a few minutes depending on video length.\n", "\n", "3. Once processing completes, use the search bar to find moments:\n", " - \"person walking\"\n", " - \"red car\"\n", " - \"someone carrying a box\"\n", "\n", "4. Click a search result to play the matching video clip.\n", "\n", "5. You can also chat about uploaded videos in the \"Chat\" tab.\n", "\n", "Note: The perception-2d container must be running for the embedding\n", "pipeline. Check with: docker ps | grep perception\n", "\"\"\")\n", "\n", "elif PROFILE == \"alerts\":\n", " print(\"\"\"=== Alerts Profile — Quick Start ===\n", "\n", "1. Open the VSS UI. The alerts profile needs an RTSP camera stream\n", " to generate detections and alerts.\n", "\n", "2. Add a camera sensor:\n", " - Go to the \"Sensors\" or camera management section in the UI\n", " - Add your RTSP stream URL (e.g. rtsp://IP:8554/stream)\n", " - The perception pipeline will begin analyzing the stream\n", "\n", "3. View live detections:\n", " - Open the \"Alerts\" tab to see real-time alerts as they're generated\n", " - Click an alert to view the video clip with bounding boxes\n", "\n", "4. Open the \"Dashboard\" tab to see the Kibana analytics dashboard\n", " with detection statistics, timelines, and heatmaps.\n", "\n", "5. Use the \"Chat\" tab to ask questions about detected events:\n", " - \"What alerts happened in the last hour?\"\n", " - \"How many people were detected today?\"\n", "\"\"\")\n", "\n", "elif PROFILE == \"lvs\":\n", " print(\"\"\"=== LVS Profile — Quick Start ===\n", "\n", "1. Open the VSS UI and upload a video via the sidebar.\n", "\n", "2. Once uploaded, use the chat to request a report:\n", " - \"Generate a report for my_video.mp4\"\n", " - \"Summarize what happens in this video\"\n", "\n", "3. The agent analyzes the full video and generates a structured\n", " report with timeline summaries, detected events, and analytics.\n", "\n", "4. Reports are saved and accessible via the \"Reports\" section.\n", "\"\"\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 13. Stop Deployment\n", "\n", "Stop all containers **without deleting data or volumes**. Use this when you want to:\n", "- Free up GPU/memory resources temporarily\n", "- Change to a different profile (update `PROFILE` in Section 1, then re-run from Section 8)\n", "- Restart the deployment later by re-running Section 8" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess, os, glob\n", "\n", "# Find the generated.env file for the current profile\n", "env_file = None\n", "gen_pattern = os.path.join(REPO_DIR, \"deployments\", \"developer-workflow\", f\"dev-profile-{PROFILE}\", \"generated.env\")\n", "matches = glob.glob(gen_pattern)\n", "if matches:\n", " env_file = matches[0]\n", "\n", "if not env_file or not os.path.isfile(env_file):\n", " print(f\"ERROR: Could not find generated.env at {gen_pattern}\")\n", " print(\"Has the deployment been run at least once (Section 8)?\")\n", " raise FileNotFoundError(gen_pattern)\n", "\n", "print(f\"Using env file: {env_file}\")\n", "print(\"Stopping all VSS containers (preserving data and volumes)...\\n\")\n", "\n", "result = subprocess.run(\n", " [\"docker\", \"compose\", \"--env-file\", env_file,\n", " \"-f\", os.path.join(REPO_DIR, \"deployments\", \"compose.yml\"),\n", " \"-p\", \"mdx\", \"stop\"],\n", " capture_output=True, text=True,\n", " cwd=os.path.join(REPO_DIR, \"deployments\")\n", ")\n", "print(result.stdout)\n", "if result.stderr:\n", " # Filter out the harmless \"variable is not set\" warnings\n", " for line in result.stderr.splitlines():\n", " if \"is not set\" not in line:\n", " print(line)\n", "\n", "if result.returncode == 0:\n", " print(\"\\nAll containers stopped. Re-run Section 8 to start them again.\")\n", "else:\n", " print(f\"\\nStop exited with code {result.returncode}.\")\n", " print(\"You can also stop manually: docker compose -p mdx stop\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 14. Teardown\n", "\n", "Stop all containers and **delete all data** (volumes, models, data directory). Run the cell below when you want to completely remove the deployment.\n", "\n", "This runs `dev-profile.sh down` which stops containers, removes networks, and deletes the data directory. If Docker storage was moved to NVMe (Section 4), volume cleanup requires an extra step because Docker can't remove volumes whose data lives outside its data-root (the symlink trick). The cell handles this automatically." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import subprocess, os, json\n", "\n", "# --- Run dev-profile.sh down ---\n", "print(\"Tearing down VSS deployment...\")\n", "process = subprocess.Popen(\n", " [\"bash\", os.path.join(SCRIPT_DIR, \"dev-profile.sh\"), \"down\"],\n", " stdout=subprocess.PIPE,\n", " stderr=subprocess.STDOUT,\n", " text=True,\n", " cwd=SCRIPT_DIR\n", ")\n", "for line in process.stdout:\n", " print(line, end=\"\")\n", "process.wait()\n", "print(f\"\\nTeardown exit code: {process.returncode}\")\n", "\n", "# --- Clean up stuck volumes ---\n", "# When Docker's data-root is on NVMe but volumes are symlinked back to root,\n", "# `docker volume rm` fails with \"unable to remove a directory outside of the\n", "# local volume root\". Fall back to sudo rm for those, then restart Docker.\n", "\n", "result = subprocess.run([\"docker\", \"volume\", \"ls\", \"-q\"], capture_output=True, text=True)\n", "leftover = result.stdout.strip().splitlines()\n", "\n", "if leftover:\n", " print(f\"\\n{len(leftover)} leftover volume(s). Cleaning up...\")\n", " need_restart = False\n", " for vol in leftover:\n", " r = subprocess.run([\"docker\", \"volume\", \"rm\", \"-f\", vol],\n", " capture_output=True, text=True)\n", " if r.returncode == 0:\n", " print(f\" removed {vol}\")\n", " else:\n", " # Symlinked volume — remove directly from /var/lib/docker/volumes\n", " vol_path = f\"/var/lib/docker/volumes/{vol}\"\n", " r2 = subprocess.run([\"sudo\", \"rm\", \"-rf\", vol_path],\n", " capture_output=True, text=True)\n", " if r2.returncode == 0:\n", " print(f\" rm'd {vol}\")\n", " need_restart = True\n", " else:\n", " print(f\" FAILED {vol}: {r2.stderr.strip()}\")\n", "\n", " if need_restart:\n", " print(\"\\n Restarting Docker to clear volume metadata...\")\n", " subprocess.run([\"sudo\", \"systemctl\", \"restart\", \"docker\"],\n", " capture_output=True, check=True)\n", "\n", " # Verify\n", " result = subprocess.run([\"docker\", \"volume\", \"ls\", \"-q\"], capture_output=True, text=True)\n", " remaining = result.stdout.strip().splitlines()\n", " if remaining:\n", " print(f\"\\n {len(remaining)} volume(s) still stuck:\")\n", " for v in remaining:\n", " print(f\" {v}\")\n", " else:\n", " print(\"\\nAll volumes cleaned up.\")\n", "else:\n", " print(\"\\nAll volumes cleaned up.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# To remove the deployment repo from disk:\n", "# import shutil\n", "# shutil.rmtree(REPO_DIR)\n", "# print(f\"Removed {REPO_DIR}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }