# Privacy and Data Flow This page documents what NBI sends to external services, when, and how administrators can restrict it. NBI is a per-user tool that runs inside your Jupyter Server process — it has no central server of its own and collects no telemetry by default. ## What NBI sends, by provider The table below describes what each LLM provider receives **when you actively use a feature** (chat message, inline completion, agent action). An idle JupyterLab does not contact the provider. | Provider | What is sent | When | Destination | | --------------------------------- | ------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------- | | **GitHub Copilot** | Prompt, surrounding cell source, attached files (when you click _attach_) | Per request (chat) and as you type (inline complete) | `api.githubcopilot.com`, `api.github.com` (auth) | | **OpenAI-compatible** | Prompt, surrounding cell source, attached files | Per request and per inline-completion request | The Base URL you configured (`api.openai.com` by default) | | **LiteLLM-compatible** | Same as OpenAI-compatible; the LiteLLM proxy forwards to the upstream model you configured | Per request | The Base URL of your LiteLLM proxy | | **Ollama (local)** | Prompt, surrounding cell source, attached files | Per request | Localhost (or the host you configured); **no external network** | | **Anthropic API** (Claude mode) | Prompt, surrounding cell source, attached files | Per inline-chat or auto-complete request | `api.anthropic.com` (or your configured Base URL) | | **Claude Code CLI** (Claude mode) | Prompt, working-directory file reads requested by Claude, shell-command output for tools Claude invokes | Per agent turn in the chat panel | Whatever the Claude Code CLI is configured to talk to (typically `api.anthropic.com`) | ### Cell outputs are included when the cell is attached NBI does **not** automatically include rendered cell outputs in every prompt. Outputs go out only when: - You attach a notebook or cell explicitly via the _attach files_ UI. - The active context references a notebook and the agent (or inline chat) reads its source — the `.ipynb` JSON includes any saved outputs. If your cells contain sensitive outputs (PHI, PII, secrets), clear them before invoking AI features, or use a local-only provider (Ollama). Inline completion is keystroke-driven and sends only the cell source; it does not transmit unrelated cells or outputs. ## Egress allowlist Hosts NBI may contact, depending on which features are enabled: | Host | Purpose | | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | | `api.githubcopilot.com` | GitHub Copilot chat and inline completion | | `api.github.com` | GitHub Copilot device-flow login; managed-skills manifest fetches when hosted on github.com; skill imports | | `github.com`, `codeload.github.com` | Skill tarball downloads (Import from GitHub and the managed-skills reconciler) | | `raw.githubusercontent.com` | Manifest fetches when `NBI_SKILLS_MANIFEST` points at a `raw.githubusercontent.com` URL | | `api.anthropic.com` | Anthropic API for Claude-mode inline chat and auto-complete; also the default destination of the Claude Code CLI | | `api.openai.com` | OpenAI-compatible provider (default Base URL) | | Your configured Base URL | OpenAI-compatible, LiteLLM-compatible, or Claude when pointed at a self-hosted endpoint | | `localhost:11434` (or your Ollama host) | Ollama local model serving | | `registry.npmjs.org` and configured npm mirrors | Only if MCP servers are configured to launch via `npx -y` — `npx` fetches the package on first run | For the configurable destinations above (Base URLs, Ollama host, MCP `npx` packages), the destination is whatever you or your admin set. There is no other implicit network activity. For air-gapped or egress-restricted environments, see [`docs/admin-guide.md`](docs/admin-guide.md#air-gap-deployment). ## Data NBI stores locally | Path | Contents | | ------------------------------- | -------------------------------------------------------------------------- | | `~/.jupyter/nbi/config.json` | Provider selection, model choices, API keys (plaintext), MCP server config | | `~/.jupyter/nbi/user-data.json` | Encrypted GitHub Copilot token (when "remember login" is enabled) | | `~/.jupyter/nbi/rules/` | Your ruleset markdown files | | `~/.jupyter/nbi/mcp.json` | MCP server config (if you used the file-based config) | | `~/.claude/skills/` | User-scope Claude skills | | `/.claude/skills/` | Project-scope Claude skills | | `~/.claude/projects/` | Claude Code session transcripts (managed by Claude CLI, not NBI) | > Treat `~/.jupyter/nbi/config.json` and `~/.jupyter/nbi/user-data.json` as secrets. They contain your API keys and (encrypted) GitHub token. Do not commit them to git, share them, or sync them across users. If a key leaks, rotate it at the provider immediately. The encrypted GitHub token uses a default password (`nbi-access-token-password`) unless you set `NBI_GH_ACCESS_TOKEN_PASSWORD`. The default is **shared across installs** and provides obfuscation, not real protection. Set a custom password before enabling "remember login" on any shared or multi-tenant system. NBI logs a per-process WARNING when the default is in use and escalates the message when `~/.jupyter/nbi/` is group/other-accessible. Operators on shared filesystems can set `NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS=1` to refuse the write entirely until a per-user password is configured, with `NBI_ALLOW_DEFAULT_TOKEN_PASSWORD=1` available as an explicit per-pod opt-out during a rollout. ## Telemetry NBI does not collect telemetry, send analytics, or report usage. The `enable_chat_feedback` traitlet (off by default) emits an internal `telemetry` event when a user gives thumbs-up/down feedback in chat. The event is **emitted in-process only** — nothing leaves the process unless you write a custom handler that listens for it. See [`docs/admin-guide.md`](docs/admin-guide.md#chat-feedback-event-hook). ## Reproducibility caveat LLM outputs are non-deterministic. Pinning the model name, temperature, and seed does **not** guarantee identical output across runs — provider-side updates, load balancing, and silent model deprecation can all shift behavior. Treat AI-generated code as a draft to be reviewed, tested, and committed like any other contribution. For research artifacts that need reproducibility, save the exact prompt, model name, and date alongside the generated output. ## Privacy-sensitive deployment recipes For HIPAA, FedRAMP, classroom, or otherwise restricted environments: - **Force local-only models.** Disable every cloud provider via `disabled_providers` and use Ollama. See the [HIPAA / sensitive-data preset](docs/admin-guide.md#hipaa--sensitive-data-preset) in the admin guide. - **Restrict skill imports.** Block egress to `github.com` and serve managed skills from an internal manifest URL. - **Disable "remember GitHub Copilot login"** for shared systems where users share home directories. - **Pre-pull MCP servers** rather than allowing `npx -y` (which downloads from npmjs). ## Reporting privacy issues Email `mbektasgh@outlook.com` with details. Privacy concerns are treated like security issues — see [SECURITY.md](SECURITY.md).