# 4DA Network Manifest **This document lists every outbound network connection 4DA makes — nothing else leaves your machine.** Run Wireshark while 4DA is running and every host you see is accounted for below. Your raw project files, source code, git history, file contents, and personal information **never leave your machine**. The single flow that transmits anything *derived* from your local content is the optional dependency-vulnerability sync, which sends only **package names** (not code) to OSV — it is disclosed explicitly in the "Conditional / opt-in" section and can be turned off. 4DA is local-first and privacy-first. There is **no telemetry, no analytics, and no crash reporting** (see "Never" at the bottom). The only 4DA-operated endpoint the app ever contacts is a user-initiated license-recovery lookup at `4da.ai`. Everything else is either a public third-party content API, a cloud service you explicitly configured (BYOK LLM / translation), or a one-time setup download. --- ## HTTP client identity All shared outbound requests use a pooled `reqwest` client (`src-tauri/src/http_client.rs`, and `src-tauri/src/sources/mod.rs::SHARED_CLIENT`) with: - **User-Agent:** `Mozilla/5.0 (compatible; desktop-app)` — the default for source fetching, connectivity checks, license validation, and article scraping. - **Exceptions (purpose-built User-Agents):** - crates.io adapter → `4DA-Developer-OS/1.0 (https://4da.ai)` (`sources/crates_io.rs`) - HuggingFace / PapersWithCode / GitHub Advisory adapters → `4DA-Developer-OS/1.0` (`sources/huggingface.rs`, `sources/papers_with_code.rs`, `sources/cve.rs`) - OSV dependency sync → `4DA/1.0 (local-osv-mirror)` (`osv/sync.rs`) - Team relay (only if you join/create a team) → `4DA-TeamSync/1.0` (`http_client.rs`) - No cookies, no login, no tracking headers are sent on any request. --- ## 1. Always-on These run automatically while the app is open. Every source can be individually enabled/disabled in **Settings → Sources**. ### 1a. Connectivity pre-check (before every fetch cycle) Before fetching sources, 4DA races a `HEAD` request against three targets and uses whichever responds first (`src-tauri/src/source_fetching/fetcher.rs`): | Host | Request | |---|---| | `1.1.1.1` (Cloudflare) | `HEAD https://1.1.1.1/cdn-cgi/trace` | | `dns.google` (Google DNS) | `HEAD https://dns.google/resolve?name=example.com` | | `httpbin.org` | `HEAD https://httpbin.org/get` | - **Trigger:** once per analysis cycle, before source fetching begins. - **Data sent:** `HEAD` only — no body, no user data. - **If offline:** 4DA falls back to cached content and keeps working. - **Disable:** disabling all sources stops fetch cycles, which stops this check. ### 1b. Content source fetches All retrieve **public** developer content. Trigger cadence is the default fetch interval per source; "deep fetch" is a larger one-time pull on first run. | Source | Host | Endpoint(s) | Trigger | Auth / data sent | |---|---|---|---|---| | Hacker News | `hacker-news.firebaseio.com` | `/v0/topstories.json`, `/v0/{new,best,ask,show}stories.json` (deep), `/v0/item/{id}.json` | auto | None | | HN fallback | `hn.algolia.com`, `hnrss.org` | `hn.algolia.com/api/v1/search?tags=front_page`, `hnrss.org/frontpage` | only if HN primary fails | None | | Reddit | `www.reddit.com` | `/r/{subreddit}/hot.json?limit={n}` | auto | None (public JSON) | | Reddit fallback | `www.reddit.com` | `/r/programming/.rss` | only if Reddit primary fails | None | | arXiv | `export.arxiv.org` | `/api/query?search_query={categories}&...` | auto | None | | arXiv fallback | `arxiv.org` | `/rss/cs.AI` | only if arXiv primary fails | None | | GitHub | `api.github.com` | `/search/repositories?q={query}&sort=stars`, `/repos/{owner}/{repo}/readme` | auto | Unauthenticated. Query holds language names + a date filter — no user data | | GitHub fallback | `github.com` | `/trending?since=daily` | only if GitHub primary fails | None | | RSS feeds | 12 default hosts (below) | `GET {feed_url}` | auto | None | | YouTube | `www.youtube.com` | `/feeds/videos.xml?channel_id={id}` (public Atom) | auto | None | | Dev.to | `dev.to` | `/api/articles?per_page=30&top=7` (+ tag pulls on deep fetch) | auto | None | | Lobsters | `lobste.rs` | `/hottest.json`, `/newest.json` (deep) | auto | None | | Product Hunt | `www.producthunt.com` | `/feed` | auto | None | | Bluesky | `public.api.bsky.app` | `/xrpc/app.bsky.feed.getFeed?feed=...whats-hot` | auto | None (public "What's Hot", no auth) | | crates.io | `crates.io` | `/api/v1/...` | auto | None (UA `4DA-Developer-OS/1.0`) | | npm | `registry.npmjs.org` | `/{package}` | auto | None | | PyPI | `pypi.org` | `/pypi/{package}/json` | auto | None | | HuggingFace | `huggingface.co` | `/api/models` | auto | None | | PapersWithCode | `huggingface.co` | `/api/daily_papers` (PwC API now redirects here) | auto | None | | Stack Overflow | `api.stackexchange.com` | `/2.3/questions?...&site=stackoverflow&tagged={tag}` | auto | None | | Go modules | `index.golang.org` | `/index?limit={n}` | auto | None | **RSS default hosts** (`sources/rss.rs`, user-customizable): `feeds.arstechnica.com`, `www.theverge.com`, `techcrunch.com`, `blog.rust-lang.org`, `engineering.fb.com`, `medium.com`, `github.blog`, `blog.cloudflare.com`, `martinfowler.com`, `simonwillison.net`, `jvns.ca`, `danluu.com`. - **Disable any source:** Settings → Sources. A disabled source makes **zero** network calls. ### 1c. Article scraping After fetching items, 4DA scrapes the linked article URL to extract **text only** (no images or media) for HN, Reddit link posts, Lobsters, and RSS items. - **Hosts:** any domain linked by the sources above. - **Data sent:** a plain `GET` with the default User-Agent. No cookies, no login. - **Rate limit:** ~100 ms between requests; 2–10 s per-article timeout. ### 1d. Model-pricing refresh On startup, 4DA refreshes its cost/context-window table for known LLM models (`src-tauri/src/model_registry.rs::refresh_registry`, called from `app_setup.rs`): - **Host:** `raw.githubusercontent.com` - **Endpoint:** `GET .../BerriAI/litellm/main/model_prices_and_context_window.json` - **Trigger:** once at startup, fire-and-forget, **at most once per 24 h** (cached otherwise). - **Data sent:** plain `GET`, no user data, no parameters. Falls back to bundled/cached pricing on failure. - **Disable:** offline operation skips it silently; the bundled table is used. --- ## 2. Conditional / opt-in None of these run unless you take an action (configure a key, join a team, or have local dependencies discovered). ### 2a. BYOK cloud LLM (relevance judging) Only contacted when you configure an LLM API key. Default is **local** (Ollama / in-process embeddings) — with no key configured, zero LLM network calls leave the machine. (`src-tauri/src/llm.rs`, `src-tauri/src/llm_judge.rs`) | Provider | Host | Endpoint | |---|---|---| | Anthropic | `api.anthropic.com` | `POST /v1/messages` | | OpenAI | `api.openai.com` | `POST /v1/chat/completions`, `POST /v1/embeddings` | | OpenAI-compatible (your `base_url`) | e.g. `api.groq.com`, `api.mistral.ai`, plus front-end-offered OpenRouter / Together / DeepSeek | `POST {base_url}/chat/completions` | | Ollama (local) | `localhost:11434` (configurable) | `POST /api/chat`, `/api/embed`, `/api/embeddings`, `GET /api/version`, `/api/tags`, `POST /api/pull` | - **What is sent:** the system prompt (scoring rubric) plus item **titles and content snippets**. Content per item is **capped at 2000 characters** (`llm_judge.rs`). **No raw project code, file contents, or git history is ever sent.** - **Privacy control:** Settings → Privacy → `llm_content_level`. Set to `titles_only` to send titles with **no** snippet body; default `full` sends the 2000-char-capped snippet. - **Auth:** your key, sent as `x-api-key` (Anthropic) or `Authorization: Bearer` (OpenAI-compatible). Keys are stored only on your machine (keychain) and never sent anywhere but the provider you chose. - **Retention (zero-retention defaults):** first-party **OpenAI** requests send `store: false`, opting out of OpenAI storing the completion for their dashboard/retrieval. **Anthropic** has no per-request retention control — zero-data-retention is an account-level agreement you make with Anthropic. **OpenAI-compatible** providers (Groq, Mistral, OpenRouter, …) are governed by each provider's own data policy; `store` is OpenAI-specific and is deliberately *not* sent to them (they may reject unknown fields, and it would have no effect). In every case your data goes only to the provider you chose — never to 4DA. - **Fallback:** on a cloud failure (network, not rate-limit) 4DA falls back to local Ollama. - **Disable:** remove the API key / select Ollama as the provider. ### 2b. Cloud translation (consent-gated) Off by default. Only used if you enable a cloud translation provider and supply its key (`src-tauri/src/translation_providers.rs`). | Provider | Host | Endpoint | |---|---|---| | DeepL | `api-free.deepl.com` / `api.deepl.com` | `POST /v2/translate` | | Microsoft Translator | `api.cognitive.microsofttranslator.com` | `POST /translate` | | Google Translate | `translation.googleapis.com` | `POST /language/translate/v2` | - **What is sent:** the text you asked to translate, plus your key. - **Disable:** leave cloud translation off (the default); local translation paths send nothing. ### 2c. Dependency-vulnerability sync (OSV) — discloses local package names Runs only **if 4DA has discovered local dependencies** (from your projects via ACE / lockfile scanning). This is the **one flow that transmits data derived from your local content** — it sends the **names of your packages and their ecosystems** (e.g. `tokio` / `crates.io`), **never your code or file contents**. (`src-tauri/src/osv/`) | Host | Endpoint | Data sent | |---|---|---| | `api.osv.dev` | `POST /v1/querybatch` | Your discovered package names + ecosystems, batched (≤1000/batch) | | `osv-vulnerabilities.storage.googleapis.com` | `GET /{ecosystem}/all.zip` | Plain `GET` — downloads the public advisory mirror (no user data) | - **Trigger:** background, at startup, **only if dependencies exist** and the local mirror was last synced **> 6 hours ago** (`app_setup.rs`). - **Disable:** if 4DA has discovered no dependencies, nothing is sent. Removing/disabling ACE project scanning prevents dependency discovery and therefore this sync. ### 2d. License validation (Keygen) Only contacted when you enter a license key (`src-tauri/src/settings/license/keygen.rs`). - **Host:** `api.keygen.sh` - **Endpoint:** `POST /v1/accounts/runyourempirehq/licenses/actions/validate-key` - **Data sent:** `{"meta":{"key": }}` — the **license key only**. No device fingerprint, no machine identifiers, no usage data, no telemetry. - **Offline:** if unreachable, the current tier is preserved (never downgraded). Successful validations cache locally for 24 h. - **Offline signature check:** signed keys are *also* verified locally against an embedded ed25519/minisign key — purely cryptographic, no server contact. - **Disable:** don't enter a license key (free tier makes no call). ### 2e. License recovery (user-initiated — the only 4DA-operated endpoint) (`src-tauri/src/settings_commands_license.rs`) - **Host:** `4da.ai` — **the only 4DA-operated server the app ever contacts.** - **Endpoint:** `GET https://4da.ai/api/streets/activate?email={email}` - **Trigger:** only when you click "Recover License by Email" in Settings. - **Data sent:** your purchase email (query parameter). Returns the license key + tier. ### 2f. App updates (Tauri updater) (`src-tauri/tauri.conf.json` → `plugins.updater`) - **Host:** `github.com` (GitHub Releases) - **Endpoint:** `GET /runyourempire/4DA/releases/latest/download/latest.json` - **Trigger:** shortly after startup, once per session; silent on failure. - **Data sent:** plain `GET` — no version reporting, no device info. Updates are **minisign-verified** against the embedded public key before install; the user must click to apply. ### 2g. Team relay (only if you create/join a team) (`src-tauri/src/team_sync_scheduler.rs`) — there is **no hardcoded relay host**. The relay is a **user-configured `relay_url`**; encrypted metadata (XChaCha20-Poly1305) is synced to it only if you set up a team. With no team configured, zero calls are made. Disable by not joining a team. ### 2h. Developer Toolkit HTTP probe (manual only) A user-triggered tool to test API endpoints. **Never automatic.** Restricted to an allowlist (`src-tauri/src/toolkit_http.rs`): `api.openai.com`, `api.anthropic.com`, `generativelanguage.googleapis.com`, `localhost`/`127.0.0.1`/`0.0.0.0`, `api.keygen.sh`, `hacker-news.firebaseio.com`, `www.reddit.com`, `oauth.reddit.com`, `api.github.com`, `api.x.com`, `export.arxiv.org`, `www.youtube.com`, `lobste.rs`, `dev.to`, `www.producthunt.com`. Anything off the allowlist is rejected. ### 2i. Twitter / X (BYOK) (`src-tauri/src/sources/twitter.rs`) — completely silent unless you provide an X API Bearer Token. - **Host:** `api.x.com` - **Endpoints:** `/2/users/by/username/{handle}`, `/2/users/{id}/tweets`, `/2/tweets/search/recent` (deep fetch). - **Auth:** your Bearer Token, sent only to `api.x.com`. No key = zero calls. --- ## 3. Setup-time (one-time downloads) Embeddings run **in-process via fastembed (ONNX Runtime) with zero network by default**. If the ONNX runtime and model weights are not already bundled/cached, they are fetched **once** on first use (`src-tauri/src/embeddings_providers/fastembed.rs`): | What | Host | Endpoint | |---|---|---| | ONNX Runtime library | `github.com` | `/microsoft/onnxruntime/releases/download/v1.24.2/onnxruntime-{platform}.{zip,tgz}` | | Embedding model weights (`snowflake-arctic-embed-m`, ~220 MB) | HuggingFace Hub CDN (`huggingface.co` + its LFS/Xet CDN) | downloaded by the fastembed/hf-hub client on first init if not bundled | - **Trigger:** first embedding init only, and only if not already bundled in the install or cached. - **After setup:** embeddings are 100% local — zero network. If no provider is available at all, scoring degrades to keyword-only (still no network). --- ## 4. Never 4DA does **not** do any of the following — verifiable in source and via Wireshark: - **No crash reporting.** The previous third-party crash reporter (Sentry) was **removed entirely**. In its place, you can export a **local, scrubbed diagnostic bundle on demand** (Settings → Privacy → Export diagnostics). It is assembled from the on-device log tail, scrubbed of usernames and secret-shaped tokens, **written to disk**, and **never transmitted** — you choose whether to attach it to a bug report (`src-tauri/src/diagnostics.rs`). - **No telemetry / analytics.** Zero usage tracking. All telemetry/metrics tables are **local SQLite only** and never leave the machine. - **No phoning home.** The only 4DA-operated endpoint is the user-initiated `4da.ai` license recovery (§2e). There is no background 4DA backend receiving data. - **No raw-content transmission.** Project files, source code, file contents, and git history never leave your machine. The only data *derived* from local content that is sent anywhere is OSV **package names** (§2c) — not code. - **No device fingerprinting.** License validation sends the key only (§2d). - **No cookies, no tracking pixels, no third-party scripts.** The frontend loads zero external resources. - **No accounts required** to run the app. No social/share integrations. --- ## Content Security Policy (CSP) The Tauri webview enforces a strict CSP (`src-tauri/tauri.conf.json` → `app.security.csp`). The `connect-src` allowlist for the frontend is exactly: ``` connect-src 'self' https://api.anthropic.com https://api.openai.com http://localhost:11434 https://hacker-news.firebaseio.com https://export.arxiv.org https://www.reddit.com https://api.github.com https://api.keygen.sh ``` Any JavaScript attempting to contact a host outside this list is blocked by the webview engine. (Most outbound calls are made by the **Rust backend**, not the frontend — the CSP governs the webview layer; the full backend inventory is the tables above.) --- ## Deep link protocol - **Scheme:** `4da://` — inbound only, lets external apps open 4DA to a specific view. No outbound network calls. --- Everything above is verifiable: read the cited source files, or run Wireshark while 4DA is open and match every connection against this manifest.