# Run 1 000 NotebookLM questions overnight This is the pattern we use to run very long batches of citation-backed Q&A against Google NotebookLM — from PhD literature reviews to market intelligence pipelines. Single laptop, single account, eight hours, one thousand structured answers in a JSONL file. The whole thing fits in **one shell loop**, because the project exposes a plain REST API on `http://localhost:3000`. There is no SDK to learn, no agent harness to configure, no MCP client to wire. ## What you need - This project running locally: `npm run setup-auth` (one-time Google login), then `npm run start:http`. [Install guide](/install). - A list of questions in a text file, one per line. - A notebook id. Either pick one from `GET /notebooks/scrape` or set a default with `PUT /notebooks/:id/activate`. - Optionally: a second Google account for rotation. [Multi-account guide](/notebooklm-multi-account). ## The minimum viable batch (10 lines of bash) ```bash NOTEBOOK_ID="paste-your-id-here" INPUT="questions.txt" OUTPUT="answers.jsonl" while IFS= read -r question; do curl -sS -X POST http://localhost:3000/ask \ -H 'Content-Type: application/json' \ -d "$(jq -n --arg q "$question" --arg n "$NOTEBOOK_ID" \ '{question: $q, notebook_id: $n, source_format: "json"}')" \ >> "$OUTPUT" echo >> "$OUTPUT" done < "$INPUT" ``` That works. It does not survive a session expiry at hour 4, it does not throttle, it does not resume after a network blip, and it makes 1 000 sequential blocking calls. So for real batches we wrap it. ## The production pattern ```python #!/usr/bin/env python3 """Run a batch of NotebookLM questions through the local REST API. Resumes safely on restart, handles re-auth, rotates accounts, throttles to respect rate limits, and writes one JSON line per answer with citations. """ import json, time, sys from pathlib import Path import httpx API = "http://localhost:3000" NOTEBOOK_ID = "paste-your-id-here" INPUT = Path("questions.txt") OUTPUT = Path("answers.jsonl") THROTTLE_SECONDS = 8 # average pace; tune to your account's quota MAX_RETRIES = 3 ACCOUNTS = ["primary", "backup"] # registered via `npm run accounts add` def already_done() -> set[str]: """Resume support: skip questions already answered.""" if not OUTPUT.exists(): return set() done = set() for line in OUTPUT.read_text().splitlines(): try: done.add(json.loads(line)["question"]) except (json.JSONDecodeError, KeyError): continue return done def switch_account(name: str) -> None: httpx.post(f"{API}/re-auth", json={"account": name}, timeout=120).raise_for_status() def ask(question: str, account_idx: int = 0) -> dict: payload = { "question": question, "notebook_id": NOTEBOOK_ID, "source_format": "json", } for attempt in range(1, MAX_RETRIES + 1): try: r = httpx.post(f"{API}/ask", json=payload, timeout=180) r.raise_for_status() data = r.json() if data.get("success"): return data # rate-limited or quota — try the next account if "rate" in str(data.get("error", "")).lower(): account_idx = (account_idx + 1) % len(ACCOUNTS) switch_account(ACCOUNTS[account_idx]) continue except httpx.HTTPError as e: print(f" attempt {attempt}: {e}", file=sys.stderr) time.sleep(2 ** attempt) raise RuntimeError(f"failed after {MAX_RETRIES} retries: {question}") def main() -> None: done = already_done() questions = [q.strip() for q in INPUT.read_text().splitlines() if q.strip()] todo = [q for q in questions if q not in done] print(f"{len(done)} already answered · {len(todo)} to go") with OUTPUT.open("a") as f: for i, question in enumerate(todo, 1): t0 = time.time() answer = ask(question) row = { "question": question, "answer": answer["answer"], "citations": answer.get("citations", []), "session_id": answer.get("session_id"), "elapsed_s": round(time.time() - t0, 1), } f.write(json.dumps(row, ensure_ascii=False) + "\n") f.flush() print(f"[{i}/{len(todo)}] {row['elapsed_s']}s · {len(row['citations'])} cites") time.sleep(THROTTLE_SECONDS) if __name__ == "__main__": main() ``` Save it as `batch.py`, drop your questions in `questions.txt`, run `python batch.py`. Kill it any time, re-run, it picks up where it left off. ## Why this pattern works ### One file in, one file out Both ends are plain text. Your input is a `questions.txt` you can edit in any tool. Your output is `answers.jsonl` — JSON Lines, one answer per line, trivially loadable into pandas, jq, BigQuery, or another LLM for further processing: ```bash jq '.answer' answers.jsonl | wc -l jq -c '{q: .question, n: (.citations | length)}' answers.jsonl ``` ### Resume on restart Network drops, OS updates, batch scripts that get killed at 3am — they all happen. The first thing the script does is re-read the output file and skip questions whose text is already there. You lose the in-flight question and nothing else. ### Auto-reauth across multi-hour runs Google sessions don't survive the night. The REST API checks the URL ground truth on every call and re-logins automatically using credentials stored in the AES-256-GCM vault (`npm run setup-auth` puts them there). TOTP codes are computed on the fly, so 2FA-protected accounts work transparently. [Multi-account configuration](/notebooklm-multi-account). ### Account rotation when one quota saturates Free Google accounts hit a daily NotebookLM Q&A quota. The script flips to the next registered account on rate-limit errors via `POST /re-auth`. With two accounts you can typically push 1 500–2 000 questions in a 24-hour window without manual intervention. ### Citations come back structured `source_format: "json"` returns a `citations` array of `{id, source, excerpt}` objects directly attached to the answer. You can join citations back to your sources for downstream processing — fact-checking, page-number resolution, LaTeX `\cite{}` generation for a thesis bibliography. ## Sizing your throttle NotebookLM does not document a public rate limit, so we picked **8 seconds between calls** based on hundreds of overnight runs. That gives you ~450 questions in an hour and ~3 600 in eight hours per account. If you see `rate limit` errors before that, raise the throttle to 12-15 seconds or add a third account. If you can sustain 5 seconds without hitting limits, go for it. ## When to switch to MCP mode instead If your driver is a coding agent (Claude Code, Cursor, Codex) rather than a script, the same operations are exposed as MCP tools. Use that surface when you want the agent to reason about which question to ask next; use the REST API when you have a flat list to grind through. [Both modes ship from the same package](/install). ## What this gives you in practice For one of our PhD use cases we run 100-200 questions per chapter across a thirty-chapter thesis library. That's 5 000+ structured answers with citations, computed overnight on a laptop, indexed back into the thesis as `\cite{}`-ready snippets. Total cost: zero (uses the user's own NotebookLM account), total infrastructure: one Node process and one Python script. ## Next steps - [HTTP API reference](/notebooklm-rest-api) — every endpoint, every parameter. - [n8n integration guide](/notebooklm-n8n) — same pattern but as a visual workflow. - [Multi-account guide](/notebooklm-multi-account) — register a second Google account for rotation. - [Compare with PleasePrompto](/compare) — when this project is the right pick over the upstream MCP-only server.