# Run 1 000 NotebookLM questions overnight

This is the pattern we use to run very long batches of citation-backed Q&A against Google NotebookLM — from PhD literature reviews to market intelligence pipelines. Single laptop, single account, eight hours, one thousand structured answers in a JSONL file.

The whole thing fits in **one shell loop**, because the project exposes a plain REST API on `http://localhost:3000`. There is no SDK to learn, no agent harness to configure, no MCP client to wire.

## What you need

- This project running locally: `npm run setup-auth` (one-time Google login), then `npm run start:http`. [Install guide](/install).
- A list of questions in a text file, one per line.
- A notebook id. Either pick one from `GET /notebooks/scrape` or set a default with `PUT /notebooks/:id/activate`.
- Optionally: a second Google account for rotation. [Multi-account guide](/notebooklm-multi-account).

## The minimum viable batch (10 lines of bash)

```bash
NOTEBOOK_ID="paste-your-id-here"
INPUT="questions.txt"
OUTPUT="answers.jsonl"

while IFS= read -r question; do
  curl -sS -X POST http://localhost:3000/ask \
    -H 'Content-Type: application/json' \
    -d "$(jq -n --arg q "$question" --arg n "$NOTEBOOK_ID" \
        '{question: $q, notebook_id: $n, source_format: "json"}')" \
    >> "$OUTPUT"
  echo >> "$OUTPUT"
done < "$INPUT"
```

That works. It does not survive a session expiry at hour 4, it does not throttle, it does not resume after a network blip, and it makes 1 000 sequential blocking calls. So for real batches we wrap it.

## The production pattern

```python
#!/usr/bin/env python3
"""Run a batch of NotebookLM questions through the local REST API.

Resumes safely on restart, handles re-auth, rotates accounts, throttles to
respect rate limits, and writes one JSON line per answer with citations.
"""

import json, time, sys
from pathlib import Path
import httpx

API = "http://localhost:3000"
NOTEBOOK_ID = "paste-your-id-here"
INPUT = Path("questions.txt")
OUTPUT = Path("answers.jsonl")
THROTTLE_SECONDS = 8           # average pace; tune to your account's quota
MAX_RETRIES = 3
ACCOUNTS = ["primary", "backup"]  # registered via `npm run accounts add`

def already_done() -> set[str]:
    """Resume support: skip questions already answered."""
    if not OUTPUT.exists():
        return set()
    done = set()
    for line in OUTPUT.read_text().splitlines():
        try:
            done.add(json.loads(line)["question"])
        except (json.JSONDecodeError, KeyError):
            continue
    return done

def switch_account(name: str) -> None:
    httpx.post(f"{API}/re-auth", json={"account": name}, timeout=120).raise_for_status()

def ask(question: str, account_idx: int = 0) -> dict:
    payload = {
        "question": question,
        "notebook_id": NOTEBOOK_ID,
        "source_format": "json",
    }
    for attempt in range(1, MAX_RETRIES + 1):
        try:
            r = httpx.post(f"{API}/ask", json=payload, timeout=180)
            r.raise_for_status()
            data = r.json()
            if data.get("success"):
                return data
            # rate-limited or quota — try the next account
            if "rate" in str(data.get("error", "")).lower():
                account_idx = (account_idx + 1) % len(ACCOUNTS)
                switch_account(ACCOUNTS[account_idx])
                continue
        except httpx.HTTPError as e:
            print(f"  attempt {attempt}: {e}", file=sys.stderr)
            time.sleep(2 ** attempt)
    raise RuntimeError(f"failed after {MAX_RETRIES} retries: {question}")

def main() -> None:
    done = already_done()
    questions = [q.strip() for q in INPUT.read_text().splitlines() if q.strip()]
    todo = [q for q in questions if q not in done]
    print(f"{len(done)} already answered · {len(todo)} to go")

    with OUTPUT.open("a") as f:
        for i, question in enumerate(todo, 1):
            t0 = time.time()
            answer = ask(question)
            row = {
                "question": question,
                "answer": answer["answer"],
                "citations": answer.get("citations", []),
                "session_id": answer.get("session_id"),
                "elapsed_s": round(time.time() - t0, 1),
            }
            f.write(json.dumps(row, ensure_ascii=False) + "\n")
            f.flush()
            print(f"[{i}/{len(todo)}] {row['elapsed_s']}s · {len(row['citations'])} cites")
            time.sleep(THROTTLE_SECONDS)

if __name__ == "__main__":
    main()
```

Save it as `batch.py`, drop your questions in `questions.txt`, run `python batch.py`. Kill it any time, re-run, it picks up where it left off.

## Why this pattern works

### One file in, one file out

Both ends are plain text. Your input is a `questions.txt` you can edit in any tool. Your output is `answers.jsonl` — JSON Lines, one answer per line, trivially loadable into pandas, jq, BigQuery, or another LLM for further processing:

```bash
jq '.answer' answers.jsonl | wc -l
jq -c '{q: .question, n: (.citations | length)}' answers.jsonl
```

### Resume on restart

Network drops, OS updates, batch scripts that get killed at 3am — they all happen. The first thing the script does is re-read the output file and skip questions whose text is already there. You lose the in-flight question and nothing else.

### Auto-reauth across multi-hour runs

Google sessions don't survive the night. The REST API checks the URL ground truth on every call and re-logins automatically using credentials stored in the AES-256-GCM vault (`npm run setup-auth` puts them there). TOTP codes are computed on the fly, so 2FA-protected accounts work transparently. [Multi-account configuration](/notebooklm-multi-account).

### Account rotation when one quota saturates

Free Google accounts hit a daily NotebookLM Q&A quota. The script flips to the next registered account on rate-limit errors via `POST /re-auth`. With two accounts you can typically push 1 500–2 000 questions in a 24-hour window without manual intervention.

### Citations come back structured

`source_format: "json"` returns a `citations` array of `{id, source, excerpt}` objects directly attached to the answer. You can join citations back to your sources for downstream processing — fact-checking, page-number resolution, LaTeX `\cite{}` generation for a thesis bibliography.

## Sizing your throttle

NotebookLM does not document a public rate limit, so we picked **8 seconds between calls** based on hundreds of overnight runs. That gives you ~450 questions in an hour and ~3 600 in eight hours per account. If you see `rate limit` errors before that, raise the throttle to 12-15 seconds or add a third account. If you can sustain 5 seconds without hitting limits, go for it.

## When to switch to MCP mode instead

If your driver is a coding agent (Claude Code, Cursor, Codex) rather than a script, the same operations are exposed as MCP tools. Use that surface when you want the agent to reason about which question to ask next; use the REST API when you have a flat list to grind through. [Both modes ship from the same package](/install).

## What this gives you in practice

For one of our PhD use cases we run 100-200 questions per chapter across a thirty-chapter thesis library. That's 5 000+ structured answers with citations, computed overnight on a laptop, indexed back into the thesis as `\cite{}`-ready snippets. Total cost: zero (uses the user's own NotebookLM account), total infrastructure: one Node process and one Python script.

## Next steps

- [HTTP API reference](/notebooklm-rest-api) — every endpoint, every parameter.
- [n8n integration guide](/notebooklm-n8n) — same pattern but as a visual workflow.
- [Multi-account guide](/notebooklm-multi-account) — register a second Google account for rotation.
- [Compare with PleasePrompto](/compare) — when this project is the right pick over the upstream MCP-only server.