# Auto-Discovery Pattern ## Overview Auto-Discovery enables autonomous resource discovery for AI orchestrators. Instead of manually cataloging notebooks, the system queries NotebookLM to generate its own metadata. ## Architecture ### Three-Level Progressive Disclosure **Level 0 - Metadata Registry (Lightweight)** ``` library.json: 5KB for 50 notebooks ├── Loaded at startup ├── Cached in memory └── Zero NotebookLM queries ``` **Level 1 - Contextual Matching (Local)** ``` User query → Tag/description matching → Relevant notebooks Example: "gmail automation" → Match tags: ["gmail", "n8n", "automation"] Cost: 0 NotebookLM queries (local only) ``` **Level 2 - Deep Query (Rate-Limited)** ``` Selected notebook → Query NotebookLM → Precise answer Cost: 1 NotebookLM query (rate-limited by Google) ``` ## For Orchestrators ### Claude Code ```typescript // Autonomous workflow User: "Build n8n Gmail workflow" // Claude internally: 1. Scan library metadata (instant) 2. Match: "n8n-workflows-api" (tags: ["n8n", "gmail"]) 3. Query NotebookLM: "Gmail node configuration?" 4. Implement with accurate info ``` ### n8n ```yaml # Workflow node HTTP Request: GET /notebooks/match?task=gmail automation → Returns: relevant notebook HTTP Request: POST /notebooks/{id}/ask → Query: "Available Gmail triggers?" → Response: precise documentation ``` ### Cursor / IDEs ```typescript // Context-aware matching Developer: "How to use n8n API?" → MCP: find_relevant_notebook({ task: "n8n API" }) → Match: "n8n-workflows-api" → Query notebook for details ``` ## Metadata Generation ### Prompt to NotebookLM The system sends this prompt to each notebook: ``` Analyze your complete content and respond ONLY in JSON format: { "name": "kebab-case-name", "description": "Two sentences max.", "tags": ["8-10", "keywords", "covering", "concepts", "actions", "contexts"] } ``` ### Validation Rules - **name**: 3 words max, kebab-case, no spaces - **description**: 2 sentences, under 150 chars - **tags**: 8-10 keywords (concepts + actions + contexts) ### Example Generated Metadata ```json { "name": "n8n-workflows-api", "description": "Complete n8n API documentation covering workflow creation and node configuration. Includes authentication, webhooks, and error handling.", "tags": [ "n8n", "api", "workflows", "automation", "webhooks", "nodes", "integration", "documentation" ] } ``` ## Benefits ### For Users - **Zero friction**: 30 seconds to add notebook (vs 5 minutes manual) - **No metadata expertise**: System generates relevant tags automatically - **Scalable**: Works with 1 or 100 notebooks ### For Orchestrators - **Autonomous discovery**: Find relevant docs without human guidance - **Token efficient**: Metadata scan costs ~500 tokens (vs 50k+ full read) - **Rate limit preservation**: 1 query per notebook add (vs N queries per research) ### For Product - **Enhanced experience**: Auto-discovery enables frictionless notebook management - **Scalable architecture**: Handles growing documentation libraries efficiently - **Progressive disclosure**: Optimizes API usage and response times ## Comparison with Claude Skills | Aspect | Claude Skills | NotebookLM Auto-Discovery | | ----------------- | ------------------------ | ------------------------- | | Metadata source | SKILL.md file | NotebookLM query | | Access cost | Free (local files) | Rate-limited (Google API) | | Content depth | Full docs bundled | Query on-demand | | Discovery pattern | Read files progressively | Match → Query if needed | | Update mechanism | Edit files | Re-query notebook | **Key difference**: Skills are free/local, Notebooks are rate-limited/remote → **Auto-Discovery optimizes for minimal queries via smart metadata matching** ## Error Handling ### Common Issues **Invalid metadata format**: ```json { "error": "Invalid name format", "details": "Must be kebab-case, 3 words max" } ``` → System retries with same prompt (max 2 retries) **NotebookLM timeout**: ```json { "error": "NotebookLM query timeout", "hint": "Try again in a few seconds" } ``` → Rate limit hit, wait and retry **Notebook not accessible**: ```json { "error": "Notebook not found", "hint": "Check notebook URL and sharing settings" } ``` → Verify notebook is shared with "Anyone with link" ## Testing ### Test With Your Own Notebook ```bash # Test with a specific notebook URL you can access .\deployment\scripts\test-auto-discovery.ps1 -NotebookUrl "https://notebooklm.google.com/notebook/" # Or provide multiple notebook URLs through the environment $env:AUTO_DISCOVERY_NOTEBOOKS = "https://notebooklm.google.com/notebook/,https://notebooklm.google.com/notebook/" .\deployment\scripts\test-auto-discovery.ps1 -TestAll ``` ### Expected Results Each auto-discovery test should: - ✅ Return 200 OK - ✅ Generate valid kebab-case name (1-3 words) - ✅ Generate description ≤150 chars - ✅ Generate 8-10 relevant topics - ✅ Set `auto_generated: true` - ✅ Add notebook to library automatically ## Future Enhancements ### Phase 2: Smart Refresh ```typescript PATCH /notebooks/:id/refresh → Re-query NotebookLM to update metadata if content changed ``` ### Phase 3: Semantic Matching ```typescript GET /notebooks/match?query=gmail&semantic=true → Use embeddings for advanced matching beyond tags ``` ### Phase 4: Usage Analytics ```typescript GET /notebooks/:id/related → "Notebooks often queried together" ```