# From Node.js to Go: Rebuilding an MCP Server for Production This is the story of why I rebuilt [google-researcher-mcp](https://github.com/zoharbabin/google-researcher-mcp) (Node.js/TypeScript) from scratch as [web-researcher-mcp](https://github.com/zoharbabin/web-researcher-mcp) (Go), and what the lessons learned along the way. ## The Starting Point The original project — `google-researcher-mcp` — was a TypeScript/Node.js MCP server distributed via npm. It had real traction: GitHub stars, steady npm downloads, a broad test suite, and active users. But five critical issues kept surfacing that couldn't be solved within the existing architecture. ## Why Rewrite in Go (Not Refactored) ### Orphan Processes (Issue #108) npx spawns deeply nested process trees. When the parent MCP client (Claude Desktop, Cursor) crashes or closes unexpectedly, the Node.js process doesn't receive a signal — it keeps running, consuming memory and holding file locks. Myself and collaborators spent three versions (v6.2.0 through v6.4.0) building increasingly complex orphan detection: a Worker thread watchdog with CPU spin detection, three-layer parent-alive checks, and graceful degradation. It was all band-aids on a fundamental runtime limitation. **Go fix**: A single static binary. No runtime process tree. EOF on stdin = immediate exit. The entire problem category disappeared. ### Google Discontinuing "Entire Web" Search (Issue #107) Google announced it would be discontinuing support for Programmable Search Engines configured to search the "entire web." The project was named `google-researcher-mcp` — the dependency on a single search provider was an foundational risk. **Go fix**: A clean provider interface that lets you swap search backends freely, plus a routing layer that automatically switches to the next provider when one fails. ### Alternative Search Engines (Issue #55) Users wanted Brave, Bing (go figure), and other providers. But the TypeScript codebase was too tightly coupled to Google's API response format — the shared directory (41 files) made every change risky and far-reaching. **Go fix**: A clean `Provider` interface — each adapter normalizes provider-specific responses to common types (`SearchResult`, `ImageResult`, `NewsResult`). Adding a new provider is one file implementing one interface. ### Redis Caching (Issue #72) The in-memory cache was lost on every process restart — which happened frequently with npx-launched servers. The complex persistence manager offered four strategies (Periodic, WriteThrough, OnShutdown, Hybrid), but none reliably survived the volatile process lifecycle. **Go fix**: A `cache.Cache` interface with a hybrid implementation: an in-memory layer over an AES-encrypted disk layer, plus an optional shared Redis tier in HTTP mode for cross-pod deployments. Simple, testable, and it never loses data because the disk layer persists across restarts. ### Monolithic Architecture (Issue #40) The project had 100+ source files but a tightly coupled `shared/` directory with 41 files. Adding a single tool required touching 4+ documentation sections, and the import graph made refactoring perilous. **Go fix**: One package per concern. Tool handlers are self-contained files. Adding a tool means writing one file and one line in the registry. ## What Changed Architecturally | Aspect | Node.js (old) | Go (new) | |--------|---------------|----------| | Distribution | npm/npx (runtime required) | Single static binary | | Memory | 430MB idle (80MB after optimization) | ~25MB baseline | | Startup | 2-4 seconds (lazy imports) | <100ms | | Process lifecycle | Worker thread watchdog | EOF detection, no orphans | | Search providers | Google only | Multiple providers + fallback routing | | Concurrency | Event loop + async/await | Goroutines + semaphores | | Type safety | TypeScript + Zod | Go type system + struct tags | | Testing | Jest test suite | Table-driven tests + race detector | | Scraping | Playwright (heavy) | Multi-tier pipeline (lightweight first) | ## Key Lessons Learned ### 1. Don't Fight Your Runtime Node.js process management is fundamentally fragile for long-lived servers launched via npx. The runtime doesn't support robust parent-death detection, and the nested process tree (npx → node → worker) makes signal propagation unreliable. We spent three versions building increasingly complex orphan detection. Go's single binary eliminated the entire category of problems. **Takeaway**: If you're spending significant engineering effort working around your runtime's limitations, that's a signal to evaluate whether the runtime fits the problem. > Side note: looking for a better runtime I looked into both Go and Rust (isn't Rust awesome!?). Go won primarily for its lightweight goroutines excelling at I/O-bound operations, and the official `modelcontextprotocol/go-sdk` is superbly maintained. ### 2. Interface-Driven Design Enables Fearless Extension Adding Brave Search in the Go version was one file implementing one interface. In the Node.js version, the equivalent change would have touched 6+ files due to tightly coupled imports in the shared directory. **Takeaway**: When you know extension is likely (new providers, new tools), invest in clean interfaces upfront. The interface is the specification; implementations are interchangeable. ### 3. Memory Matters for MCP Servers MCP servers run alongside AI assistants on developer machines. They're always-on background processes. A 430MB idle memory footprint was unacceptable — users would notice and uninstall. Go's ~25MB baseline lets the server stay resident without impact. **Takeaway**: For developer tools that run continuously, memory efficiency is a feature, not an optimization. Choose your runtime accordingly. ### 4. Caching Architecture Should Be Boring The old project had four persistence strategies with complex heuristics for when to flush. The new one layers an in-memory cache over encrypted disk, with an optional shared Redis tier for multi-pod HTTP deployments. Each layer is simple and independently testable. No heuristics, no race conditions, no data loss. **Takeaway**: Boring infrastructure is reliable infrastructure. If your caching layer needs its own debugging session, it's too complex. ### 5. Documentation Should Be Drift-Resistant The old project required updating four separate documentation files per new tool. Inevitably, docs drifted from reality. The new project's test suite programmatically validates documentation claims — tool descriptions must mention alternatives, output schemas must match actual responses, and annotations must be consistent. **Takeaway**: If documentation can be wrong without a test failing, it will eventually be wrong. ## What We Kept The rewrite preserved the user-facing contract: - **Same tools** with identical semantics and parameter names - **Same MCP protocol** compatibility (Claude Desktop, Cursor, VS Code, any MCP client) - **Same environment variables** (drop-in replacement for existing configs) - **Same search lenses** (curated domain lists, identical JSON format) What improved (without breaking backwards compatibility): - OAuth 2.1 authentication for multi-client deployments - Multi-tenancy with per-tenant session isolation - Automatic failover between search providers when one goes down - Prometheus metrics for observability - Structured audit logging for compliance ## Results Since launching the Go version: - Zero orphan process reports (vs. recurring issue in Node.js version) - Multiple search providers with automatic failover (vs. single provider) - Multi-tier scraping pipeline that tries lightweight methods first (vs. Playwright-only) - Sub-100ms cold startup (vs. 2-4 seconds) - Production-ready: rate limiting, automatic failover, user isolation, and a compliance-ready audit trail ## Should You Rewrite? Probably not. Most rewrites fail because they're motivated by developer preference ("I want to use a new language") rather than architectural necessity. Ours succeeded because: 1. The problems were **architectural**, not implementational — no amount of refactoring within Node.js would fix process orphaning 2. The user-facing contract was **well-defined** — MCP provides a clean protocol boundary 3. The scope was **bounded** — we knew exactly what the server needed to do 4. We had **comprehensive tests** on the old version to validate behavioral equivalence If your problems are solvable within your current architecture, refactor. If they're fundamentally incompatible with your runtime or architecture, consider a rewrite — but only with clear success criteria and a well-defined boundary. --- *This article covers the migration from [google-researcher-mcp](https://github.com/zoharbabin/google-researcher-mcp) to [web-researcher-mcp](https://github.com/zoharbabin/web-researcher-mcp). The new project is open source under MIT and works with any MCP-compatible AI assistant.*