# Test Coverage Recommendations & Action Plan **Weather MCP Server - v1.7.0** **Date:** November 13, 2025 --- ## Overview This document provides actionable recommendations for improving the test coverage and reliability of the Weather MCP Server test suite. The current test foundation is excellent (A- grade), but there are specific areas that need attention to achieve and maintain optimal quality. --- ## Current Status Summary ✅ **Strengths:** - 1,070 comprehensive automated tests - 99.9% pass rate (only external API timeouts failing) - 100% coverage on critical utilities - Fast unit test execution - Strong security testing ⚠️ **Issues Identified:** - Documentation claims "<2 seconds" execution time, but full suite takes ~4-5 minutes - 1-4 integration tests intermittently fail due to NOAA API timeouts - Handler unit test coverage at 25% (relying on integration tests) - New analytics module (v1.7.0) has 0% test coverage - Service layer only 50% directly tested --- ## Priority 0: Critical & Immediate (This Week) ### 1. Fix Documentation Inaccuracy ⚠️ **Issue:** CLAUDE.md and README claim tests complete in "<2 seconds" but actual execution is ~4-5 minutes. **Impact:** Sets incorrect expectations, confuses contributors **Action Required:** ```markdown Update documentation to clarify: - **Unit tests only:** ~2 seconds (accurate) - **Full suite (unit + integration):** 4-5 minutes (accurate) ``` **Files to Update:** - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/CLAUDE.md` (line ~8: "Performance: All tests must complete in < 2 seconds") - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/README.md` (if mentioned) - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/TEST_COVERAGE_REPORT_V1.0.md` (if mentioned) **Recommended Text:** ```markdown ### Test Performance - **Unit Tests:** ~2 seconds (1,008 tests) - Fast, deterministic, no external dependencies - Run on every commit for quick feedback - **Integration Tests:** ~4 minutes (62 tests) - Test real API integrations with NOAA, NIFC, Blitzortung - May take longer due to network latency and API rate limits - Run before merge/release - **Full Test Suite:** ~4-5 minutes (1,070 tests) - Comprehensive validation of all functionality - Parallel execution optimized ``` **Estimated Effort:** 15 minutes --- ### 2. Handle Flaky Integration Tests ⚠️ **Issue:** Tests in `tests/integration/safety-hazards.test.ts` intermittently fail due to NOAA API timeouts. **Failing Tests:** - `should find river gauges near St. Louis, MO (Mississippi River)` - `should find river gauges near Houston, TX (near several rivers)` - `should handle location with no nearby river gauges` - `should clamp radius to valid range` **Root Cause:** External NOAA NWPS API experiencing high latency (60+ seconds) or temporary unavailability. **Impact:** CI/CD pipeline failures, false negatives, developer frustration **Recommended Solutions:** #### Option A: Increase Timeout (Quick Fix) ```typescript // In tests/integration/safety-hazards.test.ts it('should find river gauges near St. Louis, MO', async () => { // ... test code ... }, 120000); // Increase from 60000 to 120000 (2 minutes) ``` **Pros:** Simple, one-line fix **Cons:** Slower test execution, doesn't solve underlying issue #### Option B: Add Test Retry Logic (Recommended) ```typescript // In vitest.config.ts export default defineConfig({ test: { globals: true, environment: 'node', retry: 2, // Retry failed tests up to 2 times testTimeout: 120000, // 2 minutes default timeout coverage: { // ... existing config }, }, }); ``` **Pros:** Handles intermittent failures automatically **Cons:** May hide real issues if tests always fail first try #### Option C: Mock API Responses (Best Long-term) ```typescript // Create tests/fixtures/noaa-responses.ts export const mockRiverGaugeResponse = { // ... recorded real API response }; // In tests/integration/safety-hazards.test.ts import { mockRiverGaugeResponse } from '../fixtures/noaa-responses.js'; // Use nock or msw to mock HTTP responses nock('https://api.weather.gov') .get('/nwps/gauges') .reply(200, mockRiverGaugeResponse); ``` **Pros:** Fast, deterministic, no external dependencies **Cons:** More setup work, need to maintain fixtures **Recommendation:** Implement **Option B immediately** (5 minutes), then **Option C** for long-term reliability (2-4 hours). **Estimated Effort:** - Option B: 5 minutes - Option C: 2-4 hours --- ### 3. Add Analytics Module Tests 🆕 **Issue:** New analytics module (v1.7.0) introduced in `/home/dgahagan/work/personal/weather-mcp/weather-mcp/src/analytics/` has 0% test coverage. **Risk:** Untested production code, potential bugs in privacy-sensitive functionality **Files Needing Tests:** ``` src/analytics/ ├── index.ts # Main analytics entry point ├── mqtt.ts # MQTT publishing └── privacy.ts # PII redaction ``` **Required Test Coverage:** #### Create `tests/unit/analytics.test.ts` ```typescript import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; import { withAnalytics, analytics } from '../../src/analytics/index.js'; describe('Analytics Module', () => { describe('Event Collection', () => { it('should collect tool usage events', () => { // Test event collection }); it('should redact PII from events', () => { // Test privacy redaction }); it('should handle opt-out correctly', () => { // Test ANALYTICS_ENABLED=false }); }); describe('MQTT Publishing', () => { it('should publish events to MQTT broker', async () => { // Mock MQTT client }); it('should handle MQTT connection failures gracefully', async () => { // Test error handling }); it('should not publish when analytics disabled', async () => { // Test opt-out behavior }); }); describe('Privacy Redaction', () => { it('should redact coordinates to 2 decimal places', () => { // Test coordinate privacy }); it('should remove location names', () => { // Test PII removal }); it('should preserve non-sensitive metadata', () => { // Test selective redaction }); }); }); ``` **Estimated Test Count:** 20-30 tests **Estimated Effort:** 4-6 hours --- ## Priority 1: Short-term Improvements (This Sprint) ### 4. Expand Handler Unit Tests **Issue:** Only 3 of 12 handlers have dedicated unit tests (25% coverage). **Current State:** - ✅ weatherImageryHandler: 34 tests - ✅ lightningHandler: 34 tests - ⚠️ forecastHandler: Indirect tests only (bounds-checking) - ⚠️ alertsHandler: Indirect tests only (alert-sorting) - ❌ 8 other handlers: No direct unit tests **Handlers Needing Tests:** 1. **currentConditionsHandler.ts** - Current weather + fire weather indices 2. **historicalWeatherHandler.ts** - Historical data retrieval 3. **statusHandler.ts** - Service health checks 4. **locationHandler.ts** - Location search/geocoding 5. **airQualityHandler.ts** - Air quality index 6. **marineConditionsHandler.ts** - Marine/wave conditions 7. **riverConditionsHandler.ts** - River gauge data 8. **wildfireHandler.ts** - Wildfire tracking **Template for Handler Tests:** ```typescript // tests/unit/currentConditions-handler.test.ts import { describe, it, expect, vi } from 'vitest'; import { handleGetCurrentConditions } from '../../src/handlers/currentConditionsHandler.js'; import { NOAAService } from '../../src/services/noaa.js'; describe('Current Conditions Handler', () => { describe('Parameter Validation', () => { it('should accept valid coordinates', async () => { const mockService = createMockNOAAService(); const result = await handleGetCurrentConditions( { latitude: 40.7128, longitude: -74.0060 }, mockService ); expect(result).toBeDefined(); }); it('should reject invalid coordinates', async () => { const mockService = createMockNOAAService(); await expect( handleGetCurrentConditions( { latitude: 999, longitude: -74.0060 }, mockService ) ).rejects.toThrow('Invalid latitude'); }); }); describe('Response Formatting', () => { it('should format temperature correctly', async () => { // Test response structure }); it('should include fire weather indices when available', async () => { // Test fire weather inclusion }); it('should handle missing data gracefully', async () => { // Test null/undefined handling }); }); describe('Error Handling', () => { it('should handle API failures', async () => { // Test error recovery }); it('should return user-friendly error messages', async () => { // Test error formatting }); }); }); function createMockNOAAService() { return { getGridpoint: vi.fn().mockResolvedValue({ /* mock data */ }), getStationObservation: vi.fn().mockResolvedValue({ /* mock data */ }), } as unknown as NOAAService; } ``` **Estimated Effort:** 2-3 hours per handler = **16-24 hours total** **Prioritization:** 1. riverConditionsHandler (has failing integration tests) 2. currentConditionsHandler (most commonly used) 3. historicalWeatherHandler (complex logic) 4. Rest as time permits --- ### 5. Add Service Layer Unit Tests **Issue:** Only 3 of 6 services have direct unit tests (50% coverage). **Services Needing Tests:** 1. **noaa.ts** - NOAA Weather API client (most critical) 2. **openmeteo.ts** - Open-Meteo API client 3. **nifc.ts** - NIFC wildfire API client **Why Important:** - Services contain retry logic, error handling, response parsing - Currently only tested via slow integration tests - Fast unit tests enable rapid iteration **Example Test Structure:** ```typescript // tests/unit/noaa-service.test.ts import { describe, it, expect, vi } from 'vitest'; import { NOAAService } from '../../src/services/noaa.js'; import axios from 'axios'; vi.mock('axios'); describe('NOAA Service', () => { describe('API Request Handling', () => { it('should make correct API request', async () => { const mockAxios = vi.mocked(axios); mockAxios.get.mockResolvedValue({ data: { /* mock response */ } }); const service = new NOAAService({ userAgent: 'test' }); await service.getGridpoint(40.7128, -74.0060); expect(mockAxios.get).toHaveBeenCalledWith( expect.stringContaining('weather.gov'), expect.objectContaining({ headers: { 'User-Agent': 'test' } }) ); }); it('should retry on transient failures', async () => { const mockAxios = vi.mocked(axios); mockAxios.get .mockRejectedValueOnce(new Error('Timeout')) .mockResolvedValueOnce({ data: { /* success */ } }); const service = new NOAAService({ userAgent: 'test' }); const result = await service.getGridpoint(40.7128, -74.0060); expect(mockAxios.get).toHaveBeenCalledTimes(2); expect(result).toBeDefined(); }); it('should throw after max retries', async () => { const mockAxios = vi.mocked(axios); mockAxios.get.mockRejectedValue(new Error('Timeout')); const service = new NOAAService({ userAgent: 'test' }); await expect( service.getGridpoint(40.7128, -74.0060) ).rejects.toThrow(); }); }); describe('Response Parsing', () => { it('should parse gridpoint response', async () => { // Test response transformation }); it('should handle malformed responses', async () => { // Test error handling }); }); }); ``` **Estimated Effort:** 3-4 hours per service = **9-12 hours total** --- ### 6. Implement VCR Pattern for Integration Tests **Issue:** Integration tests depend on external APIs, causing slowness and flakiness. **Solution:** Record real API responses and replay them in tests. **Implementation Options:** #### Option A: Manual Recording ```typescript // tests/fixtures/api-responses/noaa-st-louis-river.json { "timestamp": "2025-11-13T12:00:00Z", "request": { "url": "https://api.weather.gov/nwps/gauges", "params": { "lat": 38.6270, "lon": -90.1994 } }, "response": { "status": 200, "data": { /* actual API response */ } } } ``` #### Option B: Use nock Library ```bash npm install --save-dev nock @types/nock ``` ```typescript // tests/integration/safety-hazards.test.ts import nock from 'nock'; import stLouisResponse from '../fixtures/noaa-st-louis-river.json'; describe('River Conditions', () => { beforeEach(() => { nock('https://api.weather.gov') .get('/nwps/gauges') .query({ lat: 38.6270, lon: -90.1994 }) .reply(200, stLouisResponse); }); it('should find river gauges near St. Louis', async () => { // Test runs with mocked response }); }); ``` #### Option C: Use MSW (Modern Service Worker) ```bash npm install --save-dev msw ``` ```typescript // tests/mocks/handlers.ts import { http, HttpResponse } from 'msw'; export const handlers = [ http.get('https://api.weather.gov/nwps/gauges', () => { return HttpResponse.json({ /* mock response */ }); }), ]; ``` **Recommendation:** Start with **Option B (nock)** - simple, well-established, good TypeScript support. **Estimated Effort:** 6-8 hours --- ## Priority 2: Long-term Enhancements (Next Sprint) ### 7. Add Coverage Reporting to CI/CD **Goal:** Automate coverage tracking and reporting. **Implementation:** ```yaml # .github/workflows/test.yml name: Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions/setup-node@v3 with: node-version: '18' - run: npm ci - run: npm test - run: npm run test:coverage - uses: codecov/codecov-action@v3 with: files: ./coverage/coverage-final.json ``` **Add Coverage Badge to README:** ```markdown [![Coverage](https://codecov.io/gh/weather-mcp/weather-mcp/branch/main/graph/badge.svg)](https://codecov.io/gh/weather-mcp/weather-mcp) ``` **Estimated Effort:** 2-3 hours --- ### 8. Add Performance Regression Testing **Goal:** Track test execution time and prevent performance degradation. **Implementation:** ```typescript // tests/performance/benchmarks.test.ts import { describe, it, expect } from 'vitest'; import { performance } from 'perf_hooks'; describe('Performance Benchmarks', () => { it('cache lookup should complete in <1ms', () => { const start = performance.now(); // ... cache operation ... const duration = performance.now() - start; expect(duration).toBeLessThan(1); }); it('coordinate validation should complete in <0.1ms', () => { // ... validation benchmark ... }); }); ``` **Track Trends:** - Store benchmark results in CI artifacts - Alert on >10% performance degradation - Graph trends over time **Estimated Effort:** 4-6 hours --- ### 9. Create Test Utilities & Helpers **Goal:** Reduce test code duplication, make tests easier to write. **Create Test Helpers:** ```typescript // tests/helpers/mocks.ts export function createMockNOAAService(overrides = {}) { return { getGridpoint: vi.fn().mockResolvedValue(mockGridpointData), getStationObservation: vi.fn().mockResolvedValue(mockObservationData), ...overrides } as unknown as NOAAService; } export function createMockOpenMeteoService(overrides = {}) { // ... similar pattern } // tests/helpers/fixtures.ts export const VALID_COORDINATES = { NEW_YORK: { latitude: 40.7128, longitude: -74.0060 }, LOS_ANGELES: { latitude: 34.0522, longitude: -118.2437 }, CHICAGO: { latitude: 41.8781, longitude: -87.6298 }, }; export const MOCK_WEATHER_DATA = { temperature: 72, humidity: 65, windSpeed: 10, // ... common test data }; // tests/helpers/assertions.ts export function expectValidWeatherResponse(response: unknown) { expect(response).toBeDefined(); expect(response.content).toBeDefined(); expect(response.content[0].type).toBe('text'); expect(response.content[0].text).toContain('Weather'); } ``` **Usage:** ```typescript // In tests import { createMockNOAAService, VALID_COORDINATES, expectValidWeatherResponse } from '../helpers'; it('should get current conditions', async () => { const service = createMockNOAAService(); const result = await handleGetCurrentConditions( VALID_COORDINATES.NEW_YORK, service ); expectValidWeatherResponse(result); }); ``` **Estimated Effort:** 3-4 hours --- ### 10. Establish Test Maintenance Guidelines **Goal:** Keep tests maintainable as codebase grows. **Create `tests/README.md`:** ```markdown # Testing Guidelines ## Philosophy - Tests should be fast, reliable, and maintainable - Unit tests for logic, integration tests for APIs - Mock external dependencies in unit tests - Every bug fix should include a regression test ## Organization - `tests/unit/` - Fast, isolated tests (no I/O) - `tests/integration/` - Tests with real API calls - `tests/fixtures/` - Mock API responses - `tests/helpers/` - Shared test utilities ## Writing Tests - Use descriptive test names: "should X when Y" - Follow AAA pattern: Arrange, Act, Assert - Test edge cases and error conditions - Keep tests independent (no shared state) ## Running Tests - `npm test` - All tests - `npm test tests/unit/` - Unit tests only - `npm run test:watch` - Watch mode - `npm run test:coverage` - Coverage report ## Coverage Goals - Critical utilities: 100% - Handlers: 80%+ - Services: 80%+ - Overall: 80%+ ## When Tests Fail - Integration tests may fail due to external APIs - Check service status at weather.gov - Retry tests or run unit tests only - Report persistent failures as issues ``` **Estimated Effort:** 1-2 hours --- ## Action Plan Timeline ### Week 1 (Immediate) - [ ] Fix documentation (P0 #1) - 15 min - [ ] Add test retry logic (P0 #2) - 5 min - [ ] Start analytics module tests (P0 #3) - 4-6 hours ### Week 2 (Short-term) - [ ] Finish analytics module tests (P0 #3) - remaining time - [ ] Add riverConditionsHandler tests (P1 #4) - 3 hours - [ ] Add currentConditionsHandler tests (P1 #4) - 3 hours - [ ] Add NOAA service tests (P1 #5) - 4 hours ### Week 3 (Short-term continued) - [ ] Add historicalWeatherHandler tests (P1 #4) - 3 hours - [ ] Implement VCR pattern (P1 #6) - 6-8 hours ### Week 4 (Long-term start) - [ ] Add remaining handler tests (P1 #4) - 10-15 hours - [ ] Create test helpers (P2 #9) - 3-4 hours ### Month 2 (Long-term) - [ ] Add coverage reporting to CI (P2 #7) - 2-3 hours - [ ] Add performance benchmarks (P2 #8) - 4-6 hours - [ ] Create test maintenance guidelines (P2 #10) - 1-2 hours - [ ] Add remaining service tests (P1 #5) - 5-8 hours --- ## Success Metrics ### Target Metrics (End of Month 2) | Metric | Current | Target | Status | |--------|---------|--------|--------| | Total Tests | 1,070 | 1,300+ | In Progress | | Pass Rate | 99.6% | 100% | Needs Work | | Unit Tests | 1,008 | 1,200+ | In Progress | | Handler Coverage | 25% | 80% | Needs Work | | Service Coverage | 50% | 80% | Needs Work | | Analytics Coverage | 0% | 80% | Needs Work | | Flaky Test Rate | 0.4% | 0% | Needs Work | | Documentation Accuracy | 90% | 100% | Needs Work | ### Definition of Done A test suite is considered "excellent" when: - ✅ 100% pass rate on unit tests - ✅ 99%+ pass rate on integration tests (allowing for external API issues) - ✅ <2s execution for unit tests - ✅ <5min execution for full suite - ✅ 80%+ code coverage across all modules - ✅ 0% flaky tests (excluding known external API issues) - ✅ Comprehensive documentation - ✅ CI/CD integration with coverage reporting --- ## Maintenance Schedule ### Daily - Monitor CI/CD test results - Investigate and fix failing tests within 24 hours ### Weekly - Review test execution times for regressions - Update test fixtures if APIs change - Review and merge test improvements ### Monthly - Generate coverage report and trend analysis - Review and update test documentation - Audit for flaky tests and fix/skip them - Update test helpers and utilities ### Quarterly - Comprehensive test suite audit - Remove obsolete tests - Refactor duplicated test code - Update testing guidelines --- ## Resources & References ### Documentation - Vitest: https://vitest.dev/ - Nock: https://github.com/nock/nock - MSW: https://mswjs.io/ - Codecov: https://about.codecov.io/ ### Internal Docs - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/CLAUDE.md` - Development guide - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/TEST_COVERAGE_ANALYSIS_2025.md` - Full analysis - `/home/dgahagan/work/personal/weather-mcp/weather-mcp/TEST_COVERAGE_REPORT_V1.0.md` - v1.0 report --- **Document Version:** 1.0 **Last Updated:** November 13, 2025 **Next Review:** December 13, 2025 **Prepared By:** Test Automation Engineer **Status:** ✅ **Approved** - Ready for implementation