--- name: testing-principles description: Language-agnostic testing principles including TDD, test quality, coverage standards, and test design patterns. Use when writing tests, designing test strategies, or reviewing test quality. --- # Language-Agnostic Testing Principles ## Test-Driven Development (TDD) ### The RED-GREEN-REFACTOR Cycle **Always follow this cycle:** 1. **RED**: Write a failing test first - Write the test before implementation - Ensure the test fails for the right reason - Verify test can actually fail 2. **GREEN**: Write minimal code to pass - Implement just enough to make the test pass - Focus on making it work 3. **REFACTOR**: Improve code structure - Clean up implementation - Eliminate duplication - Improve naming and clarity - Keep all tests passing 4. **VERIFY**: Ensure all tests still pass - Run full test suite - Check for regressions - Validate refactoring didn't break anything ## Quality Requirements ### Coverage Standards - **Minimum 80% code coverage** for production code - Prioritize critical paths and business logic - Prioritize meaningful assertions over coverage percentage - Use coverage as a guide, not a goal ### Test Characteristics All tests must be: - **Independent**: No dependencies between tests (see Test Independence Verification for detailed criteria) - **Reproducible**: Same input always produces same output - **Fast**: Unit tests < 100ms each, integration tests < 1s each, full suite < 10 minutes - **Self-checking**: Clear pass/fail without manual verification - **Timely**: Written close to the code they test ## Test Types ### Unit Tests **Purpose**: Test individual components in isolation **Characteristics**: - Test single function, method, or class - Fast execution (milliseconds) - No external dependencies - Mock external services - Majority of your test suite ### Integration Tests **Purpose**: Test interactions between components **Characteristics**: - Test multiple components together - May include database, file system, or APIs - Slower than unit tests - Verify contracts between modules - Smaller portion of test suite ### End-to-End (E2E) Tests **Purpose**: Test complete workflows from user perspective **Characteristics**: - Test entire application stack - Simulate real user interactions - Slowest test type - Fewest in number - Highest confidence level ## Test Design Principles ### AAA Pattern (Arrange-Act-Assert) Structure every test in three clear phases: ``` // Arrange: Setup test data and conditions user = createTestUser() validator = createValidator() // Act: Execute the code under test result = validator.validate(user) // Assert: Verify expected outcome assert(result.isValid == true) ``` **Adaptation**: Apply this structure using your language's idioms (methods, functions, procedures) ### One Assertion Per Concept - Test one behavior per test case - Multiple assertions OK if testing single concept - Split unrelated assertions into separate tests Example: prefer `returns error when email is invalid` over `validates user`. ### Descriptive Test Names Test names should clearly describe: - What is being tested - Under what conditions - What the expected outcome is **Recommended format**: `"should [expected behavior] when [condition]"` **Examples**: ``` test("should return error when email is invalid") test("should calculate discount when user is premium") test("should throw exception when file not found") ``` **Adaptation**: Follow your project's naming convention (camelCase, snake_case, describe/it blocks) ## Test Independence ### Setup and Teardown - Use setup hooks to prepare test environment - Use teardown hooks to clean up resources - Keep setup minimal and focused - Ensure teardown runs even if test fails ## Mocking and Test Doubles ### When to Use Mocks - **Mock external dependencies**: APIs, databases, file systems - **Mock slow operations**: Network calls, heavy computations - **Mock unpredictable behavior**: Random values, current time - **Mock unavailable services**: Third-party services ### Mocking Principles - Mock at boundaries, not internally - Keep mocks simple and focused - Verify mock expectations when relevant - Wrap external libraries/frameworks behind adapters and mock the adapter ## Data Layer Testing ### Mock Limitations for Data Layer Mocks validate call patterns but cannot verify data layer correctness. The following pass through undetected with mock-only testing: - Schema mismatches (table names, column names, data types) - Query correctness (joins, filters, aggregations, grouping) - Database constraints (NOT NULL, UNIQUE, foreign keys) - Migration drift (schema changes that make code out of sync) ### When Mocks Are Appropriate for Data Access - Testing business logic that receives data from the data layer (mock the repository, test the service) - Testing error handling paths (simulating connection failures, timeouts) - Unit tests where data access is a dependency, not the subject under test ### When Mocks Are Insufficient for Data Access - Testing repository or data access implementations themselves - Verifying query correctness (joins, filters, aggregations, grouping) - Testing data integrity constraints - Testing migration compatibility ### Real Database Testing (Environment-Dependent) Options for verifying data layer correctness against a real database engine: - **Containerized databases** for CI environments - **In-memory databases** for fast feedback (note: dialect differences may mask issues) - **Dedicated test databases** with seed data The appropriate approach depends on project environment and CI/CD capabilities. ### AI-Generated Code and Schema Awareness - AI-generated data access code has heightened schema hallucination risk - Generated queries may use correct syntax but reference nonexistent schema elements - Mock-based tests pass regardless of schema accuracy - Mitigation: Design Docs should include explicit schema references so that documented schemas can be cross-checked against data access code during review ## Test Quality Practices ### Keep Tests Active - **Fix or delete failing tests**: Resolve failures immediately - **Remove commented-out tests**: Fix them or delete entirely - **Keep tests running**: Broken tests lose value quickly - **Maintain test suite**: Refactor tests as needed ### Test Helpers and Utilities - Create reusable test data builders - Extract common setup into helper functions - Build test utilities for complex scenarios - Share helpers across test files appropriately ## What to Test ### Focus on Behavior **Test observable behavior, not implementation**: ✓ **Good**: Test that function returns expected output ✓ **Good**: Test that correct API endpoint is called ✗ **Bad**: Test that internal variable was set ✗ **Bad**: Test order of private method calls ### Test Public APIs - Test through public interfaces - Avoid testing private methods directly - Test return values, outputs, exceptions - Test side effects (database, files, logs) ### Test Edge Cases Always test: - **Boundary conditions**: Min/max values, empty collections - **Error cases**: Invalid input, null values, missing data - **Edge cases**: Special characters, extreme values - **Happy path**: Normal, expected usage ## Test Quality Criteria These criteria ensure reliable, maintainable tests. ### Literal Expected Values - Use hardcoded literal values in assertions - Calculate expected values independently from the implementation - If the implementation has a bug, the test catches it through independent verification - If expected value equals mock return value unchanged, the test verifies nothing (no transformation occurred) ### Result-Based Verification - Verify final results and observable outcomes - Assert on return values, output data, or system state changes - For mock verification, check that correct arguments were passed ### Meaningful Assertions - Every test must include at least one assertion - Assertions must validate observable behavior - A test without assertions always passes and provides no value ### Appropriate Mock Scope - Mock direct external I/O dependencies: databases, HTTP clients, file systems - Use real implementations for internal utilities and business logic - Over-mocking reduces test value by verifying wiring instead of behavior ### Boundary Value Testing Test at boundaries of valid input ranges: - Minimum valid value - Maximum valid value - Just below minimum (invalid) - Just above maximum (invalid) - Empty input (where applicable) ### Test Independence Verification Each test must: - Create its own test data - Not depend on execution order - Clean up its own state - Pass when run in isolation ## Verification Requirements ### Before Commit - ✓ All tests pass — fix failing tests immediately - ✓ No tests skipped or commented — delete or fix - ✓ No debug code left in tests - ✓ Test coverage meets standards - ✓ No flaky tests — make deterministic - ✓ Tests run within performance thresholds ## Test Organization ### File Structure - **Mirror production structure**: Tests follow code organization - **Clear naming conventions**: Follow project's test file patterns - Examples: `UserService.test.*`, `user_service_test.*`, `test_user_service.*`, `UserServiceTests.*` - **Logical grouping**: Group related tests together - **Separate test types**: Unit, integration, e2e in separate directories ## Performance Considerations ### Test Speed - **Unit tests**: < 100ms each - **Integration tests**: < 1s each - **Full suite**: Should run frequently (< 10 minutes) ### Optimization Strategies - Run tests in parallel when possible - Use in-memory databases for tests - Mock expensive operations - Split slow test suites - Profile and optimize slow tests ## Continuous Integration ### CI/CD Requirements - Run full test suite on every commit - Block merges if tests fail - Run tests in isolated environments - Test on target platforms/versions ### Test Reports - Generate coverage reports - Track test execution time - Identify flaky tests - Monitor test trends ## Test Design Guardrails ### Every Test Must - Include at least one meaningful assertion - Create its own test data and clean up its own state - Pass when run in any order and in isolation - Test observable behavior through public interfaces - Keep test logic simple (no branching, no loops) - Mock only external I/O boundaries, use real implementations for internal logic ### Flaky Test Resolution - Use deterministic time mocking instead of real clocks - Use fixed seed values instead of random data - Ensure proper resource cleanup in teardown - Resolve race conditions with synchronization primitives ## Regression Testing ### Prevent Regressions - Add test for every bug fix - Maintain comprehensive test suite - Run full suite regularly - Keep all tests unless the tested functionality is removed ### Legacy Code - Add characterization tests before refactoring - Test existing behavior first - Gradually improve coverage - Refactor with confidence ## Testing Best Practices by Language Paradigm ### Type System Utilization **For languages with static type systems:** - Leverage compile-time verification for correctness - Focus tests on business logic and runtime behavior - Use language's type system to prevent invalid states **For languages with dynamic typing:** - Add comprehensive runtime validation tests - Explicitly test data contract validation - Consider property-based testing for broader coverage ### Programming Paradigm Considerations **Functional approach:** - Test pure functions thoroughly (deterministic, no side effects) - Test side effects at system boundaries - Leverage property-based testing for invariants **Object-oriented approach:** - Test behavior through public interfaces - Mock dependencies via abstraction layers - Test polymorphic behavior carefully **Common principle:** Adapt testing strategy to leverage language strengths while ensuring comprehensive coverage ## Documentation and Communication ### Tests as Documentation - Tests document expected behavior - Use clear, descriptive test names - Include examples of usage - Show edge cases and error handling ### Test Failure Messages - Provide clear, actionable error messages - Include actual vs expected values - Add context about what was being tested - Make debugging easier