---
name: seo-technical-audit
description: Expert guide for technical SEO auditing. Use when checking crawlability, indexing issues, page speed, Core Web Vitals, mobile-friendliness, HTTPS configuration, robots.txt, XML sitemaps, canonical tags, hreflang, redirect chains, HTTP status codes, or diagnosing technical SEO problems.
---
# Technical SEO Audit
---
## Crawlability
Search engines must be able to discover and access your pages. Crawlability issues block indexing entirely.
### Robots.txt
Controls which URLs crawlers can access.
```
# Allow all crawlers access to everything
User-agent: *
Allow: /
# Block specific paths
User-agent: *
Disallow: /admin/
Disallow: /api/
Disallow: /staging/
Disallow: /*?sort=
Disallow: /*?filter=
# Sitemap reference (always include)
Sitemap: https://example.com/sitemap.xml
```
**Critical rules:**
- Never block CSS/JS files (Google needs them to render pages)
- Never block important content paths accidentally
- Test with `analyze_robots_txt` before deploying changes
- `Crawl-delay` is ignored by Google (honored by Bing/Yandex)
### Meta Robots & X-Robots-Tag
Page-level crawl and index directives.
| Directive | Effect |
|-----------|--------|
| `index` | Allow indexing (default) |
| `noindex` | Prevent indexing — removes from search results |
| `follow` | Follow links on this page (default) |
| `nofollow` | Don't follow any links on this page |
| `noarchive` | Don't show cached version |
| `nosnippet` | Don't show text snippet in results |
| `max-snippet:N` | Limit snippet to N characters |
| `max-image-preview:large` | Allow large image previews |
```html
X-Robots-Tag: noindex, nofollow
```
**MCP Tool:** `analyze_page` extracts both meta robots and HTTP header directives.
---
## Indexing
### Canonical Tags
The definitive way to tell search engines which URL is the preferred version.
**Rules:**
- Every indexable page: self-referencing canonical
- Absolute URLs only (`https://example.com/page`, not `/page`)
- Must be consistent: canonical URL must return 200 (not redirect or 404)
- One canonical per page (if multiple, Google uses the first)
- HTTP header canonical overrides HTML canonical if both present
### Noindex vs Canonical
| Scenario | Use |
|----------|-----|
| Page should never appear in search | `noindex` |
| Duplicate page, prefer another version | `canonical` to preferred version |
| URL parameter variants | `canonical` to clean URL |
| Paginated content | Self-referencing canonical on each page |
### Sitemap Submission
- Submit via Google Search Console
- All indexable pages should be in the sitemap
- Noindex pages should NOT be in the sitemap
- Keep sitemaps under 50,000 URLs / 50MB per file (use sitemap index for larger sites)
**MCP Tools:** `analyze_sitemap` validates structure, `gsc_sitemaps` checks submission status.
---
## Core Web Vitals
Google's page experience signals. Measured from real user data (CrUX) and lab data (Lighthouse).
### Thresholds (2024+)
| Metric | Good | Needs Improvement | Poor |
|--------|------|-------------------|------|
| **LCP** (Largest Contentful Paint) | ≤ 2.5s | ≤ 4.0s | > 4.0s |
| **INP** (Interaction to Next Paint) | ≤ 200ms | ≤ 500ms | > 500ms |
| **CLS** (Cumulative Layout Shift) | ≤ 0.1 | ≤ 0.25 | > 0.25 |
### LCP Optimization
Largest Contentful Paint measures when the largest visible element finishes rendering.
**Common causes of poor LCP:**
1. Slow server response (TTFB > 800ms) → optimize server, use CDN
2. Render-blocking resources → defer non-critical CSS/JS
3. Slow resource load → optimize/compress LCP image, use `preload`
4. Client-side rendering → server-side render critical content
**Quick wins:**
- Preload LCP image: ``
- Use CDN for static assets
- Optimize server response (caching, database queries)
- Avoid lazy-loading the LCP element
### INP Optimization
Interaction to Next Paint measures responsiveness to user input.
**Common causes of poor INP:**
1. Long JavaScript tasks blocking main thread → break into smaller tasks
2. Heavy event handlers → debounce, use `requestAnimationFrame`
3. Large DOM size → reduce DOM nodes (target < 1,500)
4. Third-party scripts → defer or lazy-load
### CLS Optimization
Cumulative Layout Shift measures visual stability.
**Common causes of poor CLS:**
1. Images without dimensions → always set `width` and `height`
2. Ads/embeds without reserved space → use `aspect-ratio` or `min-height`
3. Web fonts causing FOIT/FOUT → use `font-display: swap` + preload
4. Dynamically injected content → reserve space before injection
**MCP Tool:** `check_core_web_vitals` returns all metrics with specific optimization suggestions.
---
## Page Speed Optimization
Beyond Core Web Vitals, overall page speed impacts user experience and crawl budget.
### Critical rendering path
1. Minimize render-blocking resources (critical CSS inline, defer JS)
2. Enable compression (Brotli preferred, gzip minimum)
3. Set cache headers (`Cache-Control: max-age=31536000` for static assets)
4. Use HTTP/2 or HTTP/3
5. Minimize main-thread work (reduce JS execution time)
### Image optimization
- Use modern formats (WebP/AVIF)
- Serve responsive images (`srcset`)
- Lazy load below-the-fold images
- Compress aggressively (80-85% quality)
### Resource hints
```html
```
---
## Mobile-Friendliness
Google uses mobile-first indexing — the mobile version of your site is what gets indexed.
### Requirements
- Viewport meta tag: ``
- Responsive design (content adapts to screen size)
- No horizontal scrolling
- Tap targets: minimum 48x48px with 8px spacing
- Font size: minimum 16px for body text
- No intrusive interstitials (popups covering content)
- Content parity: mobile version has same content as desktop
**MCP Tool:** `check_mobile_friendly` evaluates all mobile usability factors.
---
## HTTPS & Security
### HTTPS requirements
- All pages served over HTTPS (no HTTP pages in index)
- Valid SSL certificate (not expired, correct domain)
- No mixed content (HTTPS page loading HTTP resources)
- HSTS header recommended: `Strict-Transport-Security: max-age=31536000; includeSubDomains`
- HTTP pages 301 redirect to HTTPS
### Common HTTPS issues
| Issue | Impact | Fix |
|-------|--------|-----|
| Mixed content | Medium | Update all resource URLs to HTTPS |
| Expired certificate | Critical | Renew SSL certificate immediately |
| HTTP pages indexed | High | 301 redirect HTTP → HTTPS + update canonical |
| Missing HSTS | Low | Add HSTS header |
---
## Redirects
### Redirect types
| Code | Type | When to Use | SEO Impact |
|------|------|-------------|------------|
| 301 | Permanent | URL changed permanently, content moved | Passes ~95% link equity |
| 302 | Temporary | Temporary move (A/B test, maintenance) | Does not pass link equity |
| 307 | Temporary (strict) | Same as 302 but preserves HTTP method | Does not pass link equity |
| 308 | Permanent (strict) | Same as 301 but preserves HTTP method | Passes link equity |
### Redirect issues
- **Redirect chains:** A → B → C → D. Maximum 2 hops recommended. Fix by pointing A directly to D.
- **Redirect loops:** A → B → A. Critical error — page becomes inaccessible.
- **Soft 404s:** Page returns 200 but shows "not found" content. Should return actual 404.
- **302 where 301 needed:** Temporary redirect for permanent move wastes link equity.
---
## XML Sitemaps
### Requirements
- Located at `/sitemap.xml` (or referenced in robots.txt)
- Valid XML format
- Only indexable pages (no noindex, no 404s, no redirects)
- Include `` with accurate dates (ISO 8601)
- Maximum 50,000 URLs per sitemap file
- Maximum 50MB uncompressed per file
- Use sitemap index for larger sites
### Validation
```xml
https://example.com/page2026-01-15weekly0.8
```
**MCP Tool:** `analyze_sitemap` parses, validates, and reports issues.
---
## Hreflang (International SEO)
For multi-language or multi-region sites. Tells Google which language/region version to show.
### Syntax
```html
```
### Rules
- **Bidirectional:** If page A references page B, page B must reference page A
- **Self-referencing:** Each page must include a hreflang pointing to itself
- **x-default:** Include for the fallback/language selector page
- **Return tags:** Every referenced URL must return the same hreflang set
- Use ISO 639-1 language codes, optionally with ISO 3166-1 region codes
---
## HTTP Status Codes (SEO Impact)
| Code | Meaning | SEO Action |
|------|---------|------------|
| 200 | OK | Expected for all indexable pages |
| 301 | Moved Permanently | Passes link equity. Use for permanent URL changes |
| 302 | Found (Temporary) | Does NOT pass link equity. Use only for temporary moves |
| 304 | Not Modified | Good — efficient caching |
| 404 | Not Found | Remove from sitemap, fix internal links pointing here |
| 410 | Gone | Permanently removed. Stronger signal than 404 for deindexing |
| 500 | Server Error | Fix immediately — blocks crawling and indexing |
| 503 | Service Unavailable | Temporary. Google retries. Use for planned maintenance |
---
## Technical Audit Workflow
Recommended order for a complete technical audit:
1. **Crawlability:** `analyze_robots_txt` → check for blocking issues
2. **Sitemap:** `analyze_sitemap` → validate structure and URLs
3. **Page-level:** `analyze_page` → meta robots, canonical, redirects, status
4. **Performance:** `check_core_web_vitals` → LCP, INP, CLS scores
5. **Mobile:** `check_mobile_friendly` → viewport, tap targets, fonts
6. **Schema:** `extract_schema` → structured data validation
7. **GSC data:** `gsc_index_coverage` → real indexing status from Google
See [ERROR_CATALOG.md](ERROR_CATALOG.md) for the complete issue catalog.
See [AUDIT_WORKFLOW.md](AUDIT_WORKFLOW.md) for detailed audit methodology.
See [HTTP_STATUS_REFERENCE.md](HTTP_STATUS_REFERENCE.md) for complete status code reference.