--- name: network-meta-analysis-appraisal description: Systematically appraise network meta-analysis papers using integrated 200-point checklist (PRISMA-NMA, NICE DSU TSD 7, ISPOR-AMCP-NPC, CINeMA) with triple-validation methodology, automated PDF extraction, semantic evidence matching, and concordance analysis. Use when evaluating NMA quality for peer review, guideline development, HTA, or reimbursement decisions. --- # Network Meta-Analysis Comprehensive Appraisal ## Overview This skill enables systematic, reproducible appraisal of network meta-analysis (NMA) papers through: 1. **Automated PDF intelligence** - Extract text, tables, and statistical content from NMA PDFs 2. **Semantic evidence matching** - Map 200+ checklist criteria to PDF content using AI similarity 3. **Triple-validation methodology** - Two independent concurrent appraisals + meta-review consensus 4. **Comprehensive frameworks** - PRISMA-NMA, NICE DSU TSD 7, ISPOR-AMCP-NPC, CINeMA integration 5. **Professional reports** - Generate markdown checklists and structured YAML outputs The skill transforms a complex, time-intensive manual process (~6-8 hours) into a systematic, partially-automated workflow (~2-3 hours). ## When to Use This Skill Apply this skill when: - Conducting peer review for journal submissions containing NMA - Evaluating evidence for clinical guideline development - Assessing NMA for health technology assessment (HTA) - Reviewing NMA for reimbursement/formulary decisions - Training on systematic NMA critical appraisal methodology - Comparing Bayesian vs Frequentist NMA approaches ## Workflow: PDF to Appraisal Report Follow this sequential 5-step workflow for comprehensive appraisal: ### Step 1: Setup & Prerequisites **Install Required Libraries:** ```bash cd scripts/ pip install -r requirements.txt # Download semantic model (first time only) python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')" ``` **Verify Checklist Availability:** Confirm all 8 checklist sections are in `references/checklist_sections/`: - SECTION I - STUDY RELEVANCE and APPLICABILITY.md - SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md - SECTION III - METHODOLOGICAL RIGOR - NICE DSU TSD 7.md - SECTION IV - CREDIBILITY ASSESSMENT - ISPOR-AMCP-NPC.md - SECTION V - CERTAINTY OF EVIDENCE - CINeMA Framework.md - SECTION VI - SYNTHESIS and OVERALL JUDGMENT.md - SECTION VII - APPRAISER INFORMATION.md - SECTION VIII - APPENDICES.md **Select Framework Scope:** Choose based on appraisal purpose (see `references/frameworks_overview.md` for details): - `comprehensive`: All 4 frameworks (~200 items, 4-6 hours) - `reporting`: PRISMA-NMA only (~90 items, 2-3 hours) - `methodology`: NICE + CINeMA (~30 items, 2-3 hours) - `decision`: Relevance + ISPOR + CINeMA (~30 items, 2-3 hours) ### Step 2: Extract PDF Content Run `pdf_intelligence.py` to extract structured content from the NMA paper: ```bash python scripts/pdf_intelligence.py path/to/nma_paper.pdf --output pdf_extraction.json ``` **What This Does:** - Extracts text with section detection (abstract, methods, results, discussion) - Parses tables using multiple libraries (Camelot, pdfplumber) - Extracts metadata (title, page count, etc.) - Calculates extraction quality scores **Outputs:** - `pdf_extraction.json` - Structured PDF content for evidence matching **Quality Check:** - Verify `extraction_quality` scores ≥ 0.6 for text_coverage and sections_detected - Low scores indicate poor PDF quality - may require manual supplementation ### Step 3: Match Evidence to Checklist Criteria **Prepare Checklist Criteria JSON:** Extract checklist items from markdown sections into machine-readable format: ```python import json from pathlib import Path # Example: Extract criteria from Section II criteria = [] section_file = Path("references/checklist_sections/SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md") # Parse markdown table rows to extract item IDs and criteria text # Format: [{"id": "4.1", "text": "Does the title identify the study as a systematic review and network meta-analysis?"},...] Path("checklist_criteria.json").write_text(json.dumps(criteria, indent=2)) ``` **Run Semantic Evidence Matching:** ```bash python scripts/semantic_search.py pdf_extraction.json checklist_criteria.json --output evidence_matches.json ``` **What This Does:** - Encodes each checklist criterion as semantic vector - Searches PDF sections for matching paragraphs - Calculates similarity scores (0.0-1.0) - Assigns confidence levels (high/moderate/low/unable) **Outputs:** - `evidence_matches.json` - Evidence mapped to each criterion with confidence scores ### Step 4: Conduct Triple-Validation Appraisal **Manual Appraisal with Evidence Support:** For each checklist section: 1. Load evidence matches for that section's criteria 2. Review PDF content highlighted by semantic search 3. Apply triple-validation methodology (see `references/triple_validation_methodology.md`): **Appraiser #1 (Critical Reviewer)**: - Evidence threshold: 0.75 (high) - Stance: Skeptical, conservative - For each item: Assign rating (✓/⚠/✗/N/A) based on evidence quality **Appraiser #2 (Methodologist)**: - Evidence threshold: 0.70 (moderate) - Stance: Technical rigor emphasis - For each item: Assign rating independently 4. **Meta-Review Concordance Analysis:** - Compare ratings between appraisers - Calculate agreement levels (perfect/minor/major discordance) - Apply resolution strategy (evidence-weighted by default) - Flag major discordances for manual review **Structure Appraisal Results:** ```json { "pdf_metadata": {...}, "appraisal": { "sections": [ { "id": "section_ii", "name": "REPORTING TRANSPARENCY & COMPLETENESS", "items": [ { "id": "4.1", "criterion": "Title identification...", "rating": "✓", "confidence": "high", "evidence": "The title explicitly states...", "source": "methods section", "appraiser_1_rating": "✓", "appraiser_2_rating": "✓", "concordance": "perfect" }, ... ] }, ... ] } } ``` Save as `appraisal_results.json`. ### Step 5: Generate Reports **Create Markdown and YAML Reports:** ```bash python scripts/report_generator.py appraisal_results.json --format both --output-dir ./reports ``` **Outputs:** - `reports/nma_appraisal_report.md` - Human-readable checklist with ratings, evidence, concordance - `reports/nma_appraisal_report.yaml` - Machine-readable structured data **Report Contents:** - Executive summary with overall quality ratings - Detailed checklist tables (all 8 sections) - Concordance analysis summary - Recommendations for decision-makers and authors - Evidence citations and confidence scores **Quality Validation:** - Review major discordance items flagged in concordance analysis - Verify evidence confidence ≥ moderate for ≥50% of items - Check overall agreement rate ≥ 65% - Manually review any critical items with low confidence ## Methodological Decision Points ### Bayesian vs Frequentist Detection The skill automatically detects statistical approach by scanning for keywords: **Bayesian Indicators**: MCMC, posterior, prior, credible interval, WinBUGS, JAGS, Stan, burn-in, convergence diagnostic **Frequentist Indicators**: confidence interval, p-value, I², τ², netmeta, prediction interval Apply appropriate checklist items based on detected approach: - Item 18.3 (Bayesian specifications) - only if Bayesian detected - Items on heterogeneity metrics (I², τ²) - primarily Frequentist - Convergence diagnostics - only Bayesian ### Handling Missing Evidence When semantic search returns low confidence (<0.45): 1. Manually search PDF for the criterion 2. Check supplementary materials (if accessible) 3. If truly absent, rate as ⚠ or ✗ depending on item criticality 4. Document "No evidence found in main text" in evidence field ### Resolution Strategy Selection Choose concordance resolution strategy based on appraisal purpose: - **Evidence-weighted** (default): Most objective, prefers stronger evidence - **Conservative**: For high-stakes decisions (regulatory submissions) - **Optimistic**: For formative assessments or educational purposes See `references/triple_validation_methodology.md` for detailed guidance. ## Resources ### scripts/ Production-ready Python scripts for automated tasks: - **pdf_intelligence.py** - Multi-library PDF extraction (PyMuPDF, pdfplumber, Camelot) - **semantic_search.py** - AI-powered evidence-to-criterion matching - **report_generator.py** - Markdown + YAML report generation - **requirements.txt** - Python dependencies **Usage:** Scripts can be run standalone via CLI or orchestrated programmatically. ### references/ Comprehensive documentation for appraisal methodology: - **checklist_sections/** - All 8 integrated checklist sections (PRISMA/NICE/ISPOR/CINeMA) - **frameworks_overview.md** - Framework selection guide, rating scales, key references - **triple_validation_methodology.md** - Appraiser roles, concordance analysis, resolution strategies **Usage:** Load relevant references when conducting specific appraisal steps or interpreting results. ## Best Practices 1. **Always run pdf_intelligence.py first** - Extraction quality affects all downstream steps 2. **Review low-confidence matches manually** - Semantic search is not perfect 3. **Document resolution rationale** - For major discordances, explain meta-review decision 4. **Maintain appraiser independence** - Conduct Appraiser #1 and #2 evaluations without cross-reference 5. **Validate critical items** - Manually verify evidence for high-impact methodological criteria 6. **Use appropriate framework scope** - Comprehensive for peer review, targeted for specific assessments ## Limitations - **PDF quality dependent**: Poor scans or complex layouts reduce extraction accuracy - **Semantic matching not perfect**: May miss evidence phrased in unexpected ways - **No external validation**: Cannot verify PROSPERO registration or check author COI databases - **Language**: Optimized for English-language papers - **Human oversight required**: Final appraisal should be reviewed by domain expert