---name: epigenomics-methylgpt-agent description: AI-powered DNA methylation analysis using MethylGPT foundation models for epigenomic profiling, differential methylation detection, and cancer epigenome characterization. license: MIT metadata: author: AI Group version: "1.0.0" created: "2026-01-19" compatibility: - system: Python 3.10+ allowed-tools: - run_shell_command - read_file - write_file keywords: - epigenomics-methylgpt-agent - automation - biomedical measurable_outcome: execute task with >95% success rate. ---" # Epigenomics MethylGPT Agent The **Epigenomics MethylGPT Agent** leverages foundation models for comprehensive DNA methylation analysis. It integrates MethylGPT and DiffuCpG for methylation profiling, differential methylation region (DMR) detection, and cancer epigenome characterization at single-base resolution. ## When to Use This Skill * When analyzing whole-genome bisulfite sequencing (WGBS) data for methylation patterns. * To identify differentially methylated regions (DMRs) between conditions (e.g., tumor vs. normal). * For cancer epigenome profiling and epigenetic biomarker discovery. * When predicting CpG methylation states using deep learning models. * To impute missing methylation data in high-throughput studies. ## Core Capabilities 1. **MethylGPT Foundation Model**: Leverages transformer-based architecture trained on large-scale methylome data for methylation state prediction and pattern recognition. 2. **Differential Methylation Analysis**: Identifies DMRs with increased sensitivity using AI-enhanced detection compared to traditional statistical methods. 3. **Cancer Epigenome Profiling**: Specialized analysis for tumor methylation signatures, including hypermethylation of tumor suppressors and global hypomethylation patterns. 4. **Missing Data Imputation**: Uses DiffuCpG generative AI model to address missing data in methylation arrays and sequencing studies. 5. **Single-Base Resolution**: Deep learning models capture sequence context and long-range dependencies for accurate CpG methylation identification. 6. **Multi-Platform Support**: Analyzes data from Illumina methylation arrays (450K, EPIC), WGBS, RRBS, and targeted bisulfite sequencing. ## Workflow 1. **Input**: Provide methylation data (beta values, WGBS BAM files, or raw intensity data) and sample metadata. 2. **Preprocessing**: Quality control, normalization, and batch effect correction. 3. **Analysis**: Apply MethylGPT for methylation prediction, DMR calling, and pattern discovery. 4. **Interpretation**: Annotate DMRs to genomic features (promoters, enhancers, gene bodies) and pathways. 5. **Output**: DMR reports, methylation heatmaps, pathway enrichment, and epigenetic age estimates. ## Example Usage **User**: "Identify differentially methylated regions between tumor and normal samples in this WGBS dataset." **Agent Action**: ```bash python3 Skills/Genomics/Epigenomics_MethylGPT_Agent/methylgpt_analyzer.py \ --input tumor_normal_methylation.csv \ --groups tumor,normal \ --model methylgpt-base \ --analysis dmr \ --min_cpgs 5 \ --delta_beta 0.2 \ --output dmr_results.json ``` ## Key Methods and Tools | Method | Application | Reference | |--------|-------------|-----------| | MethylGPT | Foundation model for methylome analysis | 2025 Nature Methods | | DiffuCpG | Generative AI for missing data imputation | 2025 Bioinformatics | | DeepMethyl | WGBS analysis for DMR detection | 2024 Genome Biology | | minfi | Illumina array preprocessing | Bioconductor | | DSS | Statistical DMR calling | Bioconductor | ## Prerequisites * Python 3.10+ * PyTorch 2.0+ * Transformers library * methylgpt-model weights * Bioconductor R packages (optional) ## Related Skills * Single_Cell_Foundation_Models - For single-cell methylation analysis * Variant_Interpretation - For methylation-variant associations * Multi_Omics_Integration - For combining methylation with expression data ## Methodology DNA methylation analysis leverages CNNs and transformers to capture sequence context and long-range dependencies. The MethylGPT foundation model is pre-trained on millions of CpG sites across diverse tissues and conditions, enabling transfer learning for specific applications. DiffuCpG uses diffusion-based generative modeling to impute missing methylation values while preserving biological structure. ## Author AI Group - Biomedical AI Platform