{ "cells": [ { "cell_type": "markdown", "id": "6c3fa849", "metadata": {}, "source": [ "# Drug responses - Background traits, PharmGKB" ] }, { "cell_type": "markdown", "id": "ed4d72db", "metadata": {}, "source": [ "## Table of contents\n", "\n", "1. [ClinVar](#Data-from-ClinVar)\n", " 1. [Thoughts](#Thoughts)\n", "2. [PharmGKB](#PharmGKB-data)\n", " 1. [Clinical annotations](#Clinical-annotations)\n", " 2. [Example extraction](#Example-extraction)\n", " 2. [Connecting with ClinVar](#Connecting-with-ClinVar)\n", " 3. [Star alleles](#Star-alleles)\n", " 3. [Notes](#Notes)\n", "3. [General](#General)\n", " 1. [Meeting notes](#Meeting-notes)" ] }, { "cell_type": "code", "execution_count": 1, "id": "e598c978", "metadata": {}, "outputs": [], "source": [ "from collections import Counter\n", "import sys\n", "\n", "sys.path.append('..')" ] }, { "cell_type": "code", "execution_count": 2, "id": "9afa2e97", "metadata": {}, "outputs": [], "source": [ "from filter_clinvar_xml import filter_xml, pprint, iterate_cvs_from_xml\n", "from clinvar_xml_io.clinvar_xml_io import *" ] }, { "cell_type": "markdown", "id": "35684cff", "metadata": {}, "source": [ "## Data from ClinVar\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "Questions to address:\n", "\n", "* Can we reliably get the background trait, i.e. the disease that the drug acts on?\n", "* How many records are explicitly reporting efficacy phenotypes?" ] }, { "cell_type": "code", "execution_count": 3, "id": "86d81a4b", "metadata": {}, "outputs": [], "source": [ "# July 2022 data\n", "drug_xml = '/home/april/projects/opentargets/drug-response.xml.gz'" ] }, { "cell_type": "code", "execution_count": 4, "id": "b6ee82e0", "metadata": {}, "outputs": [], "source": [ "dataset = ClinVarDataset(drug_xml)" ] }, { "cell_type": "code", "execution_count": 237, "id": "65fe438f", "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " current\n", " NM_000769.4(CYP2C19):c.-806C>A AND clopidogrel response - Dosage, Efficacy, Toxicity/ADR\n", " \n", " \n", " current\n", " \n", " reviewed by expert panel\n", " drug response\n", " \n", " \n", " \n", " \n", " germline\n", " human\n", " yes\n", " \n", " \n", " curation\n", " \n", " \n", " not provided\n", " \n", " \n", " \n", " \n", " \n", " NM_000769.4(CYP2C19):c.-806C>A\n", " \n", " \n", " NM_000769.2(CYP2C19):c.-806C>A\n", " \n", " NC_000010.11:94761899:C:A\n", " \n", " NG_055436.1:g.1260C>A\n", " \n", " \n", " NG_008384.3:g.4220C>A\n", " \n", " \n", " NC_000010.11:g.94761900C>A\n", " \n", " \n", " NC_000010.10:g.96521657C>A\n", " \n", " \n", " 10q23.33\n", " \n", " \n", " \n", " \n", " cytochrome P450 family 2 subfamily C member 19\n", " \n", " \n", " CYP2C19\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " CYP2C19 promoter\n", " \n", " \n", " LOC110599570\n", " \n", " \n", " \n", " \n", " \n", " 21716271\n", " 3234301\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " NM_000769.4(CYP2C19):c.-806C>A\n", " \n", " \n", " NM_000769.4(CYP2C19):c.-806C>A\n", " \n", " \n", " NM_000769.4(CYP2C19):c.-806C>A\n", " \n", " \n", " \n", " \n", " \n", " \n", " clopidogrel response - Dosage, Efficacy, Toxicity/ADR\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " current\n", " \n", " reviewed by expert panel\n", " drug response\n", " \n", " 19463375\n", " \n", " \n", " 20083681\n", " \n", " \n", " 20492469\n", " \n", " \n", " 20801498\n", " \n", " \n", " 20826260\n", " \n", " \n", " 21392617\n", " \n", " \n", " 22028352\n", " \n", " \n", " 22190063\n", " \n", " \n", " 22228204\n", " \n", " \n", " 22462746\n", " \n", " \n", " 22704413\n", " \n", " \n", " 22955794\n", " \n", " \n", " 22990067\n", " \n", " \n", " 23364775\n", " \n", " \n", " 23726091\n", " \n", " \n", " 23809542\n", " \n", " \n", " 23922007\n", " \n", " \n", " 24019397\n", " \n", " PharmGKB Level of Evidence 1A: Annotation for a variant-drug combination in a CPIC or medical society-endorsed PGx guideline, or implemented at a PGRN site or in another major health system.\n", " \n", " \n", " \n", " \n", " Pharmacogenomics knowledge for personalized medicine\n", " \n", " 22992668\n", " \n", " \n", " \n", " \n", " germline\n", " human\n", " yes\n", " \n", " \n", " curation\n", " \n", " \n", " not provided\n", " \n", " \n", " \n", " \n", " \n", " NC_000010.10:g.96521657C>A\n", " \n", " \n", " \n", " CYP2C19\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " clopidogrel response - Dosage, Efficacy, Toxicity/ADR\n", " \n", " \n", " \n", " Acute coronary syndrome\n", " \n", " \n", " \n", " \n", " Coronary Artery Disease\n", " \n", " \n", " \n", " \n", " Myocardial Infarction\n", " \n", " \n", " \n", " \n", " \n", " https://www.pharmgkb.org/clinicalAnnotation/655386913\n", " \n", " Drug is not necessarily used to treat response condition\n", " \n", "\n", "\n", "\n" ] } ], "source": [ "# Entire CVS record (RCV + SCV) for reference\n", "for raw_cvs_xml in iterate_cvs_from_xml(drug_xml):\n", " pprint(raw_cvs_xml)\n", " break" ] }, { "cell_type": "markdown", "id": "9dfc45d1", "metadata": {}, "source": [ "Example [RCV000211201](https://www.ncbi.nlm.nih.gov/clinvar/RCV000211201/) - contains trait relationship between drug and disease but only in SCV not RCV record. (Note also there's only one SCV for this RCV.)\n", "\n", "**SCV:**\n", "\n", "```\n", "\n", " \n", " \n", " clopidogrel response - Dosage, Efficacy, Toxicity/ADR\n", " \n", " \n", " \n", " Acute coronary syndrome\n", " \n", " \n", " \n", " \n", " Coronary Artery Disease\n", " \n", " \n", " \n", " \n", " Myocardial Infarction\n", " \n", " \n", " \n", "\n", "```\n", "\n", "**RCV:**\n", "```\n", "\n", " \n", " \n", " clopidogrel response - Dosage, Efficacy, Toxicity/ADR\n", " \n", " \n", " \n", "\n", "```" ] }, { "cell_type": "code", "execution_count": 19, "id": "aec271c5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RCV001824998\n", "['Cabozantinib resistance', 'Entrectinib resistance', 'Larotrectinib resistance', 'Repotrectinib resistance', 'Selitrectinib resistance']\n" ] } ], "source": [ "# Check whether any of the RCV records have this kind of information\n", "for record in dataset:\n", " if len(record.trait_set) > 1:\n", " # No trait set with both a drug and a disease\n", " print(record.accession)\n", " print([trait.preferred_or_other_valid_name for trait in record.trait_set])\n", " for trait in record.trait_set:\n", " # No traits in RCV with relationship element\n", " relationships = find_elements(trait.trait_xml, './TraitRelationship')\n", " if relationships:\n", " print(record.accession)\n", " pprint(trait.trait_xml)" ] }, { "cell_type": "code", "execution_count": 49, "id": "19bd85f7", "metadata": {}, "outputs": [], "source": [ "def get_name(x):\n", " return ClinVarTrait(x, None).preferred_or_other_valid_name\n", "\n", "\n", "def is_pgkb(raw_cvs_xml):\n", " scvs = find_elements(raw_cvs_xml, './ClinVarAssertion/ClinVarSubmissionID')\n", " submitters = {scv.attrib.get('submitter') for scv in scvs}\n", " return 'PharmGKB' in submitters" ] }, { "cell_type": "code", "execution_count": 239, "id": "2bd06c38", "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "*hmg coa reductase inhibitors response - Toxicity => ['statin-related myopathy']\n", "*nicotine response - Toxicity => ['Tobacco Use Disorder']\n", "*azathioprine response - Toxicity => ['Inflammatory Bowel Diseases', 'Myelosuppression']\n", "Piroxicam response => ['Pain', 'Inflammation', 'Osteoarthritis', 'Rheumatoid arthritis']\n", "*halothane response - Toxicity => ['Malignant Hyperthermia']\n", "*warfarin response - Toxicity/ADR => ['Over-anticoagulation']\n", "*efavirenz response - Metabolism/PK => ['HIV Infections']\n", "Prednisolone response => ['Minimal change disease']\n", "efavirenz response => ['HIV']\n", "Deutetrabenazine response => ['Chorea', 'Huntington disease', 'Tardive dyskinesia']\n", "Lesinurad response => ['Gout']\n", "*rosuvastatin response - Efficacy => ['Hypercholesterolemia', 'Myocardial Infarction']\n", "Dabrafenib response => ['Pancreatic Adenocarcinoma']\n", "*tobramycin response - Toxicity => ['Ototoxicity']\n", "*peginterferon alfa-2b and ribavirin response - Toxicity => ['Anemia', 'Hepatitis C, Chronic']\n", "*captopril response - Efficacy => ['Diabetes Mellitus, Type 2', 'Heart Failure', 'Pulmonary Disease, Chronic Obstructive']\n", "Everolimus response => [None]\n", "Dopamine agonist response => ['Macroprolactinoma']\n", "Imatinib response => [None]\n", "Corticosteroid response => ['Chronic kidney disease']\n", "*Platinum compounds response - Efficacy => ['Neoplasms']\n", "*streptomycin response - Toxicity => ['Ototoxicity']\n", "Warfarin response => ['hemorrhage']\n", "*atorvastatin response - Toxicity => ['statin-related myopathy']\n", "Anti-PDL1 response => ['Cancer']\n", "*simvastatin response - Toxicity => ['statin-related myopathy']\n", "*gefitinib response - Efficacy => ['Carcinoma, Non-Small-Cell Lung', 'Drug Resistance']\n", "*hydrochlorothiazide response - Efficacy => ['Essential hypertension', 'Hypertension']\n", "*interferons, peginterferon alfa-2a, peginterferon alfa-2b and ribavirin response - Efficacy => ['Hepatitis C, Chronic']\n", "*fluorouracil response - Toxicity => ['Neoplasms']\n", "*desflurane response - Toxicity => ['Malignant Hyperthermia']\n", "*methotrexate response - Metabolism/PK => ['Burkitt Lymphoma', 'Leukemia', 'Lymphoma', 'Lymphoma, T-Cell', 'Precursor Cell Lymphoblastic Leukemia-Lymphoma']\n", "*nevirapine response - Toxicity => ['Epidermal Necrolysis, Toxic', 'Stevens-Johnson Syndrome']\n", "Phenytoin response => ['status epilepticus']\n", "Regorafenib response => ['Colorectal Neoplasms']\n", "None => ['Non-small cell lung cancer']\n", "*atorvastatin response - Efficacy => ['Coronary Disease', 'Hyperlipidemias']\n", "*ivacaftor / lumacaftor response - Efficacy => ['Cystic Fibrosis']\n", "Histone Methylation Therapy response => ['Cancer']\n", "*peginterferon alfa-2a, peginterferon alfa-2b, ribavirin and telaprevir response - Efficacy => ['Hepatitis C, Chronic']\n", "RAS Inhibitor response => ['Cancer']\n", "*pravastatin response - Efficacy => ['Coronary Disease', 'Myocardial Infarction']\n", "deoxygalactonojirimycin response => ['Fabry disease']\n", "*methoxyflurane response - Toxicity => ['Malignant Hyperthermia']\n", "*phenprocoumon response - Toxicity => ['Hemorrhage', 'over-anticoagulation', 'time above therapeutic range']\n", "*efavirenz response - Toxicity => ['HIV Infections']\n", "*tegafur response - Toxicity => ['Neoplasms']\n", "MEK Inhibitor response => ['Cancer']\n", "*ivacaftor / tezacaftor response - Efficacy => ['Cystic Fibrosis']\n", "*enflurane response - Toxicity => ['Malignant Hyperthermia']\n", "AKT1 Inhibitor response => ['Cancer']\n", "*rosuvastatin response - Metabolism/PK => ['Hypercholesterolemia']\n", "*methotrexate response - Toxicity => ['Arthritis, Juvenile Rheumatoid', 'Arthritis, Psoriatic', 'Arthritis, Rheumatoid', 'Drug Toxicity', 'Leukopenia', 'Neoplasms', 'Neutropenia', 'Osteosarcoma', 'Precursor Cell Lymphoblastic Leukemia-Lymphoma', 'Thrombocytopenia', 'Toxic liver disease', 'hematotoxicity', 'mucositis', 'primary central nervous system lymphoma']\n", "*salmeterol response - Efficacy => ['Asthma']\n", "*peginterferon alfa-2a, peginterferon alfa-2b and ribavirin response - Efficacy => ['Hepatitis C']\n", "*acenocoumarol response - Dosage => ['Atrial Fibrillation']\n", "Corticosteroid response => ['Minimal Change disease']\n", "Flurbiprofen response => ['Pain', 'Inflammation', 'Osteoarthritis', 'Rheumatoid Arthritis', 'Bursitis', 'Tendinitis']\n", "WEE1 Inhibitor response => ['Cancer']\n", "*peginterferon alfa-2b response - Efficacy => ['HIV Infections', 'Hepatitis C']\n", "*ethanol response - Toxicity => ['Alcoholism']\n", "*etanercept response - Efficacy => ['Arthritis, Psoriatic', 'Arthritis, Rheumatoid', 'Crohn Disease', 'Inflammation', 'Psoriasis', 'Spondylitis, Ankylosing']\n", "*carbamazepine response - Dosage => ['Epilepsy']\n", "*boceprevir, peginterferon alfa-2a, peginterferon alfa-2b and ribavirin response - Efficacy => ['Hepatitis C, Chronic']\n", "*nevirapine response - Metabolism/PK => ['HIV Infections']\n", "PARP Inhibitor response => ['Cancer']\n", "*warfarin response - Toxicity => ['Hemorrhage', 'over-anticoagulation']\n", "*capecitabine response - Toxicity => ['Neoplasms']\n", "Azathioprine intolerance => ['myasthenia gravis']\n", "Corticosteroid response => ['Minimal change disease']\n", "mTOR Inhibitor response => ['Cancer']\n", "*ribavirin response - Efficacy => ['HIV Infections', 'Hepatitis C']\n", "Gentamicin response => ['Bacterial infection', 'Neonatal sepsis']\n", "Androgen deprivation therapy response => ['Prostate neoplasm']\n", "*succinylcholine response - Toxicity => ['Malignant Hyperthermia']\n", "VEGF Inhibitors response => ['Cancer']\n", "all trans retinoic acid (ATRA) response => ['Acute promyelocytic leukemia']\n", "*tacrolimus response - Metabolism/PK => ['Kidney Transplantation', 'Proteinuria', 'liver transplantation']\n", "Vemurafenib-Cobimetinib Response => ['Melanoma']\n", "Corticosteroid response => ['Focal segmental glomerulosclerosis']\n", "Trametinib-Dabrafenib Response => ['Melanoma']\n", "*gentamicin response - Toxicity => ['Ototoxicity']\n", "*aminoglycoside antibacterials response - Toxicity => ['Ototoxicity']\n", "*clopidogrel response - Dosage, Efficacy, Toxicity/ADR => ['Acute coronary syndrome', 'Coronary Artery Disease', 'Myocardial Infarction']\n", "Gemcitabine response => ['non-small cell lung cancer']\n", "Corticosteroid response => ['Nephrotic syndrome']\n", "*kanamycin response - Toxicity => ['Ototoxicity']\n", "Pazopanib response => ['malignant granular cell tumor']\n", "*ivacaftor response - Efficacy => ['Cystic Fibrosis']\n", "*methotrexate response - Efficacy => ['Arthritis, Rheumatoid']\n", "*erlotinib response - Efficacy => ['Adenocarcinoma', 'Carcinoma, Non-Small-Cell Lung', 'Drug Resistance', 'Lung Neoplasms']\n", "*amikacin response - Toxicity => ['Ototoxicity']\n", "*isoflurane response - Toxicity => ['Malignant Hyperthermia']\n", "Gefitinib Response => ['Non-small cell lung carcinoma']\n", "Erlotinib Response => ['Non-small cell lung carcinoma']\n", "None => ['Leukemia', 'Inflammatory bowel disease', 'Rheumatoid arthritis', 'Non-Hodgkin lymphoma']\n", "*gefitinib response - Efficacy => ['Carcinoma, Non-Small-Cell Lung']\n", "*sevoflurane response - Toxicity => ['Malignant Hyperthermia']\n", "Tamoxifen response => ['Breast cancer']\n", "*irinotecan response - Toxicity => ['Neutropenia']\n", "*peginterferon alfa-2a response - Efficacy => ['HIV Infections', 'Hepatitis C']\n", "Doxorubicin response => [None]\n", "Prednisolone response => ['Focal segmental glomerulosclerosis 2']\n", "Suxamethonium response - slow metabolism => ['Butyrylcholinesterase deficiency']\n" ] } ], "source": [ "# Check whether all the SCV records have this kind of information\n", "n = 0\n", "count_all = 0\n", "count_pgkb = 0\n", "all_strs = set()\n", "for raw_cvs_xml in iterate_cvs_from_xml(drug_xml):\n", " n += 1\n", " elts = find_elements(raw_cvs_xml, './ClinVarAssertion/TraitSet/Trait')\n", " for e in elts:\n", " if e.attrib['Type'] == 'DrugResponse':\n", " relations = find_elements(e, './TraitRelationship')\n", " name = get_name(e)\n", " background_traits = []\n", " for r in relations:\n", " if r.attrib['Type'] == 'DrugResponseAndDisease':\n", " background_traits.append(get_name(r))\n", " if background_traits:\n", " count_all += 1\n", " if is_pgkb(raw_cvs_xml):\n", " count_pgkb += 1\n", " all_strs.add(f'*{get_name(e)} => {background_traits}')\n", " else:\n", " all_strs.add(f'{get_name(e)} => {background_traits}')\n", "\n", "for s in all_strs:\n", " print(s)" ] }, { "cell_type": "code", "execution_count": 60, "id": "034e22bb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Out of 4970 records, found 576 with drug response & disease relationship (361 from PharmGKB).\n" ] } ], "source": [ "print(f'Out of {n} records, found {count_all} with drug response & disease relationship ({count_pgkb} from PharmGKB).')" ] }, { "cell_type": "code", "execution_count": 235, "id": "9f1b0156", "metadata": { "scrolled": true }, "outputs": [], "source": [ "count_all = 0\n", "count_pgkb = 0\n", "for raw_cvs_xml in iterate_cvs_from_xml(drug_xml):\n", " elts = find_elements(raw_cvs_xml, './ClinVarAssertion/TraitSet/Trait')\n", " for e in elts:\n", " if e.attrib['Type'] == 'DrugResponse':\n", " name = get_name(e)\n", " if name and 'efficacy' in name.lower():\n", " count_all += 1\n", " if is_pgkb(raw_cvs_xml):\n", " count_pgkb += 1" ] }, { "cell_type": "code", "execution_count": 236, "id": "a9ac0e22", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Out of 4970 records, found 54 with efficacy phenotype (54 from PharmGKB).\n" ] } ], "source": [ "print(f'Out of {n} records, found {count_all} with efficacy phenotype ({count_pgkb} from PharmGKB).')" ] }, { "cell_type": "markdown", "id": "96095f19", "metadata": {}, "source": [ "### Thoughts\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "* Is it worth starting to parse SCV for drug response / disease trait relationships?\n", " * Might be relatively straightforward to do in this restricted case\n", " * Opens up a can of worms, e.g. what happens if SCVs don't agree? Do we end up redoing the work of aggregation?\n", "* Why does ClinVar exclude this info from the RCV anyway?\n", "* Is it worth trying other ways of linking drug & disease within ClinVar?\n", " * e.g. different RCV with same VCV, one for drug and one for disease\n", " * same SCV associated with different RCVs via different traits?\n", "* Counts summary: **4970** drug response records\n", " * **401** with PharmGKB submission (previous notebook)\n", " * **576** with drug response & disease relationship (in SCV only)\n", " * Of these, **361** from PharmGKB\n", " * **54** with explicit efficacy phenotype, all from PharmGKB" ] }, { "cell_type": "markdown", "id": "d8680977", "metadata": {}, "source": [ "## PharmGKB data\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "* Compare this with what PharmGKB submissions contain in ClinVar\n", "* Also consider how we would get consequences and how we'd connect to ClinVar data" ] }, { "cell_type": "markdown", "id": "7ef6c4ab", "metadata": {}, "source": [ "General PharmGKB notes:\n", "* [Multiple datasets](https://www.pharmgkb.org/downloads) that we could cross-reference\n", " * I looked at some of the others but the clinical annotations are probably all we need/can use\n", "* \"PharmGKB submits Level 1 & 2 Clinical Annotations PGx into ClinVar\" - see [levels](https://www.pharmgkb.org/page/clinAnnLevels)" ] }, { "cell_type": "code", "execution_count": 115, "id": "1f68c3b8", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import os\n", "from IPython.display import display" ] }, { "cell_type": "code", "execution_count": 84, "id": "d5a832f0", "metadata": {}, "outputs": [], "source": [ "pd.set_option('display.max_colwidth', None)" ] }, { "cell_type": "code", "execution_count": 64, "id": "4ef959c6", "metadata": {}, "outputs": [], "source": [ "pharmgkb_root = '/home/april/projects/opentargets/pharmgkb'" ] }, { "cell_type": "markdown", "id": "3443b7dd", "metadata": {}, "source": [ "### Clinical annotations\n", "\n", "[Top of page](#Table-of-contents)" ] }, { "cell_type": "code", "execution_count": 72, "id": "5613fc2a", "metadata": {}, "outputs": [], "source": [ "clinical_annotations = pd.read_csv(os.path.join(pharmgkb_root, 'clinical', 'clinical_annotations.tsv'), sep='\\t')\n", "clinical_alleles = pd.read_csv(os.path.join(pharmgkb_root, 'clinical', 'clinical_ann_alleles.tsv'), sep='\\t')\n", "clinical_evidence = pd.read_csv(os.path.join(pharmgkb_root, 'clinical', 'clinical_ann_evidence.tsv'), sep='\\t')" ] }, { "cell_type": "code", "execution_count": 132, "id": "d8183890", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5013" ] }, "execution_count": 132, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(clinical_annotations)" ] }, { "cell_type": "code", "execution_count": 112, "id": "4d97c071", "metadata": {}, "outputs": [], "source": [ "def show_id(i):\n", " for t in (clinical_annotations[clinical_annotations['Clinical Annotation ID'] == i],\n", " clinical_alleles[clinical_alleles['Clinical Annotation ID'] == i],\n", " clinical_evidence[clinical_evidence['Clinical Annotation ID'] == i]):\n", " display(t)" ] }, { "cell_type": "markdown", "id": "e60a5eda", "metadata": {}, "source": [ "Two examples: one with RS ID (981755803) and one with star allele only (1451243980)" ] }, { "cell_type": "code", "execution_count": 113, "id": "78f3063a", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDVariant/HaplotypesGeneLevel of EvidenceLevel OverrideLevel ModifiersScorePhenotype CategoryPMID CountEvidence CountDrug(s)Phenotype(s)Latest History Date (YYYY-MM-DD)URLSpecialty Population
0981755803rs75527207CFTR1ANaNRare Variant; Tier 1 VIP234.875Efficacy2830ivacaftorCystic Fibrosis2021-03-24https://www.pharmgkb.org/clinicalAnnotation/981755803Pediatric
\n", "
" ], "text/plain": [ " Clinical Annotation ID Variant/Haplotypes Gene Level of Evidence \\\n", "0 981755803 rs75527207 CFTR 1A \n", "\n", " Level Override Level Modifiers Score Phenotype Category \\\n", "0 NaN Rare Variant; Tier 1 VIP 234.875 Efficacy \n", "\n", " PMID Count Evidence Count Drug(s) Phenotype(s) \\\n", "0 28 30 ivacaftor Cystic Fibrosis \n", "\n", " Latest History Date (YYYY-MM-DD) \\\n", "0 2021-03-24 \n", "\n", " URL Specialty Population \n", "0 https://www.pharmgkb.org/clinicalAnnotation/981755803 Pediatric " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDGenotype/AlleleAnnotation TextAllele Function
0981755803AAPatients with the rs75527207 AA genotype (two copies of the CFTR G551D variant) and cystic fibrosis may respond to ivacaftor treatment. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor.NaN
1981755803AGPatients with the rs75527207 AG genotype (one copy of the CFTR G551D variant) and cystic fibrosis may respond to ivacaftor treatment. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor.NaN
2981755803GGPatients with the rs75527207 GG genotype (do not have a copy of the CFTR G551D variant) and cystic fibrosis have an unknown response to ivacaftor treatment, as response may depend on the presence of other CFTR variants. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor.NaN
\n", "
" ], "text/plain": [ " Clinical Annotation ID Genotype/Allele \\\n", "0 981755803 AA \n", "1 981755803 AG \n", "2 981755803 GG \n", "\n", " Annotation Text \\\n", "0 Patients with the rs75527207 AA genotype (two copies of the CFTR G551D variant) and cystic fibrosis may respond to ivacaftor treatment. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor. \n", "1 Patients with the rs75527207 AG genotype (one copy of the CFTR G551D variant) and cystic fibrosis may respond to ivacaftor treatment. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor. \n", "2 Patients with the rs75527207 GG genotype (do not have a copy of the CFTR G551D variant) and cystic fibrosis have an unknown response to ivacaftor treatment, as response may depend on the presence of other CFTR variants. FDA-approved drug labeling information and CPIC guidelines indicate use of ivacaftor in cystic fibrosis patients with at least one copy of a list of 33 CFTR genetic variants, including G551D. Other genetic and clinical factors may also influence response to ivacaftor. \n", "\n", " Allele Function \n", "0 NaN \n", "1 NaN \n", "2 NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDEvidence IDEvidence TypeEvidence URLPMIDSummaryScore
0981755803PA166114461Guideline Annotationhttps://www.pharmgkb.org/guidelineAnnotation/PA166114461NaNAnnotation of CPIC Guideline for ivacaftor and CFTR100
1981755803PA166104890Label Annotationhttps://www.pharmgkb.org/labelAnnotation/PA166104890NaNAnnotation of FDA Label for ivacaftor and CFTR100
2981755803981755665Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98175566521083385.0Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis.0.25
3981755803981755678Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98175567822047557.0Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis.2.0
4981755803982006840Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98200684023313410.0Allele A is associated with response to ivacaftor in men with Cystic Fibrosis.0.25
5981755803982009991Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98200999123590265.0Allele A is associated with response to ivacaftor in children with Cystic Fibrosis.2.25
69817558031043737597Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/104373759723757359.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.0
79817558031043737620Variant Functional Assay Annotationhttps://www.pharmgkb.org/variantAnnotation/104373762023757361.0Allele A is associated with increased activity of CFTR when treated with ivacaftor in transfected CHO cells.0.0
89817558031043737636Variant Functional Assay Annotationhttps://www.pharmgkb.org/variantAnnotation/104373763623891399.0Allele A is associated with activity of CFTR when treated with ivacaftor in FRT cell lines.0.0
99817558031183629335Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/118362933524066763.0Genotype AA is associated with response to ivacaftor in women with Cystic Fibrosis.0.25
109817558031448267532Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144826753227745802.0Genotypes AA + AG is associated with decreased severity of bone density when treated with ivacaftor in people with Cystic Fibrosis as compared to genotype GG.1.5
119817558031448423752Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144842375227773592.0Genotypes AA + AG is associated with increased response to ivacaftor in people with Cystic Fibrosis as compared to genotype GG.0.875
129817558031449191908Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919190825682022.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.0.25
139817558031449192031Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144919203128651844.0Allele A is associated with decreased likelihood of cystic fibrosis pulmonary exacerbation when treated with ivacaftor in people with Cystic Fibrosis.3.0
149817558031449192055Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919205528711222.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.25
159817558031449192093Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919209325311995.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.0.0
169817558031449192439Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919243928611235.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.1.5
179817558031449192481Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919248126135562.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.0
189817558031449192494Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919249425171465.0Allele A is associated with response to ivacaftor in children with Cystic Fibrosis.0.25
199817558031449192576Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919257625755212.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.0
209817558031449192615Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919261526568242.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.5
219817558031449192709Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919270925473543.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.0.25
229817558031449192721Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144919272125145599.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.2.5
239817558031450043422Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/145004342223628510.0Allele A is associated with response to ivacaftor in children with Cystic Fibrosis.3.0
249817558031184512440Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/118451244025049054.0Allele A is associated with response to ivacaftor in people with Cystic Fibrosis.1.5
25981755803981755746Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98175574622942289.0Allele A is associated with increased response to ivacaftor.This annotation is not used for clinical annotation scoring.
26981755803981755699Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98175569919846789.0Allele A is associated with increased response to ivacaftor.This annotation is not used for clinical annotation scoring.
27981755803981755787Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/98175578722293084.0Allele A is associated with increased response to ivacaftor.This annotation is not used for clinical annotation scoring.
289817558031446903789Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144690378924461666.0Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis.2.5
299817558031448099051Variant Drug Annotationhttps://www.pharmgkb.org/variantAnnotation/144809905127158673.0Genotypes AA + AG are associated with increased response to ivacaftor in people with Cystic Fibrosis as compared to genotype GG.2.0
\n", "
" ], "text/plain": [ " Clinical Annotation ID Evidence ID Evidence Type \\\n", "0 981755803 PA166114461 Guideline Annotation \n", "1 981755803 PA166104890 Label Annotation \n", "2 981755803 981755665 Variant Drug Annotation \n", "3 981755803 981755678 Variant Drug Annotation \n", "4 981755803 982006840 Variant Drug Annotation \n", "5 981755803 982009991 Variant Drug Annotation \n", "6 981755803 1043737597 Variant Drug Annotation \n", "7 981755803 1043737620 Variant Functional Assay Annotation \n", "8 981755803 1043737636 Variant Functional Assay Annotation \n", "9 981755803 1183629335 Variant Drug Annotation \n", "10 981755803 1448267532 Variant Phenotype Annotation \n", "11 981755803 1448423752 Variant Drug Annotation \n", "12 981755803 1449191908 Variant Drug Annotation \n", "13 981755803 1449192031 Variant Phenotype Annotation \n", "14 981755803 1449192055 Variant Drug Annotation \n", "15 981755803 1449192093 Variant Drug Annotation \n", "16 981755803 1449192439 Variant Drug Annotation \n", "17 981755803 1449192481 Variant Drug Annotation \n", "18 981755803 1449192494 Variant Drug Annotation \n", "19 981755803 1449192576 Variant Drug Annotation \n", "20 981755803 1449192615 Variant Drug Annotation \n", "21 981755803 1449192709 Variant Drug Annotation \n", "22 981755803 1449192721 Variant Drug Annotation \n", "23 981755803 1450043422 Variant Drug Annotation \n", "24 981755803 1184512440 Variant Drug Annotation \n", "25 981755803 981755746 Variant Drug Annotation \n", "26 981755803 981755699 Variant Drug Annotation \n", "27 981755803 981755787 Variant Drug Annotation \n", "28 981755803 1446903789 Variant Drug Annotation \n", "29 981755803 1448099051 Variant Drug Annotation \n", "\n", " Evidence URL PMID \\\n", "0 https://www.pharmgkb.org/guidelineAnnotation/PA166114461 NaN \n", "1 https://www.pharmgkb.org/labelAnnotation/PA166104890 NaN \n", "2 https://www.pharmgkb.org/variantAnnotation/981755665 21083385.0 \n", "3 https://www.pharmgkb.org/variantAnnotation/981755678 22047557.0 \n", "4 https://www.pharmgkb.org/variantAnnotation/982006840 23313410.0 \n", "5 https://www.pharmgkb.org/variantAnnotation/982009991 23590265.0 \n", "6 https://www.pharmgkb.org/variantAnnotation/1043737597 23757359.0 \n", "7 https://www.pharmgkb.org/variantAnnotation/1043737620 23757361.0 \n", "8 https://www.pharmgkb.org/variantAnnotation/1043737636 23891399.0 \n", "9 https://www.pharmgkb.org/variantAnnotation/1183629335 24066763.0 \n", "10 https://www.pharmgkb.org/variantAnnotation/1448267532 27745802.0 \n", "11 https://www.pharmgkb.org/variantAnnotation/1448423752 27773592.0 \n", "12 https://www.pharmgkb.org/variantAnnotation/1449191908 25682022.0 \n", "13 https://www.pharmgkb.org/variantAnnotation/1449192031 28651844.0 \n", "14 https://www.pharmgkb.org/variantAnnotation/1449192055 28711222.0 \n", "15 https://www.pharmgkb.org/variantAnnotation/1449192093 25311995.0 \n", "16 https://www.pharmgkb.org/variantAnnotation/1449192439 28611235.0 \n", "17 https://www.pharmgkb.org/variantAnnotation/1449192481 26135562.0 \n", "18 https://www.pharmgkb.org/variantAnnotation/1449192494 25171465.0 \n", "19 https://www.pharmgkb.org/variantAnnotation/1449192576 25755212.0 \n", "20 https://www.pharmgkb.org/variantAnnotation/1449192615 26568242.0 \n", "21 https://www.pharmgkb.org/variantAnnotation/1449192709 25473543.0 \n", "22 https://www.pharmgkb.org/variantAnnotation/1449192721 25145599.0 \n", "23 https://www.pharmgkb.org/variantAnnotation/1450043422 23628510.0 \n", "24 https://www.pharmgkb.org/variantAnnotation/1184512440 25049054.0 \n", "25 https://www.pharmgkb.org/variantAnnotation/981755746 22942289.0 \n", "26 https://www.pharmgkb.org/variantAnnotation/981755699 19846789.0 \n", "27 https://www.pharmgkb.org/variantAnnotation/981755787 22293084.0 \n", "28 https://www.pharmgkb.org/variantAnnotation/1446903789 24461666.0 \n", "29 https://www.pharmgkb.org/variantAnnotation/1448099051 27158673.0 \n", "\n", " Summary \\\n", "0 Annotation of CPIC Guideline for ivacaftor and CFTR \n", "1 Annotation of FDA Label for ivacaftor and CFTR \n", "2 Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis. \n", "3 Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis. \n", "4 Allele A is associated with response to ivacaftor in men with Cystic Fibrosis. \n", "5 Allele A is associated with response to ivacaftor in children with Cystic Fibrosis. \n", "6 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "7 Allele A is associated with increased activity of CFTR when treated with ivacaftor in transfected CHO cells. \n", "8 Allele A is associated with activity of CFTR when treated with ivacaftor in FRT cell lines. \n", "9 Genotype AA is associated with response to ivacaftor in women with Cystic Fibrosis. \n", "10 Genotypes AA + AG is associated with decreased severity of bone density when treated with ivacaftor in people with Cystic Fibrosis as compared to genotype GG. \n", "11 Genotypes AA + AG is associated with increased response to ivacaftor in people with Cystic Fibrosis as compared to genotype GG. \n", "12 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "13 Allele A is associated with decreased likelihood of cystic fibrosis pulmonary exacerbation when treated with ivacaftor in people with Cystic Fibrosis. \n", "14 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "15 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "16 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "17 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "18 Allele A is associated with response to ivacaftor in children with Cystic Fibrosis. \n", "19 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "20 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "21 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "22 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "23 Allele A is associated with response to ivacaftor in children with Cystic Fibrosis. \n", "24 Allele A is associated with response to ivacaftor in people with Cystic Fibrosis. \n", "25 Allele A is associated with increased response to ivacaftor. \n", "26 Allele A is associated with increased response to ivacaftor. \n", "27 Allele A is associated with increased response to ivacaftor. \n", "28 Genotypes AA + AG are associated with response to ivacaftor in people with Cystic Fibrosis. \n", "29 Genotypes AA + AG are associated with increased response to ivacaftor in people with Cystic Fibrosis as compared to genotype GG. \n", "\n", " Score \n", "0 100 \n", "1 100 \n", "2 0.25 \n", "3 2.0 \n", "4 0.25 \n", "5 2.25 \n", "6 2.0 \n", "7 0.0 \n", "8 0.0 \n", "9 0.25 \n", "10 1.5 \n", "11 0.875 \n", "12 0.25 \n", "13 3.0 \n", "14 2.25 \n", "15 0.0 \n", "16 1.5 \n", "17 2.0 \n", "18 0.25 \n", "19 2.0 \n", "20 2.5 \n", "21 0.25 \n", "22 2.5 \n", "23 3.0 \n", "24 1.5 \n", "25 This annotation is not used for clinical annotation scoring. \n", "26 This annotation is not used for clinical annotation scoring. \n", "27 This annotation is not used for clinical annotation scoring. \n", "28 2.5 \n", "29 2.0 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_id(981755803)" ] }, { "cell_type": "code", "execution_count": 114, "id": "c19bec99", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDVariant/HaplotypesGeneLevel of EvidenceLevel OverrideLevel ModifiersScorePhenotype CategoryPMID CountEvidence CountDrug(s)Phenotype(s)Latest History Date (YYYY-MM-DD)URLSpecialty Population
49961451243980CYP2B6*1, CYP2B6*2, CYP2B6*6, CYP2B6*18, CYP2B6*38CYP2B61ANaNTier 1 VIP211.5Toxicity1214efavirenzHIV Infections2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1451243980NaN
\n", "
" ], "text/plain": [ " Clinical Annotation ID \\\n", "4996 1451243980 \n", "\n", " Variant/Haplotypes Gene \\\n", "4996 CYP2B6*1, CYP2B6*2, CYP2B6*6, CYP2B6*18, CYP2B6*38 CYP2B6 \n", "\n", " Level of Evidence Level Override Level Modifiers Score \\\n", "4996 1A NaN Tier 1 VIP 211.5 \n", "\n", " Phenotype Category PMID Count Evidence Count Drug(s) \\\n", "4996 Toxicity 12 14 efavirenz \n", "\n", " Phenotype(s) Latest History Date (YYYY-MM-DD) \\\n", "4996 HIV Infections 2021-03-24 \n", "\n", " URL \\\n", "4996 https://www.pharmgkb.org/clinicalAnnotation/1451243980 \n", "\n", " Specialty Population \n", "4996 NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDGenotype/AlleleAnnotation TextAllele Function
154041451243980*1The CYP2B6*1 allele is assigned as a normal function allele by CPIC. Patients carrying CYP2B6*1 allele in combination with another normal function allele may have decreased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with a no or decreased function allele in combination with a normal or increased function allele or with two no or decreased function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence the toxicity of efavirenz.Normal function
154051451243980*2The CYP2B6*2 allele is assigned as a normal function allele by CPIC. Patients carrying CYP2B6*2 allele in combination with another normal function allele may have decreased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with a no or decreased function allele in combination with a normal or increased function allele or with two no or decreased function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence the toxicity of efavirenz.Normal function
154061451243980*6The CYP2B6*6 allele is assigned as a decreased function allele by CPIC. Patients carrying the CYP2B6*6 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence toxicity of efavirenz.Decreased function
154071451243980*18The CYP2B6*18 allele is assigned as a no function allele by CPIC. Patients carrying the CYP2B6*18 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence toxicity of efavirenz.No function
154081451243980*38The CYP2B6*38 allele is assigned as a no function allele by CPIC. Patients carrying the CYP2B6*38 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. Other genetic and clinical factors may also influence toxicity of efavirenz.No function
\n", "
" ], "text/plain": [ " Clinical Annotation ID Genotype/Allele \\\n", "15404 1451243980 *1 \n", "15405 1451243980 *2 \n", "15406 1451243980 *6 \n", "15407 1451243980 *18 \n", "15408 1451243980 *38 \n", "\n", " Annotation Text \\\n", "15404 The CYP2B6*1 allele is assigned as a normal function allele by CPIC. Patients carrying CYP2B6*1 allele in combination with another normal function allele may have decreased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with a no or decreased function allele in combination with a normal or increased function allele or with two no or decreased function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence the toxicity of efavirenz. \n", "15405 The CYP2B6*2 allele is assigned as a normal function allele by CPIC. Patients carrying CYP2B6*2 allele in combination with another normal function allele may have decreased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with a no or decreased function allele in combination with a normal or increased function allele or with two no or decreased function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence the toxicity of efavirenz. \n", "15406 The CYP2B6*6 allele is assigned as a decreased function allele by CPIC. Patients carrying the CYP2B6*6 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence toxicity of efavirenz. \n", "15407 The CYP2B6*18 allele is assigned as a no function allele by CPIC. Patients carrying the CYP2B6*18 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. However, conflicting evidence has been reported. Other genetic and clinical factors may also influence toxicity of efavirenz. \n", "15408 The CYP2B6*38 allele is assigned as a no function allele by CPIC. Patients carrying the CYP2B6*38 allele in combination with a normal, decreased, no, or increased function allele may have increased risk of adverse events (eg. liver toxicity or CNS side effects) when treated with efavirenz as compared to patients with two normal function alleles. Other genetic and clinical factors may also influence toxicity of efavirenz. \n", "\n", " Allele Function \n", "15404 Normal function \n", "15405 Normal function \n", "15406 Decreased function \n", "15407 No function \n", "15408 No function " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDEvidence IDEvidence TypeEvidence URLPMIDSummaryScore
146951451243980PA166182603Guideline Annotationhttps://www.pharmgkb.org/guidelineAnnotation/PA166182603NaNAnnotation of CPIC Guideline for efavirenz and CYP2B6100
146961451243980PA166182846Guideline Annotationhttps://www.pharmgkb.org/guidelineAnnotation/PA166182846NaNAnnotation of DPWG Guideline for efavirenz and CYP2B6100
1469714512439801451289240Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/145128924025889207.0Allele C is not associated with increased likelihood of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections as compared to allele T.-1.5
1469814512439801183634232Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/118363423224080498.0Genotypes CC + CT are not associated with risk of Neurotoxicity Syndromes when treated with efavirenz in people with HIV Infections as compared to genotype TT.-1.75
1469914512439801184473287Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/118447328724517233.0Genotype TT is associated with increased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections.2.0
1470014512439801448636199Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144863619928692529.0Genotype CC is associated with decreased likelihood of Drug Toxicity when treated with efavirenz in people with HIV Infections as compared to genotype TT.2.0
1470114512439801448993810Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144899381026715213.0Genotypes CC + CT are associated with decreased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections as compared to genotype TT.3.5
147021451243980827707534Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/82770753421862974.0CYP2B6 *6/*6 is associated with increased risk of drug-induced liver injury when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1.2.5
1470314512439801184168515Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/118416851523734829.0CYP2B6 *1 is not associated with Neurotoxicity Syndromes when treated with efavirenz in people with HIV as compared to CYP2B6 *6.-1.5
1470414512439801448993721Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144899372122808112.0CYP2B6 *6 is associated with increased risk of Toxic liver disease when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1.2.25
1470514512439801448993746Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144899374627333947.0CYP2B6 *6/*6 is associated with increased risk of Long QT Syndrome when exposed to efavirenz in healthy individuals as compared to CYP2B6 *1/*1.1.75
1470614512439801448994067Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144899406717686225.0CYP2B6 *2/*2 is associated with increased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1.0.25
1470714512439801449156721Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144915672123640958.0CYP2B6 *6 + *38 are associated with increased risk of Neurotoxicity Syndromes when treated with efavirenz as compared to CYP2B6 *1/*1.0.0
1470814512439801449156770Variant Phenotype Annotationhttps://www.pharmgkb.org/variantAnnotation/144915677024359841.0CYP2B6 *6/*6 is associated with increased likelihood of Toxic liver disease when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1.2.0
\n", "
" ], "text/plain": [ " Clinical Annotation ID Evidence ID Evidence Type \\\n", "14695 1451243980 PA166182603 Guideline Annotation \n", "14696 1451243980 PA166182846 Guideline Annotation \n", "14697 1451243980 1451289240 Variant Phenotype Annotation \n", "14698 1451243980 1183634232 Variant Phenotype Annotation \n", "14699 1451243980 1184473287 Variant Phenotype Annotation \n", "14700 1451243980 1448636199 Variant Phenotype Annotation \n", "14701 1451243980 1448993810 Variant Phenotype Annotation \n", "14702 1451243980 827707534 Variant Phenotype Annotation \n", "14703 1451243980 1184168515 Variant Phenotype Annotation \n", "14704 1451243980 1448993721 Variant Phenotype Annotation \n", "14705 1451243980 1448993746 Variant Phenotype Annotation \n", "14706 1451243980 1448994067 Variant Phenotype Annotation \n", "14707 1451243980 1449156721 Variant Phenotype Annotation \n", "14708 1451243980 1449156770 Variant Phenotype Annotation \n", "\n", " Evidence URL PMID \\\n", "14695 https://www.pharmgkb.org/guidelineAnnotation/PA166182603 NaN \n", "14696 https://www.pharmgkb.org/guidelineAnnotation/PA166182846 NaN \n", "14697 https://www.pharmgkb.org/variantAnnotation/1451289240 25889207.0 \n", "14698 https://www.pharmgkb.org/variantAnnotation/1183634232 24080498.0 \n", "14699 https://www.pharmgkb.org/variantAnnotation/1184473287 24517233.0 \n", "14700 https://www.pharmgkb.org/variantAnnotation/1448636199 28692529.0 \n", "14701 https://www.pharmgkb.org/variantAnnotation/1448993810 26715213.0 \n", "14702 https://www.pharmgkb.org/variantAnnotation/827707534 21862974.0 \n", "14703 https://www.pharmgkb.org/variantAnnotation/1184168515 23734829.0 \n", "14704 https://www.pharmgkb.org/variantAnnotation/1448993721 22808112.0 \n", "14705 https://www.pharmgkb.org/variantAnnotation/1448993746 27333947.0 \n", "14706 https://www.pharmgkb.org/variantAnnotation/1448994067 17686225.0 \n", "14707 https://www.pharmgkb.org/variantAnnotation/1449156721 23640958.0 \n", "14708 https://www.pharmgkb.org/variantAnnotation/1449156770 24359841.0 \n", "\n", " Summary \\\n", "14695 Annotation of CPIC Guideline for efavirenz and CYP2B6 \n", "14696 Annotation of DPWG Guideline for efavirenz and CYP2B6 \n", "14697 Allele C is not associated with increased likelihood of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections as compared to allele T. \n", "14698 Genotypes CC + CT are not associated with risk of Neurotoxicity Syndromes when treated with efavirenz in people with HIV Infections as compared to genotype TT. \n", "14699 Genotype TT is associated with increased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections. \n", "14700 Genotype CC is associated with decreased likelihood of Drug Toxicity when treated with efavirenz in people with HIV Infections as compared to genotype TT. \n", "14701 Genotypes CC + CT are associated with decreased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV Infections as compared to genotype TT. \n", "14702 CYP2B6 *6/*6 is associated with increased risk of drug-induced liver injury when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1. \n", "14703 CYP2B6 *1 is not associated with Neurotoxicity Syndromes when treated with efavirenz in people with HIV as compared to CYP2B6 *6. \n", "14704 CYP2B6 *6 is associated with increased risk of Toxic liver disease when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1. \n", "14705 CYP2B6 *6/*6 is associated with increased risk of Long QT Syndrome when exposed to efavirenz in healthy individuals as compared to CYP2B6 *1/*1. \n", "14706 CYP2B6 *2/*2 is associated with increased risk of Central Nervous System Diseases when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1. \n", "14707 CYP2B6 *6 + *38 are associated with increased risk of Neurotoxicity Syndromes when treated with efavirenz as compared to CYP2B6 *1/*1. \n", "14708 CYP2B6 *6/*6 is associated with increased likelihood of Toxic liver disease when treated with efavirenz in people with HIV as compared to CYP2B6 *1/*1. \n", "\n", " Score \n", "14695 100 \n", "14696 100 \n", "14697 -1.5 \n", "14698 -1.75 \n", "14699 2.0 \n", "14700 2.0 \n", "14701 3.5 \n", "14702 2.5 \n", "14703 -1.5 \n", "14704 2.25 \n", "14705 1.75 \n", "14706 0.25 \n", "14707 0.0 \n", "14708 2.0 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_id(1451243980)" ] }, { "cell_type": "markdown", "id": "48bb09fa", "metadata": {}, "source": [ "### Example extraction\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "New data model extracted from PharmKGB clinical annotations download file:\n", "* The trait in the evidence will be PharmGKB's “Phenotypes”\n", "* The drug will be extracted from PharmGKB's “Drugs”\n", "* The target will be the target associated with the variant, PharmGKB’s “Gene”\n", "* Filter rows for those whose category is `Efficacy` and has associated `Phenotypes`" ] }, { "cell_type": "code", "execution_count": 134, "id": "cf8105dc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Clinical Annotation ID', 'Variant/Haplotypes', 'Gene',\n", " 'Level of Evidence', 'Level Override', 'Level Modifiers', 'Score',\n", " 'Phenotype Category', 'PMID Count', 'Evidence Count', 'Drug(s)',\n", " 'Phenotype(s)', 'Latest History Date (YYYY-MM-DD)', 'URL',\n", " 'Specialty Population'],\n", " dtype='object')" ] }, "execution_count": 134, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clinical_annotations.columns" ] }, { "cell_type": "code", "execution_count": 139, "id": "b7be01a7", "metadata": {}, "outputs": [], "source": [ "# Filter by efficacy\n", "efficacy_annotations = clinical_annotations[clinical_annotations['Phenotype Category'] == 'Efficacy']" ] }, { "cell_type": "code", "execution_count": 150, "id": "4995b340", "metadata": {}, "outputs": [], "source": [ "# Keep relevant columns\n", "efficacy_annotations = efficacy_annotations[\n", " ['Clinical Annotation ID', 'Variant/Haplotypes', 'Gene',\n", " 'Level of Evidence', 'Drug(s)', 'Phenotype(s)']]" ] }, { "cell_type": "code", "execution_count": 162, "id": "5c364dd6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1931" ] }, "execution_count": 162, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(efficacy_annotations)" ] }, { "cell_type": "code", "execution_count": 153, "id": "471220b2", "metadata": {}, "outputs": [], "source": [ "# Join on alleles data\n", "efficacy_with_alleles = efficacy_annotations.set_index('Clinical Annotation ID').join(clinical_alleles.set_index('Clinical Annotation ID'))" ] }, { "cell_type": "code", "execution_count": 154, "id": "51ebdc1b", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Variant/HaplotypesGeneLevel of EvidenceDrug(s)Phenotype(s)Genotype/AlleleAnnotation TextAllele Function
Clinical Annotation ID
613979021rs1042714ADRB23carvedilolHeart FailureCCPatients with the CC genotype and heart failure may have a poorer response to carvedilol treatment as compared to patients with the CG or GG genotype. Other genetic and clinical factors may also influence a patient's chance of response.NaN
613979021rs1042714ADRB23carvedilolHeart FailureCGPatients with the CG genotype and heart failure may have a poorer response to carvedilol treatment as compared to patients with the GG genotype and a better response as compared to patients with the CC genotype. Patients with the CG genotype may still be at risk for non-response to carvedilol treatment based on their genotype. Other genetic and clinical factors may also influence a patient's chance of response.NaN
613979021rs1042714ADRB23carvedilolHeart FailureGGPatients with the GG genotype and heart failure may have a better response to carvedilol treatment as compared to patients with the CC or CG genotype. Patients with the GG genotype may still be at risk for non-response to carvedilol treatment based on their genotype. Other genetic and clinical factors may also influence a patient's chance of response.NaN
613979403rs5443GNB33sumatriptanCluster HeadacheCCPatients with the CC genotype and cluster headache who are treated with triptans may be less likely to have reduced pain or attack frequency as compared to patients with the CT genotype. Other genetic and clinical factors may also influence a patient's response to sumatriptan.NaN
613979403rs5443GNB33sumatriptanCluster HeadacheCTPatients with the CT genotype and cluster headache who are treated with triptans may be more likely to have reduced pain or attack frequency as compared to patients with the CC genotype. Other genetic and clinical factors may also influence a patient's response to sumatriptan.NaN
...........................
1451868520rs11198893GRK53Beta Blocking AgentsCoronary Artery DiseaseAGPatients with the rs11198893 AG genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents.NaN
1451868520rs11198893GRK53Beta Blocking AgentsCoronary Artery DiseaseGGPatients with the rs11198893 GG genotype and coronary artery disease may have increased response when treated with beta blocking agents as compared to patients with the AA or AG genotypes. Other genetic and clinical factors may also influence response to beta blocking agents.NaN
1451868540rs4752292GRK53Beta Blocking AgentsCoronary Artery DiseaseGGPatients with the rs4752292 GG genotype and coronary artery disease may have increased response when treated with beta blocking agents as compared to patients with the TT or GT genotypes. Other genetic and clinical factors may also influence response to beta blocking agents.NaN
1451868540rs4752292GRK53Beta Blocking AgentsCoronary Artery DiseaseGTPatients with the rs4752292 GT genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents.NaN
1451868540rs4752292GRK53Beta Blocking AgentsCoronary Artery DiseaseTTPatients with the rs4752292 TT genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents.NaN
\n", "

5881 rows × 8 columns

\n", "
" ], "text/plain": [ " Variant/Haplotypes Gene Level of Evidence \\\n", "Clinical Annotation ID \n", "613979021 rs1042714 ADRB2 3 \n", "613979021 rs1042714 ADRB2 3 \n", "613979021 rs1042714 ADRB2 3 \n", "613979403 rs5443 GNB3 3 \n", "613979403 rs5443 GNB3 3 \n", "... ... ... ... \n", "1451868520 rs11198893 GRK5 3 \n", "1451868520 rs11198893 GRK5 3 \n", "1451868540 rs4752292 GRK5 3 \n", "1451868540 rs4752292 GRK5 3 \n", "1451868540 rs4752292 GRK5 3 \n", "\n", " Drug(s) Phenotype(s) \\\n", "Clinical Annotation ID \n", "613979021 carvedilol Heart Failure \n", "613979021 carvedilol Heart Failure \n", "613979021 carvedilol Heart Failure \n", "613979403 sumatriptan Cluster Headache \n", "613979403 sumatriptan Cluster Headache \n", "... ... ... \n", "1451868520 Beta Blocking Agents Coronary Artery Disease \n", "1451868520 Beta Blocking Agents Coronary Artery Disease \n", "1451868540 Beta Blocking Agents Coronary Artery Disease \n", "1451868540 Beta Blocking Agents Coronary Artery Disease \n", "1451868540 Beta Blocking Agents Coronary Artery Disease \n", "\n", " Genotype/Allele \\\n", "Clinical Annotation ID \n", "613979021 CC \n", "613979021 CG \n", "613979021 GG \n", "613979403 CC \n", "613979403 CT \n", "... ... \n", "1451868520 AG \n", "1451868520 GG \n", "1451868540 GG \n", "1451868540 GT \n", "1451868540 TT \n", "\n", " Annotation Text \\\n", "Clinical Annotation ID \n", "613979021 Patients with the CC genotype and heart failure may have a poorer response to carvedilol treatment as compared to patients with the CG or GG genotype. Other genetic and clinical factors may also influence a patient's chance of response. \n", "613979021 Patients with the CG genotype and heart failure may have a poorer response to carvedilol treatment as compared to patients with the GG genotype and a better response as compared to patients with the CC genotype. Patients with the CG genotype may still be at risk for non-response to carvedilol treatment based on their genotype. Other genetic and clinical factors may also influence a patient's chance of response. \n", "613979021 Patients with the GG genotype and heart failure may have a better response to carvedilol treatment as compared to patients with the CC or CG genotype. Patients with the GG genotype may still be at risk for non-response to carvedilol treatment based on their genotype. Other genetic and clinical factors may also influence a patient's chance of response. \n", "613979403 Patients with the CC genotype and cluster headache who are treated with triptans may be less likely to have reduced pain or attack frequency as compared to patients with the CT genotype. Other genetic and clinical factors may also influence a patient's response to sumatriptan. \n", "613979403 Patients with the CT genotype and cluster headache who are treated with triptans may be more likely to have reduced pain or attack frequency as compared to patients with the CC genotype. Other genetic and clinical factors may also influence a patient's response to sumatriptan. \n", "... ... \n", "1451868520 Patients with the rs11198893 AG genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents. \n", "1451868520 Patients with the rs11198893 GG genotype and coronary artery disease may have increased response when treated with beta blocking agents as compared to patients with the AA or AG genotypes. Other genetic and clinical factors may also influence response to beta blocking agents. \n", "1451868540 Patients with the rs4752292 GG genotype and coronary artery disease may have increased response when treated with beta blocking agents as compared to patients with the TT or GT genotypes. Other genetic and clinical factors may also influence response to beta blocking agents. \n", "1451868540 Patients with the rs4752292 GT genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents. \n", "1451868540 Patients with the rs4752292 TT genotype and coronary artery disease may have decreased response when treated with beta blocking agents as compared to patients with the GG genotype. Other genetic and clinical factors may also influence response to beta blocking agents. \n", "\n", " Allele Function \n", "Clinical Annotation ID \n", "613979021 NaN \n", "613979021 NaN \n", "613979021 NaN \n", "613979403 NaN \n", "613979403 NaN \n", "... ... \n", "1451868520 NaN \n", "1451868520 NaN \n", "1451868540 NaN \n", "1451868540 NaN \n", "1451868540 NaN \n", "\n", "[5881 rows x 8 columns]" ] }, "execution_count": 154, "metadata": {}, "output_type": "execute_result" } ], "source": [ "efficacy_with_alleles" ] }, { "cell_type": "code", "execution_count": 161, "id": "853b2e89", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5881" ] }, "execution_count": 161, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of alleles (as opposed to variants)\n", "len(efficacy_with_alleles)" ] }, { "cell_type": "code", "execution_count": 158, "id": "2f11b4a9", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "126" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of entries with allele function\n", "len(efficacy_with_alleles[pd.notna(efficacy_with_alleles['Allele Function'])])" ] }, { "cell_type": "code", "execution_count": 160, "id": "5c4242f8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5659" ] }, "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of entries with RS\n", "len(efficacy_with_alleles[efficacy_with_alleles['Variant/Haplotypes'].str.contains('rs')])" ] }, { "cell_type": "markdown", "id": "9549ee4f", "metadata": {}, "source": [ "### Connecting with ClinVar\n", "\n", "[Top of page](#Table-of-contents)" ] }, { "cell_type": "code", "execution_count": 203, "id": "70fcc847", "metadata": {}, "outputs": [], "source": [ "import re" ] }, { "cell_type": "code", "execution_count": 230, "id": "170de42a", "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Can use Clinical Annotation ID which should appear in xrefs\n", "all_pgkb_ids = []\n", "for raw_cvs_xml in iterate_cvs_from_xml(drug_xml):\n", " if is_pgkb(raw_cvs_xml):\n", " record = ClinVarRecord(find_mandatory_unique_element(raw_cvs_xml, 'ReferenceClinVarAssertion'))\n", " if record.measure:\n", " # this is the soundest approach\n", " pgkb_ids = [\n", " int(elem.attrib['ID']) \n", " for elem in find_elements(record.measure.measure_xml, './XRef[@DB=\"PharmGKB Clinical Annotation\"]')\n", " ]\n", " if not pgkb_ids:\n", " # this yields a lot of redundancy\n", " pgkb_ids = [\n", " int(re.split(r'[a-zA-Z]+', elem.attrib['ID'])[0])\n", " for elem in find_elements(record.measure.measure_xml, './XRef[@DB=\"PharmGKB\"]')\n", " ]\n", " if not pgkb_ids:\n", " # this is stupid - probably don't do this\n", " pgkb_ids = [\n", " int(elem.text.split('/')[-1])\n", " for elem in find_elements(raw_cvs_xml, './ClinVarAssertion/ClinicalSignificance/Citation/URL')\n", " ]\n", " if not pgkb_ids:\n", " pprint(raw_cvs_xml)\n", " break\n", " all_pgkb_ids.extend(pgkb_ids)" ] }, { "cell_type": "code", "execution_count": 231, "id": "4bbc17b5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2000" ] }, "execution_count": 231, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(all_pgkb_ids)" ] }, { "cell_type": "code", "execution_count": 232, "id": "c9926163", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "167" ] }, "execution_count": 232, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cf. 401 records with PGKB submissions\n", "len(set(all_pgkb_ids))" ] }, { "cell_type": "code", "execution_count": 234, "id": "39bcfdb5", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Clinical Annotation IDVariant/HaplotypesGeneLevel of EvidenceLevel OverrideLevel ModifiersScorePhenotype CategoryPMID CountEvidence CountDrug(s)Phenotype(s)Latest History Date (YYYY-MM-DD)URLSpecialty Population
0981755803rs75527207CFTR1ANaNRare Variant; Tier 1 VIP234.875Efficacy2830ivacaftorCystic Fibrosis2021-03-24https://www.pharmgkb.org/clinicalAnnotation/981755803Pediatric
31449191690rs141033578CFTR1ANaNRare Variant; Tier 1 VIP200.000Efficacy13ivacaftorCystic Fibrosis2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1449191690NaN
41449191746rs78769542CFTR1ANaNRare Variant; Tier 1 VIP200.000Efficacy13ivacaftorCystic Fibrosis2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1449191746NaN
27655386913CYP2C19*1, CYP2C19*17CYP2C193NaNTier 1 VIP6.000Toxicity1516clopidogrelAcute coronary syndrome;Coronary Artery Disease;Hemorrhage;Myocardial Infarction2021-03-24https://www.pharmgkb.org/clinicalAnnotation/655386913NaN
159981201854rs28399499CYP2B63NaNTier 1 VIP5.250Metabolism/PK77nevirapineHIV Infections2021-03-24https://www.pharmgkb.org/clinicalAnnotation/981201854NaN
................................................
45311451237940rs9923231VKORC11ANaNTier 1 VIP117.000Dosage1011phenprocoumonNaN2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1451237940Pediatric
45331451243676rs9923231VKORC12ANaNTier 1 VIP8.250Toxicity34phenprocoumonHemorrhage;over-anticoagulation;time above therapeutic range2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1451243676NaN
45351451245360rs1051266SLC19A12ANaNTier 1 VIP14.125Efficacy910methotrexateArthritis, Rheumatoid2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1451245360NaN
47621449191758rs75541969CFTR1ANaNRare Variant; Tier 1 VIP200.000Efficacy13ivacaftorCystic Fibrosis2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1449191758NaN
50011451289660rs59086055DPYD1ANaNRare Variant; Tier 1 VIP100.000Toxicity12fluorouracilNeoplasms2021-03-24https://www.pharmgkb.org/clinicalAnnotation/1451289660NaN
\n", "

161 rows × 15 columns

\n", "
" ], "text/plain": [ " Clinical Annotation ID Variant/Haplotypes Gene \\\n", "0 981755803 rs75527207 CFTR \n", "3 1449191690 rs141033578 CFTR \n", "4 1449191746 rs78769542 CFTR \n", "27 655386913 CYP2C19*1, CYP2C19*17 CYP2C19 \n", "159 981201854 rs28399499 CYP2B6 \n", "... ... ... ... \n", "4531 1451237940 rs9923231 VKORC1 \n", "4533 1451243676 rs9923231 VKORC1 \n", "4535 1451245360 rs1051266 SLC19A1 \n", "4762 1449191758 rs75541969 CFTR \n", "5001 1451289660 rs59086055 DPYD \n", "\n", " Level of Evidence Level Override Level Modifiers Score \\\n", "0 1A NaN Rare Variant; Tier 1 VIP 234.875 \n", "3 1A NaN Rare Variant; Tier 1 VIP 200.000 \n", "4 1A NaN Rare Variant; Tier 1 VIP 200.000 \n", "27 3 NaN Tier 1 VIP 6.000 \n", "159 3 NaN Tier 1 VIP 5.250 \n", "... ... ... ... ... \n", "4531 1A NaN Tier 1 VIP 117.000 \n", "4533 2A NaN Tier 1 VIP 8.250 \n", "4535 2A NaN Tier 1 VIP 14.125 \n", "4762 1A NaN Rare Variant; Tier 1 VIP 200.000 \n", "5001 1A NaN Rare Variant; Tier 1 VIP 100.000 \n", "\n", " Phenotype Category PMID Count Evidence Count Drug(s) \\\n", "0 Efficacy 28 30 ivacaftor \n", "3 Efficacy 1 3 ivacaftor \n", "4 Efficacy 1 3 ivacaftor \n", "27 Toxicity 15 16 clopidogrel \n", "159 Metabolism/PK 7 7 nevirapine \n", "... ... ... ... ... \n", "4531 Dosage 10 11 phenprocoumon \n", "4533 Toxicity 3 4 phenprocoumon \n", "4535 Efficacy 9 10 methotrexate \n", "4762 Efficacy 1 3 ivacaftor \n", "5001 Toxicity 1 2 fluorouracil \n", "\n", " Phenotype(s) \\\n", "0 Cystic Fibrosis \n", "3 Cystic Fibrosis \n", "4 Cystic Fibrosis \n", "27 Acute coronary syndrome;Coronary Artery Disease;Hemorrhage;Myocardial Infarction \n", "159 HIV Infections \n", "... ... \n", "4531 NaN \n", "4533 Hemorrhage;over-anticoagulation;time above therapeutic range \n", "4535 Arthritis, Rheumatoid \n", "4762 Cystic Fibrosis \n", "5001 Neoplasms \n", "\n", " Latest History Date (YYYY-MM-DD) \\\n", "0 2021-03-24 \n", "3 2021-03-24 \n", "4 2021-03-24 \n", "27 2021-03-24 \n", "159 2021-03-24 \n", "... ... \n", "4531 2021-03-24 \n", "4533 2021-03-24 \n", "4535 2021-03-24 \n", "4762 2021-03-24 \n", "5001 2021-03-24 \n", "\n", " URL \\\n", "0 https://www.pharmgkb.org/clinicalAnnotation/981755803 \n", "3 https://www.pharmgkb.org/clinicalAnnotation/1449191690 \n", "4 https://www.pharmgkb.org/clinicalAnnotation/1449191746 \n", "27 https://www.pharmgkb.org/clinicalAnnotation/655386913 \n", "159 https://www.pharmgkb.org/clinicalAnnotation/981201854 \n", "... ... \n", "4531 https://www.pharmgkb.org/clinicalAnnotation/1451237940 \n", "4533 https://www.pharmgkb.org/clinicalAnnotation/1451243676 \n", "4535 https://www.pharmgkb.org/clinicalAnnotation/1451245360 \n", "4762 https://www.pharmgkb.org/clinicalAnnotation/1449191758 \n", "5001 https://www.pharmgkb.org/clinicalAnnotation/1451289660 \n", "\n", " Specialty Population \n", "0 Pediatric \n", "3 NaN \n", "4 NaN \n", "27 NaN \n", "159 NaN \n", "... ... \n", "4531 Pediatric \n", "4533 NaN \n", "4535 NaN \n", "4762 NaN \n", "5001 NaN \n", "\n", "[161 rows x 15 columns]" ] }, "execution_count": 234, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clinical_annotations[clinical_annotations['Clinical Annotation ID'].isin(set(all_pgkb_ids))]" ] }, { "cell_type": "markdown", "id": "e0495ff2", "metadata": {}, "source": [ "### Star alleles\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "e.g. [CYP2D6](https://www.ncbi.nlm.nih.gov/books/NBK574601/) - corresponds to\n", "> specific combinations of single nucleotide polymorphisms (SNPs) and/or small insertions and deletions (indels).... In addition, the CYP2D6 gene locus contains a number of complex structural variants including full gene deletions, gene duplications and multiplications [[via](https://www.nature.com/articles/s41525-020-0135-2)]\n", "\n", "`CYP2D6*1` is the reference allele, `CYP2D6*(gene variant)XN`, refers to `N` copies of the gene.\n", "\n", "Nomenclature is really heterogeneous, compare [HLA](http://hla.alleles.org/nomenclature/naming.html) - there are lots of rabbit holes we could go down!!\n", "\n", "Conversion to rs / hgvs? e.g. in [PharmVar](https://www.pharmvar.org/gene/CYP2D6)\n", "* has [data download](https://www.pharmvar.org/download)\n", "* also has an [API](https://www.pharmvar.org/documentation)!" ] }, { "cell_type": "code", "execution_count": 170, "id": "35cef091", "metadata": {}, "outputs": [], "source": [ "no_rs = efficacy_with_alleles[~efficacy_with_alleles['Variant/Haplotypes'].str.contains('rs')]['Variant/Haplotypes'].tolist()" ] }, { "cell_type": "code", "execution_count": 238, "id": "7ea17e44", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "{'CYP2B6*1, CYP2B6*4, CYP2B6*5, CYP2B6*6, CYP2B6*7',\n", " 'CYP2B6*1, CYP2B6*5',\n", " 'CYP2B6*1, CYP2B6*6',\n", " 'CYP2C19*1, CYP2C19*2',\n", " 'CYP2C19*1, CYP2C19*2, CYP2C19*3',\n", " 'CYP2C19*1, CYP2C19*2, CYP2C19*3, CYP2C19*17',\n", " 'CYP2C8*1, CYP2C8*2, CYP2C8*3, CYP2C8*4',\n", " 'CYP2C8*1, CYP2C8*3',\n", " 'CYP2C9*1, CYP2C9*2, CYP2C9*3',\n", " 'CYP2C9*1, CYP2C9*2, CYP2C9*3, CYP2C9*13, CYP2C9*14',\n", " 'CYP2C9*1, CYP2C9*3',\n", " 'CYP2D6*1, CYP2D6*10',\n", " 'CYP2D6*1, CYP2D6*1xN',\n", " 'CYP2D6*1, CYP2D6*1xN, CYP2D6*2, CYP2D6*2xN, CYP2D6*3, CYP2D6*4, CYP2D6*6',\n", " 'CYP2D6*1, CYP2D6*1xN, CYP2D6*2, CYP2D6*2xN, CYP2D6*4, CYP2D6*5, CYP2D6*10, CYP2D6*35xN',\n", " 'CYP2D6*1, CYP2D6*1xN, CYP2D6*2xN',\n", " 'CYP2D6*1, CYP2D6*2, CYP2D6*2xN, CYP2D6*3, CYP2D6*4, CYP2D6*6',\n", " 'CYP2D6*1, CYP2D6*2, CYP2D6*3, CYP2D6*4, CYP2D6*5, CYP2D6*6, CYP2D6*7, CYP2D6*9, CYP2D6*10, CYP2D6*10x2, CYP2D6*11, CYP2D6*17, CYP2D6*21, CYP2D6*36, CYP2D6*41',\n", " 'CYP2D6*1, CYP2D6*3, CYP2D6*4',\n", " 'CYP2D6*1, CYP2D6*3, CYP2D6*4, CYP2D6*5, CYP2D6*6, CYP2D6*10, CYP2D6*17',\n", " 'CYP2D6*1, CYP2D6*4',\n", " 'CYP2D6*1, CYP2D6*4, CYP2D6*5, CYP2D6*6, CYP2D6*17, CYP2D6*40',\n", " 'CYP2D6*5, CYP2D6*17',\n", " 'CYP3A4*1, CYP3A4*22',\n", " 'CYP3A4*1, CYP3A4*36',\n", " 'CYP3A4*1, CYP3A4*4',\n", " 'CYP3A5*1, CYP3A5*3',\n", " 'GSTM1 non-null, GSTM1 null',\n", " 'GSTT1 non-null, GSTT1 null',\n", " 'HLA-B*15:01:01:01',\n", " 'HLA-B*38:01:01',\n", " 'HLA-B*44:02:01:01',\n", " 'HLA-C*01:02:01, HLA-C*02:02:01, HLA-C*03:02, HLA-C*04:01:01:01, HLA-C*05:01:01:01, HLA-C*06:02:01:01, HLA-C*07:01:01, HLA-C*08:01, HLA-C*12:02:01, HLA-C*14:02:01, HLA-C*15:02:01, HLA-C*16:01:01, HLA-C*17:01:01:01',\n", " 'HLA-C*06:02:01:01',\n", " 'HLA-DRB1*04:01:01',\n", " 'NAT2*4, NAT2*5D, NAT2*6B, NAT2*7A, NAT2*12A, NAT2*13A, NAT2*14A',\n", " 'SLC6A4 HTTLPR long form (L allele), SLC6A4 HTTLPR short form (S allele)',\n", " 'SLCO1B1*1, SLCO1B1*14',\n", " 'TPMT*1, TPMT*3B, TPMT*3C',\n", " 'UGT1A1*1, UGT1A1*28',\n", " 'UGT1A1*60',\n", " 'UGT1A3*1, UGT1A3*2',\n", " 'UGT2B15*1, UGT2B15*2'}" ] }, "execution_count": 238, "metadata": {}, "output_type": "execute_result" } ], "source": [ "set(no_rs)" ] }, { "cell_type": "code", "execution_count": 172, "id": "67627cc5", "metadata": {}, "outputs": [], "source": [ "import requests" ] }, { "cell_type": "code", "execution_count": 173, "id": "63d9cbdb", "metadata": {}, "outputs": [], "source": [ "def get_pharmvar_result(allele):\n", " return requests.get(f'https://www.pharmvar.org/api-service/alleles/{allele}').json()" ] }, { "cell_type": "code", "execution_count": 174, "id": "3281a20d", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "[{'geneSymbol': 'CYP2C9',\n", " 'alleleName': 'CYP2C9*2',\n", " 'pvId': 'PV00538',\n", " 'legacyLabel': None,\n", " 'coreAllele': None,\n", " 'evidenceLevel': '0',\n", " 'description': None,\n", " 'function': 'decreased function',\n", " 'activeInd': True,\n", " 'references': [{'citation': 'Rettie et al. 1994',\n", " 'url': 'http://www.ncbi.nlm.nih.gov/pubmed/8004131'},\n", " {'citation': 'Crespi et al. 1997',\n", " 'url': 'http://www.ncbi.nlm.nih.gov/pubmed/9241660'},\n", " {'citation': 'deposited by Gaedigk et al.', 'url': None},\n", " {'citation': 'King et al. 2004',\n", " 'url': 'http://www.ncbi.nlm.nih.gov/pubmed/15608560'},\n", " {'citation': 'Takahashi et al. 2004',\n", " 'url': 'http://www.ncbi.nlm.nih.gov/pubmed/15070684'},\n", " {'citation': 'deposited by Campos et al.', 'url': None}],\n", " 'variants': [{'referenceSequence': 'NC_000010.11',\n", " 'referenceLocation': 'Sequence Start',\n", " 'referenceCollections': ['GRCh38'],\n", " 'hgvs': 'NC_000010.11:g.94942290C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/29',\n", " 'variantId': '8',\n", " 'position': 'NC_000010.11:g.94942290C>T'},\n", " {'referenceSequence': 'NC_000010.10',\n", " 'referenceLocation': 'Sequence Start',\n", " 'referenceCollections': ['GRCh37'],\n", " 'hgvs': 'NC_000010.10:g.96702047C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/31',\n", " 'variantId': '8',\n", " 'position': 'NC_000010.10:g.96702047C>T'},\n", " {'referenceSequence': 'NM_000771.4',\n", " 'referenceLocation': 'Sequence Start',\n", " 'referenceCollections': ['RefSeqTranscript'],\n", " 'hgvs': 'NM_000771.4:c.430C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/13748',\n", " 'variantId': '8',\n", " 'position': 'NM_000771.4:c.455C>T'},\n", " {'referenceSequence': 'NM_000771.4',\n", " 'referenceLocation': 'ATG Start',\n", " 'referenceCollections': ['RefSeqTranscript'],\n", " 'hgvs': 'NM_000771.4:c.430C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/13747',\n", " 'variantId': '8',\n", " 'position': 'NM_000771.4:c.430C>T'},\n", " {'referenceSequence': 'NG_008385.2',\n", " 'referenceLocation': 'ATG Start',\n", " 'referenceCollections': ['RefSeqGene'],\n", " 'hgvs': 'NG_008385.2:g.9133C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/13590',\n", " 'variantId': '8',\n", " 'position': 'NG_008385.2:g.3608C>T'},\n", " {'referenceSequence': 'NG_008385.2',\n", " 'referenceLocation': 'Sequence Start',\n", " 'referenceCollections': ['RefSeqGene'],\n", " 'hgvs': 'NG_008385.2:g.9133C>T',\n", " 'rsId': 'rs1799853',\n", " 'impact': 'R144C',\n", " 'variantFrequency': [{'source': '1000Genomes', 'frequency': 0.047923},\n", " {'source': 'GnomAD', 'frequency': 0.092016}],\n", " 'url': 'https://www.pharmvar.org/variant/13589',\n", " 'variantId': '8',\n", " 'position': 'NG_008385.2:g.9133C>T'}],\n", " 'alleleType': 'Core',\n", " 'url': 'https://www.pharmvar.org/haplotype/PV00538',\n", " 'hgvs': 'NG_008385.2:g.9133C>T',\n", " 'variantGroups': []}]" ] }, "execution_count": 174, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_pharmvar_result('CYP2C9*2')" ] }, { "cell_type": "code", "execution_count": 178, "id": "43987077", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'errorMessage': 'Allele NAT2*6 could not be located in the PharmVar database.',\n", " 'errorCode': 404}" ] }, "execution_count": 178, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_pharmvar_result('NAT2*6')" ] }, { "cell_type": "markdown", "id": "e3ee8cc6", "metadata": {}, "source": [ "### Notes\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "* More data than submitted to ClinVar\n", " * only top 2 tiers of evidence are submitted, most data is in the 3rd\n", "* Data is richer than ClinVar, but a fair amount of it is buried in free text annotations\n", " * in particular direction of effect\n", "* Can connect with ClinVar RCVs via their internal identifiers\n", "* Most data seems to use RS IDs\n", " * in theory get consequences via alleles data (assuming we can get reference allele I guess)\n", "* Pharmacogenes with star alleles are few but important\n", " * will need some special treatment and possibly use of more resources like PharmVar\n", " * maybe parallels with how we handle other complex events in ClinVar" ] }, { "cell_type": "markdown", "id": "a90d7f83", "metadata": {}, "source": [ "## General\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "Thinking both about PharmGKB data and the more general question of other data sources. Options:\n", "\n", "* Add a new data source pipeline\n", " * most likely more data even from submitters to ClinVar\n", " * can also generalise to sources that don't submit to ClinVar at all\n", " * can be used as additional annotations to ClinVar or entirely separate submissions\n", " * probably more work for us\n", "* Start parsing submitted records in ClinVar\n", " * beneficial if it's common that SCVs have more info than in RCV\n", " * potentially can get data from multiple upstream sources with a single SCV parser\n", " * lends itself to enriching \"core\" ClinVar data - ClinVar takes care of linkage\n", " * potential for extra/duplicate work aggregating submissions to ClinVar\n" ] }, { "cell_type": "markdown", "id": "0ebd2fcc", "metadata": {}, "source": [ "### Questions for 29/9 meeting\n", "\n", "* Should we start to parse SCVs in ClinVar?\n", "* Is it worth trying other ways of linking drug & disease within ClinVar?\n", "* What would be useful to get directly from PharmGKB (besides just more data)?\n", "* Any familiarity with Pharmacogenes, star alleles and other nomenclature\n", "* Other questions you have, other info that would be helpful for decision making" ] }, { "cell_type": "markdown", "id": "60dc896e", "metadata": {}, "source": [ "### Meeting notes\n", "\n", "[Top of page](#Table-of-contents)\n", "\n", "* existance of drugResponse field changes the meaning of disease from source - check OT is ok with this\n", "* maybe this is why CV doesn't include disease traits in RCV - can't confidently associate the variant with the disease, only the drug response\n", "* disease traits are potentially more ambiguous - free text, not annotated by CV with xrefs\n", " * probably extra manual curation for us\n", "* are there other terms for efficacy we can consider - depends on how efficacy is measured\n", "* same question as for clinvar - if drug & disease occur in same record, does it mean the drug is specifically targetting that disease\n", "* other things to highlight - really low number of exact efficacy terms, can provide evidence levels from pharmgkb\n", "* next steps - basically investigation into PharmGKB and/or SCV, but pending some questions for OT to raise at next meeting" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 5 }