{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Abstract:\n", "\n", "The aim of this notebook is to show to create an ISA document for depositing Stable Isotope Resolved Metabolomics Study metadata using the ISA API.\n", "\n", "This notebook highlights key steps of the deposition, including:\n", "- declaration of study variables and treatment groups\n", "- declaration of SIRM specific protocols, assays and annotation requirements for a given data modality.\n", "- ISA roundtrip (write, reading, writing).\n", "- Serialization to TAB and JSON\n", "- Validation\n", " \n", " Stable Isotope Resolved Metabolomics Studies are a type of studies using MS and NMR acquisition techniques to decypher biochemical reactions using `tracer molecule`, i.e. molecules for which certain positions carry an isotope (e.g. 13C, 15N). Specific data acquisition and data processing techniques are required and dedicated software is used to make sense of the data. Software such as `IsoSolve` [1], `Ramid`[2](for primary processing of 13C mass isotopomer data obtained with GCMS) or `midcor`[3] (for natural abundance correction processes on13C mass isotopomers spectra), may be used to accomplish those tasks. The output of such tools are tables which may comply with a new specifications devised to better support the reporting of SIRM study results.\n", " \n", " \n", " - [1]. IsoSolve https://doi.org/10.1021/acs.analchem.1c01064\n", " - [2]. https://github.com/seliv55/ramid\n", " - [3]. https://github.com/seliv55/midcor\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loading the ISA-API" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "from isatools.model import (\n", " Comment,\n", " Investigation,\n", " Study,\n", " StudyFactor,\n", " FactorValue,\n", " OntologyAnnotation,\n", " Characteristic,\n", " OntologySource,\n", " Material,\n", " Sample,\n", " Source,\n", " Protocol,\n", " ProtocolParameter,\n", " ProtocolComponent,\n", " ParameterValue,\n", " Process,\n", " Publication,\n", " Person,\n", " Assay,\n", " DataFile,\n", " plink\n", ")\n", "import datetime\n", "import os\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Programmatic reporting of a 13C Stable Isotope Resolved Metabolomics (SIRM) study" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "#### Declaring the Ontologies and Vocabularies used in the ISA Study" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "investigation = Investigation()\n", "\n", "chebi=OntologySource(name=\"CHEBI\",description=\"Chemical Entity of Biological Interest\")\n", "efo=OntologySource(name=\"EFO\", description=\"Experimental Factor Ontology\")\n", "msio=OntologySource(name=\"MSIO\", description=\"Metabolomics Standards Initiative Ontology\")\n", "obi = OntologySource(name='OBI', description=\"Ontology for Biomedical Investigations\")\n", "pato = OntologySource(name='PATO', description=\"Phenotype and Trait Ontology\")\n", "ncbitaxon = OntologySource(name=\"NCIBTaxon\", description=\"NCBI Taxonomy\")\n", "\n", "investigation.ontology_source_references=[chebi,efo,obi,pato,ncbitaxon]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Basic Study description: declaring Study Factor and and Study Design type" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "study = Study(filename=\"s_13C-SIRM-study.txt\")\n", "study.identifier = \"MTBLS-XXXX-SIRM\"\n", "study.title = \"[U-13C6]-D-glucose labeling experiment in MCF7 cancer cell line\"\n", "study.description = \"Probing cancer pathways of MCF7 cell line using 13C stable isotope resolved metabolomics study using isotopologue distribution analysis with mass spectrometry and isotopomer analysis by 1D 1H NMR.\"\n", "study.submission_date = \"15/08/2021\"\n", "study.public_release_date = \"15/08/2021\"\n", "\n", "# These EMBL-EBI Metabolights (MTBLS) related ISA Comments fields may be used for deposition to EMBL-EBI\n", "src_comment_mtbls1 = Comment(name=\"MTBLS Broker Name\", value=\"OXFORD\")\n", "src_comment_mtbls2 = Comment(name=\"MTBLS Center Name\", value=\"OXFORD\")\n", "src_comment_mtbls3 = Comment(name=\"MTBLS Center Project Name\", value=\"OXFORD\")\n", "src_comment_mtbls4 = Comment(name=\"MTBLS Lab Name\", value=\"Oxford e-Research Centre\")\n", "src_comment_mtbls5 = Comment(name=\"MTBLS Submission Action\", value=\"ADD\")\n", "study.comments.append(src_comment_mtbls1)\n", "study.comments.append(src_comment_mtbls2)\n", "study.comments.append(src_comment_mtbls3)\n", "study.comments.append(src_comment_mtbls4)\n", "study.comments.append(src_comment_mtbls5)\n", "\n", "# These ISA Comments are optional and may be used to report funding information\n", "src_comment_st1 = Comment(name=\"Study Funding Agency\", value=\"\")\n", "src_comment_st2 = Comment(name=\"Study Grant Number\", value=\"\")\n", "study.comments.append(src_comment_st1)\n", "study.comments.append(src_comment_st2)\n", " \n", "\n", "# Adding a Study Design descriptor to the ISA Study object\n", "intervention_design = OntologyAnnotation(term_source=obi)\n", "intervention_design.term = \"intervention design\"\n", "intervention_design.term_accession = \"http://purl.obolibrary.org/obo/OBI_0000115\"\n", "\n", "study_design = OntologyAnnotation(term_source=msio)\n", "study_design.term = \"stable isotope resolved metabolomics study\"\n", "study_design.term_accession = \"http://purl.obolibrary.org/obo/MSIO_0000096\"\n", "\n", "\n", "study.design_descriptors.append(intervention_design)\n", "study.design_descriptors.append(study_design)\n", "\n", "\n", "# Declaring the Study Factors\n", "study.factors = [\n", " StudyFactor(name=\"compound\",factor_type=OntologyAnnotation(term=\"chemical substance\",\n", " term_accession=\"http://purl.obolibrary.org/obo/CHEBI_59999\",\n", " term_source=chebi)),\n", " StudyFactor(name=\"dose\",factor_type=OntologyAnnotation(term=\"dose\", term_accession=\"http://www.ebi.ac.uk/efo/EFO_0000428\",term_source=efo)),\n", " StudyFactor(name=\"duration\",factor_type=OntologyAnnotation(term=\"time\", term_accession=\"http://purl.obolibrary.org/obo/PATO_0000165\", term_source=pato))\n", "]\n", "\n", "# Associating the levels to each of the Study Factor.\n", "fv1 = FactorValue(factor_name=study.factors[0], value=OntologyAnnotation(term=\"dioxygen\"))\n", "fv2 = FactorValue(factor_name=study.factors[1], value=OntologyAnnotation(term=\"high\"))\n", "fv3 = FactorValue(factor_name=study.factors[1], value=OntologyAnnotation(term=\"normal\"))\n", "fv4 = FactorValue(factor_name=study.factors[2], value=OntologyAnnotation(term=\"hour\"))\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Adding the publications associated to the study" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "study.publications = [\n", " Publication(doi=\"10.1371/journal.pone.0000000\",pubmed_id=\"\",\n", " title=\"Decyphering new cancer pathways with stable isotope resolved metabolomics in MCF7 cell lines\",\n", " status=OntologyAnnotation(term=\"indexed in PubMed\"),\n", " author_list=\"Min,W. and Everest H\"),\n", " \n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Adding the authors of the study" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "study.contacts = [\n", " Person(first_name=\"Weng\", last_name=\"Min\", affiliation=\"Beijing Institute of Metabolism\", email=\"weng.min@bim.edu.cn\",\n", " address=\"Prospect Street, Beijing, People's Republic of China\",\n", " comments=[Comment(name=\"Study Person REF\", value=\"\")],\n", " roles=[OntologyAnnotation(term=\"principal investigator role\"),\n", " OntologyAnnotation(term=\"SRA Inform On Status\"),\n", " OntologyAnnotation(term=\"SRA Inform On Error\")]\n", " ),\n", " Person(first_name=\"Hillary\", last_name=\"Everest\", affiliation=\"Centre for Cell Metabolism\",\n", " address=\"CCM, Edinborough, United Kingdom\",\n", " comments=[Comment(name=\"Study Person REF\", value=\"\")],\n", " roles=[OntologyAnnotation(term=\"principal investigator role\")]\n", " )\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Declaring all the protocols used in the ISA study. Note also the declaration of Protocol Parameters when needed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "study.protocols = [ \n", " #Protocol #0\n", " Protocol(name=\"cell culture and isotopic labeling\",\n", " description=\"SOP for growing MCF7 cells and incubating them with the tracer molecule\",\n", " protocol_type=OntologyAnnotation(term=\"sample collection\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"tracer molecule\"))\n", " ]\n", " ),\n", " #Protocol #1\n", " Protocol(\n", " name=\"intracellular metabolite extraction\",\n", " description=\"SOP for extracting metabolites from harvested cells\",\n", " protocol_type=OntologyAnnotation(term=\"metabolite extraction\")\n", " ),\n", " #Protocol #2\n", " Protocol(\n", " name=\"extracellular metabolite extraction\",\n", " description=\"SOP for extracting metabolites from cell culture supernatant\",\n", " protocol_type=OntologyAnnotation(term=\"metabolite extraction\")\n", " ),\n", " #Protocol #3\n", " Protocol(\n", " name=\"liquid chromatography mass spectrometry\",\n", " description=\"SOP for LC-MS data acquisition\",\n", " protocol_type=OntologyAnnotation(term=\"mass spectrometry\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"chromatography column\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"mass spectrometry instrument\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"mass analyzer\"))\n", " ]\n", " ),\n", " #Protocol #4\n", " Protocol(\n", " name=\"1D 13C NMR spectroscopy for isotopomer analysis\",\n", " description=\"SOP for 1D 13C NMR data acquisition for isotopomer analysis\",\n", " protocol_type=OntologyAnnotation(term=\"nmr spectroscopy\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"magnetic field strength\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr tube\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\"))\n", " ]\n", " ),\n", " #Protocol #5\n", " Protocol(\n", " name=\"1D 13C NMR spectroscopy for metabolite profiling\",\n", " description=\"SOP for 1D 13C NMR data acquisition for metabolite profiling\",\n", " protocol_type=OntologyAnnotation(term=\"nmr spectroscopy\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"magnetic field strength\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr tube\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\"))\n", " ]\n", " ),\n", " #Protocol #6\n", " Protocol(\n", " name=\"MS metabolite identification\",\n", " description=\"SOP for MS signal processing and metabolite and isotopologue identification\",\n", " protocol_type=OntologyAnnotation(term=\"metabolite identification\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"ms software\"))\n", " ]\n", " ),\n", " #Protocol #7\n", " Protocol(\n", " name=\"NMR metabolite identification\",\n", " description=\"SOP for NMR signal processing and metabolite and isotopomer identification\",\n", " uri=\"https://doi.org/10.1021/acs.analchem.1c01064\",\n", " protocol_type=OntologyAnnotation(term=\"data transformation\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr software\"))\n", " ]\n", " )\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Now, creating ISA Source and Sample objects and building the ISA study table\n", "\n", "\n", "In this fictional study, we assume the following underlying experimental setup:\n", "\n", "- Human MCF-7 breast cancer cell line will be grown in 2 distinct conditions, namely \"normal concentration of dioxygen for 72 hours\" and \"high concentration of dioxygen for 72 hours\".\n", "- Each cell culture batch will be grown in the presence of 80% [1-13C1]-D-glucose + 20% [U-13C6]-D-glucose tracer molecules\n", "- For each cell culture, 4 samples will be collected for characterisation.\n", "- 3 assay modalities will be used on each of the collected samples, namely:\n", " - isotopologue distribution analysis using LC-MS\n", " - isotopomer analysis using 1D 13C NMR spectrometry with 3 distinct pulse sequences (HSQC, ZQF-TOCSY, HCNA, and HACO-DIPSY)\n", " - metabolite profiling using 1D 13C NMR spectrometry with one pulse sequence (CPMG)\n", "- Data analysis for each data modality will be performed with dedicated data analysis protocols\n", "- Integrative analysis (cutting accross results coming from each assay modality) will be performed using a dedicated workflow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's start by building the ISA Source and ISA Samples reflecting the experimental plan:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Creating the ISA Source Materials\n", "study.sources = [Source(name=\"culture-1\"), Source(name=\"culture-2\")]\n", "\n", "characteristic_organism = Characteristic(category=OntologyAnnotation(term=\"Organism\"),\n", " value=OntologyAnnotation(term=\"Homo sapiens\", term_source=ncbitaxon,\n", " term_accession=\"http://purl.obolibrary.org/obo/NCBITaxon_9606\"))\n", "\n", "characteristic_cell = Characteristic(category=OntologyAnnotation(term=\"cell line\"),\n", " value=OntologyAnnotation(term=\"MCF-7\", term_source=\"\",\n", " term_accession=\"\"))\n", "\n", "for i in range(len(study.sources)): \n", " study.sources[i].characteristics.append(characteristic_organism)\n", " study.sources[i].characteristics.append(characteristic_cell)\n", "\n", "\n", "# Note how the treatment groups are defined as sets of factor values attached to the ISA.Sample object\n", "treatment_1 = [fv1,fv2,fv4]\n", "treatment_2 = [fv1,fv3,fv4]\n", "\n", "\n", "# Ensuring the Tracer Molecule(s) used for the SIRM study is properly reported\n", "tracer_mol_C = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"tracer molecule\")),\n", " value=OntologyAnnotation(term=\"80% [1-13C1]-D-glucose + 20% [U-13C6]-D-glucose\"))\n", "\n", "\n", "tracers = [tracer_mol_C]\n", "\n", "# the number of samples collected from each culture condition\n", "replicates = 4\n", "# Now creating a Process showing a `Protocol Application` using Source as input and producing Sample as output.\n", "\n", "for k in range(replicates):\n", " study.samples.append(Sample(name=(study.sources[0].name + \"-sample-\" + str(k)), factor_values=treatment_1))\n", " \n", " study.samples.append(Sample(name=(study.sources[1].name + \"-sample-\" + str(k)), factor_values=treatment_2))\n", "\n", "\n", "study.process_sequence.append(Process(executes_protocol=study.protocols[0], # a sample collection\n", " inputs=[study.sources[0]],\n", " outputs=[study.samples[0],study.samples[2],study.samples[4],study.samples[6]],\n", " parameter_values= [tracer_mol_C]))\n", "\n", "study.process_sequence.append(Process(executes_protocol=study.protocols[0], # a sample collection\n", " inputs=[study.sources[1]],\n", " outputs=[study.samples[1],study.samples[3],study.samples[5],study.samples[7]],\n", " parameter_values= [tracer_mol_C]))\n", " \n", "# Now appending the ISA Study object to the ISA Investigation object \n", "investigation.studies = [study]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Now, creating the ISA objects needed to represent assays and raw data acquisition." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's start by declaring the 3 modalities as 3 ISA Assays." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Starting by declaring the 2 types of assays used in BII-S-3 as coded with ISAcreator tool\n", "assay = Assay(filename=\"a_isotopologue-ms-assay.txt\")\n", "assay.measurement_type = OntologyAnnotation(term=\"isotopologue distribution analysis\",term_accession=\"http://purl.obolibrary.org/obo/msio.owl#mass_isotopologue_distribution_analysis\", term_source=msio)\n", "assay.technology_type = OntologyAnnotation(term=\"mass spectrometry\", term_accession=\"http://purl.obolibrary.org/obo/CHMO_0000470\", term_source=msio)\n", "\n", "\n", "assay_nmr_topo = Assay(filename=\"a_isotopomer-nmr-assay.txt\")\n", "assay_nmr_topo.measurement_type = OntologyAnnotation(term=\"isotopomer analysis\",term_accession=\"http://purl.obolibrary.org/obo/msio.owl#isotopomer_analysis\", term_source=msio)\n", "assay_nmr_topo.technology_type = OntologyAnnotation(term=\"NMR spectroscopy\",term_accession=\"http://purl.obolibrary.org/obo/CHMO_0000591\", term_source=msio)\n", "\n", "assay_nmr_metpro = Assay(filename=\"a_metabolite-profiling-nmr-assay.txt\")\n", "assay_nmr_metpro.measurement_type = OntologyAnnotation(term=\"untargeted metabolite profiling\",term_accession=\"http://purl.obolibrary.org/obo/MSIO_0000101\", term_source=msio)\n", "assay_nmr_metpro.technology_type = OntologyAnnotation(term=\"NMR spectroscopy\",term_accession=\"http://purl.obolibrary.org/obo/CHMO_0000591\", term_source=msio)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Warning**\n", "\n", "- The current release of ISA-API throws an error if Assay `technology type` OntologyAnnotation.term is left empty\n", "- The coming release 10.13 will address this issue.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The mass isotopologue distribution analysis assay using MS acquisitions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i, sample in enumerate(study.samples):\n", " \n", " # create an extraction process that executes the extraction protocol\n", "\n", " extraction_process = Process(executes_protocol=study.protocols[1])\n", "\n", " # extraction process takes as input a sample, and produces an extract material as output\n", " \n", " char_ext = Characteristic(category=OntologyAnnotation(term=\"Material Type\"),\n", " value=OntologyAnnotation(term=\"pellet\"))\n", " \n", " char_ext1 = Characteristic(category=OntologyAnnotation(term=\"quantity\"),\n", " value=40, unit=OntologyAnnotation(term=\"mg\"))\n", "\n", " extraction_process.inputs.append(sample)\n", " ms_material = Material(name=\"extract-ms-{}\".format(i))\n", " ms_material.type = \"Extract Name\"\n", " ms_material.characteristics.append(char_ext)\n", " ms_material.characteristics.append(char_ext1)\n", " extraction_process.outputs.append(ms_material)\n", "\n", " # create a ms acquisition process that executes the ms acquisition protocol\n", " column = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"chromatography column\")),\n", " value=OntologyAnnotation(term=\"Agilent C18 TTX\"))\n", " ms_inst = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"mass spectrometry instrument\")),\n", " value=OntologyAnnotation(term=\"Agilent QTOF XL\"))\n", " ms_anlzr = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"mass analyzer\")),\n", " value=OntologyAnnotation(term=\"Agilent MassDiscovery\"))\n", " \n", " isotopologue_process = Process(executes_protocol=study.protocols[3], parameter_values=[column, ms_inst, ms_anlzr] )\n", " isotopologue_process.name = \"assay-name-ms-{}\".format(i)\n", " isotopologue_process.inputs.append(extraction_process.outputs[0])\n", "\n", "\n", " # ms acquisition process usually has an output mzml data file\n", "\n", " datafile = DataFile(filename=\"ms-data-{}.mzml\".format(i), label=\"Spectral Raw Data File\")\n", " data_comment = Comment(name=\"data_comment\",value=\"data_value\")\n", " datafile.comments.append(data_comment)\n", " isotopologue_process.outputs.append(datafile)\n", "\n", " # Ensure Processes are linked forward and backward. plink(from_process, to_process) is a function to set\n", " # these links for you. It is found in the isatools.model package\n", " \n", " assay.samples.append(sample)\n", " assay.other_material.append(ms_material)\n", " assay.data_files.append(datafile)\n", " \n", " assay.process_sequence.append(extraction_process)\n", " assay.process_sequence.append(isotopologue_process)\n", " \n", " ms_sw = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"ms software\")),\n", " value=OntologyAnnotation(term=\"IsoSolve\"))\n", " ms_da_process = Process(executes_protocol=study.protocols[6], parameter_values=[ms_sw])\n", " ms_da_process.name = \"MS-DT-ident\"\n", " ms_da_process.inputs.append(datafile)\n", " ms_da_process.outputs.append(DataFile(filename=\"isotopologue-distribution-analysis.txt\", label=\"Derived Data File\"))\n", " \n", " assay.process_sequence.append(ms_da_process)\n", " # create an extraction process that executes the extraction protocol\n", "\n", " # plink(aliquoting_process, sequencing_process)\n", " plink(extraction_process, isotopologue_process)\n", " plink(isotopologue_process, ms_da_process)\n", " # make sure the extract, data file, and the processes are attached to the assay" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NOTE**\n", "make sure to used `ISA API plink function` to connects the protocols in a chain." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The mass isotopomer distribution analysis assay using 1D 13C NMR acquisitions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "for i, sample in enumerate(study.samples):\n", " \n", " extraction_process_nmr = Process(executes_protocol=study.protocols[1])\n", "\n", " # extraction process takes as input a sample, and produces an extract material as output\n", " extraction_process_nmr.inputs.append(sample)\n", " material_nmr = Material(name=\"extract-nmr-topo-{}\".format(i))\n", " material_nmr.type = \"Extract Name\"\n", " extraction_process_nmr.outputs.append(material_nmr) \n", " \n", " # create a nmr acquisition process that executes the nmr protocol\n", " magnet = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"magnetic field strength\")),\n", " value=6, unit=OntologyAnnotation(term=\"Tesla\"))\n", " tube = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr tube\")),\n", " value=OntologyAnnotation(term=\"Brucker 14 mm Oscar\"))\n", " pulse_a = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\")),\n", " value=OntologyAnnotation(term=\"HSQC\"))\n", " pulse_b = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\")),\n", " value=OntologyAnnotation(term=\"ZQF-TOCSY\"))\n", " pulse_c = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\")),\n", " value=OntologyAnnotation(term=\"HNCA\"))\n", " pulse_d = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\")),\n", " value=OntologyAnnotation(term=\"HACO-DIPSY\"))\n", " \n", " pulses=[pulse_a,pulse_b,pulse_c,pulse_d]\n", " \n", " for j in range(len(pulses)):\n", " \n", " isotopomer_process = Process(executes_protocol=study.protocols[4],parameter_values=[magnet,tube,pulses[j]])\n", " isotopomer_process.name = \"assay-name-nmr-topo-\"+ pulses[j].value.term +\"-{}\".format(i+1)\n", " isotopomer_process.inputs.append(extraction_process_nmr.outputs[0])\n", "\n", " # Sequencing process usually has an output data file\n", "\n", " datafile_nmr = DataFile(filename=\"nmr-data-topo\"+ pulses[j].value.term +\"-{}.nmrml\".format(i+1), label=\"Free Induction Decay File\")\n", " isotopomer_process.outputs.append(datafile_nmr)\n", "\n", " # Ensure Processes are linked forward and backward. plink(from_process, to_process) is a function to set\n", " # these links for you. It is found in the isatools.model package\n", "\n", " assay_nmr_topo.samples.append(sample)\n", " assay_nmr_topo.other_material.append(material_nmr)\n", " assay_nmr_topo.data_files.append(datafile_nmr)\n", "\n", " assay_nmr_topo.process_sequence.append(extraction_process_nmr)\n", " assay_nmr_topo.process_sequence.append(isotopomer_process) \n", "\n", " nmr_sw = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr software\")),\n", " value=OntologyAnnotation(term=\"https://pypi.org/project/IsoSolve\"))\n", " nmr_topo_da_process = Process(executes_protocol=study.protocols[7], parameter_values=[nmr_sw])\n", " nmr_topo_da_process.name = \"NMR-TOPO-DT-ident\"\n", " nmr_topo_da_process.inputs.append(datafile)\n", " nmr_topo_da_process.outputs.append(DataFile(filename=\"isotopomer-analysis.txt\", label=\"Derived Data File\"))\n", "\n", " plink(extraction_process_nmr, isotopomer_process)\n", " plink(isotopomer_process, nmr_topo_da_process)\n", " # make sure the extract, data file, and the processes are attached to the assay\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conventional Metabolite Profiling using dedicated 1D 13C NMR acquisitions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "for i, sample in enumerate(study.samples):\n", " extraction_process_nmr_metpro = Process(executes_protocol=study.protocols[1])\n", "\n", " # extraction process takes as input a sample, and produces an extract material as output\n", " extraction_process_nmr_metpro.inputs.append(sample)\n", " material_nmr_metpro = Material(name=\"extract-nmr-metpro-{}\".format(i))\n", " material_nmr_metpro.type = \"Extract Name\"\n", " extraction_process_nmr_metpro.outputs.append(material_nmr_metpro)\n", " \n", " # create a nmr acquisition process that executes the nmr protocol\n", " magnet = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"magnetic field strength\")),\n", " value=6, unit=OntologyAnnotation(term=\"Tesla\"))\n", " tube = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr tube\")),\n", " value=OntologyAnnotation(term=\"Brucker 14 mm Oscar\"))\n", " pulse_a = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"pulse sequence\")),\n", " value=OntologyAnnotation(term=\"CPMG\"))\n", " \n", " pulses=[pulse_a]\n", " \n", " for j in range(len(pulses)):\n", " \n", " metpro_process = Process(executes_protocol=study.protocols[5],parameter_values=[magnet,tube,pulses[j]])\n", " metpro_process.name = \"assay-name-nmr-metpro-\"+ pulses[j].value.term +\"-{}\".format(i+1)\n", " metpro_process.inputs.append(extraction_process_nmr_metpro.outputs[0])\n", "\n", " # a Data acquisition process usually has an output data file\n", "\n", " datafile_nmr_metpro = DataFile(filename=\"nmr-data-metpro\"+ pulses[j].value.term +\"-{}.nmrml\".format(i+1), label=\"Free Induction Decay File\")\n", " metpro_process.outputs.append(datafile_nmr)\n", "\n", " # Ensure Processes are linked forward and backward. plink(from_process, to_process) is a function to set\n", " # these links for you. It is found in the isatools.model package\n", "\n", " assay_nmr_metpro.samples.append(sample)\n", " assay_nmr_metpro.other_material.append(material_nmr_metpro)\n", " assay_nmr_metpro.data_files.append(datafile_nmr_metpro)\n", "\n", " assay_nmr_metpro.process_sequence.append(extraction_process_nmr_metpro)\n", " assay_nmr_metpro.process_sequence.append(metpro_process) \n", "\n", " nmr_sw = ParameterValue(category=ProtocolParameter(parameter_name=OntologyAnnotation(term=\"nmr software\")),\n", " value=OntologyAnnotation(term=\"Batman\"))\n", " nmr_da_process = Process(executes_protocol=study.protocols[7], parameter_values=[nmr_sw])\n", " nmr_da_process.name = \"NMR-metpro-DT-ident\"\n", " nmr_da_process.inputs.append(datafile_nmr_metpro)\n", " nmr_da_process.outputs.append(DataFile(filename=\"metpro-analysis.txt\", label=\"Derived Data File\"))\n", "\n", " plink(extraction_process_nmr_metpro, metpro_process)\n", " plink(metpro_process, nmr_da_process)\n", " # make sure the extract, data file, and the processes are attached to the assay\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding all ISA Assays declarations to the ISA Study object" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "study.assays.append(assay)\n", "study.assays.append(assay_nmr_topo)\n", "study.assays.append(assay_nmr_metpro)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reporting a cross-technique integrative analysis by referencing a workflow (e.g. snakemake, galaxy) with an ISA protocol and using ISA.Protocol.uri attribute to do so." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Protocol #*\n", "workflow_ref =Protocol(\n", " name=\"13C SIRM MS and NMR integrative analysis\",\n", " description=\"a workflow for integrating data from NMR and MS acquisition into a consolidated result\",\n", " uri=\"https://doi.org/10.1021/acs.analchem.1c01064\",\n", " protocol_type=OntologyAnnotation(term=\"data transformation\"),\n", " parameters=[\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"software\"))\n", " ])\n", "study.protocols.append(workflow_ref)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Serializing (writing) the ISA object representation to file with the ISA-API `dump` function" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "from isatools.isatab import dump\n", "\n", "# note the use of the flag for explicit serialization on factor values on assay tables\n", "dump(investigation, \"./output/MTBLS-XXXX-SIRM/\", write_factor_values_in_assay_table=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Validating the ISA object representation with the ISA `validate` function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from isatools import isatab\n", "\n", "my_json_report_isa_flux = isatab.validate(open(os.path.join(\"./output/MTBLS-XXXX-SIRM/\",\"i_investigation.txt\")))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_isa_flux[\"errors\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NOTE: The error report indicates the need to add new configurations files matching the assay definitions.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading the ISA document from disk back in, loading it into memory and writing to disk again to check that the ISA-API load function works nominally" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "from isatools.isatab import load\n", "with open(os.path.join(\"./output/MTBLS-XXXX-SIRM/\",\"i_investigation.txt\")) as isa_sirm_test:\n", " roundtrip = load(isa_sirm_test)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from isatools.convert import isatab2json\n", "from isatools import isajson\n", "import json\n", "isa_json = isatab2json.convert('./output/MTBLS-XXXX-SIRM/', validate_first=False, use_new_parser=True)\n", "\n", "isa_j = json.dumps(\n", " isa_json, cls=isajson.ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')\n", " )\n", "\n", "\n", "with open(os.path.join('./output/MTBLS-XXXX-SIRM/', 'isa-sirm-test.json'), 'w') as out_fp:\n", " out_fp.write(isa_j)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# note the use of the flag for explicit serialization on factor values on assay tables\n", "# dump(roundtrip, \"./notebook-output/MTBLS-0000-SIRM-roundtrip/\", write_factor_values_in_assay_table=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## About this notebook\n", "\n", "- authors: Philippe Rocca-Serra (philippe.rocca-serra@oerc.ox.ac.uk)\n", "- license: CC-BY 4.0\n", "- support: isatools@googlegroups.com\n", "- issue tracker: https://github.com/ISA-tools/isa-api/issues" ] } ], "metadata": { "kernelspec": { "display_name": "isa-api-py39", "language": "python", "name": "isa-api-py39" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }