{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## How To Pool and Merge Nodes (Material or Data) with ISA-API\n", "\n", "- author: https://orcid.org/0000-0001-9853-5668\n", "- email: philippe.rocca-serra@oerc.ox.ac.uk\n", "- license: CC-BY-4.0\n", "- createdOn: 2021-04-27" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example shows how to use the ProtocolProcessSequence to build an ISA graph with Node merging(pooling) events.\n", "The notebook shows 2 examples:\n", "- pooling samples, as in the case of using to Source Material to create a pooled Samples (for example, pooling soil samples)\n", "- pooling data, as in the case of a normalization data transformation event acting on 2 raw data files\n", "\n", "The notebooks shows how to serialize (write) the ISA Model content to ISA-Tab and ISA-JSON formats." ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [], "source": [ "from isatools.model import *\n", "from isatools.create.model import *\n", "import datetime" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating basic ISA objects: Investigation, Study, Protocols" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "# creating an ISA.Investigation object\n", "investigation = Investigation()\n", "\n", "# creating an ISA.Study object\n", "study = Study(filename=\"s_study.txt\")\n", "study.identifier = \"S1\"\n", "study.title = \"ISA Study example: creating sample pools\"\n", "study.description = \"a jupytern notebook showing how to create pooled samples (a node merging event with material nodes)\"\n", "\n", "# creating the necessary ISA.Protocol objects\n", "study.protocols = [Protocol(name=\"sample collection\",protocol_type=\"pooling\"),\n", " Protocol(name=\"intracellular fraction extraction\",\n", " protocol_type=OntologyAnnotation(term=\"extraction\"),\n", " parameters=[ProtocolParameter(parameter_name=OntologyAnnotation(term=\"concentration\")),\n", " ProtocolParameter(parameter_name=OntologyAnnotation(term=\"sample QC\"))]),\n", " Protocol(name=\"data collection\",\n", " protocol_type=OntologyAnnotation(term=\"data acquisition\")),\n", " Protocol(name=\"data transformation\",\n", " protocol_type=OntologyAnnotation(term=\"data normalization\"))\n", " ]" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "# creating 4 ISA.Source objects\n", "study.sources = [Source(name=\"source1\"),Source(name=\"source2\"),Source(name=\"source3\"),Source(name=\"source4\")]\n", "\n", "# creating 2 ISA.Sample objects\n", "study.samples = [Sample(name=\"sample1\"),Sample(name=\"sample2\")]\n", "\n", "# creating an ISA.ProtocolApplication pooling Source1 and Source2 into Sample1\n", "study.process_sequence = [Process(executes_protocol=study.protocols[0], inputs=[study.sources[0],study.sources[1]], outputs=[study.samples[0]])]\n", "\n", "\n", "# doing the same again for pooling Source3 and Source4 into Sample2\n", "study.process_sequence.append(Process(executes_protocol=study.protocols[0], inputs=[study.sources[2],study.sources[3]], outputs=[study.samples[1]]))\n", "\n", "investigation.studies = [study]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing the Study without Assay to ISA-Tab" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/var/folders/5n/rl6lqnks4rqb59pbtpvvntqw0000gr/T/tmp5xyrzxr6/i_investigation.txt\n", "ONTOLOGY SOURCE REFERENCE\n", "Term Source Name\n", "Term Source File\n", "Term Source Version\n", "Term Source Description\n", "INVESTIGATION\n", "Investigation Identifier\t\n", "Investigation Title\t\n", "Investigation Description\t\n", "Investigation Submission Date\t\n", "Investigation Public Release Date\t\n", "INVESTIGATION PUBLICATIONS\n", "Investigation PubMed ID\n", "Investigation Publication DOI\n", "Investigation Publication Author List\n", "Investigation Publication Title\n", "Investigation Publication Status\n", "Investigation Publication Status Term Accession Number\n", "Investigation Publication Status Term Source REF\n", "INVESTIGATION CONTACTS\n", "Investigation Person Last Name\n", "Investigation Person First Name\n", "Investigation Person Mid Initials\n", "Investigation Person Email\n", "Investigation Person Phone\n", "Investigation Person Fax\n", "Investigation Person Address\n", "Investigation Person Affiliation\n", "Investigation Person Roles\n", "Investigation Person Roles Term Accession Number\n", "Investigation Person Roles Term Source REF\n", "STUDY\n", "Study Identifier\tS1\n", "Study Title\tISA Study example: creating sample pools\n", "Study Description\ta jupytern notebook showing how to create pooled samples (a node merging event with material nodes)\n", "Study Submission Date\t\n", "Study Public Release Date\t\n", "Study File Name\ts_study.txt\n", "STUDY DESIGN DESCRIPTORS\n", "Study Design Type\n", "Study Design Type Term Accession Number\n", "Study Design Type Term Source REF\n", "STUDY PUBLICATIONS\n", "Study PubMed ID\n", "Study Publication DOI\n", "Study Publication Author List\n", "Study Publication Title\n", "Study Publication Status\n", "Study Publication Status Term Accession Number\n", "Study Publication Status Term Source REF\n", "STUDY FACTORS\n", "Study Factor Name\n", "Study Factor Type\n", "Study Factor Type Term Accession Number\n", "Study Factor Type Term Source REF\n", "STUDY ASSAYS\n", "Study Assay File Name\n", "Study Assay Measurement Type\n", "Study Assay Measurement Type Term Accession Number\n", "Study Assay Measurement Type Term Source REF\n", "Study Assay Technology Type\n", "Study Assay Technology Type Term Accession Number\n", "Study Assay Technology Type Term Source REF\n", "Study Assay Technology Platform\n", "STUDY PROTOCOLS\n", "Study Protocol Name\tsample collection\tintracellular fraction extraction\tdata collection\tdata transformation\n", "Study Protocol Type\tpooling\textraction\tdata acquisition\tdata normalization\n", "Study Protocol Type Term Accession Number\t\t\t\t\n", "Study Protocol Type Term Source REF\t\t\t\t\n", "Study Protocol Description\t\t\t\t\n", "Study Protocol URI\t\t\t\t\n", "Study Protocol Version\t\t\t\t\n", "Study Protocol Parameters Name\t\tconcentration;sample QC\t\t\n", "Study Protocol Parameters Name Term Accession Number\t\t;\t\t\n", "Study Protocol Parameters Name Term Source REF\t\t;\t\t\n", "Study Protocol Components Name\t\t\t\t\n", "Study Protocol Components Type\t\t\t\t\n", "Study Protocol Components Type Term Accession Number\t\t\t\t\n", "Study Protocol Components Type Term Source REF\t\t\t\t\n", "STUDY CONTACTS\n", "Study Person Last Name\n", "Study Person First Name\n", "Study Person Mid Initials\n", "Study Person Email\n", "Study Person Phone\n", "Study Person Fax\n", "Study Person Address\n", "Study Person Affiliation\n", "Study Person Roles\n", "Study Person Roles Term Accession Number\n", "Study Person Roles Term Source REF\n", "--------\n", "/var/folders/5n/rl6lqnks4rqb59pbtpvvntqw0000gr/T/tmp5xyrzxr6/s_study.txt\n", "Source Name\tProtocol REF\tSample Name\n", "source1\tsample collection\tsample1\n", "source2\tsample collection\tsample1\n", "source3\tsample collection\tsample2\n", "source4\tsample collection\tsample2\n", "\n" ] } ], "source": [ "# let's check how this looks in ISA-Tab\n", "from isatools.isatab import dumps\n", "print(dumps(investigation))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Writing the ISA Study object to ISA-JSON" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"comments\": [],\n", " \"description\": \"\",\n", " \"identifier\": \"\",\n", " \"ontologySourceReferences\": [],\n", " \"people\": [],\n", " \"publicReleaseDate\": \"\",\n", " \"publications\": [],\n", " \"studies\": [\n", " {\n", " \"assays\": [],\n", " \"characteristicCategories\": [],\n", " \"comments\": [],\n", " \"description\": \"a jupytern notebook showing how to create pooled samples (a node merging event with material nodes)\",\n", " \"factors\": [],\n", " \"filename\": \"s_study.txt\",\n", " \"identifier\": \"S1\",\n", " \"materials\": {\n", " \"otherMaterials\": [],\n", " \"samples\": [\n", " {\n", " \"@id\": \"#sample/5116543904\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample1\"\n", " },\n", " {\n", " \"@id\": \"#sample/5116545584\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample2\"\n", " }\n", " ],\n", " \"sources\": [\n", " {\n", " \"@id\": \"#source/5116545344\",\n", " \"characteristics\": [],\n", " \"name\": \"source1\"\n", " },\n", " {\n", " \"@id\": \"#source/5116545824\",\n", " \"characteristics\": [],\n", " \"name\": \"source2\"\n", " },\n", " {\n", " \"@id\": \"#source/5119072912\",\n", " \"characteristics\": [],\n", " \"name\": \"source3\"\n", " },\n", " {\n", " \"@id\": \"#source/5119071040\",\n", " \"characteristics\": [],\n", " \"name\": \"source4\"\n", " }\n", " ]\n", " },\n", " \"people\": [],\n", " \"processSequence\": [\n", " {\n", " \"@id\": \"#process/5119073488\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5109903952\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#source/5116545344\"\n", " },\n", " {\n", " \"@id\": \"#source/5116545824\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#sample/5116543904\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " },\n", " {\n", " \"@id\": \"#process/5119071088\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5109903952\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#source/5119072912\"\n", " },\n", " {\n", " \"@id\": \"#source/5119071040\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#sample/5116545584\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " }\n", " ],\n", " \"protocols\": [\n", " {\n", " \"@id\": \"#5109903952\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"sample collection\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#87632ca1-1109-4f42-bdc9-fd56c30cead8\",\n", " \"annotationValue\": \"pooling\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5118472000\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"intracellular fraction extraction\",\n", " \"parameters\": [\n", " {\n", " \"@id\": \"#5119051952\",\n", " \"parameterName\": {\n", " \"@id\": \"#590aaeb5-2e11-4aff-a2c9-9b34bf676249\",\n", " \"annotationValue\": \"concentration\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " }\n", " },\n", " {\n", " \"@id\": \"#5119050704\",\n", " \"parameterName\": {\n", " \"@id\": \"#33dfce38-e133-4bc0-9666-9a81bfb39362\",\n", " \"annotationValue\": \"sample QC\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " }\n", " }\n", " ],\n", " \"protocolType\": {\n", " \"@id\": \"#9a41779d-eb92-467d-88ac-d1595fe189de\",\n", " \"annotationValue\": \"extraction\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5118472096\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"data collection\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#26e7d865-e6be-454b-bd91-a8ee33a3cbcd\",\n", " \"annotationValue\": \"data acquisition\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5116464528\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"data transformation\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#e3ed694a-215c-45f5-bb09-cf4c2a6154e7\",\n", " \"annotationValue\": \"data normalization\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " }\n", " ],\n", " \"publicReleaseDate\": \"\",\n", " \"publications\": [],\n", " \"studyDesignDescriptors\": [],\n", " \"submissionDate\": \"\",\n", " \"title\": \"ISA Study example: creating sample pools\",\n", " \"unitCategories\": []\n", " }\n", " ],\n", " \"submissionDate\": \"\",\n", " \"title\": \"\"\n", "}\n" ] } ], "source": [ "import json\n", "from isatools.isajson import ISAJSONEncoder\n", "print(json.dumps(investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating ISA Assays, Data Acquisition and Data Transformation events\n", "\n", "Let's now augment the ISA.Study by adding an Assay table where\n", "- `raw data` will be collected *independently* on each of the samples created in the previous.\n", "- `derived data` resulting from a data transformation acting on the raw data (node merging)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "\n", "# This creates intermediate ISA.Materials (Extracts) from Samples. \n", "# The extracts will be used as input to the next protocol application\n", "extraction_process1 = Process(executes_protocol=study.protocols[1])\n", "extraction_process1.inputs.append(study.samples[0])\n", "\n", "material1 = Material(name=\"extract-1\")\n", "material1.type = \"Extract Name\"\n", "\n", "extraction_process2 = Process(executes_protocol=study.protocols[1])\n", "extraction_process2.inputs.append(study.samples[1])\n", "\n", "material2 = Material(name=\"extract-2\")\n", "material2.type = \"Extract Name\"\n", "\n", "extraction_process1.outputs=[material1]\n", "extraction_process2.outputs=[material2]\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Acquisition Events" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [], "source": [ "metprof_assay = Assay(measurement_type=OntologyAnnotation(term=\"metabolite profiling\"),\n", " technology_type=OntologyAnnotation(term=\"mass spectrometry\"),filename=\"a_mp_by_ms.txt\")\n", "\n", "metprof_assay.samples.append(study.samples[0])\n", "metprof_assay.samples.append(study.samples[1])\n", "\n", "# metprof_assay.data_files.append(DataFile(filename=\"sequenced-data-1\", label=\"Raw Data File\"))\n", "\n", "datafile1=DataFile(filename=\"file-1\",label=\"Spectral Raw Data File\")\n", "datafile2=DataFile(filename=\"file-2\",label=\"Spectral Raw Data File\")\n", "metprof_assay.data_files.append(datafile1)\n", "metprof_assay.data_files.append(datafile2)\n", "\n", "metprof_assay.other_material.append(material1)\n", "metprof_assay.other_material.append(material2)\n", " \n", "metprof_assay.process_sequence.append(extraction_process1)\n", "metprof_assay.process_sequence.append(extraction_process2)\n", "\n", "da_process1 = Process(executes_protocol=study.protocols[2],inputs=[material1], outputs=[datafile1], date_=\"2021-03-30\", performer=\"Bob Louis\")\n", "da_process1.name = \"assay-name-test-1\"\n", "da_process2 = Process(executes_protocol=study.protocols[2],inputs=[material2], outputs=[datafile2], date_=\"2021-04-10\", performer=\"Yu Wong\")\n", "da_process2.name = \"assay-name-test-2\"\n", " \n", "\n", "metprof_assay.process_sequence.append(da_process1)\n", "metprof_assay.process_sequence.append(da_process2)\n", "\n", "# IMPORTANT: explictly set the linking/sequence between processes\n", "# NOTE: one-to-one mapping between protocol applications\n", "plink(extraction_process1, da_process1)\n", "plink(extraction_process2, da_process2)\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Transformation Event acting on 2 ISA Data Nodes and resulting in 1 ISA Data Node." ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [], "source": [ "datafile3 = DataFile(filename=\"analysis-output1.txt\", label=\"Derived Spectral Data File\")\n", "dt_process1 = Process(executes_protocol=study.protocols[3], inputs=[datafile1,datafile2],outputs=[datafile3], date_=\"2021-04-25\", performer=\"Data Science Officer\")\n", "\n", "dt_process1.name = \"data transformation 1\"\n", "\n", "metprof_assay.process_sequence.append(dt_process1)\n", "\n", "# IMPORTANT: explictly set the linking/sequence between processes\n", "# NOTE: many-to-one mapping between protocol applications ~ pooling/merging event\n", "plink(da_process1,dt_process1)\n", "plink(da_process2,dt_process1)\n", "\n", "\n", "study.assays.append(metprof_assay)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Writing the full ISA Study complete with Assay to ISA-Tab" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/var/folders/5n/rl6lqnks4rqb59pbtpvvntqw0000gr/T/tmp3hybi37n/i_investigation.txt\n", "ONTOLOGY SOURCE REFERENCE\n", "Term Source Name\n", "Term Source File\n", "Term Source Version\n", "Term Source Description\n", "INVESTIGATION\n", "Investigation Identifier\t\n", "Investigation Title\t\n", "Investigation Description\t\n", "Investigation Submission Date\t\n", "Investigation Public Release Date\t\n", "INVESTIGATION PUBLICATIONS\n", "Investigation PubMed ID\n", "Investigation Publication DOI\n", "Investigation Publication Author List\n", "Investigation Publication Title\n", "Investigation Publication Status\n", "Investigation Publication Status Term Accession Number\n", "Investigation Publication Status Term Source REF\n", "INVESTIGATION CONTACTS\n", "Investigation Person Last Name\n", "Investigation Person First Name\n", "Investigation Person Mid Initials\n", "Investigation Person Email\n", "Investigation Person Phone\n", "Investigation Person Fax\n", "Investigation Person Address\n", "Investigation Person Affiliation\n", "Investigation Person Roles\n", "Investigation Person Roles Term Accession Number\n", "Investigation Person Roles Term Source REF\n", "STUDY\n", "Study Identifier\tS1\n", "Study Title\tISA Study example: creating sample pools\n", "Study Description\ta jupytern notebook showing how to create pooled samples (a node merging event with material nodes)\n", "Study Submission Date\t\n", "Study Public Release Date\t\n", "Study File Name\ts_study.txt\n", "STUDY DESIGN DESCRIPTORS\n", "Study Design Type\n", "Study Design Type Term Accession Number\n", "Study Design Type Term Source REF\n", "STUDY PUBLICATIONS\n", "Study PubMed ID\n", "Study Publication DOI\n", "Study Publication Author List\n", "Study Publication Title\n", "Study Publication Status\n", "Study Publication Status Term Accession Number\n", "Study Publication Status Term Source REF\n", "STUDY FACTORS\n", "Study Factor Name\n", "Study Factor Type\n", "Study Factor Type Term Accession Number\n", "Study Factor Type Term Source REF\n", "STUDY ASSAYS\n", "Study Assay File Name\ta_mp_by_ms.txt\n", "Study Assay Measurement Type\tmetabolite profiling\n", "Study Assay Measurement Type Term Accession Number\t\n", "Study Assay Measurement Type Term Source REF\t\n", "Study Assay Technology Type\tmass spectrometry\n", "Study Assay Technology Type Term Accession Number\t\n", "Study Assay Technology Type Term Source REF\t\n", "Study Assay Technology Platform\t\n", "STUDY PROTOCOLS\n", "Study Protocol Name\tsample collection\tintracellular fraction extraction\tdata collection\tdata transformation\n", "Study Protocol Type\tpooling\textraction\tdata acquisition\tdata normalization\n", "Study Protocol Type Term Accession Number\t\t\t\t\n", "Study Protocol Type Term Source REF\t\t\t\t\n", "Study Protocol Description\t\t\t\t\n", "Study Protocol URI\t\t\t\t\n", "Study Protocol Version\t\t\t\t\n", "Study Protocol Parameters Name\t\tconcentration;sample QC\t\t\n", "Study Protocol Parameters Name Term Accession Number\t\t;\t\t\n", "Study Protocol Parameters Name Term Source REF\t\t;\t\t\n", "Study Protocol Components Name\t\t\t\t\n", "Study Protocol Components Type\t\t\t\t\n", "Study Protocol Components Type Term Accession Number\t\t\t\t\n", "Study Protocol Components Type Term Source REF\t\t\t\t\n", "STUDY CONTACTS\n", "Study Person Last Name\n", "Study Person First Name\n", "Study Person Mid Initials\n", "Study Person Email\n", "Study Person Phone\n", "Study Person Fax\n", "Study Person Address\n", "Study Person Affiliation\n", "Study Person Roles\n", "Study Person Roles Term Accession Number\n", "Study Person Roles Term Source REF\n", "--------\n", "/var/folders/5n/rl6lqnks4rqb59pbtpvvntqw0000gr/T/tmp3hybi37n/s_study.txt\n", "Source Name\tProtocol REF\tSample Name\n", "source1\tsample collection\tsample1\n", "source2\tsample collection\tsample1\n", "source3\tsample collection\tsample2\n", "source4\tsample collection\tsample2\n", "--------\n", "/var/folders/5n/rl6lqnks4rqb59pbtpvvntqw0000gr/T/tmp3hybi37n/a_mp_by_ms.txt\n", "Sample Name\tProtocol REF\tExtract Name\tProtocol REF\tAssay Name\tDate\tPerformer\tSpectral Raw Data File\tProtocol REF\tDate\tPerformer\tDerived Spectral Data File\n", "sample1\tintracellular fraction extraction\textract-1\tdata collection\tassay-name-test-1\t2021-03-30\tBob Louis\tfile-1\tdata transformation\t2021-04-25\tData Science Officer\tanalysis-output1.txt\n", "sample2\tintracellular fraction extraction\textract-2\tdata collection\tassay-name-test-2\t2021-04-10\tYu Wong\tfile-2\tdata transformation\t2021-04-25\tData Science Officer\tanalysis-output1.txt\n", "\n" ] } ], "source": [ "from isatools.isatab import dumps\n", "print(dumps(investigation))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Writing the same ISA Study to ISA-JSON" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"comments\": [],\n", " \"description\": \"\",\n", " \"identifier\": \"\",\n", " \"ontologySourceReferences\": [],\n", " \"people\": [],\n", " \"publicReleaseDate\": \"\",\n", " \"publications\": [],\n", " \"studies\": [\n", " {\n", " \"assays\": [\n", " {\n", " \"characteristicCategories\": [],\n", " \"comments\": [],\n", " \"dataFiles\": [\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5120399872\",\n", " \"comments\": [],\n", " \"name\": \"file-1\",\n", " \"type\": \"Spectral Raw Data File\"\n", " },\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5119233712\",\n", " \"comments\": [],\n", " \"name\": \"file-2\",\n", " \"type\": \"Spectral Raw Data File\"\n", " }\n", " ],\n", " \"filename\": \"a_mp_by_ms.txt\",\n", " \"materials\": {\n", " \"otherMaterials\": [\n", " {\n", " \"@id\": \"#material/extract-5116526304\",\n", " \"characteristics\": [],\n", " \"name\": \"extract-1\",\n", " \"type\": \"Extract Name\"\n", " },\n", " {\n", " \"@id\": \"#material/extract-5116523040\",\n", " \"characteristics\": [],\n", " \"name\": \"extract-2\",\n", " \"type\": \"Extract Name\"\n", " }\n", " ],\n", " \"samples\": [\n", " {\n", " \"@id\": \"#sample/5116543904\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample1\"\n", " },\n", " {\n", " \"@id\": \"#sample/5116545584\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample2\"\n", " }\n", " ]\n", " },\n", " \"measurementType\": {\n", " \"@id\": \"#08e97f57-79b7-45e1-ba49-fec0232deb22\",\n", " \"annotationValue\": \"metabolite profiling\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"processSequence\": [\n", " {\n", " \"@id\": \"#process/5116524192\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5118472000\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#sample/5116543904\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"nextProcess\": {\n", " \"@id\": \"#process/5119233568\"\n", " },\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#material/extract-5116526304\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " },\n", " {\n", " \"@id\": \"#process/5116526016\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5118472000\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#sample/5116545584\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"nextProcess\": {\n", " \"@id\": \"#process/5119234000\"\n", " },\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#material/extract-5116523040\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " },\n", " {\n", " \"@id\": \"#process/5119233568\",\n", " \"comments\": [],\n", " \"date\": \"2021-03-30\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5118472096\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#material/extract-5116526304\"\n", " }\n", " ],\n", " \"name\": \"assay-name-test-1\",\n", " \"nextProcess\": {\n", " \"@id\": \"#process/5116526112\"\n", " },\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5120399872\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"Bob Louis\",\n", " \"previousProcess\": {\n", " \"@id\": \"#process/5116524192\"\n", " }\n", " },\n", " {\n", " \"@id\": \"#process/5119234000\",\n", " \"comments\": [],\n", " \"date\": \"2021-04-10\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5118472096\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#material/extract-5116523040\"\n", " }\n", " ],\n", " \"name\": \"assay-name-test-2\",\n", " \"nextProcess\": {\n", " \"@id\": \"#process/5116526112\"\n", " },\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5119233712\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"Yu Wong\",\n", " \"previousProcess\": {\n", " \"@id\": \"#process/5116526016\"\n", " }\n", " },\n", " {\n", " \"@id\": \"#process/5116526112\",\n", " \"comments\": [],\n", " \"date\": \"2021-04-25\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5116464528\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5120399872\"\n", " },\n", " {\n", " \"@id\": \"#data/spectralrawdatafile-5119233712\"\n", " }\n", " ],\n", " \"name\": \"data transformation 1\",\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#data/derivedspectraldatafile-5116523904\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"Data Science Officer\",\n", " \"previousProcess\": {\n", " \"@id\": \"#process/5119234000\"\n", " }\n", " }\n", " ],\n", " \"technologyPlatform\": \"\",\n", " \"technologyType\": {\n", " \"@id\": \"#4eb6ea6c-c1a9-4b8c-ae88-a59f328cfcd7\",\n", " \"annotationValue\": \"mass spectrometry\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"unitCategories\": []\n", " }\n", " ],\n", " \"characteristicCategories\": [],\n", " \"comments\": [],\n", " \"description\": \"a jupytern notebook showing how to create pooled samples (a node merging event with material nodes)\",\n", " \"factors\": [],\n", " \"filename\": \"s_study.txt\",\n", " \"identifier\": \"S1\",\n", " \"materials\": {\n", " \"otherMaterials\": [],\n", " \"samples\": [\n", " {\n", " \"@id\": \"#sample/5116543904\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample1\"\n", " },\n", " {\n", " \"@id\": \"#sample/5116545584\",\n", " \"characteristics\": [],\n", " \"factorValues\": [],\n", " \"name\": \"sample2\"\n", " }\n", " ],\n", " \"sources\": [\n", " {\n", " \"@id\": \"#source/5116545344\",\n", " \"characteristics\": [],\n", " \"name\": \"source1\"\n", " },\n", " {\n", " \"@id\": \"#source/5116545824\",\n", " \"characteristics\": [],\n", " \"name\": \"source2\"\n", " },\n", " {\n", " \"@id\": \"#source/5119072912\",\n", " \"characteristics\": [],\n", " \"name\": \"source3\"\n", " },\n", " {\n", " \"@id\": \"#source/5119071040\",\n", " \"characteristics\": [],\n", " \"name\": \"source4\"\n", " }\n", " ]\n", " },\n", " \"people\": [],\n", " \"processSequence\": [\n", " {\n", " \"@id\": \"#process/5119073488\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5109903952\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#source/5116545344\"\n", " },\n", " {\n", " \"@id\": \"#source/5116545824\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#sample/5116543904\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " },\n", " {\n", " \"@id\": \"#process/5119071088\",\n", " \"comments\": [],\n", " \"date\": \"\",\n", " \"executesProtocol\": {\n", " \"@id\": \"#5109903952\"\n", " },\n", " \"inputs\": [\n", " {\n", " \"@id\": \"#source/5119072912\"\n", " },\n", " {\n", " \"@id\": \"#source/5119071040\"\n", " }\n", " ],\n", " \"name\": \"\",\n", " \"outputs\": [\n", " {\n", " \"@id\": \"#sample/5116545584\"\n", " }\n", " ],\n", " \"parameterValues\": [],\n", " \"performer\": \"\"\n", " }\n", " ],\n", " \"protocols\": [\n", " {\n", " \"@id\": \"#5109903952\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"sample collection\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#87632ca1-1109-4f42-bdc9-fd56c30cead8\",\n", " \"annotationValue\": \"pooling\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5118472000\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"intracellular fraction extraction\",\n", " \"parameters\": [\n", " {\n", " \"@id\": \"#5119051952\",\n", " \"parameterName\": {\n", " \"@id\": \"#590aaeb5-2e11-4aff-a2c9-9b34bf676249\",\n", " \"annotationValue\": \"concentration\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " }\n", " },\n", " {\n", " \"@id\": \"#5119050704\",\n", " \"parameterName\": {\n", " \"@id\": \"#33dfce38-e133-4bc0-9666-9a81bfb39362\",\n", " \"annotationValue\": \"sample QC\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " }\n", " }\n", " ],\n", " \"protocolType\": {\n", " \"@id\": \"#9a41779d-eb92-467d-88ac-d1595fe189de\",\n", " \"annotationValue\": \"extraction\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5118472096\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"data collection\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#26e7d865-e6be-454b-bd91-a8ee33a3cbcd\",\n", " \"annotationValue\": \"data acquisition\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " },\n", " {\n", " \"@id\": \"#5116464528\",\n", " \"comments\": [],\n", " \"components\": [],\n", " \"description\": \"\",\n", " \"name\": \"data transformation\",\n", " \"parameters\": [],\n", " \"protocolType\": {\n", " \"@id\": \"#e3ed694a-215c-45f5-bb09-cf4c2a6154e7\",\n", " \"annotationValue\": \"data normalization\",\n", " \"comments\": [],\n", " \"termAccession\": \"\",\n", " \"termSource\": \"\"\n", " },\n", " \"uri\": \"\",\n", " \"version\": \"\"\n", " }\n", " ],\n", " \"publicReleaseDate\": \"\",\n", " \"publications\": [],\n", " \"studyDesignDescriptors\": [],\n", " \"submissionDate\": \"\",\n", " \"title\": \"ISA Study example: creating sample pools\",\n", " \"unitCategories\": []\n", " }\n", " ],\n", " \"submissionDate\": \"\",\n", " \"title\": \"\"\n", "}\n" ] } ], "source": [ "import json\n", "from isatools.isajson import ISAJSONEncoder\n", "print(json.dumps(investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': ')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "isa-api-py38", "language": "python", "name": "isa-api-py38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.0" } }, "nbformat": 4, "nbformat_minor": 4 }