{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading ISA-Tab from files and Validating ISA-Tab files " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Abstract:\n", "\n", "The aim of this notebook is to:\n", " - show essential function to read and load an ISA-tab file in memory.\n", " - navigate key objects and pull key attributes.\n", " - learn how to invoke the ISA-tab validation function.\n", " - interpret the output of the validation report.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Getting the tools" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If executing the notebooks on `Google Colab`,uncomment the following command \n", "# and run it to install the required python libraries. Also, make the test datasets available.\n", "\n", "# !pip install -r requirements.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import isatools\n", "import os\n", "import sys\n", "from isatools import isatab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Reading and loading an ISA Investigation in memory from an ISA-Tab instance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(os.path.join('./BII-S-3', 'i_gilbert.txt')) as fp:\n", " ISA = isatab.load(fp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's check the description of the first study object present in an ISA Investigation object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ISA.studies[0].description" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's check the protocols declared in ISA the study (using a python list comprehension):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[protocol.description for protocol in ISA.studies[0].protocols]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's now checks the ISA Assay Measurement and Technology Types are used in this ISA Study object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[f'{assay.measurement_type.term} using {assay.technology_type.term}' for assay in ISA.studies[0].assays]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's now check the `ISA Study Source` Material:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[source.name for source in ISA.studies[0].sources]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's check what is the first `ISA Study Source property`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# here, we get all the characteristics of the first Source object\n", "first_source_characteristics = ISA.studies[0].sources[0].characteristics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "first_source_characteristics[0].category.term" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's now check what is the `value` associated with that first `ISA Study Source property`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "first_source_characteristics[0].value.term" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's now check what are all the properties associated with this first `ISA Study Source`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[char.category.term for char in first_source_characteristics]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### And the corresponding values are:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[char.value for char in first_source_characteristics]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Invoking the python ISA-Tab Validator" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_i_1 = isatab.validate(open(os.path.join('./BII-I-1/', 'i_investigation.txt')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_3 = isatab.validate(open(os.path.join('./BII-S-3/', 'i_gilbert.txt')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_4 = isatab.validate(open(os.path.join('./BII-S-4/', 'i_investigation.txt')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_7 = isatab.validate(open(os.path.join('./BII-S-7/', 'i_matteo.txt')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- This `Validation Report` shows that No Error has been logged\n", "- The rest of the report consists in warnings meant to draw the attention of the curator to elements which may be provided but which do not break the ISA syntax.\n", "- Notice the `study group` information reported on both study and assay files. If ISA `Factor Value[]` fields are found present in the `ISA Study` or ` ISA Assay` tables, the validator will try to identify the set of unique `Factor Value` combination defining a `Study Group`.\n", " - When no `Factor Value` are found in a ISA `Study` or `Assay` table, the value is left to its default value: -1, which means that `No Study Group` have been found.\n", " - ISA **strongly** encourages to declare Study Group using ISA Factor Value to unambiguously identify the Independent Variables of an experiment.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. How does a validation failure looks like ?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### BII-S-5 contains an error located in the `i_investigation.txt` file of the submission" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_5 = isatab.validate(open(os.path.join('./BII-S-5/', 'i_investigation.txt')))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_json_report_bii_s_5[\"errors\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- The Validator report the Error Array is not empty and shows the root cause of the syntactic validator error.\n", "- There is a typo in the Investigation file which affects 2 positions on the file for both Investigation and Study Object: \n", "Publication **l**ist. vs Publication **L**ist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## About this notebook\n", "\n", "- authors: philippe.rocca-serra@oerc.ox.ac.uk, massimiliano.izzo@oerc.ox.ac.uk\n", "- license: CC-BY 4.0\n", "- support: isatools@googlegroups.com\n", "- issue tracker: https://github.com/ISA-tools/isa-api/issues" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1" } }, "nbformat": 4, "nbformat_minor": 4 }