{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reading ISA-Tab from files and Validating ISA-Tab files "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Abstract:\n",
"\n",
"The aim of this notebook is to:\n",
" - show essential function to read and load an ISA-tab file in memory.\n",
" - navigate key objects and pull key attributes.\n",
" - learn how to invoke the ISA-tab validation function.\n",
" - interpret the output of the validation report.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Getting the tools"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# If executing the notebooks on `Google Colab`,uncomment the following command \n",
"# and run it to install the required python libraries. Also, make the test datasets available.\n",
"\n",
"# !pip install -r requirements.txt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import isatools\n",
"import os\n",
"import sys\n",
"from isatools import isatab"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Reading and loading an ISA Investigation in memory from an ISA-Tab instance"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open(os.path.join('./BII-S-3', 'i_gilbert.txt')) as fp:\n",
" ISA = isatab.load(fp)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's check the description of the first study object present in an ISA Investigation object"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ISA.studies[0].description"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's check the protocols declared in ISA the study (using a python list comprehension):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[protocol.description for protocol in ISA.studies[0].protocols]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's now checks the ISA Assay Measurement and Technology Types are used in this ISA Study object"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[f'{assay.measurement_type.term} using {assay.technology_type.term}' for assay in ISA.studies[0].assays]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Let's now check the `ISA Study Source` Material:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[source.name for source in ISA.studies[0].sources]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's check what is the first `ISA Study Source property`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# here, we get all the characteristics of the first Source object\n",
"first_source_characteristics = ISA.studies[0].sources[0].characteristics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"first_source_characteristics[0].category.term"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's now check what is the `value` associated with that first `ISA Study Source property`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"first_source_characteristics[0].value.term"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Let's now check what are all the properties associated with this first `ISA Study Source`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[char.category.term for char in first_source_characteristics]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### And the corresponding values are:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[char.value for char in first_source_characteristics]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Invoking the python ISA-Tab Validator"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_i_1 = isatab.validate(open(os.path.join('./BII-I-1/', 'i_investigation.txt')))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_3 = isatab.validate(open(os.path.join('./BII-S-3/', 'i_gilbert.txt')))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_4 = isatab.validate(open(os.path.join('./BII-S-4/', 'i_investigation.txt')))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_7 = isatab.validate(open(os.path.join('./BII-S-7/', 'i_matteo.txt')))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- This `Validation Report` shows that No Error has been logged\n",
"- The rest of the report consists in warnings meant to draw the attention of the curator to elements which may be provided but which do not break the ISA syntax.\n",
"- Notice the `study group` information reported on both study and assay files. If ISA `Factor Value[]` fields are found present in the `ISA Study` or ` ISA Assay` tables, the validator will try to identify the set of unique `Factor Value` combination defining a `Study Group`.\n",
" - When no `Factor Value` are found in a ISA `Study` or `Assay` table, the value is left to its default value: -1, which means that `No Study Group` have been found.\n",
" - ISA **strongly** encourages to declare Study Group using ISA Factor Value to unambiguously identify the Independent Variables of an experiment.\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. How does a validation failure looks like ?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### BII-S-5 contains an error located in the `i_investigation.txt` file of the submission"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_5 = isatab.validate(open(os.path.join('./BII-S-5/', 'i_investigation.txt')))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_json_report_bii_s_5[\"errors\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- The Validator report the Error Array is not empty and shows the root cause of the syntactic validator error.\n",
"- There is a typo in the Investigation file which affects 2 positions on the file for both Investigation and Study Object: \n",
"Publication **l**ist. vs Publication **L**ist"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## About this notebook\n",
"\n",
"- authors: philippe.rocca-serra@oerc.ox.ac.uk, massimiliano.izzo@oerc.ox.ac.uk\n",
"- license: CC-BY 4.0\n",
"- support: isatools@googlegroups.com\n",
"- issue tracker: https://github.com/ISA-tools/isa-api/issues"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}