{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading and Writing Files\n", "\n", "## HDF5\n", "\n", "Scipp supports writing variables, data arrays, and dataset to [HDF5](https://portal.hdfgroup.org/documentation/index.html) files.\n", "Reading of HDF5 is supported *only* for these scipp-specific files.\n", "Other HDF5-based formats are not supported at this point.\n", "For reading the HDF5-based [NeXus](https://www.nexusformat.org/) files, see [scippneutron](https://scipp.github.io/scippneutron/).\n", "\n", "
\n", "\n", "**Warning**\n", " \n", "We do not recommend to use Scipp HDF5 files for archiving or as the sole means of storing valuable data.\n", "The current Scipp HDF5 schema is not a standard and will likely be subject to change due to the early development status of scipp.\n", "**Future versions of Scipp may not be able to read older files.**\n", " \n", "That being said, the file format is quite simple and based on the HDF5 standard so it would still be possible to recover data from such files in such a case.\n", "Note that the Scipp version is stored as an HDF5 attribute of the saved objects.\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipp as sc\n", "\n", "x = sc.Variable(dims=['x'], values=np.arange(10))\n", "var = sc.Variable(dims=['x', 'y'], values=np.random.rand(9, 3))\n", "a = sc.DataArray(data=var, coords={'x': x})\n", "\n", "a.save_hdf5(filename='test.hdf5')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = sc.io.load_hdf5(filename='test.hdf5')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CSV\n", "\n", "
\n", "\n", "**Note**\n", "\n", "CSV support requires [pandas](https://pandas.pydata.org/) which must be installed separately.\n", " \n", "
\n", "\n", "CSV files can be read into datasets with [scipp.io.load_csv](../generated/modules/scipp.io.csv.load_csv.rst).\n", "For example, given the following CSV-encoded data can be read into a dataset as shown:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "csv_content = '''a [m],b [s],c\n", "1,5,9\n", "2,6,10\n", "3,7,11\n", "4,8,12'''" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from io import StringIO\n", "\n", "ds = sc.io.load_csv(StringIO(csv_content), header_parser='bracket')\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example uses `StringIO` to load the data directly from a string.\n", "But `load_csv` can also load from a file on your hard drive or even from a remote server.\n", "Simply pass the path or URL of the file as the first argument.\n", "See also [pandas.read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html).\n", "\n", "See [scipp.io.load_csv](../generated/modules/scipp.io.csv.load_csv.rst) for more options to customize how the data is structured in the dataset." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using pandas\n", "\n", "The CSV reader shown above is a wrapper around `pandas.read_csv` and provides commonly used functionality.\n", "But pandas supports many more file readers for, among others, Excel, JSON, and XML files.\n", "See [pandas IO tools](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) for a complete list.\n", "\n", "It is possible to use pandas manually to load these files and then convert the result to a Scipp dataset using [from_pandas](../generated/modules/scipp.compat.pandas_compat.from_pandas.rst).\n", "For example, JSON can be read as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "json = '''{\"A [m]\": {\"0\": 1, \"1\": 3, \"2\": 5},\n", "\"B [m/s]\": {\"0\": 2, \"1\": 4, \"2\": 6}}'''" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "df = pd.read_json(json)\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sc.compat.from_pandas(df, header_parser='bracket')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## NeXus\n", "\n", "Scipp has no built-in support for loading [NeXus](https://www.nexusformat.org/) files.\n", "However, the `scippneutron` package can internally use [Mantid](https://www.mantidproject.org) to load such files, or any other Mantid-supported file type, see [scippneutron](https://scipp.github.io/scippneutron/) and in particular [scippneutron.load_with_mantid](https://scipp.github.io/scippneutron/generated/functions/scippneutron.load_with_mantid.html)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15" } }, "nbformat": 4, "nbformat_minor": 4 }