{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python interface to the SuiteSparse Matrix Collection\n", "\n", "This notebook walks you through some of the features of the `ssgetpy` package that provides a search and download interface for the [Suite Sparse](https://suitesparse.com) matrix collection. \n", "\n", "The simplest way to install `ssgetpy` is via:\n", "```\n", "pip install ssgetpy\n", "```\n", "\n", "This installs both the `ssgetpy` Python module as well as the `ssgetpy` command-line script. \n", "\n", "\n", "This notebook only covers the library version of `ssgetpy`. To get more information on the command-line script run:\n", "```\n", "$ ssgetpy --help\n", "```\n", "\n", "Before proceeding with the rest of this notebook, please install `ssgetpy` into your environment. If you are running this notebook under Binder, `ssgetpy` will already be installed for you. If you are running this notebook in Google Colaboratory, the following cell will install `ssgetpy`: " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": false }, "outputs": [], "source": [ "ipy = get_ipython()\n", "if 'google.colab' in str(ipy):\n", " import sys\n", " ipy.run_cell('!{sys.executable} -m pip install ssgetpy')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First import `ssgetpy` via:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": false }, "outputs": [], "source": [ "import ssgetpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic query interface\n", "\n", "The primary interface to `ssgetpy` is via `ssgetpy.search`. Running `search` without any arguments returns the first 10 matrices in the collection:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
1HB1138_bus113811384054realNoYes1.01.0power network problem
2HB494_bus4944941666realNoYes1.01.0power network problem
3HB662_bus6626622474realNoYes1.01.0power network problem
4HB685_bus6856853249realNoYes1.01.0power network problem
5HBabb3133131761557binaryNoNo0.00.0least squares problem
6HBarc1301301301037realYesNo0.760.0materials problem
7HBash21921985438binaryNoNo0.00.0least squares problem
8HBash2922922922208binaryNoNo1.01.0least squares problem
9HBash331331104662binaryNoNo0.00.0least squares problem
10HBash6086081881216binaryNoNo0.00.0least squares problem
" ], "text/plain": [ "[Matrix(1, 'HB', '1138_bus', 1138, 1138, 4054, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/1138_bus.png'),\n", " Matrix(2, 'HB', '494_bus', 494, 494, 1666, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/494_bus.png'),\n", " Matrix(3, 'HB', '662_bus', 662, 662, 2474, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/662_bus.png'),\n", " Matrix(4, 'HB', '685_bus', 685, 685, 3249, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/685_bus.png'),\n", " Matrix(5, 'HB', 'abb313', 313, 176, 1557, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/abb313.png'),\n", " Matrix(6, 'HB', 'arc130', 130, 130, 1037, 'real', True, False, 0.7586805555555556, 0.0, 'materials problem', 'https://sparse.tamu.edu/files/HB/arc130.png'),\n", " Matrix(7, 'HB', 'ash219', 219, 85, 438, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash219.png'),\n", " Matrix(8, 'HB', 'ash292', 292, 292, 2208, 'binary', False, False, 1.0, 1.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash292.png'),\n", " Matrix(9, 'HB', 'ash331', 331, 104, 662, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash331.png'),\n", " Matrix(10, 'HB', 'ash608', 608, 188, 1216, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash608.png')]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ssgetpy.search()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that search result comes with minimal Jupyter integration that shows some metadata along with the distribution of the non-zero values. Click on the group or name link to go a web page in the SuiteSparse matrix collection that has much more information about the group or the matrix respectively.\n", "\n", "### Query filters\n", "\n", "You can add more filters via keyword arguments as follows:\n", "\n", "|Argument | Description | Type | Default | Notes |\n", "|---------|-------------|------|---------|-------| \n", "|`rowbounds` | Number of rows | `tuple`: `(min_value, max_value)` | `(None, None)`| `min_value` or `max_value` can be `None` which implies \"don't care\" |\n", "|`colbounds` | Number of columns | `tuple`: `(min_value, max_value)` | `(None, None)` | |\n", "|`nzbounds` | Number of non-zeros | `tuple`: `(min_value, max_value)` | `(None, None)`| |\n", "|`isspd` | SPD? | `bool` or `None` | `None` | `None` implies \"don't care\" |\n", "|`is2d3d` | 2D/3D Discretization? | `bool` or `None` | `None` | |\n", "| `dtype` | Non-zero data type | `real`, `complex`, `binary` or `None` | `None` | |\n", "| `group` | Matrix group | `str` or `None` | `None` | Supports partial matches; `None` implies \"don't care\" |\n", "| `kind` | Problem domain | `str` or `None` | `None` | Supports partial matches; `None` implies \"don't care\" |\n", "| `limit` | Max number of results | `int` | 10 | |\n", "\n", "> Note that numerical and pattern symmetry filters are not yet supported.\n", "\n", "As an example of using the above filters, here is a query that returns five, non-SPD matrices with $1000\\leq \\text{NNZ} \\leq 10000$:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
5HBabb3133131761557binaryNoNo0.00.0least squares problem
6HBarc1301301301037realYesNo0.760.0materials problem
8HBash2922922922208binaryNoNo1.01.0least squares problem
10HBash6086081881216binaryNoNo0.00.0least squares problem
12HBash9589582921916binaryNoNo0.00.0least squares problem
" ], "text/plain": [ "[Matrix(5, 'HB', 'abb313', 313, 176, 1557, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/abb313.png'),\n", " Matrix(6, 'HB', 'arc130', 130, 130, 1037, 'real', True, False, 0.7586805555555556, 0.0, 'materials problem', 'https://sparse.tamu.edu/files/HB/arc130.png'),\n", " Matrix(8, 'HB', 'ash292', 292, 292, 2208, 'binary', False, False, 1.0, 1.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash292.png'),\n", " Matrix(10, 'HB', 'ash608', 608, 188, 1216, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash608.png'),\n", " Matrix(12, 'HB', 'ash958', 958, 292, 1916, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash958.png')]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ssgetpy.search(nzbounds=(1000,10000), isspd=False, limit=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with search results\n", "The result of a search query is a collection of `Matrix` objects. The collection can be sliced using the same syntax as for vanilla Python `list`s as shown below:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
24HBbcsstk0266664356realYesYes1.01.0structural problem
26HBbcsstk041321323648realYesYes1.01.0structural problem
27HBbcsstk051531532423realYesYes1.01.0structural problem
28HBbcsstk064204207860realYesYes1.01.0structural problem
" ], "text/plain": [ "[Matrix(24, 'HB', 'bcsstk02', 66, 66, 4356, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk02.png'),\n", " Matrix(26, 'HB', 'bcsstk04', 132, 132, 3648, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk04.png'),\n", " Matrix(27, 'HB', 'bcsstk05', 153, 153, 2423, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk05.png'),\n", " Matrix(28, 'HB', 'bcsstk06', 420, 420, 7860, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk06.png')]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result = ssgetpy.search(kind='structural', nzbounds=(1000,10000))\n", "result[:4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An individual element in the collection can be used as follows:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
IdGroupNameRowsColsNNZDType2D/3D Discretization?SPD?Pattern SymmetryNumerical SymmetryKindSpy Plot
24HBbcsstk0266664356realYesYes1.01.0structural problem
" ], "text/plain": [ "Matrix(24, 'HB', 'bcsstk02', 66, 66, 4356, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk02.png')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_matrix = result[0]\n", "small_matrix" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "4356" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_matrix.nnz" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can download a matrix locally using the `download` method:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz',\n", " 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_matrix.download()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `download` methods supports the following arguments:\n", "\n", "|Argument| Description | Data type | Default value | Notes|\n", "|--------|-------------|-----------|---------------|------|\n", "|`format`| Sparse matrix storage format | One of (`'MM', 'RB', 'MAT'`) | `MM` | `MM` is Matrix Market, `RB` is Rutherford-Boeing and `MAT` is MATLAB MAT-file format|\n", "|`destpath` | Path to download | `str` | `~/.ssgetpy` on Unix `%APPDATA%\\ssgetpy` on Windows | The full filename for the matrix is obtained via `os.path.join(destpath, format, group_name, matrix_name + extension)`where `extention` is `.tar.gz` for `MM` and `RB` and `.mat` for `MAT`|\n", "|`extract` | Extract TGZ archive? | `bool` | `False` | Only applicable to `MM` and `RB` formats |\n", "\n", "The return value is a two-element `tuple` containing the local path where the matrix was downloaded to along with the path for the extracted file, if applicable. \n", "\n", "Note that `download` does not actually download the file again if it already exists in the path. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz',\n", " 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_matrix.download()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02',\n", " 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_matrix.download(extract=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, `download` also works directly on the output of `search`, so you don't have to download one matrix at a time. For example, to download the first five matrices in the previous query, you could use:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "scrolled": false }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b9ad495a1ef5462a889164dcb19a4d92", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=0.0, description='Overall progress', max=5.0, style=ProgressStyle(descripti…" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "result[:5].download()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }