{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python interface to the SuiteSparse Matrix Collection\n",
"\n",
"This notebook walks you through some of the features of the `ssgetpy` package that provides a search and download interface for the [Suite Sparse](https://suitesparse.com) matrix collection. \n",
"\n",
"The simplest way to install `ssgetpy` is via:\n",
"```\n",
"pip install ssgetpy\n",
"```\n",
"\n",
"This installs both the `ssgetpy` Python module as well as the `ssgetpy` command-line script. \n",
"\n",
"\n",
"This notebook only covers the library version of `ssgetpy`. To get more information on the command-line script run:\n",
"```\n",
"$ ssgetpy --help\n",
"```\n",
"\n",
"Before proceeding with the rest of this notebook, please install `ssgetpy` into your environment. If you are running this notebook under Binder, `ssgetpy` will already be installed for you. If you are running this notebook in Google Colaboratory, the following cell will install `ssgetpy`: "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"ipy = get_ipython()\n",
"if 'google.colab' in str(ipy):\n",
" import sys\n",
" ipy.run_cell('!{sys.executable} -m pip install ssgetpy')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First import `ssgetpy` via:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"import ssgetpy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Basic query interface\n",
"\n",
"The primary interface to `ssgetpy` is via `ssgetpy.search`. Running `search` without any arguments returns the first 10 matrices in the collection:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"
Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot | 1 | HB | 1138_bus | 1138 | 1138 | 4054 | real | No | Yes | 1.0 | 1.0 | power network problem | |
2 | HB | 494_bus | 494 | 494 | 1666 | real | No | Yes | 1.0 | 1.0 | power network problem | |
3 | HB | 662_bus | 662 | 662 | 2474 | real | No | Yes | 1.0 | 1.0 | power network problem | |
4 | HB | 685_bus | 685 | 685 | 3249 | real | No | Yes | 1.0 | 1.0 | power network problem | |
5 | HB | abb313 | 313 | 176 | 1557 | binary | No | No | 0.0 | 0.0 | least squares problem | |
6 | HB | arc130 | 130 | 130 | 1037 | real | Yes | No | 0.76 | 0.0 | materials problem | |
7 | HB | ash219 | 219 | 85 | 438 | binary | No | No | 0.0 | 0.0 | least squares problem | |
8 | HB | ash292 | 292 | 292 | 2208 | binary | No | No | 1.0 | 1.0 | least squares problem | |
9 | HB | ash331 | 331 | 104 | 662 | binary | No | No | 0.0 | 0.0 | least squares problem | |
10 | HB | ash608 | 608 | 188 | 1216 | binary | No | No | 0.0 | 0.0 | least squares problem | |
"
],
"text/plain": [
"[Matrix(1, 'HB', '1138_bus', 1138, 1138, 4054, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/1138_bus.png'),\n",
" Matrix(2, 'HB', '494_bus', 494, 494, 1666, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/494_bus.png'),\n",
" Matrix(3, 'HB', '662_bus', 662, 662, 2474, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/662_bus.png'),\n",
" Matrix(4, 'HB', '685_bus', 685, 685, 3249, 'real', False, True, 1.0, 1.0, 'power network problem', 'https://sparse.tamu.edu/files/HB/685_bus.png'),\n",
" Matrix(5, 'HB', 'abb313', 313, 176, 1557, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/abb313.png'),\n",
" Matrix(6, 'HB', 'arc130', 130, 130, 1037, 'real', True, False, 0.7586805555555556, 0.0, 'materials problem', 'https://sparse.tamu.edu/files/HB/arc130.png'),\n",
" Matrix(7, 'HB', 'ash219', 219, 85, 438, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash219.png'),\n",
" Matrix(8, 'HB', 'ash292', 292, 292, 2208, 'binary', False, False, 1.0, 1.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash292.png'),\n",
" Matrix(9, 'HB', 'ash331', 331, 104, 662, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash331.png'),\n",
" Matrix(10, 'HB', 'ash608', 608, 188, 1216, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash608.png')]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ssgetpy.search()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that search result comes with minimal Jupyter integration that shows some metadata along with the distribution of the non-zero values. Click on the group or name link to go a web page in the SuiteSparse matrix collection that has much more information about the group or the matrix respectively.\n",
"\n",
"### Query filters\n",
"\n",
"You can add more filters via keyword arguments as follows:\n",
"\n",
"|Argument | Description | Type | Default | Notes |\n",
"|---------|-------------|------|---------|-------| \n",
"|`rowbounds` | Number of rows | `tuple`: `(min_value, max_value)` | `(None, None)`| `min_value` or `max_value` can be `None` which implies \"don't care\" |\n",
"|`colbounds` | Number of columns | `tuple`: `(min_value, max_value)` | `(None, None)` | |\n",
"|`nzbounds` | Number of non-zeros | `tuple`: `(min_value, max_value)` | `(None, None)`| |\n",
"|`isspd` | SPD? | `bool` or `None` | `None` | `None` implies \"don't care\" |\n",
"|`is2d3d` | 2D/3D Discretization? | `bool` or `None` | `None` | |\n",
"| `dtype` | Non-zero data type | `real`, `complex`, `binary` or `None` | `None` | |\n",
"| `group` | Matrix group | `str` or `None` | `None` | Supports partial matches; `None` implies \"don't care\" |\n",
"| `kind` | Problem domain | `str` or `None` | `None` | Supports partial matches; `None` implies \"don't care\" |\n",
"| `limit` | Max number of results | `int` | 10 | |\n",
"\n",
"> Note that numerical and pattern symmetry filters are not yet supported.\n",
"\n",
"As an example of using the above filters, here is a query that returns five, non-SPD matrices with $1000\\leq \\text{NNZ} \\leq 10000$:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot | 5 | HB | abb313 | 313 | 176 | 1557 | binary | No | No | 0.0 | 0.0 | least squares problem | |
6 | HB | arc130 | 130 | 130 | 1037 | real | Yes | No | 0.76 | 0.0 | materials problem | |
8 | HB | ash292 | 292 | 292 | 2208 | binary | No | No | 1.0 | 1.0 | least squares problem | |
10 | HB | ash608 | 608 | 188 | 1216 | binary | No | No | 0.0 | 0.0 | least squares problem | |
12 | HB | ash958 | 958 | 292 | 1916 | binary | No | No | 0.0 | 0.0 | least squares problem | |
"
],
"text/plain": [
"[Matrix(5, 'HB', 'abb313', 313, 176, 1557, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/abb313.png'),\n",
" Matrix(6, 'HB', 'arc130', 130, 130, 1037, 'real', True, False, 0.7586805555555556, 0.0, 'materials problem', 'https://sparse.tamu.edu/files/HB/arc130.png'),\n",
" Matrix(8, 'HB', 'ash292', 292, 292, 2208, 'binary', False, False, 1.0, 1.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash292.png'),\n",
" Matrix(10, 'HB', 'ash608', 608, 188, 1216, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash608.png'),\n",
" Matrix(12, 'HB', 'ash958', 958, 292, 1916, 'binary', False, False, 0.0, 0.0, 'least squares problem', 'https://sparse.tamu.edu/files/HB/ash958.png')]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ssgetpy.search(nzbounds=(1000,10000), isspd=False, limit=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with search results\n",
"The result of a search query is a collection of `Matrix` objects. The collection can be sliced using the same syntax as for vanilla Python `list`s as shown below:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot | 24 | HB | bcsstk02 | 66 | 66 | 4356 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
26 | HB | bcsstk04 | 132 | 132 | 3648 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
27 | HB | bcsstk05 | 153 | 153 | 2423 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
28 | HB | bcsstk06 | 420 | 420 | 7860 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
"
],
"text/plain": [
"[Matrix(24, 'HB', 'bcsstk02', 66, 66, 4356, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk02.png'),\n",
" Matrix(26, 'HB', 'bcsstk04', 132, 132, 3648, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk04.png'),\n",
" Matrix(27, 'HB', 'bcsstk05', 153, 153, 2423, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk05.png'),\n",
" Matrix(28, 'HB', 'bcsstk06', 420, 420, 7860, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk06.png')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result = ssgetpy.search(kind='structural', nzbounds=(1000,10000))\n",
"result[:4]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An individual element in the collection can be used as follows:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"Id | Group | Name | Rows | Cols | NNZ | DType | 2D/3D Discretization? | SPD? | Pattern Symmetry | Numerical Symmetry | Kind | Spy Plot | 24 | HB | bcsstk02 | 66 | 66 | 4356 | real | Yes | Yes | 1.0 | 1.0 | structural problem | |
"
],
"text/plain": [
"Matrix(24, 'HB', 'bcsstk02', 66, 66, 4356, 'real', True, True, 1.0, 1.0, 'structural problem', 'https://sparse.tamu.edu/files/HB/bcsstk02.png')"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"small_matrix = result[0]\n",
"small_matrix"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"4356"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"small_matrix.nnz"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can download a matrix locally using the `download` method:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz',\n",
" 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"small_matrix.download()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `download` methods supports the following arguments:\n",
"\n",
"|Argument| Description | Data type | Default value | Notes|\n",
"|--------|-------------|-----------|---------------|------|\n",
"|`format`| Sparse matrix storage format | One of (`'MM', 'RB', 'MAT'`) | `MM` | `MM` is Matrix Market, `RB` is Rutherford-Boeing and `MAT` is MATLAB MAT-file format|\n",
"|`destpath` | Path to download | `str` | `~/.ssgetpy` on Unix `%APPDATA%\\ssgetpy` on Windows | The full filename for the matrix is obtained via `os.path.join(destpath, format, group_name, matrix_name + extension)`where `extention` is `.tar.gz` for `MM` and `RB` and `.mat` for `MAT`|\n",
"|`extract` | Extract TGZ archive? | `bool` | `False` | Only applicable to `MM` and `RB` formats |\n",
"\n",
"The return value is a two-element `tuple` containing the local path where the matrix was downloaded to along with the path for the extracted file, if applicable. \n",
"\n",
"Note that `download` does not actually download the file again if it already exists in the path. "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz',\n",
" 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"small_matrix.download()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"('C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02',\n",
" 'C:\\\\Users\\\\drdar\\\\AppData\\\\Roaming\\\\ssgetpy\\\\MM\\\\HB\\\\bcsstk02.tar.gz')"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"small_matrix.download(extract=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, `download` also works directly on the output of `search`, so you don't have to download one matrix at a time. For example, to download the first five matrices in the previous query, you could use:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "b9ad495a1ef5462a889164dcb19a4d92",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, description='Overall progress', max=5.0, style=ProgressStyle(descripti…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"result[:5].download()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}