{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# datasets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module has the necessary functions to be able to download several useful datasets that we might be interested in using in our models." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.datasets import * \n", "from fastai.datasets import Config\n", "from pathlib import Path" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
class
URLs
[source]URLs
()"
],
"text/plain": [
"untar_data
[source]untar_data
(`url`:`str`, `fname`:`PathOrStr`=`None`, `dest`:`PathOrStr`=`None`, `data`=`True`)\n",
"\n",
"Download `url` if it doesn't exist to `fname` and un-tgz to folder `dest` "
],
"text/plain": [
"download_data
[source]download_data
(`url`:`str`, `fname`:`PathOrStr`=`None`, `data`:`bool`=`True`)\n",
"\n",
"Download `url` to destination `fname` "
],
"text/plain": [
"data
directory inside the notebook, that data file will be used instead of ~/.fasta/data
. Paths are resolved by calling the function [`datapath4file`](/datasets.html#datapath4file) - which checks if data exists locally (`data/`) first, before downloading to `~/.fastai/data` home directory.\n",
"\n",
"Example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PosixPath('/home/jhoward/.fastai/data/planet_sample.tgz')"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"download_data(URLs.PLANET_SAMPLE)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide_input": true
},
"outputs": [
{
"data": {
"text/markdown": [
"datapath4file
[source]datapath4file
(`filename`)\n",
"\n",
"Returns URLs.DATA path if file exists. Otherwise returns config path "
],
"text/plain": [
"data
directory in the same place as the calling notebook/script, that is used as the parent directly, otherwise `~/.fastai/config.yml` is read to see what path to use, which defaults to ~/.fastai/data
is used. To override this default, simply modify the value in your `~/.fastai/config.yml`:\n",
"\n",
" data_path: ~/.fastai/data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hide_input": true
},
"outputs": [
{
"data": {
"text/markdown": [
"class
Config
[source]Config
()\n",
"\n",
"Creates a default config file at `~/.fastai/config.yml` "
],
"text/plain": [
"get_path
[source]get_path
(`path`)"
],
"text/plain": [
"data_path
[source]data_path
()"
],
"text/plain": [
"model_path
[source]model_path
()"
],
"text/plain": [
"create
[source]create
(`fpath`)"
],
"text/plain": [
"url2name
[source]url2name
(`url`)"
],
"text/plain": [
"get_key
[source]get_key
(`key`)"
],
"text/plain": [
"get
[source]get
(`fpath`=`None`, `create_missing`=`True`)"
],
"text/plain": [
"