{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2ecf2e51",
   "metadata": {},
   "source": [
    "# Exploring the Dataset Zoo\n",
    "\n",
    "This experience introduces you to the core components of the FiftyOne Zoo:\n",
    "- The **Dataset Zoo** for accessing and exploring public datasets\n",
    "- The **Model Zoo** for running pre-trained models on your data\n",
    "- Creating your **own remotely-sourced datasets** for reuse and collaboration\n",
    "\n",
    "Whether you're a researcher, engineer, or educator, these tools help streamline your computer vision workflows in FiftyOne.\n",
    "\n",
    "> 💡 Make sure to run `pip install fiftyone torch torchvision` before starting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "bat"
    }
   },
   "outputs": [],
   "source": [
    "!pip install fiftyone\n",
    "!pip install torch torchvision"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## FiftyOne Zoo: A Hub for Datasets and Models\n",
    "\n",
    "FiftyOne Zoo provides easy access to a vast collection of pre-built datasets and pre-trained models. This notebook will guide you through exploring and using these resources.\n",
    "\n",
    "### Key Components:\n",
    "\n",
    "* **Dataset Zoo:** Offers a wide range of computer vision datasets, ready for immediate use.\n",
    "* **Model Zoo:** Provides pre-trained models for various tasks, enabling quick experimentation and deployment.\n",
    "\n",
    "Let's dive in!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo\n",
    "import fiftyone.zoo as foz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dataset Zoo\n",
    "\n",
    "### Exploring the Dataset Zoo\n",
    "\n",
    "The Dataset Zoo simplifies the process of loading and working with popular datasets.\n",
    "\n",
    "#### Listing Available Datasets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Available Datasets:\n",
      "- activitynet-100\n",
      "- activitynet-200\n",
      "- bdd100k\n",
      "- caltech101\n",
      "- caltech256\n",
      "- cifar10\n",
      "- cifar100\n",
      "- cityscapes\n",
      "- coco-2014\n",
      "- coco-2017\n",
      "- fashion-mnist\n",
      "- fiw\n",
      "- hmdb51\n",
      "- imagenet-2012\n",
      "- imagenet-sample\n",
      "- kinetics-400\n",
      "- kinetics-600\n",
      "- kinetics-700\n",
      "- kinetics-700-2020\n",
      "- kitti\n",
      "- kitti-multiview\n",
      "- lfw\n",
      "- mnist\n",
      "- open-images-v6\n",
      "- open-images-v7\n",
      "- places\n",
      "- quickstart\n",
      "- quickstart-3d\n",
      "- quickstart-geo\n",
      "- quickstart-groups\n",
      "- quickstart-video\n",
      "- sama-coco\n",
      "- ucf101\n",
      "- voc-2007\n",
      "- voc-2012\n"
     ]
    }
   ],
   "source": [
    "print(\"Available Datasets:\")\n",
    "for dataset_name in foz.list_zoo_datasets():\n",
    "    print(f\"- {dataset_name}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Loading a Dataset (Example: MNIST)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = foz.load_zoo_dataset(\"mnist\")\n",
    "print(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualizing the Dataset "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Session launched. Run `session.show()` to open the App in a cell output.\n",
      "\n",
      "Welcome to\n",
      "\n",
      "███████╗██╗███████╗████████╗██╗   ██╗ ██████╗ ███╗   ██╗███████╗\n",
      "██╔════╝██║██╔════╝╚══██╔══╝╚██╗ ██╔╝██╔═══██╗████╗  ██║██╔════╝\n",
      "█████╗  ██║█████╗     ██║    ╚████╔╝ ██║   ██║██╔██╗ ██║█████╗\n",
      "██╔══╝  ██║██╔══╝     ██║     ╚██╔╝  ██║   ██║██║╚██╗██║██╔══╝\n",
      "██║     ██║██║        ██║      ██║   ╚██████╔╝██║ ╚████║███████╗\n",
      "╚═╝     ╚═╝╚═╝        ╚═╝      ╚═╝    ╚═════╝ ╚═╝  ╚═══╝╚══════╝ v1.3.1\n",
      "\n",
      "If you're finding FiftyOne helpful, here's how you can get involved:\n",
      "\n",
      "|\n",
      "|  ⭐⭐⭐ Give the project a star on GitHub ⭐⭐⭐\n",
      "|  https://github.com/voxel51/fiftyone\n",
      "|\n",
      "|  🚀🚀🚀 Join the FiftyOne Discord community 🚀🚀🚀\n",
      "|  https://community.voxel51.com/\n",
      "|\n",
      "\n"
     ]
    }
   ],
   "source": [
    "session = fo.launch_app(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![visualizate_dataset](https://cdn.voxel51.com/getting_started_model_dataset_zoo/notebook1/visualizate_dataset.webp)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Loading a Specific Split (Example: COCO)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    coco_train = foz.load_zoo_dataset(\"coco-2017\", split=\"train\")\n",
    "    print(coco_train)\n",
    "except:\n",
    "    print(\"coco-2017 dataset is not available, please install it if needed.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Downloading and Loading a Dataset with Specific Splits and Downsampling (Example: open-images-v6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    dataset = foz.load_zoo_dataset(\n",
    "        \"open-images-v6\",\n",
    "        splits=[\"train\", \"validation\"],\n",
    "        label_types=[\"detections\", \"segmentations\"],\n",
    "        classes=[\"Car\", \"Person\"],\n",
    "        max_samples=50,\n",
    "    )\n",
    "    print(dataset)\n",
    "except:\n",
    "    print(\"open-images-v6 dataset is not available, please install it if needed.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Working with Dataset Metadata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    metadata = foz.get_zoo_dataset_info(\"coco-2017\")\n",
    "    print(metadata)\n",
    "except:\n",
    "    print(\"coco-2017 metadata is not available, please install it if needed.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: Loading a Remote Image Dataset\n",
    "\n",
    "With fiftyOne you can work/create zoo datasets whose download/preparation methods are hosted via GitHub repositories or public URLs\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloading https://github.com/voxel51/coco-2017...\n",
      "   33.7Kb [1.4ms elapsed, ? remaining, 350.4Mb/s]  \n",
      "Downloading split 'validation' to '/home/paula/fiftyone/voxel51/coco-2017/validation' if necessary\n",
      "Downloading annotations to '/home/paula/fiftyone/voxel51/coco-2017/tmp-download/annotations_trainval2017.zip'\n",
      " 100% |██████|    1.9Gb/1.9Gb [1.4m elapsed, 0s remaining, 20.7Mb/s]       \n",
      "Extracting annotations to '/home/paula/fiftyone/voxel51/coco-2017/raw/instances_val2017.json'\n",
      "Downloading images to '/home/paula/fiftyone/voxel51/coco-2017/tmp-download/val2017.zip'\n",
      "  20% |█-----|    1.2Gb/6.1Gb [1.0m elapsed, 4.0m remaining, 22.8Mb/s]    "
     ]
    }
   ],
   "source": [
    "dataset = foz.load_zoo_dataset(\n",
    "    \"https://github.com/voxel51/coco-2017\",\n",
    "    split=\"validation\",\n",
    ")\n",
    "\n",
    "session = fo.launch_app(dataset, port=5152, auto=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Other loading examples with remote datasets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load 50 random samples from the validation split\n",
    "\n",
    "Only the required images will be downloaded (if necessary).\n",
    "By default, only detections are loaded"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = foz.load_zoo_dataset(\n",
    "    \"https://github.com/voxel51/coco-2017\",\n",
    "    split=\"validation\",\n",
    "    max_samples=50,\n",
    "    shuffle=True,\n",
    ")\n",
    "\n",
    "session = fo.launch_app(dataset, port=5152, auto=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load segmentations for 25 samples from the validation split that contain cats and dogs\n",
    "\n",
    "Images that contain all `classes` will be prioritized first, followed by images that contain at least one of the required `classes`. If there are not enough images matching `classes` in the split to meet `max_samples`, only the available images will be loaded. Images will only be downloaded if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = foz.load_zoo_dataset(\n",
    "    \"https://github.com/voxel51/coco-2017\",\n",
    "    split=\"validation\",\n",
    "    label_types=[\"segmentations\"],\n",
    "    classes=[\"cat\", \"dog\"],\n",
    "    max_samples=25,\n",
    ")\n",
    "\n",
    "session = fo.launch_app(dataset, port=5152, auto=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Download the entire validation split and load both detections and segmentations. \n",
    "\n",
    "Subsequent partial loads of the validation split will never require downloading any images.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "dataset = foz.load_zoo_dataset(\n",
    "    \"https://github.com/voxel51/coco-2017\",\n",
    "    split=\"validation\",\n",
    "    label_types=[\"detections\", \"segmentations\"],\n",
    ")\n",
    "\n",
    "session = fo.launch_app(dataset, port=5152, auto=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}