{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 2: Analyzing with Model Evaluation Panel\n", "\n", "In our last step we showed some basic ways on how to evaluate models. In this step, we will show how to take it even further with the [Model Eval](https://docs.voxel51.com/plugins/api/plugins.panels.model_evaluation.html) Panel in the app. With the Model Eval Panel you can:\n", "\n", "- See all evaluation runs on a dataset\n", "- View summary statistics of each run\n", "- Filter dataset based on FP, TP, and more\n", "- Analyze class-wise evaluation metrics and filter based on them\n", "- View confusion matrices and histograms of evaluation results\n", "\n", "Let's hop into an example to see how to get started!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dataset already downloaded\n", "Loading existing dataset 'quickstart'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use\n", "Name: quickstart\n", "Media type: image\n", "Num samples: 200\n", "Persistent: False\n", "Tags: []\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n", " ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " uniqueness: fiftyone.core.fields.FloatField\n", " predictions: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " eval_tp: fiftyone.core.fields.IntField\n", " eval_fp: fiftyone.core.fields.IntField\n", " eval_fn: fiftyone.core.fields.IntField\n", " eval_high_conf_tp: fiftyone.core.fields.IntField\n", " eval_high_conf_fp: fiftyone.core.fields.IntField\n", " eval_high_conf_fn: fiftyone.core.fields.IntField\n" ] } ], "source": [ "import fiftyone as fo\n", "import fiftyone.zoo as foz\n", "\n", "dataset = foz.load_zoo_dataset(\"quickstart\")\n", "\n", "# View summary info about the dataset\n", "print(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's quickly rerun evaluation in case we do not have it from the previous step:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |█████████████████| 200/200 [7.2s elapsed, 0s remaining, 18.5 samples/s] \n", "Performing IoU sweep...\n", " 100% |█████████████████| 200/200 [2.3s elapsed, 0s remaining, 74.4 samples/s] \n" ] } ], "source": [ "results = dataset.evaluate_detections(\n", " \"predictions\",\n", " gt_field=\"ground_truth\",\n", " eval_key=\"eval\",\n", " compute_mAP=True,\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |█████████████████| 200/200 [1.3s elapsed, 0s remaining, 127.8 samples/s] \n", "Performing IoU sweep...\n", " 100% |█████████████████| 200/200 [924.4ms elapsed, 0s remaining, 216.4 samples/s] \n" ] } ], "source": [ "from fiftyone import ViewField as F\n", "\n", "# Only contains detections with confidence >= 0.75\n", "high_conf_view = dataset.filter_labels(\"predictions\", F(\"confidence\") > 0.75, only_matches=False)\n", "\n", "results = high_conf_view.evaluate_detections(\n", " \"predictions\",\n", " gt_field=\"ground_truth\",\n", " eval_key=\"eval_high_conf\",\n", " compute_mAP=True,\n", ")\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Name: quickstart\n", "Media type: image\n", "Num samples: 200\n", "Persistent: False\n", "Tags: []\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n", " ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " uniqueness: fiftyone.core.fields.FloatField\n", " predictions: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " eval_tp: fiftyone.core.fields.IntField\n", " eval_fp: fiftyone.core.fields.IntField\n", " eval_fn: fiftyone.core.fields.IntField\n", " eval_high_conf_tp: fiftyone.core.fields.IntField\n", " eval_high_conf_fp: fiftyone.core.fields.IntField\n", " eval_high_conf_fn: fiftyone.core.fields.IntField" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset.load_evaluation_view(\"eval\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can open up our dataset and view are results in the Model Eval Panel! I recommend opening the app in browser for the best experience at `http://localhost:5151/`! " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![model_evaluation](https://cdn.voxel51.com/getting_started_model_evaluation/notebook2/model_evaluation.webp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explore the Model Evaluation Panel\n", "\n", "Now that you have the Model Evaluation Panel open, you can explore all the powerful features it offers:\n", "\n", "- **Browse evaluation runs**: Switch between different evaluation runs (like `eval` and `eval_high_conf`) to compare model performance\n", "- **Analyze metrics**: View precision, recall, F1-score, and mAP metrics for your models\n", "- **Class-wise analysis**: Examine performance metrics for individual classes to identify which objects your model struggles with\n", "- **Confusion matrices**: Visualize classification errors and understand common misclassifications\n", "- **Interactive filtering**: Filter your dataset to show only true positives, false positives, false negatives, or specific confidence ranges\n", "- **Histogram analysis**: Explore distributions of confidence scores, IoU values, and other evaluation metrics\n", "- **Sample-level insights**: Click on specific samples to understand why certain predictions were classified as TP, FP, or FN\n", "\n", "Take some time to explore these features and gain deeper insights into your model's performance. The Model Evaluation Panel makes it easy to identify patterns, debug model issues, and make data-driven decisions for model improvement!\n" ] } ], "metadata": { "kernelspec": { "display_name": "env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.23" } }, "nbformat": 4, "nbformat_minor": 2 }