{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory Data Analysis and Visualization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of __exploratory data analysis (EDA)__ is to explore attributes across multiple entities to decide what statistical or machine learning techniques to apply to the data. Visualizations are used to assist in understanding the data." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pandas in /opt/conda/lib/python3.8/site-packages (1.2.2)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2.8.1)\n", "Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2021.1)\n", "Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.8/site-packages (from pandas) (1.19.5)\n", "Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)\n" ] }, { "data": { "text/html": [ "
\n", " | Unnamed: 0 | \n", "Unnamed: 0.1 | \n", "Form | \n", "State | \n", "Security_Grade | \n", "Area_Number | \n", "Terrain_Description | \n", "Favorable_Influences | \n", "Detrimental_Influences | \n", "INHABITANTS_Type | \n", "... | \n", "max_annual_income | \n", "terrain_rolling | \n", "white_collar | \n", "mixture_or_jewish | \n", "professional | \n", "business_or_executive | \n", "laborer | \n", "clerks | \n", "mechanics | \n", "industrial | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "1 | \n", "NS FORM-8 6-1-37 | \n", "Maryland | \n", "A | \n", "1 | \n", "undulating | \n", "Very nicely planned residential area of medium... | \n", "No | \n", "executives professional men | \n", "... | \n", "5000.0 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
1 | \n", "1 | \n", "0 | \n", "NS FORM-8 6-1-37 | \n", "Maryland | \n", "A | \n", "2 | \n", "rolling | \n", "Fairly new suburban area of homogeneous charac... | \n", "No | \n", "substantial middle class | \n", "... | \n", "5000.0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "2 | \n", "2 | \n", "NS FORM-8 6-1-37 | \n", "Maryland | \n", "A | \n", "3 | \n", "rolling | \n", "Good residential area. Well planned. | \n", "Distance to City | \n", "executives professional men | \n", "... | \n", "7000.0 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
3 rows × 43 columns
\n", "