{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Let's Clean Some Data\n", "\n", "Fivethirtyeight has some great data sets and this is one of them. Some light cleaning should make it more useable!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# import libraries\n", "import pandas as pd\n", "import numpy as np\n", "star_wars = pd.read_csv(\"star_wars.csv\", encoding=\"ISO-8859-1\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | RespondentID | \n", "Have you seen any of the 6 films in the Star Wars franchise? | \n", "Do you consider yourself to be a fan of the Star Wars film franchise? | \n", "Which of the following Star Wars films have you seen? Please select all that apply. | \n", "Unnamed: 4 | \n", "Unnamed: 5 | \n", "Unnamed: 6 | \n", "Unnamed: 7 | \n", "Unnamed: 8 | \n", "Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film. | \n", "... | \n", "Unnamed: 28 | \n", "Which character shot first? | \n", "Are you familiar with the Expanded Universe? | \n", "Do you consider yourself to be a fan of the Expanded Universe? | \n", "Do you consider yourself to be a fan of the Star Trek franchise? | \n", "Gender | \n", "Age | \n", "Household Income | \n", "Education | \n", "Location (Census Region) | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "3292879998 | \n", "Yes | \n", "Yes | \n", "Star Wars: Episode I The Phantom Menace | \n", "Star Wars: Episode II Attack of the Clones | \n", "Star Wars: Episode III Revenge of the Sith | \n", "Star Wars: Episode IV A New Hope | \n", "Star Wars: Episode V The Empire Strikes Back | \n", "Star Wars: Episode VI Return of the Jedi | \n", "3.0 | \n", "... | \n", "Very favorably | \n", "I don't understand this question | \n", "Yes | \n", "No | \n", "No | \n", "Male | \n", "18-29 | \n", "NaN | \n", "High school degree | \n", "South Atlantic | \n", "
1 | \n", "3292879538 | \n", "No | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Yes | \n", "Male | \n", "18-29 | \n", "$0 - $24,999 | \n", "Bachelor degree | \n", "West South Central | \n", "
2 | \n", "3292765271 | \n", "Yes | \n", "No | \n", "Star Wars: Episode I The Phantom Menace | \n", "Star Wars: Episode II Attack of the Clones | \n", "Star Wars: Episode III Revenge of the Sith | \n", "NaN | \n", "NaN | \n", "NaN | \n", "1.0 | \n", "... | \n", "Unfamiliar (N/A) | \n", "I don't understand this question | \n", "No | \n", "NaN | \n", "No | \n", "Male | \n", "18-29 | \n", "$0 - $24,999 | \n", "High school degree | \n", "West North Central | \n", "
3 | \n", "3292763116 | \n", "Yes | \n", "Yes | \n", "Star Wars: Episode I The Phantom Menace | \n", "Star Wars: Episode II Attack of the Clones | \n", "Star Wars: Episode III Revenge of the Sith | \n", "Star Wars: Episode IV A New Hope | \n", "Star Wars: Episode V The Empire Strikes Back | \n", "Star Wars: Episode VI Return of the Jedi | \n", "5.0 | \n", "... | \n", "Very favorably | \n", "I don't understand this question | \n", "No | \n", "NaN | \n", "Yes | \n", "Male | \n", "18-29 | \n", "$100,000 - $149,999 | \n", "Some college or Associate degree | \n", "West North Central | \n", "
4 | \n", "3292731220 | \n", "Yes | \n", "Yes | \n", "Star Wars: Episode I The Phantom Menace | \n", "Star Wars: Episode II Attack of the Clones | \n", "Star Wars: Episode III Revenge of the Sith | \n", "Star Wars: Episode IV A New Hope | \n", "Star Wars: Episode V The Empire Strikes Back | \n", "Star Wars: Episode VI Return of the Jedi | \n", "5.0 | \n", "... | \n", "Somewhat favorably | \n", "Greedo | \n", "Yes | \n", "No | \n", "No | \n", "Male | \n", "18-29 | \n", "$100,000 - $149,999 | \n", "Some college or Associate degree | \n", "West North Central | \n", "
5 rows × 38 columns
\n", "