{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

Analyzing Star Wars Survey

\n", "

Studying the people's perception on the Star Wars franchise

\n", "\n", "*Star Wars* is the epic space-opera media frnachise. It quickly became a worldwide pop-culture phenomenon. There have been a total of 9 movies called episodes since its first release in 1977.
\n", "While waiting for *Star Wars: The Force Awakens* to come out, the team at *FiveThirtyEight* became interested in answering some questions about *Star Wars* fans. In particular, they wondered: does the rest of America realize that *\"The Empire Strikes Back\"* is clearly the best of the bunch?
\n", "They surveyed *Star Wars* fans using the online tool SurveyMonkey. They received 835 total responses.\n", "\n", "The aim of this project is to clean and analyze this survey to answer the following questions :-\n", "\n", "* How many respondants like the *Star Wars* franchise?\n", "* Which *Star Wars* film is most popular among the fans?\n", "* How many of the respondants are Super fans? \n", "* How many of the respondants like space-opera media franchises (Star Wars and Star Trek)?\n", "* Which Characters are favorable and unfavorable to the fans?\n", "* Which character is controversial, split between likes and dislikes?\n", "\n", "These questions give an insight into the perception of the respondants and are key to finding the popular movies and characters of the franchise.
\n", "The analysis to answer the above questions is split into 4 parts :-\n", "\n", " Analyzing Star Wars film franchise fans on a granular level.\n", " Fnding the most viewed and most popular movie of the Star Wars franchise.\n", " Analyzing super fans of the franchise on a granular level.\n", " Perceptions of charaters from Star Wars franchise.\n", " Analyzing Space-Opera media franchises (Star Wars and Star Trek) fans on a granular level.\n", "\n", "The dataset has been picked up from this Link
\n", "A few columns are described below:-\n", "\n", " * RespondentID - An anonymized ID for the respondent (person taking the survey)\n", " * Gender - The respondent's gender\n", " * Age - The respondent's age\n", " * Household Income - The respondent's income\n", " * Education - The respondent's education level\n", " * Location (Census Region) - The respondent's location\n", " * Have you seen any of the 6 films in the Star Wars franchise? - Has a Yes or No response\n", " * Do you consider yourself to be a fan of the Star Wars film franchise? - Has a Yes or No response\n", "\n", "There are several more columns in the dataset containing answers to questions about the *Star Wars* franchise." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "from pywaffle.waffle import Waffle\n", "import plotly as py\n", "import plotly.graph_objs as go\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDHave you seen any of the 6 films in the Star Wars franchise?Do you consider yourself to be a fan of the Star Wars film franchise?Which of the following Star Wars films have you seen? Please select all that apply.Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film....Unnamed: 28Which character shot first?Are you familiar with the Expanded Universe?Do you consider yourself to be a fan of the Expanded Universe?ξDo you consider yourself to be a fan of the Star Trek franchise?GenderAgeHousehold IncomeEducationLocation (Census Region)
0NaNResponseResponseStar Wars: Episode I The Phantom MenaceStar Wars: Episode II Attack of the ClonesStar Wars: Episode III Revenge of the SithStar Wars: Episode IV A New HopeStar Wars: Episode V The Empire Strikes BackStar Wars: Episode VI Return of the JediStar Wars: Episode I The Phantom Menace...YodaResponseResponseResponseResponseResponseResponseResponseResponseResponse
13.292880e+09YesYesStar Wars: Episode I The Phantom MenaceStar Wars: Episode II Attack of the ClonesStar Wars: Episode III Revenge of the SithStar Wars: Episode IV A New HopeStar Wars: Episode V The Empire Strikes BackStar Wars: Episode VI Return of the Jedi3...Very favorablyI don't understand this questionYesNoNoMale18-29NaNHigh school degreeSouth Atlantic
23.292880e+09NoNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNYesMale18-29$0 - $24,999Bachelor degreeWest South Central
33.292765e+09YesNoStar Wars: Episode I The Phantom MenaceStar Wars: Episode II Attack of the ClonesStar Wars: Episode III Revenge of the SithNaNNaNNaN1...Unfamiliar (N/A)I don't understand this questionNoNaNNoMale18-29$0 - $24,999High school degreeWest North Central
43.292763e+09YesYesStar Wars: Episode I The Phantom MenaceStar Wars: Episode II Attack of the ClonesStar Wars: Episode III Revenge of the SithStar Wars: Episode IV A New HopeStar Wars: Episode V The Empire Strikes BackStar Wars: Episode VI Return of the Jedi5...Very favorablyI don't understand this questionNoNaNYesMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
\n", "

5 rows × 38 columns

\n", "
" ], "text/plain": [ " RespondentID Have you seen any of the 6 films in the Star Wars franchise? \\\n", "0 NaN Response \n", "1 3.292880e+09 Yes \n", "2 3.292880e+09 No \n", "3 3.292765e+09 Yes \n", "4 3.292763e+09 Yes \n", "\n", " Do you consider yourself to be a fan of the Star Wars film franchise? \\\n", "0 Response \n", "1 Yes \n", "2 NaN \n", "3 No \n", "4 Yes \n", "\n", " Which of the following Star Wars films have you seen? Please select all that apply. \\\n", "0 Star Wars: Episode I The Phantom Menace \n", "1 Star Wars: Episode I The Phantom Menace \n", "2 NaN \n", "3 Star Wars: Episode I The Phantom Menace \n", "4 Star Wars: Episode I The Phantom Menace \n", "\n", " Unnamed: 4 \\\n", "0 Star Wars: Episode II Attack of the Clones \n", "1 Star Wars: Episode II Attack of the Clones \n", "2 NaN \n", "3 Star Wars: Episode II Attack of the Clones \n", "4 Star Wars: Episode II Attack of the Clones \n", "\n", " Unnamed: 5 \\\n", "0 Star Wars: Episode III Revenge of the Sith \n", "1 Star Wars: Episode III Revenge of the Sith \n", "2 NaN \n", "3 Star Wars: Episode III Revenge of the Sith \n", "4 Star Wars: Episode III Revenge of the Sith \n", "\n", " Unnamed: 6 \\\n", "0 Star Wars: Episode IV A New Hope \n", "1 Star Wars: Episode IV A New Hope \n", "2 NaN \n", "3 NaN \n", "4 Star Wars: Episode IV A New Hope \n", "\n", " Unnamed: 7 \\\n", "0 Star Wars: Episode V The Empire Strikes Back \n", "1 Star Wars: Episode V The Empire Strikes Back \n", "2 NaN \n", "3 NaN \n", "4 Star Wars: Episode V The Empire Strikes Back \n", "\n", " Unnamed: 8 \\\n", "0 Star Wars: Episode VI Return of the Jedi \n", "1 Star Wars: Episode VI Return of the Jedi \n", "2 NaN \n", "3 NaN \n", "4 Star Wars: Episode VI Return of the Jedi \n", "\n", " Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film. \\\n", "0 Star Wars: Episode I The Phantom Menace \n", "1 3 \n", "2 NaN \n", "3 1 \n", "4 5 \n", "\n", " ... Unnamed: 28 Which character shot first? \\\n", "0 ... Yoda Response \n", "1 ... Very favorably I don't understand this question \n", "2 ... NaN NaN \n", "3 ... Unfamiliar (N/A) I don't understand this question \n", "4 ... Very favorably I don't understand this question \n", "\n", " Are you familiar with the Expanded Universe? \\\n", "0 Response \n", "1 Yes \n", "2 NaN \n", "3 No \n", "4 No \n", "\n", " Do you consider yourself to be a fan of the Expanded Universe?ξ \\\n", "0 Response \n", "1 No \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " Do you consider yourself to be a fan of the Star Trek franchise? Gender \\\n", "0 Response Response \n", "1 No Male \n", "2 Yes Male \n", "3 No Male \n", "4 Yes Male \n", "\n", " Age Household Income Education \\\n", "0 Response Response Response \n", "1 18-29 NaN High school degree \n", "2 18-29 $0 - $24,999 Bachelor degree \n", "3 18-29 $0 - $24,999 High school degree \n", "4 18-29 $100,000 - $149,999 Some college or Associate degree \n", "\n", " Location (Census Region) \n", "0 Response \n", "1 South Atlantic \n", "2 West South Central \n", "3 West North Central \n", "4 West North Central \n", "\n", "[5 rows x 38 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "survey = pd.read_csv(\"star_wars.csv\", encoding=\"ISO-8859-1\")\n", "survey.head(5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['RespondentID',\n", " 'Have you seen any of the 6 films in the Star Wars franchise?',\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?',\n", " 'Which of the following Star Wars films have you seen? Please select all that apply.',\n", " 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8',\n", " 'Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.',\n", " 'Unnamed: 10', 'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13',\n", " 'Unnamed: 14',\n", " 'Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her.',\n", " 'Unnamed: 16', 'Unnamed: 17', 'Unnamed: 18', 'Unnamed: 19',\n", " 'Unnamed: 20', 'Unnamed: 21', 'Unnamed: 22', 'Unnamed: 23',\n", " 'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',\n", " 'Unnamed: 28', 'Which character shot first?',\n", " 'Are you familiar with the Expanded Universe?',\n", " 'Do you consider yourself to be a fan of the Expanded Universe?',\n", " 'Do you consider yourself to be a fan of the Star Trek franchise?',\n", " 'Gender', 'Age', 'Household Income', 'Education', 'Location'],\n", " dtype='object')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols = ['RespondentID',\n", " 'Have you seen any of the 6 films in the Star Wars franchise?',\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?',\n", " 'Which of the following Star Wars films have you seen? Please select all that apply.',\n", " 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8',\n", " 'Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.',\n", " 'Unnamed: 10', 'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13',\n", " 'Unnamed: 14',\n", " 'Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her.',\n", " 'Unnamed: 16', 'Unnamed: 17', 'Unnamed: 18', 'Unnamed: 19',\n", " 'Unnamed: 20', 'Unnamed: 21', 'Unnamed: 22', 'Unnamed: 23',\n", " 'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',\n", " 'Unnamed: 28', 'Which character shot first?',\n", " 'Are you familiar with the Expanded Universe?',\n", " 'Do you consider yourself to be a fan of the Expanded Universe?',\n", " 'Do you consider yourself to be a fan of the Star Trek franchise?',\n", " 'Gender', 'Age', 'Household Income', 'Education',\n", " 'Location']\n", "\n", "survey.columns = cols\n", "survey.columns" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RespondentID 1\n", "Have you seen any of the 6 films in the Star Wars franchise? 0\n", "Do you consider yourself to be a fan of the Star Wars film franchise? 350\n", "Which of the following Star Wars films have you seen? Please select all that apply. 513\n", "Unnamed: 4 615\n", "Unnamed: 5 636\n", "Unnamed: 6 579\n", "Unnamed: 7 428\n", "Unnamed: 8 448\n", "Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film. 351\n", "Unnamed: 10 350\n", "Unnamed: 11 351\n", "Unnamed: 12 350\n", "Unnamed: 13 350\n", "Unnamed: 14 350\n", "Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her. 357\n", "Unnamed: 16 355\n", "Unnamed: 17 355\n", "Unnamed: 18 363\n", "Unnamed: 19 361\n", "Unnamed: 20 372\n", "Unnamed: 21 360\n", "Unnamed: 22 366\n", "Unnamed: 23 374\n", "Unnamed: 24 359\n", "Unnamed: 25 356\n", "Unnamed: 26 365\n", "Unnamed: 27 372\n", "Unnamed: 28 360\n", "Which character shot first? 358\n", "Are you familiar with the Expanded Universe? 358\n", "Do you consider yourself to be a fan of the Expanded Universe? 973\n", "Do you consider yourself to be a fan of the Star Trek franchise? 118\n", "Gender 140\n", "Age 140\n", "Household Income 328\n", "Education 150\n", "Location 143\n", "dtype: int64" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "survey.isna().sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *RespondentID* contains an invalid rows where the *RespondentID* is *NaN*. Since the Id is supposed to be unique, this row is removed from the dataset. This row actually gives the options presented to the respondant for questions with checkboxes." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "df = survey.dropna(axis=0,subset=['RespondentID'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The two columns - *Have you seen any of the 6 films in the Star Wars franchise?* and *Do you consider yourself to be a fan of the Star Wars film franchise?* are answers to these questions. They are important for this analysis and the focus would be to analyze the people who have seen the movies and/or are fans of the franchise." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Have you seen any of the 6 films in the Star Wars franchise?Do you consider yourself to be a fan of the Star Wars film franchise?
1YesYes
2NoNaN
3YesNo
4YesYes
5YesYes
6YesYes
7YesYes
8YesYes
9YesYes
10YesNo
11YesNaN
12NoNaN
13YesNo
14YesYes
15YesYes
\n", "
" ], "text/plain": [ " Have you seen any of the 6 films in the Star Wars franchise? \\\n", "1 Yes \n", "2 No \n", "3 Yes \n", "4 Yes \n", "5 Yes \n", "6 Yes \n", "7 Yes \n", "8 Yes \n", "9 Yes \n", "10 Yes \n", "11 Yes \n", "12 No \n", "13 Yes \n", "14 Yes \n", "15 Yes \n", "\n", " Do you consider yourself to be a fan of the Star Wars film franchise? \n", "1 Yes \n", "2 NaN \n", "3 No \n", "4 Yes \n", "5 Yes \n", "6 Yes \n", "7 Yes \n", "8 Yes \n", "9 Yes \n", "10 No \n", "11 NaN \n", "12 NaN \n", "13 No \n", "14 Yes \n", "15 Yes " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[[\n", " 'Have you seen any of the 6 films in the Star Wars franchise?',\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?'\n", "]].head(15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The two columns contain either Yes or No values, with some missing values in between. For ease of usage throughout the analysis, these values are mapped to boolean.
\n", "\n", " 'Yes' - True\n", " 'No' - False\n", " \n", "The column names have also been changed to :-\n", "\n", " Have you seen any of the 6 films in the Star Wars franchise? - \n", " seen_any\n", " Do you consider yourself to be a fan of the Star Wars film franchise? - is_fan" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "mappings = {\n", " 'Yes':True,\n", " 'No':False\n", "}\n", "\n", "df[\n", " 'Have you seen any of the 6 films in the Star Wars franchise?'\n", "] = df[\n", " 'Have you seen any of the 6 films in the Star Wars franchise?'\n", "].map(mappings)\n", "\n", "df[\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?'\n", "] = df[\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?'\n", "].map(mappings)\n", "\n", "df.rename(columns = {\n", " 'Have you seen any of the 6 films in the Star Wars franchise?' : 'seen_any',\n", " 'Do you consider yourself to be a fan of the Star Wars film franchise?' : 'is_fan' \n", "}, inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next 6 columns from *Which of the following Star Wars films have you seen? Please select all that apply.* to *Unnamed:8* are answers for the question - *Which of the following Star Wars films have you seen? Please select all that apply.*, the user checked off a series of boxes as response.
\n", "\n", "Since the aim of the survey and eventually the analysis is to identify which *Star Wars* film the public likes the most. It is imperative, that these columns be cleaned. Each column out of the six represents a movie starting from *Star Wars: Episode I The Phantom Menace* to *Star Wars: Episode VI Return of the Jedi*. The column has a `NaN` value if either the respondant hasn't watched the movie or hasn't answered. Considering these `NaN` to be `False` and any text appearing to be `True`, the columns make much more sense. The column names are also changed" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Ep_1Ep_2Ep_3Ep_4Ep_5Ep_6
1TrueTrueTrueTrueTrueTrue
2FalseFalseFalseFalseFalseFalse
3TrueTrueTrueFalseFalseFalse
4TrueTrueTrueTrueTrueTrue
5TrueTrueTrueTrueTrueTrue
6TrueTrueTrueTrueTrueTrue
7TrueTrueTrueTrueTrueTrue
8TrueTrueTrueTrueTrueTrue
9TrueTrueTrueTrueTrueTrue
10FalseTrueFalseFalseFalseFalse
\n", "
" ], "text/plain": [ " Ep_1 Ep_2 Ep_3 Ep_4 Ep_5 Ep_6\n", "1 True True True True True True\n", "2 False False False False False False\n", "3 True True True False False False\n", "4 True True True True True True\n", "5 True True True True True True\n", "6 True True True True True True\n", "7 True True True True True True\n", "8 True True True True True True\n", "9 True True True True True True\n", "10 False True False False False False" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols = df.iloc[:,3:9].columns\n", "\n", "for col in cols:\n", " df[col] = df[col].apply(lambda x: False if pd.isna(x) else True)\n", "\n", "df.rename(columns={\n", " 'Which of the following Star Wars films have you seen? Please select all that apply.':'Ep_1',\n", " \"Unnamed: 4\":'Ep_2',\n", " \"Unnamed: 5\":'Ep_3',\n", " \"Unnamed: 6\":'Ep_4',\n", " \"Unnamed: 7\":'Ep_5',\n", " \"Unnamed: 8\":'Ep_6'\n", "}, inplace=True)\n", "\n", "df.iloc[:,3:9].head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The columns *Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.* to *Unnamed: 14* ask the respondant to rank the movies from 1 to 6. Rank *1* being the most favorite and Rank *6* being the least favorite.
\n", "Since the columns are already numeric, only the column names are changed for the ease of analysis." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "df.iloc[:,9:15] = df.iloc[:,9:15].astype('float')\n", "\n", "df.rename(columns= {\n", " 'Please rank the Star Wars films in order of preference with 1 being your favorite film in the franchise and 6 being your least favorite film.' : 'Ep_1_rank',\n", " 'Unnamed: 10' : 'Ep_2_rank',\n", " 'Unnamed: 11' : 'Ep_3_rank',\n", " 'Unnamed: 12' : 'Ep_4_rank',\n", " 'Unnamed: 13' : 'Ep_5_rank',\n", " 'Unnamed: 14' : 'Ep_6_rank'\n", "}, inplace = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The columns *Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her.* to *Unnamed: 28* are answers to a series of questions about the characters of the films that they favour. These columns are not of much importance to the analysis in hand. Thus these columns are removed from the dataset for now." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankAre you familiar with the Expanded Universe?Do you consider yourself to be a fan of the Expanded Universe?Do you consider yourself to be a fan of the Star Trek franchise?GenderAgeHousehold IncomeEducationLocation
13.292880e+09TrueTrueTrueTrueTrueTrueTrueTrue3.0...5.06.0YesNoNoMale18-29NaNHigh school degreeSouth Atlantic
23.292880e+09FalseNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNYesMale18-29$0 - $24,999Bachelor degreeWest South Central
33.292765e+09TrueFalseTrueTrueTrueFalseFalseFalse1.0...5.06.0NoNaNNoMale18-29$0 - $24,999High school degreeWest North Central
43.292763e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...4.03.0NoNaNYesMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
53.292731e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...1.03.0YesNoNoMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " RespondentID seen_any is_fan Ep_1 Ep_2 Ep_3 Ep_4 Ep_5 Ep_6 \\\n", "1 3.292880e+09 True True True True True True True True \n", "2 3.292880e+09 False NaN False False False False False False \n", "3 3.292765e+09 True False True True True False False False \n", "4 3.292763e+09 True True True True True True True True \n", "5 3.292731e+09 True True True True True True True True \n", "\n", " Ep_1_rank ... Ep_5_rank Ep_6_rank \\\n", "1 3.0 ... 5.0 6.0 \n", "2 NaN ... NaN NaN \n", "3 1.0 ... 5.0 6.0 \n", "4 5.0 ... 4.0 3.0 \n", "5 5.0 ... 1.0 3.0 \n", "\n", " Are you familiar with the Expanded Universe? \\\n", "1 Yes \n", "2 NaN \n", "3 No \n", "4 No \n", "5 Yes \n", "\n", " Do you consider yourself to be a fan of the Expanded Universe? \\\n", "1 No \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "5 No \n", "\n", " Do you consider yourself to be a fan of the Star Trek franchise? Gender \\\n", "1 No Male \n", "2 Yes Male \n", "3 No Male \n", "4 Yes Male \n", "5 No Male \n", "\n", " Age Household Income Education \\\n", "1 18-29 NaN High school degree \n", "2 18-29 $0 - $24,999 Bachelor degree \n", "3 18-29 $0 - $24,999 High school degree \n", "4 18-29 $100,000 - $149,999 Some college or Associate degree \n", "5 18-29 $100,000 - $149,999 Some college or Associate degree \n", "\n", " Location \n", "1 South Atlantic \n", "2 West South Central \n", "3 West North Central \n", "4 West North Central \n", "5 West North Central \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rm_cols = df.iloc[:,15:30].columns\n", "df.drop(rm_cols, axis=1, inplace=True)\n", "df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the non-film material produced such as novels, comic books, TV-series and other supporting films is refered to as *The Star Wars Expanded Universe*, which was later rebranded to *Star Wars Legends*. There are two columns in the dataset that touches these topics - \n", "*Are you familiar with the Expanded Universe?* and *Do you consider yourself to be a fan of the Expanded Universe?*.
\n", "These columns are preserved for now, for further analysis. Similar to the approach for the first two columns, the answers are mapped to Boolean :-\n", "\n", " 'Yes' - True\n", " 'No' - False\n", "\n", "and column names are changed for ease of access to :-\n", "\n", " Are you familiar with the Expanded Universe? - knows_EU\n", " Do you consider yourself to be a fan of the Expanded Universe? - likes_EU" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankknows_EUlikes_EUDo you consider yourself to be a fan of the Star Trek franchise?GenderAgeHousehold IncomeEducationLocation
13.292880e+09TrueTrueTrueTrueTrueTrueTrueTrue3.0...5.06.0TrueFalseNoMale18-29NaNHigh school degreeSouth Atlantic
23.292880e+09FalseNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNYesMale18-29$0 - $24,999Bachelor degreeWest South Central
33.292765e+09TrueFalseTrueTrueTrueFalseFalseFalse1.0...5.06.0FalseNaNNoMale18-29$0 - $24,999High school degreeWest North Central
43.292763e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...4.03.0FalseNaNYesMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
53.292731e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...1.03.0TrueFalseNoMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " RespondentID seen_any is_fan Ep_1 Ep_2 Ep_3 Ep_4 Ep_5 Ep_6 \\\n", "1 3.292880e+09 True True True True True True True True \n", "2 3.292880e+09 False NaN False False False False False False \n", "3 3.292765e+09 True False True True True False False False \n", "4 3.292763e+09 True True True True True True True True \n", "5 3.292731e+09 True True True True True True True True \n", "\n", " Ep_1_rank ... Ep_5_rank Ep_6_rank knows_EU likes_EU \\\n", "1 3.0 ... 5.0 6.0 True False \n", "2 NaN ... NaN NaN NaN NaN \n", "3 1.0 ... 5.0 6.0 False NaN \n", "4 5.0 ... 4.0 3.0 False NaN \n", "5 5.0 ... 1.0 3.0 True False \n", "\n", " Do you consider yourself to be a fan of the Star Trek franchise? Gender \\\n", "1 No Male \n", "2 Yes Male \n", "3 No Male \n", "4 Yes Male \n", "5 No Male \n", "\n", " Age Household Income Education \\\n", "1 18-29 NaN High school degree \n", "2 18-29 $0 - $24,999 Bachelor degree \n", "3 18-29 $0 - $24,999 High school degree \n", "4 18-29 $100,000 - $149,999 Some college or Associate degree \n", "5 18-29 $100,000 - $149,999 Some college or Associate degree \n", "\n", " Location \n", "1 South Atlantic \n", "2 West South Central \n", "3 West North Central \n", "4 West North Central \n", "5 West North Central \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mappings = {\n", " 'Yes':True,\n", " 'No':False\n", "}\n", "\n", "cols = [\n", " 'Are you familiar with the Expanded Universe?',\n", " 'Do you consider yourself to be a fan of the Expanded Universe?'\n", "]\n", "\n", "for col in cols:\n", " df[col] = df[col].map(mappings)\n", "\n", "df.rename(columns= {\n", " 'Are you familiar with the Expanded Universe?' : 'knows_EU',\n", " 'Do you consider yourself to be a fan of the Expanded Universe?' : 'likes_EU'\n", "}, inplace=True)\n", "\n", "df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To add some spice to the data, the respondants were asked whether they also liked the *Star Trek* franchise. The answers of the respondants are in the column - *Do you consider yourself to be a fan of the Star Trek franchise?*.
\n", "Taking a similar approach, the column is cleaned by making the following mappings :-\n", " \n", " 'Yes' - True\n", " 'No' - False\n", " \n", "and changing the column name to *like_star_trek* for ease of analysis." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankknows_EUlikes_EUlikes_star_trekGenderAgeHousehold IncomeEducationLocation
13.292880e+09TrueTrueTrueTrueTrueTrueTrueTrue3.0...5.06.0TrueFalseFalseMale18-29NaNHigh school degreeSouth Atlantic
23.292880e+09FalseNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNTrueMale18-29$0 - $24,999Bachelor degreeWest South Central
33.292765e+09TrueFalseTrueTrueTrueFalseFalseFalse1.0...5.06.0FalseNaNFalseMale18-29$0 - $24,999High school degreeWest North Central
43.292763e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...4.03.0FalseNaNTrueMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
53.292731e+09TrueTrueTrueTrueTrueTrueTrueTrue5.0...1.03.0TrueFalseFalseMale18-29$100,000 - $149,999Some college or Associate degreeWest North Central
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " RespondentID seen_any is_fan Ep_1 Ep_2 Ep_3 Ep_4 Ep_5 Ep_6 \\\n", "1 3.292880e+09 True True True True True True True True \n", "2 3.292880e+09 False NaN False False False False False False \n", "3 3.292765e+09 True False True True True False False False \n", "4 3.292763e+09 True True True True True True True True \n", "5 3.292731e+09 True True True True True True True True \n", "\n", " Ep_1_rank ... Ep_5_rank Ep_6_rank knows_EU likes_EU likes_star_trek \\\n", "1 3.0 ... 5.0 6.0 True False False \n", "2 NaN ... NaN NaN NaN NaN True \n", "3 1.0 ... 5.0 6.0 False NaN False \n", "4 5.0 ... 4.0 3.0 False NaN True \n", "5 5.0 ... 1.0 3.0 True False False \n", "\n", " Gender Age Household Income Education \\\n", "1 Male 18-29 NaN High school degree \n", "2 Male 18-29 $0 - $24,999 Bachelor degree \n", "3 Male 18-29 $0 - $24,999 High school degree \n", "4 Male 18-29 $100,000 - $149,999 Some college or Associate degree \n", "5 Male 18-29 $100,000 - $149,999 Some college or Associate degree \n", "\n", " Location \n", "1 South Atlantic \n", "2 West South Central \n", "3 West North Central \n", "4 West North Central \n", "5 West North Central \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mappings = {\n", " 'Yes':True,\n", " 'No':False\n", "}\n", "\n", "df[\n", " 'Do you consider yourself to be a fan of the Star Trek franchise?'\n", "] = df[\n", " 'Do you consider yourself to be a fan of the Star Trek franchise?'\n", "].map(mappings)\n", "\n", "df.rename(columns= {\n", " 'Do you consider yourself to be a fan of the Star Trek franchise?' : 'likes_star_trek'\n", "}, inplace=True)\n", "\n", "df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The remaining columns - \n", "\n", "* Gender\n", "* Age\n", "* Household Income\n", "* Education\n", "* Location \n", "\n", "describe personal attributes of the respondant. These columns can be useful for generalizing the analysis over segments of respondants. These columns as such do not require cleaning nor column name changes.
\n", "\n", "Now that all the columns are clean and ready for the analysis, the first question to answer is - *How many respondants are fans of the Star Wars franchise?*\n", "To answer this question, the columns *seen_any*, *is_fan*, *gender* and *age* are utilized.\n", "\n", "The *is_fan* column identifies which respondants are fans of the *Star Wars* franchise. Since this is a survey, we cannot always trust the data present. There is a possibility that a respondant answered that he/she is a fan, but hasn't watched any movies. This can be considered as outliers in the data. It is better to check for such outliers." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankknows_EUlikes_EUlikes_star_trekGenderAgeHousehold IncomeEducationLocation
\n", "

0 rows × 23 columns

\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [RespondentID, seen_any, is_fan, Ep_1, Ep_2, Ep_3, Ep_4, Ep_5, Ep_6, Ep_1_rank, Ep_2_rank, Ep_3_rank, Ep_4_rank, Ep_5_rank, Ep_6_rank, knows_EU, likes_EU, likes_star_trek, Gender, Age, Household Income, Education, Location]\n", "Index: []\n", "\n", "[0 rows x 23 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df.seen_any == False) & (df.is_fan == True)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There seem to be no such outliers, which is good as it hints that this data can be trustable." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True 552\n", "NaN 350\n", "False 284\n", "Name: is_fan, dtype: int64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.is_fan.value_counts(dropna=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are 350 `NaN` values in the data. It would be better to classify them either `True` or `False` so the analysis can be complete. Not considering these 350 respondants will result in the existing data give skewed percentages. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False 250\n", "True 100\n", "Name: seen_any, dtype: int64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nulls = df[df.is_fan.isna()]\n", "nulls.seen_any.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are 250 respondants who have not watched any of the 6 *Star Wars* movies - `False`. Since they have not watched the movie, it is fair to assume that they would not be fans of the franchise. For the respondants who have answered with a yes = `True`. It is interesting to note that all the respondants with a `True` value for *seen_any* in this case have all following column values as NaN. These 100 rows contain almost all missing data.
\n", "\n", "NOTE - Only a few rows out of a 100 are being shown to prove the point. The trend does follow through." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankknows_EUlikes_EUlikes_star_trekGenderAgeHousehold IncomeEducationLocation
113.292638e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
813.291669e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
973.291570e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1063.291470e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1283.291420e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1303.291406e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1463.291341e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1813.291038e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1913.291022e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1983.291007e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2093.290981e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2113.290977e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2233.290950e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2313.290940e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2443.290912e+09TrueNaNFalseFalseFalseFalseFalseFalseNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

15 rows × 23 columns

\n", "
" ], "text/plain": [ " RespondentID seen_any is_fan Ep_1 Ep_2 Ep_3 Ep_4 Ep_5 Ep_6 \\\n", "11 3.292638e+09 True NaN False False False False False False \n", "81 3.291669e+09 True NaN False False False False False False \n", "97 3.291570e+09 True NaN False False False False False False \n", "106 3.291470e+09 True NaN False False False False False False \n", "128 3.291420e+09 True NaN False False False False False False \n", "130 3.291406e+09 True NaN False False False False False False \n", "146 3.291341e+09 True NaN False False False False False False \n", "181 3.291038e+09 True NaN False False False False False False \n", "191 3.291022e+09 True NaN False False False False False False \n", "198 3.291007e+09 True NaN False False False False False False \n", "209 3.290981e+09 True NaN False False False False False False \n", "211 3.290977e+09 True NaN False False False False False False \n", "223 3.290950e+09 True NaN False False False False False False \n", "231 3.290940e+09 True NaN False False False False False False \n", "244 3.290912e+09 True NaN False False False False False False \n", "\n", " Ep_1_rank ... Ep_5_rank Ep_6_rank knows_EU likes_EU \\\n", "11 NaN ... NaN NaN NaN NaN \n", "81 NaN ... NaN NaN NaN NaN \n", "97 NaN ... NaN NaN NaN NaN \n", "106 NaN ... NaN NaN NaN NaN \n", "128 NaN ... NaN NaN NaN NaN \n", "130 NaN ... NaN NaN NaN NaN \n", "146 NaN ... NaN NaN NaN NaN \n", "181 NaN ... NaN NaN NaN NaN \n", "191 NaN ... NaN NaN NaN NaN \n", "198 NaN ... NaN NaN NaN NaN \n", "209 NaN ... NaN NaN NaN NaN \n", "211 NaN ... NaN NaN NaN NaN \n", "223 NaN ... NaN NaN NaN NaN \n", "231 NaN ... NaN NaN NaN NaN \n", "244 NaN ... NaN NaN NaN NaN \n", "\n", " likes_star_trek Gender Age Household Income Education Location \n", "11 NaN NaN NaN NaN NaN NaN \n", "81 NaN NaN NaN NaN NaN NaN \n", "97 NaN NaN NaN NaN NaN NaN \n", "106 NaN NaN NaN NaN NaN NaN \n", "128 NaN NaN NaN NaN NaN NaN \n", "130 NaN NaN NaN NaN NaN NaN \n", "146 NaN NaN NaN NaN NaN NaN \n", "181 NaN NaN NaN NaN NaN NaN \n", "191 NaN NaN NaN NaN NaN NaN \n", "198 NaN NaN NaN NaN NaN NaN \n", "209 NaN NaN NaN NaN NaN NaN \n", "211 NaN NaN NaN NaN NaN NaN \n", "223 NaN NaN NaN NaN NaN NaN \n", "231 NaN NaN NaN NaN NaN NaN \n", "244 NaN NaN NaN NaN NaN NaN \n", "\n", "[15 rows x 23 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nulls[nulls.seen_any == True].head(15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the arguements above, the following actions are taken :-\n", "\n", "* All 100 rows will almost all missing values are removed from the dataset\n", "* Rows containing `NaN` for *is_fan* where `False` for *seen_any* exists are filled with `False` values." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "df = df[~((df.seen_any == True) & (df.is_fan.isna()))]\n", "df.is_fan.fillna(False, inplace=True)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True 0.508287\n", "False 0.491713\n", "Name: is_fan, dtype: float64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.is_fan.value_counts(normalize=True)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of respondants who are fans of the Star Wars franchise')" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fans_dist = df.is_fan.value_counts(normalize=True)\n", "\n", "plt.style.use('fivethirtyeight')\n", "plt.figure(figsize=(12,8))\n", "sns.barplot(x= fans_dist.index, y= fans_dist.values)\n", "plt.ylabel('precentage of respondants')\n", "plt.xlabel('respondant is a fan?')\n", "plt.title('Percentage of respondants who are fans of the Star Wars franchise')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Considering people who have not watched any of the movies as \"not fans\", the precentage of respondants who are and are not fans is very close. It can be said that out all the respondants, slightly more than half of them are fans of the *Star Wars* franchise.
\n", "\n", "These findings are incomplete, if the analysis is not done further at a granular level, focusing only on the \"Fans\" i.e. those respondants who claim to be fans of the franchise.
\n", "The analysis will be done on the two columns - *gender* and *age*
\n", "Starting with the *gender*
\n", "There may not be a logical correlation between *gender* and being \"Fans\" of the franchise. Only for the sake of the analysis and gathering a statistic from the respondants, *gender* is considered." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Male 0.548913\n", "Female 0.431159\n", "NaN 0.019928\n", "Name: Gender, dtype: float64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fans = df[df.is_fan == True]\n", "\n", "fans.Gender.value_counts(dropna=False, normalize=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are about 11 `NaN` values, these could be beacuse either the respondant did not want to reveal their *gender* or no option represented them.
\n", "To be fair and not ignore data, these `NaN` values are filled with 'Others' out of respect.\n", "The stats above show that percentage or males and females among the \"Fans\" are not very far apart, with the male tipping the scale by around 9%." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Male 0.548913\n", "Female 0.431159\n", "Others 0.019928\n", "Name: Gender, dtype: float64\n" ] }, { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "%{label} : %{percent}", "labels": [ "Male", "Female", "Others" ], "marker": { "colors": [ "#009999", "#ff9933", "#99004C" ], "line": { "width": 1 } }, "type": "pie", "values": [ 0.5489130434782609, 0.4311594202898551, 0.019927536231884056 ] } ], "layout": { "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Distribution of Gender among Fans
Percentage of Males, Females or others in the Fan population", "x": 0.5, "xref": "paper", "yanchor": "top" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fans.Gender.fillna('Others',inplace=True)\n", "gender_counts = fans.Gender.value_counts(normalize=True)\n", "print(gender_counts)\n", "\n", "# plt.style.use('seaborn')\n", "# plt.figure(figsize=(10,8))\n", "# plt.pie(\n", "# x= gender_counts,\n", "# labels = ['Male','Female','Others'],\n", "# colors = ['#009999','#ff9933','#99004C'],\n", "# autopct=\"%1.1f%%\",\n", "# textprops=dict(color='w',fontsize=10),\n", "# shadow= True,\n", "# wedgeprops = {'linewidth': 1},\n", "# pctdistance= 0.7\n", "# )\n", "# plt.legend(['Male','Female','Others'],loc='upper right', bbox_to_anchor=(1, 0.5, 0.5, 0.5))\n", "# plt.show()\n", "\n", "layout = go.Layout(\n", " title={\n", " 'text':\"Distribution of Gender among Fans
Percentage of Males, Females or others in the Fan population\",\n", " 'yanchor':'top',\n", " 'xref':'paper',\n", " 'x':0.5\n", " }\n", ")\n", "\n", "data = [\n", " go.Pie(\n", " labels= gender_counts.index,\n", " values= gender_counts.values,\n", " marker= dict(\n", " colors= ['#009999','#ff9933','#99004C'],\n", " line= dict(width=1)\n", " ),\n", " hovertemplate= \"%{label} : %{percent}\"\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure(\n", " FigureClass=Waffle,\n", " figsize=(12,6),\n", " rows = 4,\n", " columns = 10,\n", " values = fans.Gender.value_counts(normalize=True),\n", " legend={'loc': 'upper left', 'bbox_to_anchor': (1.05, 0.5), 'labels':['Male','Female','Others']},\n", " icons='child',\n", " font_size=65,\n", " title={'label': 'Number of Fans by Gender (per 40 people)', 'loc': 'center','fontsize':24}\n", ")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following conclusions are drawn from the plots :-\n", "\n", "* The number of Male fans are greater than the rest. As per statistic, per 40 people, 22 Males are Fans of the Star Wars franchise.\n", "* The precentage of Female fans is not very low. As per statistic, per 40 people, 17 Females are Fans of the Star Wars franchise.\n", "* The plots show an almost equal distribution between the Males and Females.\n", "\n", "Thus concluding that the franchise is popular among both Males and Females almost equally.
\n", "\n", "Age seems as a more appropriate charateristic for differentitation between the Fans. Different Age groups usually have different likings, which is expected from the results of this analysis. However, given *Star Wars* was first released in 1978, with *A New Hope - Episode IV* and the survey was conducted in recent times. The age group 40 and above is also a candidate to have a good number of Fans." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "45-60 291\n", "> 60 269\n", "30-44 268\n", "18-29 218\n", "NaN 40\n", "Name: Age, dtype: int64" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.Age.value_counts(dropna=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The age group is of ordinal type with the following intervals. There are 40 `NaN` values, which in this case will be ignored. For ease of the analysis, these intervals/categories are converted to text categories as below :-\n", "\n", " 18-29 -> Young\n", " 30-44 -> Middle\n", " 45-60 -> Senior\n", " > 60 -> Elder\n", "\n", "The barplot below, shows the percentage of fans per age group." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Young 0.224638\n", "Middle 0.271739\n", "Senior 0.278986\n", "Elder 0.224638\n", "Name: age_label, dtype: float64\n" ] }, { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of fans of Star Wars per Age category')" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def encode(age_grp):\n", " \"\"\"Convert age groups in interval forms to Labels\n", " :param age_grp: the age interval\n", " \"\"\"\n", " if age_grp == '18-29':\n", " return 'Young'\n", " elif age_grp == '30-44':\n", " return 'Middle'\n", " elif age_grp == '45-60':\n", " return 'Senior'\n", " else:\n", " return 'Elder'\n", " \n", "fans['age_label'] = fans.Age.apply(encode)\n", "fans_age = fans.age_label.value_counts(normalize=True)\n", "fans_age = fans_age.iloc[[2,1,0,3]]\n", "print(fans_age)\n", "\n", "plt.style.use('fivethirtyeight')\n", "plt.figure(figsize=(12,8))\n", "sns.barplot(x= fans_age.index, y=fans_age.values)\n", "plt.ylabel('precentage of fans')\n", "plt.xlabel('age category')\n", "plt.title('Percentage of fans of Star Wars per Age category')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot shows that the number of fans are in an increasing order with age. The Senior age category has the maximum representatives. This supports the theory discussed above. The Young generation of the 1980s are the Seniors of today and thus this category is likely, as shown, to have more representatives. \n", "Its important to note, that over the years, the fan base of the *Star Wars* franchise has been consistent. All age categories have a good number of fans" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "x", "marker": { "color": "#ff9933" }, "name": "Male", "orientation": "h", "type": "bar", "x": [ 74, 91, 58, 80 ], "y": [ "Young", "Middle", "Elder", "Senior" ] }, { "hoverinfo": "text", "marker": { "color": "#009999" }, "name": "Female", "orientation": "h", "text": [ "50", "59", "55", "74" ], "type": "bar", "x": [ -50, -59, -55, -74 ], "y": [ "Young", "Middle", "Elder", "Senior" ] } ], "layout": { "bargap": 0.1, "barmode": "overlay", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 22 }, "text": "Gender distribution of Fans between various Age categories
Distribution of male and female fans for every age category", "xanchor": "left" }, "xaxis": { "range": [ -80, 100 ], "showticklabels": false, "title": { "text": "Number of Fans" } }, "yaxis": { "title": { "text": "Age category" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "age_catgs = fans.age_label.unique()\n", "\n", "Males = []\n", "Females = []\n", "\n", "for catg in age_catgs:\n", " gender_counts = fans[fans.age_label == catg].Gender.value_counts()\n", " Males.append(gender_counts[0])\n", " Females.append(gender_counts[1])\n", " \n", "layout = go.Layout(\n", " title = {\n", " 'text':\"Gender distribution of Fans between various Age categories
Distribution of male and female fans for every age category\",\n", " 'xanchor':'left',\n", " 'font':{'size':22}\n", " },\n", " yaxis=go.layout.YAxis(\n", " title='Age category'\n", " ),\n", " xaxis=go.layout.XAxis(\n", " range=[-80, 100],\n", "# tickvals=[-100, -70, -30, 0, 30, 70, 100],\n", "# ticktext=[100, 70, 30, 0, 30, 70, 100],\n", " title='Number of Fans',\n", " showticklabels=False\n", " ),\n", " barmode='overlay',\n", " bargap=0.1\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " y=age_catgs,\n", " x=Males,\n", " orientation='h',\n", " name='Male',\n", " hoverinfo='x',\n", " marker=dict(color='#ff9933')\n", " ),\n", " go.Bar(\n", " y=age_catgs,\n", " x=[-1 * f for f in Females],\n", " orientation='h',\n", " name='Female',\n", " text= Females,\n", " hoverinfo='text',\n", " marker=dict(color='#009999')\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot above is only to illustrate the distribution of gender across various age categories for the fans of the *Star Wars* franchise.
\n", "\n", "The next part of the analysis is to answer, which of the 6 parts is most viewed by the public (respondants) and which of them is the most popular. Here popularity is defined as the highest average rank recieved by the movie intuitively. (Basically, the movie with the average lowest rank on a scale of 1-6)
\n", "There are two sets of columns required for this analysis. The columns *Ep_1* through *Ep_6* are boolean columns identifying whether respondant has watched that movie or not. Every respondant has ranked the movies they have seen from 1 to 6, 1 being the highest rank and 6 being the lowest. These are available in the *Ep_1_rank* through *Ep_6_rank* columns.
\n", "\n", "A mean of both these sets of columns, gives a fair enough metric for the popularity of the movie. A popular movie as mentioned, will have a lower average meaning, has a higher average rank. For the ease of Understanding, the movies are mapped as following :-\n", "\n", " Episode I -> The Phantom Menace (1999)\n", " Episode II -> Attack of the Clones (2002)\n", " Episode III -> Revenge of the Sith (2005)\n", " Episode IV -> A New Hope (1977)\n", " Episode V -> The Empire Strikes Back (1980)\n", " Episode VI -> Return of the Jedi (1983)\n", "\n", "The mean of each rank is subtracted from the max possible rank - 6, so that the rank with the lower mean becomes higher. This inverts the scale and thus the plot is more interpretable." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "Views: %{x:.f}", "marker": { "color": [ "#009999", "#009999", "#009999", "#009999", "#ff9933", "#009999" ] }, "orientation": "h", "type": "bar", "x": [ 673, 571, 550, 607, 758, 738 ], "y": [ "The phantom Menace", "Attack of the Clones", "Revenge of the Sith", "A New Hope", "The Empire Strikes Back", "Return of the Jedi" ] } ], "layout": { "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 22 }, "text": "Most viewed Star Wars film in the franchise
Views recieved by each movie in the Star Wars franchise" }, "xaxis": { "showticklabels": false, "title": { "text": "Number of views" } }, "yaxis": { "title": { "text": "Star Wars movie" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "max_rank = 6\n", "movies = [\n", " 'The phantom Menace',\n", " 'Attack of the Clones',\n", " 'Revenge of the Sith',\n", " 'A New Hope',\n", " 'The Empire Strikes Back',\n", " 'Return of the Jedi'\n", "]\n", "\n", "views = df.iloc[:,3:9].sum()\n", "views.index = movies\n", "ranks = df.iloc[:,9:15].mean()\n", "invert_ranks = max_rank - ranks\n", "invert_ranks.index = movies\n", "\n", "colors=['#009999','#009999','#009999','#009999','#ff9933','#009999']\n", "\n", "layout = go.Layout(\n", " title = {\n", " 'text':'Most viewed Star Wars film in the franchise
Views recieved by each movie in the Star Wars franchise',\n", " 'font':{'size':22}\n", " },\n", " yaxis=go.layout.YAxis(title='Star Wars movie'),\n", " xaxis=go.layout.XAxis(title='Number of views',showticklabels=False)\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " x= views.values,\n", " y= views.index,\n", " marker_color= colors,\n", " hovertemplate='Views: %{x:.f}',\n", " orientation= 'h'\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot highlights the most viewed movie of all in the *Star Wars* franchise. Without a doubt, it is *The Empire Strikes Back*. Out of all respondants, 758 people have watched this movie. A close second is the *Return of a Jedi* with 738 views. Since these numbers are not just for fans, but all the respondants, the high number of views gives a sense into the popularity of these movies." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "Average rank: %{text:.2f}", "marker": { "color": [ "#009999", "#009999", "#009999", "#009999", "#ff9933", "#009999" ] }, "orientation": "h", "text": [ 3.7329341317365268, 4.087320574162679, 4.341317365269461, 3.272727272727273, 2.513157894736842, 3.047846889952153 ], "type": "bar", "x": [ 2.2670658682634732, 1.9126794258373208, 1.658682634730539, 2.727272727272727, 3.486842105263158, 2.952153110047847 ], "y": [ "The phantom Menace", "Attack of the Clones", "Revenge of the Sith", "A New Hope", "The Empire Strikes Back", "Return of the Jedi" ] } ], "layout": { "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 22 }, "text": "Most popular Star Wars film in the franchise
average rank recieved by each movie in the Star Wars franchise" }, "xaxis": { "showticklabels": false, "title": { "text": "Average rank" } }, "yaxis": { "title": { "text": "Star Wars movie" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "colors=['#009999','#009999','#009999','#009999','#ff9933','#009999']\n", "\n", "layout = go.Layout(\n", " title = {\n", " 'text':'Most popular Star Wars film in the franchise
'+\n", " 'average rank recieved by each movie in the Star Wars franchise',\n", " 'font':{'size':22},\n", " },\n", " yaxis=go.layout.YAxis(title='Star Wars movie'),\n", " xaxis=go.layout.XAxis(title='Average rank',showticklabels=False)\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " x= invert_ranks.values,\n", " y= invert_ranks.index,\n", " marker_color= colors,\n", " orientation= 'h',\n", " hovertemplate='Average rank: %{text:.2f}',\n", " text = ranks\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As the plot shows, *The Empire Strike Backs* is the most popular film in the *Star Wars* franchise. It can also be inferred that, the Trilogy that was originally released, between 1977 and 1983, are the most popular among all. The prequel trilogy released later in 1999 through 2005, is less popular among the respondants.
\n", "\n", "The *Star Wars* franchise also consists of novels, comic books, TV series and other such entities apart from the movies. These entities were categorized under the *Star Wars Expanded Universe* which was later rebranded to *The Star Wars Legends*. Two specific columns have the respondants views on these - *knows_EU* and *likes_EU*
\n", "A super fan for this analysis is defined as someone who is a fan of the movies and likes the *Extended Universe*. These two columns have to be checked as well, for any sort of faulty data where, the respondant has answered `False` for *knows_EU* but answered `True` for *likes_EU*" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RespondentIDseen_anyis_fanEp_1Ep_2Ep_3Ep_4Ep_5Ep_6Ep_1_rank...Ep_5_rankEp_6_rankknows_EUlikes_EUlikes_star_trekGenderAgeHousehold IncomeEducationLocation
\n", "

0 rows × 23 columns

\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [RespondentID, seen_any, is_fan, Ep_1, Ep_2, Ep_3, Ep_4, Ep_5, Ep_6, Ep_1_rank, Ep_2_rank, Ep_3_rank, Ep_4_rank, Ep_5_rank, Ep_6_rank, knows_EU, likes_EU, likes_star_trek, Gender, Age, Household Income, Education, Location]\n", "Index: []\n", "\n", "[0 rows x 23 columns]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df.knows_EU == False ) & (df.likes_EU == True)]" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False 615\n", "Name: knows_EU, dtype: int64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df.likes_EU.isna()].knows_EU.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are no such faulty rows. All `NaN` values for the *likes_EU* column have *knows_EU* as `False`, hence these rows can filled with `False`.
\n" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False 987\n", "True 99\n", "Name: likes_EU, dtype: int64" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.likes_EU.fillna(False, inplace=True)\n", "df.knows_EU.fillna(False, inplace=True)\n", "df.likes_EU.value_counts(dropna=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Among all the respondants, only 99 people like the *Star Wars Extended Universe*. This gives the sense that maybe, the *Extended Universe* is not a total hit with the public. To understand this better, the following plot shows a comparision between the number of people who like it and the people who dont amongst the people that were aware that *Star Wars Extended Universe* existed. It would not be fit to consider those people who were not aware of its existence as it cannot be said for sure whether they will like it or not once they come to know of it." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of respondants who like the Extended Universe')" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "knows_fans = df.knows_EU.value_counts(dropna=False, normalize=True)\n", "likes_fans = df[df.knows_EU == True].likes_EU.value_counts(dropna=False, normalize=True)\n", "\n", "plt.subplots(figsize=(20,8))\n", "\n", "plt.subplot(1,2,1)\n", "sns.barplot(x= knows_fans.index, y= knows_fans.values)\n", "plt.xlabel('knows about the Extended Universe?')\n", "plt.ylabel('precentage of respondants')\n", "plt.title('Percentage of respondants who know about the Extended Universe')\n", "\n", "plt.subplot(1,2,2)\n", "sns.barplot(x= likes_fans.index, y= likes_fans.values)\n", "plt.xlabel('likes the Extended Universe?')\n", "plt.ylabel('precentage of respondants')\n", "plt.title('Percentage of respondants who like the Extended Universe')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The conclusions drawn from the two plots above are :-\n", "\n", "* From the entire population of respondants, a lesser percentage about 19% of people are aware that the *Extended Universe* exists. \n", "* Out of the people who are aware of the *Extended Universe*'s existence, a lesser percentage of people actually like it.\n", "\n", "This leads to the conclusion that, the *Star Wars Extended Universe* is less popular than the *Star Wars* film franchise.
\n", "As per the definition for \"super fan\" this analysis, a fan who likes both the movies and the *Extended Universe* is considered to be a super fan. The respondants who are super fans are a subset of the fans. The above conclusions drawn are for the entire population of respondants. Limiting the sample to only those respondants who are Fans." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of respondants who like the Extended Universe')" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "knows_fans = fans.knows_EU.value_counts(dropna=False, normalize=True)\n", "likes_fans = fans[fans.knows_EU == True].likes_EU.value_counts(dropna=False, normalize=True)\n", "\n", "plt.subplots(figsize=(20,8))\n", "\n", "plt.subplot(1,2,1)\n", "sns.barplot(x= knows_fans.index, y= knows_fans.values)\n", "plt.xlabel('knows about the Extended Universe?')\n", "plt.ylabel('precentage of respondants')\n", "plt.title('Percentage of respondants who know about the Extended Universe')\n", "\n", "plt.subplot(1,2,2)\n", "sns.barplot(x= likes_fans.index, y= likes_fans.values)\n", "plt.xlabel('likes the Extended Universe?')\n", "plt.ylabel('precentage of respondants')\n", "plt.title('Percentage of respondants who like the Extended Universe')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first plot shows the percentage of people who know about the *Extended Universe* out of the fans. The result is inline with previous conclusions. Not a lot of fans know about the *Extended Universe*. From the small set of fans who know about the *Extended Universe*, a slightly higher percentage of people actually like it, the \"super fans\". A similar analysis is carried out with the \"super fans\" as done with the fans, i.e. based on the *gender* and *age*." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "%{label} : %{percent}", "labels": [ "Male", "Female", "Others" ], "marker": { "colors": [ "#009999", "#ff9933", "#99004C" ], "line": { "width": 1 } }, "type": "pie", "values": [ 0.6774193548387096, 0.3118279569892473, 0.010752688172043012 ] } ], "layout": { "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Distribution of Gender among Super Fans
Percentage of Males, Females or others in the Super Fan population", "x": 0.5, "xref": "paper", "yanchor": "top" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "super_fans = fans[fans.likes_EU == True]\n", "gender_counts = super_fans.Gender.value_counts(normalize=True)\n", "\n", "layout = go.Layout(\n", " title={\n", " 'text':\"Distribution of Gender among Super Fans
Percentage of Males, Females or others in the Super Fan population\",\n", " 'yanchor':'top',\n", " 'xref':'paper',\n", " 'x':0.5\n", " }\n", ")\n", "\n", "data = [\n", " go.Pie(\n", " labels= gender_counts.index,\n", " values= gender_counts.values,\n", " marker= dict(\n", " colors= ['#009999','#ff9933','#99004C'],\n", " line= dict(width=1)\n", " ),\n", " hovertemplate= \"%{label} : %{percent}\"\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is interesting to note, that there are more number of Males who are super fans as compared to the almost equal percentages of Males and Females who were fans.
\n", "Similarly, for the age categories." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of super fans of Star Wars per Age category')" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "super_fans_age = super_fans.age_label.value_counts(normalize=True)\n", "\n", "plt.style.use('fivethirtyeight')\n", "plt.figure(figsize=(12,8))\n", "sns.barplot(x= super_fans_age.index, y=super_fans_age.values)\n", "plt.ylabel('precentage of super fans')\n", "plt.xlabel('age category')\n", "plt.title('Percentage of super fans of Star Wars per Age category')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Contrary to the observations made in the previous version of this plot, where the population considered was of the fans of the *Star Wars* franchise, the plot above narrates a different story.
\n", "\n", "* Most of the Super Fans belong to the Young or Middle age category.\n", "* The Senior and Elder age categories have very less representatives as Super Fans.\n" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "x", "marker": { "color": "#ff9933" }, "name": "Male", "orientation": "h", "type": "bar", "x": [ 19, 22, 8, 14 ], "y": [ "Young", "Middle", "Elder", "Senior" ] }, { "hoverinfo": "text", "marker": { "color": "#009999" }, "name": "Female", "orientation": "h", "text": [ "11", "5", "8", "5" ], "type": "bar", "x": [ -11, -5, -8, -5 ], "y": [ "Young", "Middle", "Elder", "Senior" ] } ], "layout": { "bargap": 0.1, "barmode": "overlay", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 22 }, "text": "Gender distribution of Super Fans between various Age categories
Distribution of male and female super fans for every age category", "xanchor": "left" }, "xaxis": { "range": [ -15, 25 ], "showticklabels": false, "title": { "text": "Number of Fans" } }, "yaxis": { "title": { "text": "Age category" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "age_catgs = super_fans.age_label.unique()\n", "\n", "Males = []\n", "Females = []\n", "\n", "for catg in age_catgs:\n", " gender_counts = super_fans[super_fans.age_label == catg].Gender.value_counts()\n", " Males.append(gender_counts[0])\n", " Females.append(gender_counts[1])\n", " \n", "layout = go.Layout(\n", " title = {\n", " 'text':\"Gender distribution of Super Fans between various Age categories
\"+\n", " \"Distribution of male and female super fans for every age category\",\n", " 'xanchor':'left',\n", " 'font':{'size':22}\n", " },\n", " yaxis=go.layout.YAxis(title='Age category'),\n", " xaxis=go.layout.XAxis(\n", " range=[-15, 25],\n", "# tickvals=[-10, 0, 10, 20],\n", "# ticktext=[10, 0, 10 ,20],\n", " showticklabels= False,\n", " title='Number of Fans'\n", " ),\n", " barmode='overlay',\n", " bargap=0.1\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " y=age_catgs,\n", " x=Males,\n", " orientation='h',\n", " name='Male',\n", " hoverinfo='x',\n", " marker=dict(color='#ff9933')\n", " ),\n", " go.Bar(\n", " y=age_catgs,\n", " x=[-1 * f for f in Females],\n", " orientation='h',\n", " name='Female',\n", " text= Females,\n", " hoverinfo='text',\n", " marker=dict(color='#009999')\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As suggested from the pie plot, the number of male super fans is far greater for every age category.
\n", "\n", "The *Extended Universe* is a newer entity as compared to the film franchise. The animated movies, TV series, comic books, Video games are more appealing to the Young and the Middle aged population rather than the old. The results of the plot support the arguments put forward.\n", "There are columns that are answers to the question - *Please state whether you view the following characters favorably, unfavorably, or are unfamiliar with him/her.* The response to this question is on a scale as follows :-\n", "\n", " Very favorably\n", " Somewhat favorably\n", " Neither favorably nor unfavorably (neutral)\n", " Unfamiliar \n", " Somewhat unfavorably\n", " Very unfavorably\n", "\n", "All the `NaN` values are filled with Unfamiliar for all the columns and the column is converted to type categorical." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Han Solo', 'Luke Skywalker', 'Princess Leia Organa',\n", " 'Anakin Skywalker', 'Obi Wan Kenobi', 'Emperor Palpatine',\n", " 'Darth Vader', 'Lando Calrissian', 'Boba Fett', 'C-3P0', 'R2 D2',\n", " 'Jar Jar Binks', 'Padme Amidala', 'Yoda'], dtype=object)" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "characters = survey.iloc[0,15:29].values\n", "char_ratings = survey.iloc[1:,15:29]\n", "char_ratings.fillna('Unfamiliar',inplace=True)\n", "characters" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "cols = char_ratings.columns\n", "\n", "for col in cols:\n", " char_ratings[col] = char_ratings[col].astype(pd.api.types.CategoricalDtype(ordered=True, categories = [\n", " 'Very favorably',\n", " 'Somewhat favorably',\n", " 'Neither favorably nor unfavorably (neutral)',\n", " 'Unfamiliar',\n", " 'Somewhat unfavorably',\n", " 'Very unfavorably'\n", " ]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The categories - Unfamiliar and Neither favorably nor unfavorably, intuitively do not give insight into the character popularity or perception of the charater by the public (respondants). Thus these two categories are ignored." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "text", "hovertemplate": "Very favorable %{text:.1f}", "marker": { "color": "powderblue" }, "name": "Very Favorable", "orientation": "h", "text": [ 10.399257195914577, 10.679611650485436, 13.092979127134724, 13.680154142581888, 16.43835616438356, 21.604938271604937, 26.360544217687078, 40.47822374039283, 46.434634974533104, 46.779661016949156, 47.7891156462585, 50.556030795551756, 51.44557823129252, 52.09222886421861 ], "type": "bar", "x": [ 10.399257195914577, 10.679611650485436, 13.092979127134724, 13.680154142581888, 16.43835616438356, 21.604938271604937, 26.360544217687078, 40.47822374039283, 46.434634974533104, 46.779661016949156, 47.7891156462585, 50.556030795551756, 51.44557823129252, 52.09222886421861 ], "y": [ "Jar Jar Binks", "Emperor Palpatine", "Boba Fett", "Lando Calrissian", "Padme Amidala", "Anakin Skywalker", "Darth Vader", "C-3P0", "Princess Leia Organa", "Luke Skywalker", "R2 D2", "Obi Wan Kenobi", "Yoda", "Han Solo" ] }, { "hoverinfo": "text", "hovertemplate": "Somewhat favorable %{text:.1f}", "marker": { "color": "#009999" }, "name": "Somewhat Favorable", "orientation": "h", "text": [ 12.070566388115136, 13.883495145631066, 14.516129032258066, 21.483622350674374, 17.90606653620352, 23.721340388007054, 14.540816326530612, 19.55593509820666, 17.826825127334462, 18.559322033898304, 15.731292517006803, 13.60136869118905, 12.244897959183673, 12.894961571306576 ], "type": "bar", "x": [ 12.070566388115136, 13.883495145631066, 14.516129032258066, 21.483622350674374, 17.90606653620352, 23.721340388007054, 14.540816326530612, 19.55593509820666, 17.826825127334462, 18.559322033898304, 15.731292517006803, 13.60136869118905, 12.244897959183673, 12.894961571306576 ], "y": [ "Jar Jar Binks", "Emperor Palpatine", "Boba Fett", "Lando Calrissian", "Padme Amidala", "Anakin Skywalker", "Darth Vader", "C-3P0", "Princess Leia Organa", "Luke Skywalker", "R2 D2", "Obi Wan Kenobi", "Yoda", "Han Solo" ] }, { "hoverinfo": "text", "hovertemplate": "Very Unfavorable %{text:.1f}", "marker": { "color": "#FC4040" }, "name": "Very Unfavorable", "orientation": "h", "text": [ 18.94150417827298, 12.03883495145631, 4.269449715370019, 0.7707129094412332, 3.326810176125244, 3.439153439153439, 12.670068027210885, 0.5977796754910333, 0.5093378607809848, 0.2542372881355932, 0.5102040816326531, 0.5988023952095809, 0.6802721088435374, 0.08539709649871904 ], "type": "bar", "x": [ -18.94150417827298, -12.03883495145631, -4.269449715370019, -0.7707129094412332, -3.326810176125244, -3.439153439153439, -12.670068027210885, -0.5977796754910333, -0.5093378607809848, -0.2542372881355932, -0.5102040816326531, -0.5988023952095809, -0.6802721088435374, -0.08539709649871904 ], "y": [ "Jar Jar Binks", "Emperor Palpatine", "Boba Fett", "Lando Calrissian", "Padme Amidala", "Anakin Skywalker", "Darth Vader", "C-3P0", "Princess Leia Organa", "Luke Skywalker", "R2 D2", "Obi Wan Kenobi", "Yoda", "Han Solo" ] }, { "hoverinfo": "text", "hovertemplate": "Somewhat Unfavorable %{text:.1f}", "marker": { "color": "crimson" }, "name": "Somewhat Unfavorable", "orientation": "h", "text": [ 9.47075208913649, 6.601941747572816, 9.108159392789373, 6.069364161849711, 5.6751467710371815, 7.319223985890652, 8.673469387755102, 1.964133219470538, 1.0186757215619695, 1.1016949152542372, 0.8503401360544218, 0.6843455945252352, 0.6802721088435374, 0.6831767719897524 ], "type": "bar", "x": [ -9.47075208913649, -6.601941747572816, -9.108159392789373, -6.069364161849711, -5.6751467710371815, -7.319223985890652, -8.673469387755102, -1.964133219470538, -1.0186757215619695, -1.1016949152542372, -0.8503401360544218, -0.6843455945252352, -0.6802721088435374, -0.6831767719897524 ], "y": [ "Jar Jar Binks", "Emperor Palpatine", "Boba Fett", "Lando Calrissian", "Padme Amidala", "Anakin Skywalker", "Darth Vader", "C-3P0", "Princess Leia Organa", "Luke Skywalker", "R2 D2", "Obi Wan Kenobi", "Yoda", "Han Solo" ] } ], "layout": { "bargap": 0.1, "barmode": "overlay", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Percentage of Favourability for every Character
Viewers perception of the popular characters from the Star Wars franchise", "x": 0.5 }, "xaxis": { "range": [ -25, 55 ], "showticklabels": false, "title": { "text": "Percentage of favorability" } }, "yaxis": { "title": { "text": "Characters" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "char_very_favorable = []\n", "char_somewhat_favorable = []\n", "char_somewhat_unfavorable = []\n", "char_very_unfavorable = []\n", "\n", "for col in cols:\n", " counts = char_ratings[col].value_counts(normalize=True) * 100\n", " char_very_favorable.append(counts['Very favorably'])\n", " char_somewhat_favorable.append(counts['Somewhat favorably'])\n", " char_somewhat_unfavorable.append(counts['Somewhat unfavorably'])\n", " char_very_unfavorable.append(counts['Very unfavorably'])\n", " \n", "\n", "ratings = pd.DataFrame({\n", " 'character':characters,\n", " 'very_fav':char_very_favorable,\n", " 'somewhat_fav':char_somewhat_favorable,\n", " 'somewhat_unfav':char_somewhat_unfavorable,\n", " 'very_unfav':char_very_unfavorable\n", "})\n", "ratings.sort_values(by='very_fav',ascending=True,inplace=True)\n", " \n", "layout = go.Layout(\n", " title={\n", " \"text\":\"Percentage of Favourability for every Character
\"+\n", " \"Viewers perception of the popular characters from the Star Wars franchise\",\n", " 'x':.5\n", " },\n", " yaxis=go.layout.YAxis(title='Characters'),\n", " xaxis=go.layout.XAxis(\n", " range=[-25, 55],\n", "# tickvals=[-25, -10, 0, 15, 30, 55],\n", "# ticktext=[25, 10, 0, 15, 30, 55],\n", " title='Percentage of favorability',\n", " showticklabels= False\n", " ),\n", " barmode='overlay',\n", " bargap=0.1\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " y=ratings.character,\n", " x= ratings.very_fav,\n", " orientation= 'h',\n", " name= 'Very Favorable',\n", " hoverinfo= 'text',\n", " hovertemplate=\"Very favorable %{text:.1f}\",\n", " text = ratings.very_fav,\n", " marker= dict(color= 'powderblue')\n", " ),\n", " go.Bar(\n", " y=ratings.character,\n", " x= ratings.somewhat_fav,\n", " orientation= 'h',\n", " name= 'Somewhat Favorable',\n", " hovertemplate=\"Somewhat favorable %{text:.1f}\",\n", " hoverinfo= 'text',\n", " text= ratings.somewhat_fav,\n", " marker= dict(color= '#009999')\n", " ),\n", " go.Bar(\n", " y=ratings.character,\n", " x= [-1 * r for r in ratings.very_unfav],\n", " orientation= 'h',\n", " name= 'Very Unfavorable',\n", " hovertemplate=\"Very Unfavorable %{text:.1f}\",\n", " hoverinfo= 'text',\n", " text= ratings.very_unfav,\n", " marker= dict(color= '#FC4040')\n", " ),\n", " go.Bar(\n", " y=ratings.character,\n", " x= [-1 * r for r in ratings.somewhat_unfav],\n", " orientation= 'h',\n", " name= 'Somewhat Unfavorable',\n", " hovertemplate=\"Somewhat Unfavorable %{text:.1f}\",\n", " hoverinfo= 'text',\n", " text= ratings.somewhat_unfav,\n", " marker= dict(color= 'crimson')\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The interactive plot above draws the following conclusions :-\n", "\n", "* The most popular and favorable characters are :-\n", "\n", " * Han Solo\n", " * Yoda\n", " * Obi Wan Kenobi\n", " \n", " These characters have more than 50% of \"very favorable\" hits and more than 10% on \"somewhat favorable\" hits by the respondants.\n", "\n", "* There are no characters who are completely unfavored. A few characters crossing the 10% mark for unfavorability are :-\n", " \n", " * Jar Jar Binks\n", " * Emperor Palpatine\n", " * Darth Vader\n", " \n", "* Interesting point to notice, the two characters having the highest unfavorability percentages also have equivalent favorability percentages. This really points to the fact that no character in the *Star Wars* film franchise is hated so much.\n", "\n", "* There are some controversial character ratings among the lot such as :-\n", "\n", " * Emperor Palpatine\n", " * Darth Vader\n", " \n", " Since these characters being evil have a high unfavorability percentage, but at the same time they have an equivalent or more favorability percentage as well.\n", " \n", "* The fact that the character list has Anakin Skywalker and Darth Vader both on the list, it shows the viewer's perception of Star Wars on two different time lines.\n", "\n", "\n", "To conclude this analysis, a final question to answer is, how many of the respondants like the space-opera media franchises. From all the respondants, a person is said to like sci-fi movies if that person has expressed he/she is a fan of the *Star Wars* franchise or the *Star Trek* franchise. The columns considered for this are - *is_fan* and *likes_star_trek*." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of respondants who are fans of the Space-Opera Media franchises')" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df['scifi_fan'] = df.apply(\n", " lambda x: True if x.likes_star_trek == True or x.is_fan == True else False,\n", " axis=1\n", ")\n", "scifi_fans = df[df.scifi_fan == True]\n", "scifi = df.scifi_fan.value_counts(normalize=True)\n", "\n", "plt.style.use('fivethirtyeight')\n", "plt.figure(figsize=(12,8))\n", "sns.barplot(x= scifi.index, y= scifi.values)\n", "plt.ylabel('precentage of respondants')\n", "plt.xlabel('respondant is a fan?')\n", "plt.title('Percentage of respondants who are fans of the Space-Opera Media franchises')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From all the fans of the *Star Wars* franchise, a very high percentage of respondants are fans of both *Star Wars* and *Star Trek* - Space-Opera Media franchises. A more granular analysis of these fans reveals," ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "%{label} : %{percent}", "labels": [ "Male", "Female" ], "marker": { "colors": [ "#009999", "#ff9933", "#99004C" ], "line": { "width": 1 } }, "type": "pie", "values": [ 0.5544388609715243, 0.4455611390284757 ] } ], "layout": { "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Distribution of Gender among Sci-Fi Fans
Percentage of Males, Females or others in the Sci-Fi Fan population", "x": 0.5, "xref": "paper", "yanchor": "top" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "gender_counts = scifi_fans.Gender.value_counts(normalize=True)\n", "\n", "layout = go.Layout(\n", " title={\n", " 'text':\"Distribution of Gender among Sci-Fi Fans
\"+\n", " \"Percentage of Males, Females or others in the Sci-Fi Fan population\",\n", " 'yanchor':'top',\n", " 'xref':'paper',\n", " 'x':0.5\n", " }\n", ")\n", "\n", "data = [\n", " go.Pie(\n", " labels= gender_counts.index,\n", " values= gender_counts.values,\n", " marker= dict(\n", " colors= ['#009999','#ff9933','#99004C'],\n", " line= dict(width=1)\n", " ),\n", " hovertemplate= \"%{label} : %{percent}\"\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The percentage of Males and Females are close to equal, with the Males slightly larger." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Percentage of fans of Space-Opera Media per Age category')" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "scifi_fans['age_label'] = scifi_fans.Age.apply(encode)\n", "fans_age = scifi_fans.age_label.value_counts(dropna=False, normalize=True)\n", "fans_age = fans_age.iloc[[3,1,0,2]]\n", "\n", "plt.style.use('fivethirtyeight')\n", "plt.figure(figsize=(12,8))\n", "sns.barplot(x= fans_age.index, y=fans_age.values)\n", "plt.ylabel('precentage of fans')\n", "plt.xlabel('age category')\n", "plt.title('Percentage of fans of Space-Opera Media per Age category')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Interesting to note that the plot shows, the fans of the space-media opera franchises are mostly older than 30 years, with a peak in the number of fans aged 45 and older. The two space-opera media franchises mentioned are *Star Wars* and *Star Trek*. The former first released in 1977 whereas the latter first released in 1966. The Young of that era are now the Senior and Elder of today, thus it makes sense that most of the fans are 45 or older." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "x", "marker": { "color": "#ff9933" }, "name": "Male", "orientation": "h", "type": "bar", "x": [ 78, 95, 65, 93 ], "y": [ "Young", "Middle", "Elder", "Senior" ] }, { "hoverinfo": "text", "marker": { "color": "#009999" }, "name": "Female", "orientation": "h", "text": [ "57", "64", "62", "83" ], "type": "bar", "x": [ -57, -64, -62, -83 ], "y": [ "Young", "Middle", "Elder", "Senior" ] } ], "layout": { "bargap": 0.1, "barmode": "overlay", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 22 }, "text": "Gender distribution of Sapce-Opera Media Fans between various Age categories
Distribution of male and female space-opera media fans for every age category", "xanchor": "left" }, "xaxis": { "range": [ -100, 100 ], "showticklabels": false, "title": { "text": "Number of Fans" } }, "yaxis": { "title": { "text": "Age category" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "age_catgs = scifi_fans.age_label.unique()\n", "\n", "Males = []\n", "Females = []\n", "\n", "for catg in age_catgs:\n", " gender_counts = scifi_fans[scifi_fans.age_label == catg].Gender.value_counts()\n", " Males.append(gender_counts[0])\n", " Females.append(gender_counts[1])\n", " \n", "layout = go.Layout(\n", " title = {\n", " 'text':\"Gender distribution of Sapce-Opera Media Fans between various Age categories
\"+\n", " \"Distribution of male and female space-opera media fans for every age category\",\n", " 'xanchor':'left',\n", " 'font':{'size':22}\n", " },\n", " yaxis=go.layout.YAxis(title='Age category'),\n", " xaxis=go.layout.XAxis(\n", " range=[-100, 100],\n", "# tickvals=[-60, -40, -20, 0, 25, 50, 75],\n", "# ticktext=[60, 40, 20, 0, 25, 50, 75],\n", " title='Number of Fans',\n", " showticklabels= False\n", " ),\n", " barmode='overlay',\n", " bargap=0.1\n", ")\n", "\n", "data = [\n", " go.Bar(\n", " y=age_catgs,\n", " x=Males,\n", " orientation='h',\n", " name='Male',\n", " hoverinfo='x',\n", " marker=dict(color='#ff9933')\n", " ),\n", " go.Bar(\n", " y=age_catgs,\n", " x=[-1 * f for f in Females],\n", " orientation='h',\n", " name='Female',\n", " text= Females,\n", " hoverinfo='text',\n", " marker=dict(color='#009999')\n", " )\n", "]\n", "\n", "fig = go.Figure(data= data, layout= layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is an increase in the number of fans by age category as shown by the plot. It is inline with the results obtained above this. The plot below summarizes the data into categories and makes it easier to gain a view of the entire distribution." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "branchvalues": "total", "ids": [ "Total", "Male", "Female", "Young Males", "Middle Males", "Senior Males", "Elder Males", "Young Females", "Middle Females", "Senior Females", "Elder Females", "Star Wars Fans Young Males", "Star Wars Fans Middle Males", "Star Wars Fans Senior Males", "Star Wars Fans Elder Males", "Star Wars Fans Young Females", "Star Wars Fans Middle Females", "Star Wars Fans Senior Females", "Star Wars Fans Elder Females" ], "labels": [ "Total", "Male", "Female", "Young", "Middle", "Senior", "Elder", "Young", "Middle", "Senior", "Elder", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans", "Star Wars Fans" ], "parents": [ "", "Total", "Total", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Young Males", "Middle Males", "Senior Males", "Elder Males", "Young Females", "Middle Females", "Senior Females", "Elder Females" ], "type": "sunburst", "values": [ 1086, 497, 549, 104, 132, 140, 121, 114, 148, 151, 136, 74, 91, 80, 58, 50, 59, 74, 55 ] } ], "layout": { "autosize": true, "margin": { "b": 0, "l": 0, "r": 0, "t": 85 }, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Distribution of Respondants in various categories
Summarizing the data into various categories", "x": 0.5, "y": 0.95 } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df['age_label'] = df.Age.apply(encode)\n", "\n", "total_resp = len(df)\n", "\n", "gender_counts = df.Gender.value_counts(dropna=False)\n", "no_male = gender_counts[1]\n", "no_female = gender_counts[0]\n", "\n", "age_counts_male = df[df.Gender == 'Male'].age_label.value_counts(dropna=False)\n", "no_male_young = age_counts_male[3]\n", "no_male_middle = age_counts_male[1]\n", "no_male_senior = age_counts_male[0]\n", "no_male_elder = age_counts_male[2]\n", "\n", "age_counts_female = df[df.Gender == 'Female'].age_label.value_counts(dropna=False)\n", "no_female_young = age_counts_female[3]\n", "no_female_middle = age_counts_female[1]\n", "no_female_senior = age_counts_female[0]\n", "no_female_elder = age_counts_female[2]\n", "\n", "sw_count_young_male = scifi_fans[(df.Gender=='Male') & (df.age_label == 'Young')].is_fan.sum()\n", "sw_count_middle_male = scifi_fans[(df.Gender=='Male') & (df.age_label == 'Middle')].is_fan.sum()\n", "sw_count_senior_male = scifi_fans[(df.Gender=='Male') & (df.age_label == 'Senior')].is_fan.sum()\n", "sw_count_elder_male = scifi_fans[(df.Gender=='Male') & (df.age_label == 'Elder')].is_fan.sum()\n", "\n", "sw_count_young_female = scifi_fans[(df.Gender=='Female') & (df.age_label == 'Young')].is_fan.sum()\n", "sw_count_middle_female = scifi_fans[(df.Gender=='Female') & (df.age_label == 'Middle')].is_fan.sum()\n", "sw_count_senior_female = scifi_fans[(df.Gender=='Female') & (df.age_label == 'Senior')].is_fan.sum()\n", "sw_count_elder_female = scifi_fans[(df.Gender=='Female') & (df.age_label == 'Elder')].is_fan.sum()\n", "\n", "labels= [\n", " 'Total',\n", " 'Male',\n", " 'Female',\n", " 'Young',\n", " 'Middle',\n", " 'Senior',\n", " 'Elder',\n", " 'Young',\n", " 'Middle',\n", " 'Senior',\n", " 'Elder',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans',\n", " 'Star Wars Fans'\n", "]\n", "ids = [\n", " 'Total',\n", " 'Male',\n", " 'Female',\n", " 'Young Males',\n", " 'Middle Males',\n", " 'Senior Males',\n", " 'Elder Males',\n", " 'Young Females',\n", " 'Middle Females',\n", " 'Senior Females',\n", " 'Elder Females',\n", " 'Star Wars Fans Young Males',\n", " 'Star Wars Fans Middle Males',\n", " 'Star Wars Fans Senior Males',\n", " 'Star Wars Fans Elder Males',\n", " 'Star Wars Fans Young Females',\n", " 'Star Wars Fans Middle Females',\n", " 'Star Wars Fans Senior Females',\n", " 'Star Wars Fans Elder Females'\n", "]\n", "parents = [\n", " \"\",\n", " 'Total',\n", " 'Total',\n", " 'Male',\n", " 'Male',\n", " 'Male',\n", " 'Male',\n", " 'Female',\n", " 'Female',\n", " 'Female',\n", " 'Female',\n", " 'Young Males',\n", " 'Middle Males',\n", " 'Senior Males',\n", " 'Elder Males',\n", " 'Young Females',\n", " 'Middle Females',\n", " 'Senior Females',\n", " 'Elder Females',\n", "]\n", "values = [\n", " total_resp,\n", " no_male,\n", " no_female,\n", " no_male_young,\n", " no_male_middle,\n", " no_male_senior,\n", " no_male_elder,\n", " no_female_young,\n", " no_female_middle,\n", " no_female_senior,\n", " no_female_elder,\n", " sw_count_young_male,\n", " sw_count_middle_male,\n", " sw_count_senior_male,\n", " sw_count_elder_male,\n", " sw_count_young_female,\n", " sw_count_middle_female,\n", " sw_count_senior_female,\n", " sw_count_elder_female\n", "]\n", "\n", "data= [\n", " go.Sunburst(\n", " ids= ids,\n", " labels = labels,\n", " parents= parents,\n", " values= values,\n", " branchvalues= 'total'\n", " )\n", "]\n", "\n", "layout = go.Layout(\n", " title= {\n", " 'text': 'Distribution of Respondants in various categories
'+\n", " 'Summarizing the data into various categories',\n", " 'x':0.5,\n", " 'y':0.95\n", " },\n", " autosize= True,\n", " margin = dict(t=85, l=0, r=0, b=0)\n", ")\n", "\n", "fig = go.Figure(data= data,layout=layout)\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The conclusion of the analysis done on the respondants of this survey are listed as following :-\n", "\n", "* Slightly more than half of the respondants are fans of the *Star Wars* film franchise.\n", "\n", "* The fans of the *Star Wars* film franchise have about equal percentages of Males and Females, with the Males slightly tipping the balance.\n", "\n", "* The fans of the *Star Wars* film franchise belong to the age bracket of 30 years to 60 years, indicating the period of time *Star Wars* was a vogue.\n", "\n", "* *The Empire Strikes Back* is the most viewed and the most popular movie of the entire *Star Wars* film franchise.\n", "\n", "* The *Star Wars Extended Universe* is not a great hit with the public, but out of the fans that know about its existence, slightly more than half of the respondants like the *Extended Universe*.\n", "\n", "* The Super fans have a substantially more percentage of Males as compared to Females.\n", "\n", "* Mostly the Young generation consists of Super fans, since the *Extended Universe* containes TV series and Video games which are new and a hit with the Young.\n", "\n", "* The following characters are rated the most favorable from the *Star Wars* film franchise:\n", " * Han Solo\n", " * Yoda\n", " * Obi Wan Kenobi\n", "\n", "* Similarly, the following characters are rated the most unfavorable from the *Star Wars* film franchise:\n", " * Jar Jar Binks\n", " * Darth Vader\n", " * Emperor Palpatine\n", "\n", "* The most controversial characters of the *Star Wars* film franchise, as even though they are villans and are most unfavorable, they still have a good percentage of favorability:\n", " * Emperor Palpatine\n", " * Darth Vader\n", " \n", "* A good percentage of fans of the *Star Wars* franchise are fans of the Space-Opera Media franchises.\n", "\n", "* The percentages of Males and Females of the fans of Space-Opera Media franchises are very close, with the number of Males outweighing the number of Females.\n", "\n", "* The Fans of the Space-Opera Media franchises are mostly above the age of 30 years with a peak in the fans of the age bracket 45 to 60 years. This is valid since both *Star Wars* and *Star Trek* are from the 1970s and 1960s respectively.\n", "\n", "* The final sunburst plot concludes how the respondants are distributed between the various categories." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 2 }