{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "Case Study: Movies Dataset

\n", "
This notebook uses a dataset from Kaggle. We will describe the dataset further as we explore with it using *pandas*. \n", "\n", "## Download the Dataset\n", "\n", "Please note that **you will need to download the dataset** from the course website. \n", "\n", "You can find the data at https://junyounglim.github.io/. Please unzip the file at a filepath of your choice. \n", "\n", "Here are instructions on how to unzip a file in Windows: https://support.microsoft.com/en-us/help/14200/windows-compress-uncompress-zip-files. \n", "For Macs, simply double-click on the file. \n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "Use Pandas to Read the Dataset
\n", "

\n", "
\n", "In this notebook, we will be using a CSV file:\n", "* **tmdb_5000_movies.csv :** \n", "\n", "The dataset contains about 5000 movies. \n", "\n", "The following are the features: \n", " budget, genres, homepage, id, keywords, original_language, original_title, overview, popularity, production_companies, production_countries, release_date, revenue, runtime, spoken_languages, status, tagline, title, vote_average, vote_count\n", "\n", "\n", "Using the *read_csv* function in pandas, we will transfer this information into our code. " ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# import pandas and load data\n", "import pandas as pd\n", "\n", "filepath = 'tmdb_5000_movies.csv'\n", "movies = pd.read_csv(filepath)\n", "\n", "# or\n", "movies = pd.read_csv('tmdb_5000_movies.csv')" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetgenreshomepageidkeywordsoriginal_languageoriginal_titleoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_count
0237000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://www.avatarmovie.com/19995[{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":...enAvatarIn the 22nd century, a paraplegic Marine is di...150.437577[{\"name\": \"Ingenious Film Partners\", \"id\": 289...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...12/10/092787965087162.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso...ReleasedEnter the World of Pandora.Avatar7.211800
1300000000[{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"...http://disney.go.com/disneypictures/pirates/285[{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na...enPirates of the Caribbean: At World's EndCaptain Barbossa, long believed to be dead, ha...139.082615[{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...5/19/07961000000169.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedAt the end of the world, the adventure begins.Pirates of the Caribbean: At World's End6.94500
2245000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://www.sonypictures.com/movies/spectre/206647[{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name...enSpectreA cryptic message from Bond’s past sends him o...107.376788[{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam...[{\"iso_3166_1\": \"GB\", \"name\": \"United Kingdom\"...10/26/15880674609148.0[{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},...ReleasedA Plan No One EscapesSpectre6.34466
3250000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam...http://www.thedarkknightrises.com/49026[{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,...enThe Dark Knight RisesFollowing the death of District Attorney Harve...112.312950[{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...7/16/121084939099165.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedThe Legend EndsThe Dark Knight Rises7.69106
4260000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://movies.disney.com/john-carter49529[{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":...enJohn CarterJohn Carter is a war-weary, former military ca...43.926995[{\"name\": \"Walt Disney Pictures\", \"id\": 2}][{\"iso_3166_1\": \"US\", \"name\": \"United States o...3/7/12284139100132.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedLost in our world, found in another.John Carter6.12124
\n", "
" ], "text/plain": [ " budget genres \\\n", "0 237000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "1 300000000 [{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"... \n", "2 245000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "3 250000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam... \n", "4 260000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "\n", " homepage id \\\n", "0 http://www.avatarmovie.com/ 19995 \n", "1 http://disney.go.com/disneypictures/pirates/ 285 \n", "2 http://www.sonypictures.com/movies/spectre/ 206647 \n", "3 http://www.thedarkknightrises.com/ 49026 \n", "4 http://movies.disney.com/john-carter 49529 \n", "\n", " keywords original_language \\\n", "0 [{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":... en \n", "1 [{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na... en \n", "2 [{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name... en \n", "3 [{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,... en \n", "4 [{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":... en \n", "\n", " original_title \\\n", "0 Avatar \n", "1 Pirates of the Caribbean: At World's End \n", "2 Spectre \n", "3 The Dark Knight Rises \n", "4 John Carter \n", "\n", " overview popularity \\\n", "0 In the 22nd century, a paraplegic Marine is di... 150.437577 \n", "1 Captain Barbossa, long believed to be dead, ha... 139.082615 \n", "2 A cryptic message from Bond’s past sends him o... 107.376788 \n", "3 Following the death of District Attorney Harve... 112.312950 \n", "4 John Carter is a war-weary, former military ca... 43.926995 \n", "\n", " production_companies \\\n", "0 [{\"name\": \"Ingenious Film Partners\", \"id\": 289... \n", "1 [{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"... \n", "2 [{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam... \n", "3 [{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"... \n", "4 [{\"name\": \"Walt Disney Pictures\", \"id\": 2}] \n", "\n", " production_countries release_date revenue \\\n", "0 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 12/10/09 2787965087 \n", "1 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 5/19/07 961000000 \n", "2 [{\"iso_3166_1\": \"GB\", \"name\": \"United Kingdom\"... 10/26/15 880674609 \n", "3 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 7/16/12 1084939099 \n", "4 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 3/7/12 284139100 \n", "\n", " runtime spoken_languages status \\\n", "0 162.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso... Released \n", "1 169.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "2 148.0 [{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},... Released \n", "3 165.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "4 132.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "\n", " tagline \\\n", "0 Enter the World of Pandora. \n", "1 At the end of the world, the adventure begins. \n", "2 A Plan No One Escapes \n", "3 The Legend Ends \n", "4 Lost in our world, found in another. \n", "\n", " title vote_average vote_count \n", "0 Avatar 7.2 11800 \n", "1 Pirates of the Caribbean: At World's End 6.9 4500 \n", "2 Spectre 6.3 4466 \n", "3 The Dark Knight Rises 7.6 9106 \n", "4 John Carter 6.1 2124 " ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Now that we have the dataset we will start to get a feeling for its layout\n", "movies.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our dataset is loaded and looks ok, but it looks like there's some cleaning that needs to be done. Notice how multiple columns seem to be objects." ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetgenreshomepageidkeywordsoriginal_languageoriginal_titleoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_count
0237000000[Action, Adventure, Fantasy, Science Fiction]http://www.avatarmovie.com/19995[culture clash, future, space war, space colon...enAvatarIn the 22nd century, a paraplegic Marine is di...150.437577[Ingenious Film Partners, Twentieth Century Fo...[US, GB]12/10/092787965087162.0[en, es]ReleasedEnter the World of Pandora.Avatar7.211800
1300000000[Adventure, Fantasy, Action]http://disney.go.com/disneypictures/pirates/285[ocean, drug abuse, exotic island, east india ...enPirates of the Caribbean: At World's EndCaptain Barbossa, long believed to be dead, ha...139.082615[Walt Disney Pictures, Jerry Bruckheimer Films...[US]5/19/07961000000169.0[en]ReleasedAt the end of the world, the adventure begins.Pirates of the Caribbean: At World's End6.94500
2245000000[Action, Adventure, Crime]http://www.sonypictures.com/movies/spectre/206647[spy, based on novel, secret agent, sequel, mi...enSpectreA cryptic message from Bond’s past sends him o...107.376788[Columbia Pictures, Danjaq, B24][GB, US]10/26/15880674609148.0[fr, en, es, it, de]ReleasedA Plan No One EscapesSpectre6.34466
3250000000[Action, Crime, Drama, Thriller]http://www.thedarkknightrises.com/49026[dc comics, crime fighter, terrorist, secret i...enThe Dark Knight RisesFollowing the death of District Attorney Harve...112.312950[Legendary Pictures, Warner Bros., DC Entertai...[US]7/16/121084939099165.0[en]ReleasedThe Legend EndsThe Dark Knight Rises7.69106
4260000000[Action, Adventure, Science Fiction]http://movies.disney.com/john-carter49529[based on novel, mars, medallion, space travel...enJohn CarterJohn Carter is a war-weary, former military ca...43.926995[Walt Disney Pictures][US]3/7/12284139100132.0[en]ReleasedLost in our world, found in another.John Carter6.12124
\n", "
" ], "text/plain": [ " budget genres \\\n", "0 237000000 [Action, Adventure, Fantasy, Science Fiction] \n", "1 300000000 [Adventure, Fantasy, Action] \n", "2 245000000 [Action, Adventure, Crime] \n", "3 250000000 [Action, Crime, Drama, Thriller] \n", "4 260000000 [Action, Adventure, Science Fiction] \n", "\n", " homepage id \\\n", "0 http://www.avatarmovie.com/ 19995 \n", "1 http://disney.go.com/disneypictures/pirates/ 285 \n", "2 http://www.sonypictures.com/movies/spectre/ 206647 \n", "3 http://www.thedarkknightrises.com/ 49026 \n", "4 http://movies.disney.com/john-carter 49529 \n", "\n", " keywords original_language \\\n", "0 [culture clash, future, space war, space colon... en \n", "1 [ocean, drug abuse, exotic island, east india ... en \n", "2 [spy, based on novel, secret agent, sequel, mi... en \n", "3 [dc comics, crime fighter, terrorist, secret i... en \n", "4 [based on novel, mars, medallion, space travel... en \n", "\n", " original_title \\\n", "0 Avatar \n", "1 Pirates of the Caribbean: At World's End \n", "2 Spectre \n", "3 The Dark Knight Rises \n", "4 John Carter \n", "\n", " overview popularity \\\n", "0 In the 22nd century, a paraplegic Marine is di... 150.437577 \n", "1 Captain Barbossa, long believed to be dead, ha... 139.082615 \n", "2 A cryptic message from Bond’s past sends him o... 107.376788 \n", "3 Following the death of District Attorney Harve... 112.312950 \n", "4 John Carter is a war-weary, former military ca... 43.926995 \n", "\n", " production_companies production_countries \\\n", "0 [Ingenious Film Partners, Twentieth Century Fo... [US, GB] \n", "1 [Walt Disney Pictures, Jerry Bruckheimer Films... [US] \n", "2 [Columbia Pictures, Danjaq, B24] [GB, US] \n", "3 [Legendary Pictures, Warner Bros., DC Entertai... [US] \n", "4 [Walt Disney Pictures] [US] \n", "\n", " release_date revenue runtime spoken_languages status \\\n", "0 12/10/09 2787965087 162.0 [en, es] Released \n", "1 5/19/07 961000000 169.0 [en] Released \n", "2 10/26/15 880674609 148.0 [fr, en, es, it, de] Released \n", "3 7/16/12 1084939099 165.0 [en] Released \n", "4 3/7/12 284139100 132.0 [en] Released \n", "\n", " tagline \\\n", "0 Enter the World of Pandora. \n", "1 At the end of the world, the adventure begins. \n", "2 A Plan No One Escapes \n", "3 The Legend Ends \n", "4 Lost in our world, found in another. \n", "\n", " title vote_average vote_count \n", "0 Avatar 7.2 11800 \n", "1 Pirates of the Caribbean: At World's End 6.9 4500 \n", "2 Spectre 6.3 4466 \n", "3 The Dark Knight Rises 7.6 9106 \n", "4 John Carter 6.1 2124 " ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import re\n", "\n", "def to_list_4(strng):\n", " return [item for index, item in enumerate(strng.split('\"')) if (index + 1) % 6 == 0]\n", "\n", "def to_list_2(strng):\n", " return [item for index, item in enumerate(strng.split('\"')) if (index + 5) % 8 == 0]\n", "\n", "def to_list_2_mod(strng):\n", " return [item for index, item in enumerate(strng.split('\"')) if (index + 3) % 6 == 0]\n", "\n", "movies.genres = movies.genres.apply(to_list_4)\n", "movies.keywords = movies.keywords.apply(to_list_4)\n", "movies.production_companies = movies.production_companies.apply(to_list_2_mod)\n", "movies.production_countries = movies.production_countries.apply(to_list_2)\n", "movies.spoken_languages = movies.spoken_languages.apply(to_list_2)\n", "movies.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Descriptive Statistics

\n", "\n", "Pandas also provides some basic quantitative functions to understand our data. " ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count 4.803000e+03\n", "mean 2.904504e+07\n", "std 4.072239e+07\n", "min 0.000000e+00\n", "25% 7.900000e+05\n", "50% 1.500000e+07\n", "75% 4.000000e+07\n", "max 3.800000e+08\n", "Name: budget, dtype: float64" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies['budget'].describe()" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetidpopularityrevenueruntimevote_averagevote_count
count4.803000e+034803.0000004803.0000004.803000e+034753.0000004803.0000004803.000000
mean2.904504e+0757165.48428121.4923018.226064e+07106.8537776.092172690.217989
std4.072239e+0788694.61403331.8166501.628571e+0822.6145861.1946121234.585891
min0.000000e+005.0000000.0000000.000000e+000.0000000.0000000.000000
25%7.900000e+059014.5000004.6680700.000000e+0094.0000005.60000054.000000
50%1.500000e+0714629.00000012.9215941.917000e+07104.0000006.200000235.000000
75%4.000000e+0758610.50000028.3135059.291719e+07117.0000006.800000737.000000
max3.800000e+08459488.000000875.5813052.787965e+09338.00000010.00000013752.000000
\n", "
" ], "text/plain": [ " budget id popularity revenue runtime \\\n", "count 4.803000e+03 4803.000000 4803.000000 4.803000e+03 4753.000000 \n", "mean 2.904504e+07 57165.484281 21.492301 8.226064e+07 106.853777 \n", "std 4.072239e+07 88694.614033 31.816650 1.628571e+08 22.614586 \n", "min 0.000000e+00 5.000000 0.000000 0.000000e+00 0.000000 \n", "25% 7.900000e+05 9014.500000 4.668070 0.000000e+00 94.000000 \n", "50% 1.500000e+07 14629.000000 12.921594 1.917000e+07 104.000000 \n", "75% 4.000000e+07 58610.500000 28.313505 9.291719e+07 117.000000 \n", "max 3.800000e+08 459488.000000 875.581305 2.787965e+09 338.000000 \n", "\n", " vote_average vote_count \n", "count 4803.000000 4803.000000 \n", "mean 6.092172 690.217989 \n", "std 1.194612 1234.585891 \n", "min 0.000000 0.000000 \n", "25% 5.600000 54.000000 \n", "50% 6.200000 235.000000 \n", "75% 6.800000 737.000000 \n", "max 10.000000 13752.000000 " ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.describe()" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.092171559442011" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies['vote_average'].mean()" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "budget 2.904504e+07\n", "id 5.716548e+04\n", "popularity 2.149230e+01\n", "revenue 8.226064e+07\n", "runtime 1.068538e+02\n", "vote_average 6.092172e+00\n", "vote_count 6.902180e+02\n", "dtype: float64" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.mean()" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetidpopularityrevenueruntimevote_averagevote_count
budget1.000000-0.0893770.5054140.7308230.2679150.0931460.593180
id-0.0893771.0000000.031202-0.050425-0.152379-0.270595-0.004128
popularity0.5054140.0312021.0000000.6447240.2242050.2739520.778130
revenue0.730823-0.0504250.6447241.0000000.2499860.1971500.781487
runtime0.267915-0.1523790.2242050.2499861.0000000.3771390.272182
vote_average0.093146-0.2705950.2739520.1971500.3771391.0000000.312997
vote_count0.593180-0.0041280.7781300.7814870.2721820.3129971.000000
\n", "
" ], "text/plain": [ " budget id popularity revenue runtime \\\n", "budget 1.000000 -0.089377 0.505414 0.730823 0.267915 \n", "id -0.089377 1.000000 0.031202 -0.050425 -0.152379 \n", "popularity 0.505414 0.031202 1.000000 0.644724 0.224205 \n", "revenue 0.730823 -0.050425 0.644724 1.000000 0.249986 \n", "runtime 0.267915 -0.152379 0.224205 0.249986 1.000000 \n", "vote_average 0.093146 -0.270595 0.273952 0.197150 0.377139 \n", "vote_count 0.593180 -0.004128 0.778130 0.781487 0.272182 \n", "\n", " vote_average vote_count \n", "budget 0.093146 0.593180 \n", "id -0.270595 -0.004128 \n", "popularity 0.273952 0.778130 \n", "revenue 0.197150 0.781487 \n", "runtime 0.377139 0.272182 \n", "vote_average 1.000000 0.312997 \n", "vote_count 0.312997 1.000000 " ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.corr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also filter information conditionally. " ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetgenreshomepageidkeywordsoriginal_languageoriginal_titleoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_count
65185000000[Drama, Action, Crime, Thriller]http://thedarkknight.warnerbros.com/dvdsite/155[dc comics, crime fighter, secret identity, sc...enThe Dark KnightBatman raises the stakes in his war on crime. ...187.322927[DC Comics, Legendary Pictures, Warner Bros., ...[GB, US]7/16/081004558444152.0[en, zh]ReleasedWhy So Serious?The Dark Knight8.212002
95165000000[Adventure, Drama, Science Fiction]http://www.interstellarmovie.net/157336[saving the world, artificial intelligence, fa...enInterstellarInterstellar chronicles the adventures of a gr...724.247784[Paramount Pictures, Legendary Pictures, Warne...[CA, US, GB]11/5/14675120017169.0[en]ReleasedMankind was born on Earth. It was never meant ...Interstellar8.110867
96160000000[Action, Thriller, Science Fiction, Mystery, A...http://inceptionmovie.warnerbros.com/27205[loss of lover, dream, kidnapping, sleep, subc...enInceptionCobb, a skilled thief who commits corporate es...167.583710[Legendary Pictures, Warner Bros., Syncopy][GB, US]7/14/10825532764148.0[en, ja, fr]ReleasedYour mind is the scene of the crime.Inception8.113752
32994000000[Adventure, Fantasy, Action]http://www.lordoftherings.net122[elves, orcs, middle-earth (tolkien), based on...enThe Lord of the Rings: The Return of the KingAragorn is revealed as the heir to the ancient...123.630332[WingNut Films, New Line Cinema][NZ, US]12/1/031118888979201.0[en]ReleasedThe eye of the enemy is moving.The Lord of the Rings: The Return of the King8.18064
66263000000[Drama]http://www.foxmovies.com/movies/fight-club550[support group, dual identity, nihilism, rage ...enFight ClubA ticking-time-bomb insomniac and a slippery s...146.757391[Regency Enterprises, Fox 2000 Pictures, Tauru...[DE, US]10/15/99100853753NaN[en]ReleasedMischief. Mayhem. Soap.Fight Club8.39413
69060000000[Fantasy, Drama, Crime]http://thegreenmile.warnerbros.com/497[southern usa, black people, mentally disabled...enThe Green MileA supernatural tale set on death row in a Sout...103.698022[Castle Rock Entertainment, Darkwoods Producti...[US]12/10/99284600000189.0[fr, en]ReleasedMiracles do happen.The Green Mile8.24048
80955000000[Comedy, Drama, Romance]NaN13[vietnam veteran, hippie, mentally disabled, r...enForrest GumpA man with a low IQ has accomplished great thi...138.133331[Paramount Pictures][US]7/6/94677945399142.0[en]ReleasedThe world will never be the same, once you've ...Forrest Gump8.27927
155333000000[Crime, Mystery, Thriller]http://www.sevenmovie.com/807[self-fulfilling prophecy, detective, s.w.a.t....enSe7enTwo homicide detectives are on a desperate hun...79.579532[New Line Cinema, Juno Pix, Cecchi Gori Pictures][US]9/22/95327311859127.0[en]ReleasedSeven deadly sins. Seven ways to die.Se7en8.15765
166330000000[Drama, Crime]NaN311[life and death, corruption, street gang, rape...enOnce Upon a Time in AmericaA former Prohibition-era Jewish gangster retur...49.336397[Warner Bros., The Ladd Company][US, IT]2/16/840229.0[en, fr, it]ReleasedCrime, passion and lust for power - Sergio Leo...Once Upon a Time in America8.21069
181822000000[Drama, History, War]http://www.schindlerslist.com/424[factory, concentration camp, hero, holocaust,...enSchindler's ListThe true story of how businessman Oskar Schind...104.469351[Universal Pictures, Amblin Entertainment][US]11/29/93321365567195.0[de, pl, he, en]ReleasedWhoever saves one life, saves the world entire.Schindler's List8.34329
184725000000[Drama, Crime]http://www.warnerbros.com/goodfellas769[prison, based on novel, florida, 1970s, mass ...enGoodFellasThe true story of Henry Hill, a half-Irish, ha...63.654244[Winkler Films][US]9/12/9046836394145.0[it, en]ReleasedThree Decades of Life in the Mafia.GoodFellas8.23128
188125000000[Drama, Crime]NaN278[prison, corruption, police brutality, prison ...enThe Shawshank RedemptionFramed in the 1940s for the double murder of h...136.747729[Castle Rock Entertainment][US]9/23/9428341469142.0[en]ReleasedFear can hold you prisoner. Hope can set you f...The Shawshank Redemption8.58205
198724000000[Fantasy, Animation, Adventure]NaN4935[flying, witch, rain, castle, scarecrow, body ...jaハウルの動く城When Sophie, a shy young woman, is cursed with...49.549984[Studio Ghibli, Buena Vista Home Entertainment...[JP]11/19/04234710455119.0[ja]ReleasedThe two lived thereHowl's Moving Castle8.21991
199018000000[Adventure, Action, Science Fiction]http://www.starwars.com/films/star-wars-episod...1891[rebel, android, asteroid, space battle, snow ...enThe Empire Strikes BackThe epic saga continues as Luke Skywalker, in ...78.517830[Lucasfilm, Twentieth Century Fox Film Corpora...[US]5/17/80538400000124.0[en]ReleasedThe Adventure Continues...The Empire Strikes Back8.25879
209119000000[Crime, Drama, Thriller]NaN274[based on novel, psychopath, horror, suspense,...enThe Silence of the LambsFBI trainee, Clarice Starling ventures into a ...18.174804[Orion Pictures, Strong Heart/Demme Production][US]2/1/91272742922119.0[en]ReleasedTo enter the mind of a killer she must challen...The Silence of the Lambs8.14443
2170806948[Drama, Horror, Thriller]NaN539[hotel, clerk, arizona, shower, rain, motel, m...enPsychoWhen larcenous real estate clerk Marion Crane ...93.067866[Paramount Pictures, Universal Pictures, Shaml...[US]6/16/6032000000109.0[en]ReleasedThe master of suspense moves his cameras into ...Psycho8.22320
224726500000[Adventure, Fantasy, Animation]NaN128[fight, wolf, village and town, iron, pan, wil...jaもののけ姫Ashitaka, a prince of the disappearing Ainu tr...60.732738[Miramax Films, Studio Ghibli, Nibariki, Nippo...[JP]7/12/97159375308134.0[ja]ReleasedThe Fate Of The World Rests On The Courage Of ...Princess Mononoke8.21983
228419000000[Horror, Thriller]NaN694[hotel, isolation, hotelier, colorado, maze, b...enThe ShiningJack Torrance accepts a caretaker job at the O...78.699993[Hawk Films, Warner Bros., Peregrine][US, GB]5/22/8044017374144.0[en]ReleasedA masterpiece of modern horror.The Shining8.13757
229415000000[Fantasy, Adventure, Animation, Family]http://movies.disney.com/spirited-away129[witch, parents kids relationship, magic, twil...ja千と千尋の神隠しA ten year old girl who wanders away from her ...118.968562[Studio Ghibli][JP]7/20/01274925095125.0[ja]ReleasedThe tunnel led Chihiro to a mysterious town...Spirited Away8.33840
23860[Western, Action, Drama, History]NaN69848[war, army, battlefield, chivalry]enOne Man's HeroOne Man's Hero tells the little-known story of...0.910529[Filmax, Silver Lion Films, Televisa S.A. de C...[US, MX, ES]8/2/990121.0[en]ReleasedOne man's hero is another man's traitor.One Man's Hero9.32
245316400000[Drama]NaN207[individual, philosophy, poetry, shakespeare, ...enDead Poets SocietyAt an elite, old-fashioned boarding school in ...55.458584[Touchstone Pictures, Silver Screen Partners IV][US]6/2/89235860116129.0[en]ReleasedHe was their inspiration. He made their lives ...Dead Poets Society8.12705
273113000000[Drama, Crime]NaN240[italo-american, cuba, vororte, melancholy, pr...enThe Godfather: Part IIIn the continuing saga of the Corleone crime f...105.792936[Paramount Pictures, The Coppola Company][US]12/20/7447542841200.0[en, it, la, es]ReleasedI don't feel I have to wipe everybody out, Tom...The Godfather: Part II8.33338
27606000000[Drama, Thriller]http://www.roomthemovie.com264644[based on novel, carpet, isolation, kidnapping...enRoomJack is a young boy of 5 years old who has liv...66.113340[T\\u00e9l\\u00e9film Canada, Film 4, Element Pi...[CA, IE]10/16/1535401758117.0[en]ReleasedLove knows no boundariesRoom8.12757
27960[Adventure, Drama, Romance]NaN43867[kidnapping, coronation, villain, kingdom, hei...enThe Prisoner of ZendaAn Englishman on a Ruritarian holiday must imp...4.021389[United Artists, Selznick International Pictures][US]9/3/370NaN[es, en]ReleasedThe most thrilling swordfight ever filmed...The Prisoner of Zenda8.411
291211000000[Adventure, Action, Science Fiction]http://www.starwars.com/films/star-wars-episod...11[android, galaxy, hermit, death star, lightsab...enStar WarsPrincess Leia is captured and held hostage by ...126.393695[Lucasfilm, Twentieth Century Fox Film Corpora...[US]5/25/77775398007121.0[en]ReleasedA long time ago in a galaxy far, far away...Star Wars8.16624
294710000000[Drama, History]NaN30973[miracle, jesus christ, last supper, bible, ch...enThe Visual Bible: The Gospel of JohnA word for word depiction of the life of Jesus...3.208172[Gospel of John Ltd., Toronto Film Studios, Vi...[CA, GB]9/11/034069090125.0[en]ReleasedFor God loved the world So much...The Visual Bible: The Gospel of John8.212
297010500000[Drama, Comedy]NaN88641[]enThere Goes My BabyA group of high school seniors meets in the su...0.037073[Nelson Entertainment][US]9/2/9412350999.0[en]ReleasedNaNThere Goes My Baby8.52
304110000000[Comedy, Drama, Romance]NaN27322[sex, ex-boyfriend, independent film, african ...enLove JonesDarius Lovehall is a young black poet in Chica...1.000178[New Line Cinema, Addis Wechsler Pictures][US]3/14/970104.0[en]ReleasedGet Together. Fall Apart. Start Over.Love Jones8.112
305720000000[Drama]http://www.historyx.com/73[usa, neo-nazi, prison, skinhead, brother brot...enAmerican History XDerek Vineyard is paroled after serving 3 year...73.567232[New Line Cinema, The Turman-Morrissey Company...[US]10/30/9823875127119.0[en]ReleasedSome Legacies Must End.American History X8.23016
32328000000[Thriller, Crime]NaN680[transporter, brothel, drug dealer, boxer, mas...enPulp FictionA burger-loving hit man, his philosophical par...121.463076[Miramax Films, A Band Apart, Jersey Films][US]10/8/94213928762154.0[en, es, fr]ReleasedJust because you are a character doesn't mean ...Pulp Fiction8.38428
33376000000[Drama, Crime]http://www.thegodfather.com/238[italy, love at first sight, loss of father, p...enThe GodfatherSpanning the years 1945 to 1955, a chronicle o...143.659698[Paramount Pictures, Alfran Productions][US]3/14/72245066411175.0[en, it, la]ReleasedAn offer you can't refuse.The Godfather8.45893
34546000000[Drama, Crime, Thriller]http://www.mgm.com/#/our-titles/2083/The-Usual...629[law, relatives, theft, criminal, criminal mas...enThe Usual SuspectsHeld in an L.A. interrogation room, Verbal Kin...64.025031[Blue Parrot Productions, Bad Hat Harry Produc...[US]7/19/9523341568NaN[es, en, fr, hu]ReleasedFive Criminals. One Line Up. No Coincidence.The Usual Suspects8.13254
35190[Comedy]NaN89861[italy, victorian england, young woman]enStiff Upper LipsStiff Upper Lips is a broad parody of British ...0.356495[][GB, IN]6/12/98099.0[en]ReleasedNaNStiff Upper Lips10.01
35739000000[Mystery, Thriller]http://www.otnemem.com/77[individual, insulin, tattoo, waitress, amnesi...enMementoSuffering short-term memory loss after a head ...60.715151[Summit Entertainment, Newmarket Capital Group...[US]10/11/0039723096113.0[en]ReleasedSome memories are best forgotten.Memento8.14028
36225000000[Western]NaN335[showdown, bounty, bounty hunter, loss of brot...itC'era una volta il WestThis classic western masterpiece is an epic fi...49.333137[Paramount Pictures, Rafran Cinematografica, F...[IT, ES, US]12/21/685321508175.0[en]ReleasedThere were three men in her life. One to take ...Once Upon a Time in the West8.11128
37193000000[Drama]NaN510[individual, rebel, self-destruction, wheelcha...enOne Flew Over the Cuckoo's NestWhile serving time for insanity at a state men...127.525581[United Artists, Fantasy Films, Warner Bros.][US]11/18/75108981275133.0[en]ReleasedIf he's crazy, what does that make you?One Flew Over the Cuckoo's Nest8.22919
37230[Drama, Family]http://www.anneofgreengables.com/17663[based on novel, brother sister relationship, ...enAnne of Green GablesAt the turn of the century on Prince Edward Is...8.772574[Sullivan Entertainment][US]12/1/850199.0[en]ReleasedNaNAnne of Green Gables8.268
37884000000[Drama]NaN242575[germany, mexican]enGuten Tag, RamónAfter five failed attempts to go to the United...1.771584[Eficine 226, Beanca Films, Fondo de Inversi\\u...[MX, DE]10/18/130119.0[de, es]ReleasedNaNGuten Tag, Ramón8.118
38653300000[Drama]http://sonyclassics.com/whiplash/244786[jazz, obsession, conservatory, music teacher,...enWhiplashUnder the direction of a ruthless instructor, ...192.528841[Bold Films, Blumhouse Productions, Right of W...[US]10/10/1413092000105.0[en]ReleasedThe road to greatness can take you to the edge.Whiplash8.34254
38663300000[Drama, Crime]http://cidadededeus.globo.com/598[male nudity, street gang, brazilian, photogra...ptCidade de DeusCidade de Deus is a shantytown that started du...44.356711[O2 Filmes, VideoFilmes, Wild Bunch, Globo fil...[BR, FR]2/5/0230641770130.0[pt]ReleasedIf you run you're dead... if you stay, you're ...City of God8.11814
39063000000[Comedy, Drama, Romance]NaN284[new york, new year's eve, lovesickness, age d...enThe ApartmentBud Baxter is a minor clerk in a huge New York...22.889294[United Artists, The Mirisch Company][US]6/15/6025000000125.0[en]ReleasedMovie-wise, there has never been anything like...The Apartment8.1483
39920[]NaN346081[]enSardaarjiA ghost hunter uses bottles to capture trouble...0.296981[][IN]6/26/1500.0[]ReleasedNaNSardaarji9.52
40450[Comedy, Drama, Family]NaN78373[small town, texas]enDancer, Texas Pop. 81Four guys, best friends, have grown up togethe...0.376662[HSX Films, Chase Productions][US]5/1/9856559297.0[en]Releasedin the middle of nowhere they had everythingDancer, Texas Pop. 8110.01
42381[Drama, Comedy]NaN3082[factory, ambulance, invention, tramp, great d...enModern TimesThe Tramp struggles to live in modern industri...28.276480[United Artists, Charles Chaplin Productions][US]2/5/36850000087.0[en]ReleasedHe stands alone as the greatest entertainer of...Modern Times8.1856
42471[Romance, Comedy, Drama]NaN361505[]enMe You and Five BucksA womanizing yet lovable loser, Charlie, a wai...0.094105[][]7/7/15090.0[]ReleasedA story about second, second chancesMe You and Five Bucks10.02
43021200000[Western]http://www.mgm.com/#/our-titles/766/The-Good,-...429[bounty hunter, refugee, gold, anti hero, gall...itIl buono, il brutto, il cattivoWhile the Civil War rages between the Union an...88.377076[United Artists, Constantin Film Produktion, P...[US, IT, ES, DE]12/23/666000000161.0[it]ReleasedFor three men the Civil War wasn't hell. It wa...The Good, the Bad and the Ugly8.12311
45352000000[Action, Drama]NaN346[japan, samurai, peasant, looting, rice, fenci...ja七人の侍A samurai answers a village's request for prot...39.756748[Toho Company][JP]4/26/54271841207.0[ja]ReleasedThe Mighty Warriors Who Became the Seven Natio...Seven Samurai8.2878
4602350000[Drama]NaN389[judge, jurors, sultriness, death penalty, fat...en12 Angry MenThe defense and the prosecution have rested an...59.259204[United Artists, Orion-Nova Productions][US]3/25/57100000096.0[en]ReleasedLife is in their hands. Death is on their minds.12 Angry Men8.22078
46620[Comedy]NaN40963[independent film]enLittle Big TopAn aging out of work clown returns to his smal...0.092100[Fly High Films][US]1/1/0600.0[en]RumoredNaNLittle Big Top10.01
475550000[Documentary]NaN322745[]enCountingAn associative collection of visual impression...0.293587[][US]2/9/150111.0[en]ReleasedNaNCounting8.33
\n", "
" ], "text/plain": [ " budget genres \\\n", "65 185000000 [Drama, Action, Crime, Thriller] \n", "95 165000000 [Adventure, Drama, Science Fiction] \n", "96 160000000 [Action, Thriller, Science Fiction, Mystery, A... \n", "329 94000000 [Adventure, Fantasy, Action] \n", "662 63000000 [Drama] \n", "690 60000000 [Fantasy, Drama, Crime] \n", "809 55000000 [Comedy, Drama, Romance] \n", "1553 33000000 [Crime, Mystery, Thriller] \n", "1663 30000000 [Drama, Crime] \n", "1818 22000000 [Drama, History, War] \n", "1847 25000000 [Drama, Crime] \n", "1881 25000000 [Drama, Crime] \n", "1987 24000000 [Fantasy, Animation, Adventure] \n", "1990 18000000 [Adventure, Action, Science Fiction] \n", "2091 19000000 [Crime, Drama, Thriller] \n", "2170 806948 [Drama, Horror, Thriller] \n", "2247 26500000 [Adventure, Fantasy, Animation] \n", "2284 19000000 [Horror, Thriller] \n", "2294 15000000 [Fantasy, Adventure, Animation, Family] \n", "2386 0 [Western, Action, Drama, History] \n", "2453 16400000 [Drama] \n", "2731 13000000 [Drama, Crime] \n", "2760 6000000 [Drama, Thriller] \n", "2796 0 [Adventure, Drama, Romance] \n", "2912 11000000 [Adventure, Action, Science Fiction] \n", "2947 10000000 [Drama, History] \n", "2970 10500000 [Drama, Comedy] \n", "3041 10000000 [Comedy, Drama, Romance] \n", "3057 20000000 [Drama] \n", "3232 8000000 [Thriller, Crime] \n", "3337 6000000 [Drama, Crime] \n", "3454 6000000 [Drama, Crime, Thriller] \n", "3519 0 [Comedy] \n", "3573 9000000 [Mystery, Thriller] \n", "3622 5000000 [Western] \n", "3719 3000000 [Drama] \n", "3723 0 [Drama, Family] \n", "3788 4000000 [Drama] \n", "3865 3300000 [Drama] \n", "3866 3300000 [Drama, Crime] \n", "3906 3000000 [Comedy, Drama, Romance] \n", "3992 0 [] \n", "4045 0 [Comedy, Drama, Family] \n", "4238 1 [Drama, Comedy] \n", "4247 1 [Romance, Comedy, Drama] \n", "4302 1200000 [Western] \n", "4535 2000000 [Action, Drama] \n", "4602 350000 [Drama] \n", "4662 0 [Comedy] \n", "4755 50000 [Documentary] \n", "\n", " homepage id \\\n", "65 http://thedarkknight.warnerbros.com/dvdsite/ 155 \n", "95 http://www.interstellarmovie.net/ 157336 \n", "96 http://inceptionmovie.warnerbros.com/ 27205 \n", "329 http://www.lordoftherings.net 122 \n", "662 http://www.foxmovies.com/movies/fight-club 550 \n", "690 http://thegreenmile.warnerbros.com/ 497 \n", "809 NaN 13 \n", "1553 http://www.sevenmovie.com/ 807 \n", "1663 NaN 311 \n", "1818 http://www.schindlerslist.com/ 424 \n", "1847 http://www.warnerbros.com/goodfellas 769 \n", "1881 NaN 278 \n", "1987 NaN 4935 \n", "1990 http://www.starwars.com/films/star-wars-episod... 1891 \n", "2091 NaN 274 \n", "2170 NaN 539 \n", "2247 NaN 128 \n", "2284 NaN 694 \n", "2294 http://movies.disney.com/spirited-away 129 \n", "2386 NaN 69848 \n", "2453 NaN 207 \n", "2731 NaN 240 \n", "2760 http://www.roomthemovie.com 264644 \n", "2796 NaN 43867 \n", "2912 http://www.starwars.com/films/star-wars-episod... 11 \n", "2947 NaN 30973 \n", "2970 NaN 88641 \n", "3041 NaN 27322 \n", "3057 http://www.historyx.com/ 73 \n", "3232 NaN 680 \n", "3337 http://www.thegodfather.com/ 238 \n", "3454 http://www.mgm.com/#/our-titles/2083/The-Usual... 629 \n", "3519 NaN 89861 \n", "3573 http://www.otnemem.com/ 77 \n", "3622 NaN 335 \n", "3719 NaN 510 \n", "3723 http://www.anneofgreengables.com/ 17663 \n", "3788 NaN 242575 \n", "3865 http://sonyclassics.com/whiplash/ 244786 \n", "3866 http://cidadededeus.globo.com/ 598 \n", "3906 NaN 284 \n", "3992 NaN 346081 \n", "4045 NaN 78373 \n", "4238 NaN 3082 \n", "4247 NaN 361505 \n", "4302 http://www.mgm.com/#/our-titles/766/The-Good,-... 429 \n", "4535 NaN 346 \n", "4602 NaN 389 \n", "4662 NaN 40963 \n", "4755 NaN 322745 \n", "\n", " keywords original_language \\\n", "65 [dc comics, crime fighter, secret identity, sc... en \n", "95 [saving the world, artificial intelligence, fa... en \n", "96 [loss of lover, dream, kidnapping, sleep, subc... en \n", "329 [elves, orcs, middle-earth (tolkien), based on... en \n", "662 [support group, dual identity, nihilism, rage ... en \n", "690 [southern usa, black people, mentally disabled... en \n", "809 [vietnam veteran, hippie, mentally disabled, r... en \n", "1553 [self-fulfilling prophecy, detective, s.w.a.t.... en \n", "1663 [life and death, corruption, street gang, rape... en \n", "1818 [factory, concentration camp, hero, holocaust,... en \n", "1847 [prison, based on novel, florida, 1970s, mass ... en \n", "1881 [prison, corruption, police brutality, prison ... en \n", "1987 [flying, witch, rain, castle, scarecrow, body ... ja \n", "1990 [rebel, android, asteroid, space battle, snow ... en \n", "2091 [based on novel, psychopath, horror, suspense,... en \n", "2170 [hotel, clerk, arizona, shower, rain, motel, m... en \n", "2247 [fight, wolf, village and town, iron, pan, wil... ja \n", "2284 [hotel, isolation, hotelier, colorado, maze, b... en \n", "2294 [witch, parents kids relationship, magic, twil... ja \n", "2386 [war, army, battlefield, chivalry] en \n", "2453 [individual, philosophy, poetry, shakespeare, ... en \n", "2731 [italo-american, cuba, vororte, melancholy, pr... en \n", "2760 [based on novel, carpet, isolation, kidnapping... en \n", "2796 [kidnapping, coronation, villain, kingdom, hei... en \n", "2912 [android, galaxy, hermit, death star, lightsab... en \n", "2947 [miracle, jesus christ, last supper, bible, ch... en \n", "2970 [] en \n", "3041 [sex, ex-boyfriend, independent film, african ... en \n", "3057 [usa, neo-nazi, prison, skinhead, brother brot... en \n", "3232 [transporter, brothel, drug dealer, boxer, mas... en \n", "3337 [italy, love at first sight, loss of father, p... en \n", "3454 [law, relatives, theft, criminal, criminal mas... en \n", "3519 [italy, victorian england, young woman] en \n", "3573 [individual, insulin, tattoo, waitress, amnesi... en \n", "3622 [showdown, bounty, bounty hunter, loss of brot... it \n", "3719 [individual, rebel, self-destruction, wheelcha... en \n", "3723 [based on novel, brother sister relationship, ... en \n", "3788 [germany, mexican] en \n", "3865 [jazz, obsession, conservatory, music teacher,... en \n", "3866 [male nudity, street gang, brazilian, photogra... pt \n", "3906 [new york, new year's eve, lovesickness, age d... en \n", "3992 [] en \n", "4045 [small town, texas] en \n", "4238 [factory, ambulance, invention, tramp, great d... en \n", "4247 [] en \n", "4302 [bounty hunter, refugee, gold, anti hero, gall... it \n", "4535 [japan, samurai, peasant, looting, rice, fenci... ja \n", "4602 [judge, jurors, sultriness, death penalty, fat... en \n", "4662 [independent film] en \n", "4755 [] en \n", "\n", " original_title \\\n", "65 The Dark Knight \n", "95 Interstellar \n", "96 Inception \n", "329 The Lord of the Rings: The Return of the King \n", "662 Fight Club \n", "690 The Green Mile \n", "809 Forrest Gump \n", "1553 Se7en \n", "1663 Once Upon a Time in America \n", "1818 Schindler's List \n", "1847 GoodFellas \n", "1881 The Shawshank Redemption \n", "1987 ハウルの動く城 \n", "1990 The Empire Strikes Back \n", "2091 The Silence of the Lambs \n", "2170 Psycho \n", "2247 もののけ姫 \n", "2284 The Shining \n", "2294 千と千尋の神隠し \n", "2386 One Man's Hero \n", "2453 Dead Poets Society \n", "2731 The Godfather: Part II \n", "2760 Room \n", "2796 The Prisoner of Zenda \n", "2912 Star Wars \n", "2947 The Visual Bible: The Gospel of John \n", "2970 There Goes My Baby \n", "3041 Love Jones \n", "3057 American History X \n", "3232 Pulp Fiction \n", "3337 The Godfather \n", "3454 The Usual Suspects \n", "3519 Stiff Upper Lips \n", "3573 Memento \n", "3622 C'era una volta il West \n", "3719 One Flew Over the Cuckoo's Nest \n", "3723 Anne of Green Gables \n", "3788 Guten Tag, Ramón \n", "3865 Whiplash \n", "3866 Cidade de Deus \n", "3906 The Apartment \n", "3992 Sardaarji \n", "4045 Dancer, Texas Pop. 81 \n", "4238 Modern Times \n", "4247 Me You and Five Bucks \n", "4302 Il buono, il brutto, il cattivo \n", "4535 七人の侍 \n", "4602 12 Angry Men \n", "4662 Little Big Top \n", "4755 Counting \n", "\n", " overview popularity \\\n", "65 Batman raises the stakes in his war on crime. ... 187.322927 \n", "95 Interstellar chronicles the adventures of a gr... 724.247784 \n", "96 Cobb, a skilled thief who commits corporate es... 167.583710 \n", "329 Aragorn is revealed as the heir to the ancient... 123.630332 \n", "662 A ticking-time-bomb insomniac and a slippery s... 146.757391 \n", "690 A supernatural tale set on death row in a Sout... 103.698022 \n", "809 A man with a low IQ has accomplished great thi... 138.133331 \n", "1553 Two homicide detectives are on a desperate hun... 79.579532 \n", "1663 A former Prohibition-era Jewish gangster retur... 49.336397 \n", "1818 The true story of how businessman Oskar Schind... 104.469351 \n", "1847 The true story of Henry Hill, a half-Irish, ha... 63.654244 \n", "1881 Framed in the 1940s for the double murder of h... 136.747729 \n", "1987 When Sophie, a shy young woman, is cursed with... 49.549984 \n", "1990 The epic saga continues as Luke Skywalker, in ... 78.517830 \n", "2091 FBI trainee, Clarice Starling ventures into a ... 18.174804 \n", "2170 When larcenous real estate clerk Marion Crane ... 93.067866 \n", "2247 Ashitaka, a prince of the disappearing Ainu tr... 60.732738 \n", "2284 Jack Torrance accepts a caretaker job at the O... 78.699993 \n", "2294 A ten year old girl who wanders away from her ... 118.968562 \n", "2386 One Man's Hero tells the little-known story of... 0.910529 \n", "2453 At an elite, old-fashioned boarding school in ... 55.458584 \n", "2731 In the continuing saga of the Corleone crime f... 105.792936 \n", "2760 Jack is a young boy of 5 years old who has liv... 66.113340 \n", "2796 An Englishman on a Ruritarian holiday must imp... 4.021389 \n", "2912 Princess Leia is captured and held hostage by ... 126.393695 \n", "2947 A word for word depiction of the life of Jesus... 3.208172 \n", "2970 A group of high school seniors meets in the su... 0.037073 \n", "3041 Darius Lovehall is a young black poet in Chica... 1.000178 \n", "3057 Derek Vineyard is paroled after serving 3 year... 73.567232 \n", "3232 A burger-loving hit man, his philosophical par... 121.463076 \n", "3337 Spanning the years 1945 to 1955, a chronicle o... 143.659698 \n", "3454 Held in an L.A. interrogation room, Verbal Kin... 64.025031 \n", "3519 Stiff Upper Lips is a broad parody of British ... 0.356495 \n", "3573 Suffering short-term memory loss after a head ... 60.715151 \n", "3622 This classic western masterpiece is an epic fi... 49.333137 \n", "3719 While serving time for insanity at a state men... 127.525581 \n", "3723 At the turn of the century on Prince Edward Is... 8.772574 \n", "3788 After five failed attempts to go to the United... 1.771584 \n", "3865 Under the direction of a ruthless instructor, ... 192.528841 \n", "3866 Cidade de Deus is a shantytown that started du... 44.356711 \n", "3906 Bud Baxter is a minor clerk in a huge New York... 22.889294 \n", "3992 A ghost hunter uses bottles to capture trouble... 0.296981 \n", "4045 Four guys, best friends, have grown up togethe... 0.376662 \n", "4238 The Tramp struggles to live in modern industri... 28.276480 \n", "4247 A womanizing yet lovable loser, Charlie, a wai... 0.094105 \n", "4302 While the Civil War rages between the Union an... 88.377076 \n", "4535 A samurai answers a village's request for prot... 39.756748 \n", "4602 The defense and the prosecution have rested an... 59.259204 \n", "4662 An aging out of work clown returns to his smal... 0.092100 \n", "4755 An associative collection of visual impression... 0.293587 \n", "\n", " production_companies production_countries \\\n", "65 [DC Comics, Legendary Pictures, Warner Bros., ... [GB, US] \n", "95 [Paramount Pictures, Legendary Pictures, Warne... [CA, US, GB] \n", "96 [Legendary Pictures, Warner Bros., Syncopy] [GB, US] \n", "329 [WingNut Films, New Line Cinema] [NZ, US] \n", "662 [Regency Enterprises, Fox 2000 Pictures, Tauru... [DE, US] \n", "690 [Castle Rock Entertainment, Darkwoods Producti... [US] \n", "809 [Paramount Pictures] [US] \n", "1553 [New Line Cinema, Juno Pix, Cecchi Gori Pictures] [US] \n", "1663 [Warner Bros., The Ladd Company] [US, IT] \n", "1818 [Universal Pictures, Amblin Entertainment] [US] \n", "1847 [Winkler Films] [US] \n", "1881 [Castle Rock Entertainment] [US] \n", "1987 [Studio Ghibli, Buena Vista Home Entertainment... [JP] \n", "1990 [Lucasfilm, Twentieth Century Fox Film Corpora... [US] \n", "2091 [Orion Pictures, Strong Heart/Demme Production] [US] \n", "2170 [Paramount Pictures, Universal Pictures, Shaml... [US] \n", "2247 [Miramax Films, Studio Ghibli, Nibariki, Nippo... [JP] \n", "2284 [Hawk Films, Warner Bros., Peregrine] [US, GB] \n", "2294 [Studio Ghibli] [JP] \n", "2386 [Filmax, Silver Lion Films, Televisa S.A. de C... [US, MX, ES] \n", "2453 [Touchstone Pictures, Silver Screen Partners IV] [US] \n", "2731 [Paramount Pictures, The Coppola Company] [US] \n", "2760 [T\\u00e9l\\u00e9film Canada, Film 4, Element Pi... [CA, IE] \n", "2796 [United Artists, Selznick International Pictures] [US] \n", "2912 [Lucasfilm, Twentieth Century Fox Film Corpora... [US] \n", "2947 [Gospel of John Ltd., Toronto Film Studios, Vi... [CA, GB] \n", "2970 [Nelson Entertainment] [US] \n", "3041 [New Line Cinema, Addis Wechsler Pictures] [US] \n", "3057 [New Line Cinema, The Turman-Morrissey Company... [US] \n", "3232 [Miramax Films, A Band Apart, Jersey Films] [US] \n", "3337 [Paramount Pictures, Alfran Productions] [US] \n", "3454 [Blue Parrot Productions, Bad Hat Harry Produc... [US] \n", "3519 [] [GB, IN] \n", "3573 [Summit Entertainment, Newmarket Capital Group... [US] \n", "3622 [Paramount Pictures, Rafran Cinematografica, F... [IT, ES, US] \n", "3719 [United Artists, Fantasy Films, Warner Bros.] [US] \n", "3723 [Sullivan Entertainment] [US] \n", "3788 [Eficine 226, Beanca Films, Fondo de Inversi\\u... [MX, DE] \n", "3865 [Bold Films, Blumhouse Productions, Right of W... [US] \n", "3866 [O2 Filmes, VideoFilmes, Wild Bunch, Globo fil... [BR, FR] \n", "3906 [United Artists, The Mirisch Company] [US] \n", "3992 [] [IN] \n", "4045 [HSX Films, Chase Productions] [US] \n", "4238 [United Artists, Charles Chaplin Productions] [US] \n", "4247 [] [] \n", "4302 [United Artists, Constantin Film Produktion, P... [US, IT, ES, DE] \n", "4535 [Toho Company] [JP] \n", "4602 [United Artists, Orion-Nova Productions] [US] \n", "4662 [Fly High Films] [US] \n", "4755 [] [US] \n", "\n", " release_date revenue runtime spoken_languages status \\\n", "65 7/16/08 1004558444 152.0 [en, zh] Released \n", "95 11/5/14 675120017 169.0 [en] Released \n", "96 7/14/10 825532764 148.0 [en, ja, fr] Released \n", "329 12/1/03 1118888979 201.0 [en] Released \n", "662 10/15/99 100853753 NaN [en] Released \n", "690 12/10/99 284600000 189.0 [fr, en] Released \n", "809 7/6/94 677945399 142.0 [en] Released \n", "1553 9/22/95 327311859 127.0 [en] Released \n", "1663 2/16/84 0 229.0 [en, fr, it] Released \n", "1818 11/29/93 321365567 195.0 [de, pl, he, en] Released \n", "1847 9/12/90 46836394 145.0 [it, en] Released \n", "1881 9/23/94 28341469 142.0 [en] Released \n", "1987 11/19/04 234710455 119.0 [ja] Released \n", "1990 5/17/80 538400000 124.0 [en] Released \n", "2091 2/1/91 272742922 119.0 [en] Released \n", "2170 6/16/60 32000000 109.0 [en] Released \n", "2247 7/12/97 159375308 134.0 [ja] Released \n", "2284 5/22/80 44017374 144.0 [en] Released \n", "2294 7/20/01 274925095 125.0 [ja] Released \n", "2386 8/2/99 0 121.0 [en] Released \n", "2453 6/2/89 235860116 129.0 [en] Released \n", "2731 12/20/74 47542841 200.0 [en, it, la, es] Released \n", "2760 10/16/15 35401758 117.0 [en] Released \n", "2796 9/3/37 0 NaN [es, en] Released \n", "2912 5/25/77 775398007 121.0 [en] Released \n", "2947 9/11/03 4069090 125.0 [en] Released \n", "2970 9/2/94 123509 99.0 [en] Released \n", "3041 3/14/97 0 104.0 [en] Released \n", "3057 10/30/98 23875127 119.0 [en] Released \n", "3232 10/8/94 213928762 154.0 [en, es, fr] Released \n", "3337 3/14/72 245066411 175.0 [en, it, la] Released \n", "3454 7/19/95 23341568 NaN [es, en, fr, hu] Released \n", "3519 6/12/98 0 99.0 [en] Released \n", "3573 10/11/00 39723096 113.0 [en] Released \n", "3622 12/21/68 5321508 175.0 [en] Released \n", "3719 11/18/75 108981275 133.0 [en] Released \n", "3723 12/1/85 0 199.0 [en] Released \n", "3788 10/18/13 0 119.0 [de, es] Released \n", "3865 10/10/14 13092000 105.0 [en] Released \n", "3866 2/5/02 30641770 130.0 [pt] Released \n", "3906 6/15/60 25000000 125.0 [en] Released \n", "3992 6/26/15 0 0.0 [] Released \n", "4045 5/1/98 565592 97.0 [en] Released \n", "4238 2/5/36 8500000 87.0 [en] Released \n", "4247 7/7/15 0 90.0 [] Released \n", "4302 12/23/66 6000000 161.0 [it] Released \n", "4535 4/26/54 271841 207.0 [ja] Released \n", "4602 3/25/57 1000000 96.0 [en] Released \n", "4662 1/1/06 0 0.0 [en] Rumored \n", "4755 2/9/15 0 111.0 [en] Released \n", "\n", " tagline \\\n", "65 Why So Serious? \n", "95 Mankind was born on Earth. It was never meant ... \n", "96 Your mind is the scene of the crime. \n", "329 The eye of the enemy is moving. \n", "662 Mischief. Mayhem. Soap. \n", "690 Miracles do happen. \n", "809 The world will never be the same, once you've ... \n", "1553 Seven deadly sins. Seven ways to die. \n", "1663 Crime, passion and lust for power - Sergio Leo... \n", "1818 Whoever saves one life, saves the world entire. \n", "1847 Three Decades of Life in the Mafia. \n", "1881 Fear can hold you prisoner. Hope can set you f... \n", "1987 The two lived there \n", "1990 The Adventure Continues... \n", "2091 To enter the mind of a killer she must challen... \n", "2170 The master of suspense moves his cameras into ... \n", "2247 The Fate Of The World Rests On The Courage Of ... \n", "2284 A masterpiece of modern horror. \n", "2294 The tunnel led Chihiro to a mysterious town... \n", "2386 One man's hero is another man's traitor. \n", "2453 He was their inspiration. He made their lives ... \n", "2731 I don't feel I have to wipe everybody out, Tom... \n", "2760 Love knows no boundaries \n", "2796 The most thrilling swordfight ever filmed... \n", "2912 A long time ago in a galaxy far, far away... \n", "2947 For God loved the world So much... \n", "2970 NaN \n", "3041 Get Together. Fall Apart. Start Over. \n", "3057 Some Legacies Must End. \n", "3232 Just because you are a character doesn't mean ... \n", "3337 An offer you can't refuse. \n", "3454 Five Criminals. One Line Up. No Coincidence. \n", "3519 NaN \n", "3573 Some memories are best forgotten. \n", "3622 There were three men in her life. One to take ... \n", "3719 If he's crazy, what does that make you? \n", "3723 NaN \n", "3788 NaN \n", "3865 The road to greatness can take you to the edge. \n", "3866 If you run you're dead... if you stay, you're ... \n", "3906 Movie-wise, there has never been anything like... \n", "3992 NaN \n", "4045 in the middle of nowhere they had everything \n", "4238 He stands alone as the greatest entertainer of... \n", "4247 A story about second, second chances \n", "4302 For three men the Civil War wasn't hell. It wa... \n", "4535 The Mighty Warriors Who Became the Seven Natio... \n", "4602 Life is in their hands. Death is on their minds. \n", "4662 NaN \n", "4755 NaN \n", "\n", " title vote_average vote_count \n", "65 The Dark Knight 8.2 12002 \n", "95 Interstellar 8.1 10867 \n", "96 Inception 8.1 13752 \n", "329 The Lord of the Rings: The Return of the King 8.1 8064 \n", "662 Fight Club 8.3 9413 \n", "690 The Green Mile 8.2 4048 \n", "809 Forrest Gump 8.2 7927 \n", "1553 Se7en 8.1 5765 \n", "1663 Once Upon a Time in America 8.2 1069 \n", "1818 Schindler's List 8.3 4329 \n", "1847 GoodFellas 8.2 3128 \n", "1881 The Shawshank Redemption 8.5 8205 \n", "1987 Howl's Moving Castle 8.2 1991 \n", "1990 The Empire Strikes Back 8.2 5879 \n", "2091 The Silence of the Lambs 8.1 4443 \n", "2170 Psycho 8.2 2320 \n", "2247 Princess Mononoke 8.2 1983 \n", "2284 The Shining 8.1 3757 \n", "2294 Spirited Away 8.3 3840 \n", "2386 One Man's Hero 9.3 2 \n", "2453 Dead Poets Society 8.1 2705 \n", "2731 The Godfather: Part II 8.3 3338 \n", "2760 Room 8.1 2757 \n", "2796 The Prisoner of Zenda 8.4 11 \n", "2912 Star Wars 8.1 6624 \n", "2947 The Visual Bible: The Gospel of John 8.2 12 \n", "2970 There Goes My Baby 8.5 2 \n", "3041 Love Jones 8.1 12 \n", "3057 American History X 8.2 3016 \n", "3232 Pulp Fiction 8.3 8428 \n", "3337 The Godfather 8.4 5893 \n", "3454 The Usual Suspects 8.1 3254 \n", "3519 Stiff Upper Lips 10.0 1 \n", "3573 Memento 8.1 4028 \n", "3622 Once Upon a Time in the West 8.1 1128 \n", "3719 One Flew Over the Cuckoo's Nest 8.2 2919 \n", "3723 Anne of Green Gables 8.2 68 \n", "3788 Guten Tag, Ramón 8.1 18 \n", "3865 Whiplash 8.3 4254 \n", "3866 City of God 8.1 1814 \n", "3906 The Apartment 8.1 483 \n", "3992 Sardaarji 9.5 2 \n", "4045 Dancer, Texas Pop. 81 10.0 1 \n", "4238 Modern Times 8.1 856 \n", "4247 Me You and Five Bucks 10.0 2 \n", "4302 The Good, the Bad and the Ugly 8.1 2311 \n", "4535 Seven Samurai 8.2 878 \n", "4602 12 Angry Men 8.2 2078 \n", "4662 Little Big Top 10.0 1 \n", "4755 Counting 8.3 3 " ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filter_1 = movies['vote_average'] > 8.0\n", "print(filter_1.any())\n", "movies[movies['vote_average'] > 8.0]" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filter_2 = movies['vote_average'] > 8.0\n", "filter_2.all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Handling Missing Data

" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4803, 20)" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.shape" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "budget False\n", "genres False\n", "homepage True\n", "id False\n", "keywords False\n", "original_language False\n", "original_title False\n", "overview True\n", "popularity False\n", "production_companies False\n", "production_countries False\n", "release_date True\n", "revenue False\n", "runtime True\n", "spoken_languages False\n", "status False\n", "tagline True\n", "title False\n", "vote_average False\n", "vote_count False\n", "dtype: bool" ] }, "execution_count": 97, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Check if there are Null values in each row\n", "movies.isnull().any()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with fixing the categorical NaN values." ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "collapsed": true }, "outputs": [], "source": [ "movies_filled = movies\n", "\n", "movies_filled['homepage'] = movies_filled['homepage'].fillna(value='None')\n", "movies_filled['overview'] = movies_filled['overview'].fillna(value='')\n", "movies_filled['tagline'] = movies_filled['tagline'].fillna(value='')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's do the numerical NaN values." ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "50\n", "1\n" ] } ], "source": [ "print(movies_filled['runtime'].isnull().sum())\n", "print(movies_filled['release_date'].isnull().sum())" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [], "source": [ "movies_filled['runtime'] = movies_filled['runtime'].fillna(value=movies_filled['runtime'].median())" ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "collapsed": true }, "outputs": [], "source": [ "movies_filled = movies_filled.dropna()" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "budget False\n", "genres False\n", "homepage False\n", "id False\n", "keywords False\n", "original_language False\n", "original_title False\n", "overview False\n", "popularity False\n", "production_companies False\n", "production_countries False\n", "release_date False\n", "revenue False\n", "runtime False\n", "spoken_languages False\n", "status False\n", "tagline False\n", "title False\n", "vote_average False\n", "vote_count False\n", "dtype: bool" ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_filled.isnull().any()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thats nice! No NULL rows!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }