{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## IDENTIFYING MOST COMMON NBA PLAYERS FROM THE 2017 DRAFT CLASS USING PCA DIMENSIONALITY REDUCTION AND K NEAREST NEIGHBORS ALGORITHM\n", "\n", "**Its so common that we hear talking heads tell us about how Lonzo Ball looks like the next Jason Kidd, or how John Jackson is a better shooting version of Kawhi Leonard. But there's a lot of inherent bias in the prognostications. One - they're limited to players we're familiar with; Two - we choose to see certain aspects of a players' game, wanted to describe someone as a great shooter, or passer, or rebounder. The goal of this analysis is to strip away those biases and get the most accurate comparisons possible, using the best data we have available. **\n", "\n", "The approach is pretty straightforward, and is outlined below before digging into all of the code.\n", "\n", "* Take every player who has been drafted since 2010 - who also played in the NCAA\n", "* Append their basic and advanced college stats from CBB Reference\n", "* Take all NCAA players from this season and retrieve their advanced stats as well. \n", "* Since we have about 36 different statistics - there's alot of covariance among our features, so we'll perform something called \"dimensionality reduction\" to reduce them to the fewest # of features that can explain the variance we see in our dataset\n", "* Take every player and measure their euclidean distance to every other player in the dataset \n", "* Limit the dataset to NCAA players compared to NBA players (this is the comparison we wanted to make from the beginning)\n", "* return every NBA player and sort by ascending distance metric\n", "* limit to Chad Fords top 50 and export to CSV" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#import necessary libraries\n", "import pandas as pd\n", "import numpy as np\n", "from time import sleep \n", "from scipy.spatial import distance\n", "from scipy.spatial.distance import squareform\n", "from scipy.spatial.distance import pdist\n", "from sklearn.decomposition import PCA" ] }, { "cell_type": "code", "execution_count": 158, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#read in datasets\n", "nba = pd.read_csv(\"nba_draft_picks_final.csv\")\n", "\n", "df = pd.read_csv(\"ncaa_stats.csv\")\n", "df.drop('Unnamed: 0',axis=1,inplace=True)" ] }, { "cell_type": "code", "execution_count": 228, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#scrape data from bbref if pulling data for the first time\n", "#df = pd.read_html(\"http://www.sports-reference.com/cbb/play-index/psl_finder.cgi?request=1&match=single&year_min=2011&year_max=&conf_id=&school_id=&class_is_fr=Y&class_is_so=Y&class_is_jr=Y&class_is_sr=Y&pos_is_g=Y&pos_is_gf=Y&pos_is_fg=Y&pos_is_f=Y&pos_is_fc=Y&pos_is_cf=Y&pos_is_c=Y&games_type=A&qual=pts_per_g&c1stat=mp_per_g&c1comp=gt&c1val=10&c2stat=pts_per_g&c2comp=gt&c2val=5&c3stat=&c3comp=&c3val=&c4stat=&c4comp=&c4val=&order_by=bpm&order_by_asc=\")[0]\n", "#i = 100\n", "\n", "#while(i < 15000):\n", "# print(\"Number of players retrieved:\", str(i))\n", "# df = df.append(pd.read_html(\"http://www.sports-reference.com/cbb/play-index/psl_finder.cgi?request=1&match=single&year_min=2011&year_max=&conf_id=&school_id=&class_is_fr=Y&class_is_so=Y&class_is_jr=Y&class_is_sr=Y&pos_is_g=Y&pos_is_gf=Y&pos_is_fg=Y&pos_is_f=Y&pos_is_fc=Y&pos_is_cf=Y&pos_is_c=Y&games_type=A&qual=pts_per_g&c1stat=mp_per_g&c1comp=gt&c1val=10&c2stat=pts_per_g&c2comp=gt&c2val=5&c3stat=&c3comp=&c3val=&c4stat=&c4comp=&c4val=&order_by=bpm&order_by_asc=&offset=\"+str(i))[0])\n", "# i = i+100\n", "# sleep(10)\n", "\n", "#realign columns properly\n", "#cols = ['Rk', 'Player', 'Class', 'Season',\n", "# 'Pos', 'School', 'Conf', 'G', 'MP', 'MP.1', 'FG', 'FGA', '2P', '2PA',\n", "# '3P', '3PA', 'FT', 'FTA', 'ORB', 'DRB', 'TRB', 'AST', 'STL', 'BLK',\n", "# 'TOV', 'PF', 'PTS', 'PER', 'TS%', 'eFG%', 'ORB%', 'DRB%', 'TRB%',\n", "# 'AST%', 'STL%', 'BLK%', 'TOV%', 'USG%', 'PProd', 'ORtg', 'DRtg', 'OWS',\n", "# 'DWS', 'WS', 'OBPM', 'DBPM', 'BPM','drop1','drop2','drop3']\n", "\n", "#df.columns = cols\n", "\n", "#df.drop(['drop1','drop2','drop3'], axis=1, inplace=True)\n", "\n", "#df = df.drop(df[df.Class == 'Advanced'].index)\n", "#df = df.drop(df[df.Class == 'Class'].index)\n", "\n", "#send to CSV for perpetuity\n", "#df.to_csv(\"ncaa_stats.csv\")" ] }, { "cell_type": "code", "execution_count": 229, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#get the max year for each player (only one row for the season)\n", "df_new = df.groupby(['Player'])['Season'].transform(max) == df['Season']\n", "\n", "stats = df[df_new]" ] }, { "cell_type": "code", "execution_count": 230, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#limit to the few columns we need from the NBA dataset\n", "nba = nba[['Player','Rd','Pk','Year']]" ] }, { "cell_type": "code", "execution_count": 231, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#merge nba players and get their ncaa stats\n", "draft_stats = pd.merge(nba,stats,on='Player',how='left')\n", "\n", "draft_stats = draft_stats.dropna(subset=['Class']) " ] }, { "cell_type": "code", "execution_count": 233, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\coreyjez\\Anaconda3\\lib\\site-packages\\ipykernel\\__main__.py:2: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " from ipykernel import kernelapp as app\n" ] } ], "source": [ "#limit to the most recent NCAA season and drop the raw MP column (we only want Min/Gm)\n", "test = stats[stats['Season']== '2016-17']\n", "test.drop('MP',axis=1,inplace=True)\n", "\n", "draft_stats_test = draft_stats[draft_stats['Year'] < 2017]" ] }, { "cell_type": "code", "execution_count": 236, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#drop unwanted columns\n", "draft_stats_vars = draft_stats_test.drop(['Rd','Pk','Year','Rk','Class','Season','Pos','School','Conf','G','MP'],axis=1)\n", "test_16_vars = test.drop(['Class','Rk','Season','Pos','School','Conf','G'],axis=1)" ] }, { "cell_type": "code", "execution_count": 237, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "test df shape: (2080, 39)\n", "draft df shape: (272, 39)\n" ] } ], "source": [ "#verify they have the same # of columns\n", "print (\"test df shape:\",test_16_vars.shape)\n", "print (\"draft df shape:\",draft_stats_vars.shape)" ] }, { "cell_type": "code", "execution_count": 238, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# concat the two DFs together into one big DF \n", "final = pd.concat([test_16_vars,draft_stats_vars])\n", "final = final.reset_index()" ] }, { "cell_type": "code", "execution_count": 239, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#finally drop the player and additional index column\n", "df_final = final.drop(['Player','index'],axis=1)" ] }, { "cell_type": "code", "execution_count": 240, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#normalize the data so that distance is equalized regardless of the scale of the metric\n", "final_normal = (df_final - df_final.mean()) / df_final.std()" ] }, { "cell_type": "code", "execution_count": 241, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " MP.1 FG FGA 2P 2PA 3P 3PA \\\n", "0 1.214320 1.676989 1.813782 1.160537 1.524675 0.978063 0.792355 \n", "1 0.184873 1.405143 0.496129 2.210807 1.755843 -1.315460 -1.417786 \n", "2 -1.393613 -0.905546 -1.464283 -0.169806 -0.594367 -1.315460 -1.417786 \n", "3 0.339290 0.385722 -0.500146 1.090519 0.407362 -1.194749 -1.229689 \n", "4 0.510865 -0.157970 -0.628698 -0.309842 -0.787007 0.133081 0.039967 \n", "\n", " FT FTA ORB ... USG% PProd ORtg \\\n", "0 3.794487 3.365690 1.594710 ... 1.726118 2.079515 1.316056 \n", "1 -0.014472 0.844939 2.430493 ... 1.105218 1.340251 0.764449 \n", "2 -1.149055 -0.856568 0.400733 ... -1.292741 -1.222051 0.291643 \n", "3 -0.095513 -0.037324 2.072300 ... -0.736072 0.708647 1.473658 \n", "4 -0.662805 -0.919586 -0.076857 ... -1.314151 0.012447 2.133616 \n", "\n", " DRtg OWS DWS WS OBPM DBPM BPM \n", "0 -2.210414 2.258310 2.564215 2.687228 2.628517 2.274177 3.040330 \n", "1 -2.560484 1.446197 2.875438 2.201929 1.684870 3.228015 2.956779 \n", "2 -2.088651 -0.584086 0.541266 -0.224570 0.363764 4.108481 2.580799 \n", "3 -2.179973 1.283774 2.875438 2.080604 1.087227 3.117957 2.497248 \n", "4 -1.525496 1.202563 1.941769 1.655967 1.684870 2.347549 2.455473 \n", "\n", "[5 rows x 38 columns]\n" ] }, { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
component_1component_2component_3component_4component_5component_6component_7component_8component_9component_10...component_29component_30component_31component_32component_33component_34component_35component_36component_37component_38
010.2760671.6058260.7729212.503615-1.112311-1.594821-0.5894600.639170-0.1749851.860121...-0.173023-0.133946-0.1413750.0027170.017823-0.004699-0.0419120.010904-0.0053660.002974
18.860138-3.794910-0.0776454.185647-0.216325-0.925672-1.518631-0.1449310.604341-1.232700...-0.267700-0.0141890.101492-0.0104170.005115-0.0000980.002593-0.0011630.001736-0.000472
20.206886-6.2659152.3904404.4155170.160699-1.1426732.5028303.200884-0.197537-0.942359...0.176014-0.076339-0.066579-0.015415-0.015631-0.0000410.002867-0.022992-0.002697-0.002196
36.413919-5.7152632.3185233.5362360.2691410.5166241.0591610.1967170.208271-1.431863...0.0271900.0386640.004368-0.0081950.0032810.0089770.004088-0.0178370.001622-0.000914
42.651144-1.2051504.7644764.2852650.4409310.745651-0.7085351.183168-0.092450-0.207216...0.1202020.0483200.0732310.0119290.018293-0.037068-0.039336-0.0057050.000903-0.001435
\n", "

5 rows × 38 columns

\n", "
" ], "text/plain": [ " component_1 component_2 component_3 component_4 component_5 \\\n", "0 10.276067 1.605826 0.772921 2.503615 -1.112311 \n", "1 8.860138 -3.794910 -0.077645 4.185647 -0.216325 \n", "2 0.206886 -6.265915 2.390440 4.415517 0.160699 \n", "3 6.413919 -5.715263 2.318523 3.536236 0.269141 \n", "4 2.651144 -1.205150 4.764476 4.285265 0.440931 \n", "\n", " component_6 component_7 component_8 component_9 component_10 \\\n", "0 -1.594821 -0.589460 0.639170 -0.174985 1.860121 \n", "1 -0.925672 -1.518631 -0.144931 0.604341 -1.232700 \n", "2 -1.142673 2.502830 3.200884 -0.197537 -0.942359 \n", "3 0.516624 1.059161 0.196717 0.208271 -1.431863 \n", "4 0.745651 -0.708535 1.183168 -0.092450 -0.207216 \n", "\n", " ... component_29 component_30 component_31 component_32 \\\n", "0 ... -0.173023 -0.133946 -0.141375 0.002717 \n", "1 ... -0.267700 -0.014189 0.101492 -0.010417 \n", "2 ... 0.176014 -0.076339 -0.066579 -0.015415 \n", "3 ... 0.027190 0.038664 0.004368 -0.008195 \n", "4 ... 0.120202 0.048320 0.073231 0.011929 \n", "\n", " component_33 component_34 component_35 component_36 component_37 \\\n", "0 0.017823 -0.004699 -0.041912 0.010904 -0.005366 \n", "1 0.005115 -0.000098 0.002593 -0.001163 0.001736 \n", "2 -0.015631 -0.000041 0.002867 -0.022992 -0.002697 \n", "3 0.003281 0.008977 0.004088 -0.017837 0.001622 \n", "4 0.018293 -0.037068 -0.039336 -0.005705 0.000903 \n", "\n", " component_38 \n", "0 0.002974 \n", "1 -0.000472 \n", "2 -0.002196 \n", "3 -0.000914 \n", "4 -0.001435 \n", "\n", "[5 rows x 38 columns]" ] }, "execution_count": 241, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#introduce PCA Dimensionality Reduction to get the best features that explain the variance in our matrix\n", "pca = PCA()\n", "transformed_pca_x = pca.fit_transform(final_normal)\n", "#create component indices\n", "component_names = [\"component_\"+str(comp) for comp in range(1, len(pca.explained_variance_)+1)]\n", "\n", "#generate new component dataframe\n", "transformed_pca_x = pd.DataFrame(transformed_pca_x,columns=component_names)" ] }, { "cell_type": "code", "execution_count": 264, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "explained variance running sum by component: component_1 0.361325\n", "component_2 0.583178\n", "component_3 0.699289\n", "component_4 0.779868\n", "component_5 0.822355\n", "component_6 0.852970\n", "component_7 0.879692\n", "component_8 0.904938\n", "component_9 0.925857\n", "component_10 0.942240\n", "component_11 0.955429\n", "component_12 0.967679\n", "component_13 0.977788\n", "component_14 0.985724\n", "component_15 0.989414\n", "component_16 0.991526\n", "component_17 0.993078\n", "component_18 0.994361\n", "component_19 0.995431\n", "component_20 0.996239\n", "component_21 0.996847\n", "component_22 0.997343\n", "component_23 0.997810\n", "component_24 0.998226\n", "component_25 0.998636\n", "component_26 0.998989\n", "component_27 0.999272\n", "component_28 0.999514\n", "component_29 0.999744\n", "component_30 0.999874\n", "component_31 0.999943\n", "component_32 0.999957\n", "component_33 0.999970\n", "component_34 0.999982\n", "component_35 0.999991\n", "component_36 0.999996\n", "component_37 0.999998\n", "component_38 1.000000\n", "Name: explained_variance_ratio, dtype: float64\n" ] } ], "source": [ "#generate component loadings on original features\n", "component_matrix = pd.DataFrame(pca.components_,index=component_names,columns = df_final.columns)\n", "#add additional columns to describe what\n", "component_matrix[\"explained_variance_ratio\"] = pca.explained_variance_ratio_\n", "component_matrix[\"eigenvalue\"] = pca.explained_variance_\n", "\n", "print(\"explained variance running sum by component:\",component_matrix.explained_variance_ratio.cumsum())" ] }, { "cell_type": "code", "execution_count": 243, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#so letes perform the KNN algorithm on components 1-14, since they expalin 98.15% of the variance in the dataset\n", "pca_final = transformed_pca_x.iloc[:,:14]" ] }, { "cell_type": "code", "execution_count": 245, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#get the distance between every obs in the final D\n", "distances_euclidean = pdist(pca_final, metric='euclidean')" ] }, { "cell_type": "code", "execution_count": 246, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0. , 20.88732888, 35.13218736, ..., 25.94866902,\n", " 24.31348246, 24.06270058],\n", " [ 20.88732888, 0. , 25.82637323, ..., 26.99952771,\n", " 23.93688809, 24.37278189],\n", " [ 35.13218736, 25.82637323, 0. , ..., 24.78983909,\n", " 27.13142578, 28.85847513],\n", " ..., \n", " [ 25.94866902, 26.99952771, 24.78983909, ..., 0. ,\n", " 14.00979177, 16.17721438],\n", " [ 24.31348246, 23.93688809, 27.13142578, ..., 14.00979177,\n", " 0. , 12.05671275],\n", " [ 24.06270058, 24.37278189, 28.85847513, ..., 16.17721438,\n", " 12.05671275, 0. ]])" ] }, "execution_count": 246, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#create a pairwise matrix \n", "distances_matrix = squareform(distances_euclidean)" ] }, { "cell_type": "code", "execution_count": 247, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...234323442345234623472348234923502351id_a
00.00000020.88732935.13218726.08763626.70291227.04379125.08999929.28391720.23640216.541398...23.73244630.05893927.20215521.13240425.29295330.49201525.94866924.31348224.0627010
120.8873290.00000025.82637316.57221322.91360319.19710719.54123027.64111921.75782118.055367...17.03543625.57564821.26882223.11825329.96060526.59215126.99952823.93688824.3727821
235.13218725.8263730.00000018.61454920.94198113.92947829.31032833.33411229.19435133.430191...23.72991325.14138423.21590240.10485824.37020035.45735924.78983927.13142628.8584752
326.08763616.57221318.6145490.00000019.28861412.72709119.42901321.49355922.04790921.372450...16.09110816.14892224.85278729.46292426.80617627.93524426.34519224.07482226.5799033
426.70291222.91360320.94198119.2886140.00000015.40614621.81269121.84256015.93010620.433810...20.20671724.15560816.82144428.14799421.50679127.47698016.49690320.10214823.6062904
\n", "

5 rows × 2353 columns

\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 \\\n", "0 0.000000 20.887329 35.132187 26.087636 26.702912 27.043791 \n", "1 20.887329 0.000000 25.826373 16.572213 22.913603 19.197107 \n", "2 35.132187 25.826373 0.000000 18.614549 20.941981 13.929478 \n", "3 26.087636 16.572213 18.614549 0.000000 19.288614 12.727091 \n", "4 26.702912 22.913603 20.941981 19.288614 0.000000 15.406146 \n", "\n", " 6 7 8 9 ... 2343 2344 \\\n", "0 25.089999 29.283917 20.236402 16.541398 ... 23.732446 30.058939 \n", "1 19.541230 27.641119 21.757821 18.055367 ... 17.035436 25.575648 \n", "2 29.310328 33.334112 29.194351 33.430191 ... 23.729913 25.141384 \n", "3 19.429013 21.493559 22.047909 21.372450 ... 16.091108 16.148922 \n", "4 21.812691 21.842560 15.930106 20.433810 ... 20.206717 24.155608 \n", "\n", " 2345 2346 2347 2348 2349 2350 \\\n", "0 27.202155 21.132404 25.292953 30.492015 25.948669 24.313482 \n", "1 21.268822 23.118253 29.960605 26.592151 26.999528 23.936888 \n", "2 23.215902 40.104858 24.370200 35.457359 24.789839 27.131426 \n", "3 24.852787 29.462924 26.806176 27.935244 26.345192 24.074822 \n", "4 16.821444 28.147994 21.506791 27.476980 16.496903 20.102148 \n", "\n", " 2351 id_a \n", "0 24.062701 0 \n", "1 24.372782 1 \n", "2 28.858475 2 \n", "3 26.579903 3 \n", "4 23.606290 4 \n", "\n", "[5 rows x 2353 columns]" ] }, "execution_count": 247, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#transform that pairwise matrix into a dataframe and add an index field for joining\n", "distances = pd.DataFrame(distances_matrix)\n", "distances['id_a'] = range(0, len(distances))" ] }, { "cell_type": "code", "execution_count": 248, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id_avariablevalue
0000.000000
11020.887329
22035.132187
33026.087636
44026.702912
\n", "
" ], "text/plain": [ " id_a variable value\n", "0 0 0 0.000000\n", "1 1 0 20.887329\n", "2 2 0 35.132187\n", "3 3 0 26.087636\n", "4 4 0 26.702912" ] }, "execution_count": 248, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#pivot this data so that we can have one row per player comparison\n", "distances_final = pd.melt(distances,id_vars='id_a')\n", "\n", "#rename columns\n", "cols = ['player_a','player_b','eucl_dist']\n", "distances_final.columns = cols" ] }, { "cell_type": "code", "execution_count": 251, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#merge over the players' names'\n", "final1 = pd.merge(distances_final, final, how='inner',left_on='player_a', right_index=True)\n", "\n", "final2 = pd.merge(final1,final,how='inner',left_on='player_b',right_index=True)\n", "\n", "final2 = final2[['eucl_dist','Player_y','Player_x']]" ] }, { "cell_type": "code", "execution_count": 255, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\coreyjez\\Anaconda3\\lib\\site-packages\\ipykernel\\__main__.py:2: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " from ipykernel import kernelapp as app\n", "C:\\Users\\coreyjez\\Anaconda3\\lib\\site-packages\\ipykernel\\__main__.py:5: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n" ] } ], "source": [ "#create lookup tables \n", "ncaa_lookup = test[['Player']]\n", "ncaa_lookup['player_type'] = 'NCAA'\n", "\n", "nba_lookup = draft_stats_test[['Player']]\n", "nba_lookup['player_type'] = 'NBA'" ] }, { "cell_type": "code", "execution_count": 256, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#join over whether or not the players are NBA comparisons or NCAA players\n", "final3 = pd.merge(final2,ncaa_lookup,how='left',left_on='Player_y',right_on='Player')\n", "final3 = final3.drop('Player',axis=1)\n", "final3.rename(columns={'player_type':'player_y_type'},inplace=True)\n", "\n", "final4 = pd.merge(final3,ncaa_lookup,how='left',left_on='Player_x',right_on='Player')\n", "final4.drop('Player',axis=1,inplace=True)\n", "final4.rename(columns={'player_type':'player_x_type'},inplace=True)\n", "\n", "final4.fillna('NBA',inplace=True)" ] }, { "cell_type": "code", "execution_count": 259, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#create final final dataframe with one NCAA players compared to NBA players\n", "final_final = final4[(final4.player_y_type =='NCAA') & (final4.player_x_type=='NBA')]\n", "\n", "#df[(df.A == 1) & (df.D == 6)]" ] }, { "cell_type": "code", "execution_count": 265, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
eucl_distPlayer_yPlayer_xplayer_y_typeplayer_x_type
1884713.716531Lonzo BallShane LarkinNCAANBA
1876715.257080Lonzo BallReggie JacksonNCAANBA
1884315.593754Lonzo BallJerian GrantNCAANBA
1892916.083855Lonzo BallLamar PattersonNCAANBA
1887016.613314Lonzo BallPatrick McCawNCAANBA
1888617.019915Lonzo BallShabazz NapierNCAANBA
1886517.092628Lonzo BallTyus JonesNCAANBA
1896217.203179Lonzo BallLorenzo BrownNCAANBA
1885317.436172Lonzo BallReggie BullockNCAANBA
1875718.096593Lonzo BallIsaiah ThomasNCAANBA
1886918.109116Lonzo BallDelon WrightNCAANBA
1888218.261786Lonzo BallD'Angelo RussellNCAANBA
1875118.609615Lonzo BallTrey BurkeNCAANBA
1883618.690465Lonzo BallBen McLemoreNCAANBA
1890318.895548Lonzo BallDenzel ValentineNCAANBA
1890118.964175Lonzo BallTyler UlisNCAANBA
1897319.131855Lonzo BallWade BaldwinNCAANBA
1895819.170505Lonzo BallMichael GbinijeNCAANBA
1891319.406235Lonzo BallPat ConnaughtonNCAANBA
1898419.436834Lonzo BallMarcus DenmonNCAANBA
1893119.696883Lonzo BallKris DunnNCAANBA
1900619.841475Lonzo BallIsaiah CousinsNCAANBA
1880019.850398Lonzo BallAllen CrabbeNCAANBA
1880819.928488Lonzo BallSolomon HillNCAANBA
1887819.954606Lonzo BallJamal MurrayNCAANBA
1886119.981715Lonzo BallRay McCallumNCAANBA
1887619.984895Lonzo BallOrlando JohnsonNCAANBA
1883219.996917Lonzo BallJosh RichardsonNCAANBA
1883520.028378Lonzo BallMalcolm BrogdonNCAANBA
1889520.102194Lonzo BallCameron PayneNCAANBA
..................
1900031.974613Lonzo BallCady LalanneNCAANBA
1884532.226368Lonzo BallJordan HamiltonNCAANBA
1899632.377422Lonzo BallDakari JohnsonNCAANBA
1890232.601743Lonzo BallJarnell StokesNCAANBA
1894633.082097Lonzo BallDamian JonesNCAANBA
1881433.309737Lonzo BallShabazz MuhammadNCAANBA
1892033.330938Lonzo BallTony MitchellNCAANBA
1882933.377107Lonzo BallJulius RandleNCAANBA
1875933.571917Lonzo BallTristan ThompsonNCAANBA
1885533.813413Lonzo BallJimmer FredetteNCAANBA
1892434.171508Lonzo BallJoel BolomboyNCAANBA
1882334.273858Lonzo BallFestus EzeliNCAANBA
1895534.390509Lonzo BallCameron BairstowNCAANBA
1878834.604572Lonzo BallJohn HensonNCAANBA
1899134.906913Lonzo BallAlec BrownNCAANBA
1890935.015887Lonzo BallRakeem ChristmasNCAANBA
1893935.619096Lonzo BallFab MeloNCAANBA
1875836.065498Lonzo BallAndre DrummondNCAANBA
1887336.104126Lonzo BallPascal SiakamNCAANBA
1881936.443424Lonzo BallJeff WitheyNCAANBA
1879336.852015Lonzo BallMyles TurnerNCAANBA
1881837.646551Lonzo BallMike MuscalaNCAANBA
1883337.727792Lonzo BallThomas RobinsonNCAANBA
1873738.120727Lonzo BallAnthony DavisNCAANBA
1892238.253961Lonzo BallJordan MickeyNCAANBA
1893739.284590Lonzo BallKeith BensonNCAANBA
1874039.583770Lonzo BallKenneth FariedNCAANBA
1881739.696040Lonzo BallT.J. WarrenNCAANBA
1888440.637818Lonzo BallSkal LabissiereNCAANBA
1877741.570897Lonzo BallJames JohnsonNCAANBA
\n", "

271 rows × 5 columns

\n", "
" ], "text/plain": [ " eucl_dist Player_y Player_x player_y_type player_x_type\n", "18847 13.716531 Lonzo Ball Shane Larkin NCAA NBA\n", "18767 15.257080 Lonzo Ball Reggie Jackson NCAA NBA\n", "18843 15.593754 Lonzo Ball Jerian Grant NCAA NBA\n", "18929 16.083855 Lonzo Ball Lamar Patterson NCAA NBA\n", "18870 16.613314 Lonzo Ball Patrick McCaw NCAA NBA\n", "18886 17.019915 Lonzo Ball Shabazz Napier NCAA NBA\n", "18865 17.092628 Lonzo Ball Tyus Jones NCAA NBA\n", "18962 17.203179 Lonzo Ball Lorenzo Brown NCAA NBA\n", "18853 17.436172 Lonzo Ball Reggie Bullock NCAA NBA\n", "18757 18.096593 Lonzo Ball Isaiah Thomas NCAA NBA\n", "18869 18.109116 Lonzo Ball Delon Wright NCAA NBA\n", "18882 18.261786 Lonzo Ball D'Angelo Russell NCAA NBA\n", "18751 18.609615 Lonzo Ball Trey Burke NCAA NBA\n", "18836 18.690465 Lonzo Ball Ben McLemore NCAA NBA\n", "18903 18.895548 Lonzo Ball Denzel Valentine NCAA NBA\n", "18901 18.964175 Lonzo Ball Tyler Ulis NCAA NBA\n", "18973 19.131855 Lonzo Ball Wade Baldwin NCAA NBA\n", "18958 19.170505 Lonzo Ball Michael Gbinije NCAA NBA\n", "18913 19.406235 Lonzo Ball Pat Connaughton NCAA NBA\n", "18984 19.436834 Lonzo Ball Marcus Denmon NCAA NBA\n", "18931 19.696883 Lonzo Ball Kris Dunn NCAA NBA\n", "19006 19.841475 Lonzo Ball Isaiah Cousins NCAA NBA\n", "18800 19.850398 Lonzo Ball Allen Crabbe NCAA NBA\n", "18808 19.928488 Lonzo Ball Solomon Hill NCAA NBA\n", "18878 19.954606 Lonzo Ball Jamal Murray NCAA NBA\n", "18861 19.981715 Lonzo Ball Ray McCallum NCAA NBA\n", "18876 19.984895 Lonzo Ball Orlando Johnson NCAA NBA\n", "18832 19.996917 Lonzo Ball Josh Richardson NCAA NBA\n", "18835 20.028378 Lonzo Ball Malcolm Brogdon NCAA NBA\n", "18895 20.102194 Lonzo Ball Cameron Payne NCAA NBA\n", "... ... ... ... ... ...\n", "19000 31.974613 Lonzo Ball Cady Lalanne NCAA NBA\n", "18845 32.226368 Lonzo Ball Jordan Hamilton NCAA NBA\n", "18996 32.377422 Lonzo Ball Dakari Johnson NCAA NBA\n", "18902 32.601743 Lonzo Ball Jarnell Stokes NCAA NBA\n", "18946 33.082097 Lonzo Ball Damian Jones NCAA NBA\n", "18814 33.309737 Lonzo Ball Shabazz Muhammad NCAA NBA\n", "18920 33.330938 Lonzo Ball Tony Mitchell NCAA NBA\n", "18829 33.377107 Lonzo Ball Julius Randle NCAA NBA\n", "18759 33.571917 Lonzo Ball Tristan Thompson NCAA NBA\n", "18855 33.813413 Lonzo Ball Jimmer Fredette NCAA NBA\n", "18924 34.171508 Lonzo Ball Joel Bolomboy NCAA NBA\n", "18823 34.273858 Lonzo Ball Festus Ezeli NCAA NBA\n", "18955 34.390509 Lonzo Ball Cameron Bairstow NCAA NBA\n", "18788 34.604572 Lonzo Ball John Henson NCAA NBA\n", "18991 34.906913 Lonzo Ball Alec Brown NCAA NBA\n", "18909 35.015887 Lonzo Ball Rakeem Christmas NCAA NBA\n", "18939 35.619096 Lonzo Ball Fab Melo NCAA NBA\n", "18758 36.065498 Lonzo Ball Andre Drummond NCAA NBA\n", "18873 36.104126 Lonzo Ball Pascal Siakam NCAA NBA\n", "18819 36.443424 Lonzo Ball Jeff Withey NCAA NBA\n", "18793 36.852015 Lonzo Ball Myles Turner NCAA NBA\n", "18818 37.646551 Lonzo Ball Mike Muscala NCAA NBA\n", "18833 37.727792 Lonzo Ball Thomas Robinson NCAA NBA\n", "18737 38.120727 Lonzo Ball Anthony Davis NCAA NBA\n", "18922 38.253961 Lonzo Ball Jordan Mickey NCAA NBA\n", "18937 39.284590 Lonzo Ball Keith Benson NCAA NBA\n", "18740 39.583770 Lonzo Ball Kenneth Faried NCAA NBA\n", "18817 39.696040 Lonzo Ball T.J. Warren NCAA NBA\n", "18884 40.637818 Lonzo Ball Skal Labissiere NCAA NBA\n", "18777 41.570897 Lonzo Ball James Johnson NCAA NBA\n", "\n", "[271 rows x 5 columns]" ] }, "execution_count": 265, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#examine the comparisons for Lonzo Ball\n", "final_final[final_final['Player_y'] == 'Lonzo Ball'].sort_values(by='eucl_dist',ascending = True)" ] }, { "cell_type": "code", "execution_count": 263, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
eucl_distPlayer_yPlayer_xplayer_y_typeplayer_x_type
5534664.897065Steve VasturiaCaris LeVertNCAANBA
10667275.062801Jared TerrellMalcolm LeeNCAANBA
2302725.379588Rawle AlkinsSolomon HillNCAANBA
5890215.415472Dylan EnnisCory JosephNCAANBA
13613725.521101Rodney PurvisMalachi RichardsonNCAANBA
21572315.577684Christian TerrellZach LaVineNCAANBA
5035355.852017Tracy AbramsZach LaVineNCAANBA
7626575.911154Khadeen CarringtonDarius Johnson-OdomNCAANBA
1162745.911482Sterling BrownJustise WinslowNCAANBA
4132476.009668Sviatoslav MykhailiukZach LaVineNCAANBA
6768276.016287Justin RobinsonIsaiah CanaanNCAANBA
6768266.016287Justin RobinsonIsaiah CanaanNCAANBA
9667926.018570Jordan MurphyMarcus ThorntonNCAANBA
9670186.018570Jordan MurphyMarcus ThorntonNCAANBA
7625146.030128Khadeen CarringtonAustin RiversNCAANBA
22855496.036571Darrell DavisJordan HamiltonNCAANBA
6960016.080880Jack GibbsIsaiah CanaanNCAANBA
11974196.142668Xavier JohnsonMalik BeasleyNCAANBA
10690236.158689Christian VitalZach LaVineNCAANBA
16749176.229392Cullen VanLeerJordan HamiltonNCAANBA
24686616.258655Alex MurphyAbdel NaderNCAANBA
26562396.387709Rob EdwardsJordan WilliamsNCAANBA
15538076.409944D.J. FennerMalcolm LeeNCAANBA
1209816.443540Edrice AdebayoQuincy AcyNCAANBA
20004836.444821Brynton LemarJames YoungNCAANBA
20004106.457194Brynton LemarAustin RiversNCAANBA
24399896.606846Alexander Aka GorskiJordan HamiltonNCAANBA
330686.608452Donovan MitchellGary HarrisNCAANBA
9145356.698608Kavin Gilder-TilburyTerrence RossNCAANBA
2801716.719336London PerrantesTony SnellNCAANBA
..................
472321755.659372Sebastian TownesAnthony DavisNCAANBA
483488955.806828Darius MooreAnthony DavisNCAANBA
480875355.818913Eliel GonzalezAnthony DavisNCAANBA
479212155.901396Mike GreenAnthony DavisNCAANBA
401754555.908765Christian EllisAnthony DavisNCAANBA
461629756.092542Isaiah WaltonAnthony DavisNCAANBA
495606556.164070Darrell RileyAnthony DavisNCAANBA
491567356.267556Sam HuntAnthony DavisNCAANBA
495844156.304151Raheem WattsAnthony DavisNCAANBA
489904156.318796Sam BurmeisterAnthony DavisNCAANBA
491804956.357529Jermaine MarrowAnthony DavisNCAANBA
492517756.468464Reggie DillardAnthony DavisNCAANBA
463768156.498150Matthew ButlerAnthony DavisNCAANBA
468520156.636098Asante GistAnthony DavisNCAANBA
490854556.642419Jo'Vontae MillnerAnthony DavisNCAANBA
468282556.740089Josh BoydAnthony DavisNCAANBA
478261756.883457Reggie OliverAnthony DavisNCAANBA
487765756.946361Charles Tucker Jr.Anthony DavisNCAANBA
496319356.987312Max HeideggerAnthony DavisNCAANBA
482538557.088632Delante JonesAnthony DavisNCAANBA
499170557.289504Amos GivenAnthony DavisNCAANBA
493943357.606328Tyson BatisteAnthony DavisNCAANBA
470896157.632143Taylor JohnsonAnthony DavisNCAANBA
499883357.667914Marcus MerriweatherAnthony DavisNCAANBA
497032157.728321Rakim LubinAnthony DavisNCAANBA
476360957.844177August HaasAnthony DavisNCAANBA
498220158.519947Rakiya BattleAnthony DavisNCAANBA
486577758.674243Junior LomombaAnthony DavisNCAANBA
498457759.255539Elijah PughsleyAnthony DavisNCAANBA
497744959.585905Chris ShieldsAnthony DavisNCAANBA
\n", "

570455 rows × 5 columns

\n", "
" ], "text/plain": [ " eucl_dist Player_y Player_x player_y_type \\\n", "553466 4.897065 Steve Vasturia Caris LeVert NCAA \n", "1066727 5.062801 Jared Terrell Malcolm Lee NCAA \n", "230272 5.379588 Rawle Alkins Solomon Hill NCAA \n", "589021 5.415472 Dylan Ennis Cory Joseph NCAA \n", "1361372 5.521101 Rodney Purvis Malachi Richardson NCAA \n", "2157231 5.577684 Christian Terrell Zach LaVine NCAA \n", "503535 5.852017 Tracy Abrams Zach LaVine NCAA \n", "762657 5.911154 Khadeen Carrington Darius Johnson-Odom NCAA \n", "116274 5.911482 Sterling Brown Justise Winslow NCAA \n", "413247 6.009668 Sviatoslav Mykhailiuk Zach LaVine NCAA \n", "676827 6.016287 Justin Robinson Isaiah Canaan NCAA \n", "676826 6.016287 Justin Robinson Isaiah Canaan NCAA \n", "966792 6.018570 Jordan Murphy Marcus Thornton NCAA \n", "967018 6.018570 Jordan Murphy Marcus Thornton NCAA \n", "762514 6.030128 Khadeen Carrington Austin Rivers NCAA \n", "2285549 6.036571 Darrell Davis Jordan Hamilton NCAA \n", "696001 6.080880 Jack Gibbs Isaiah Canaan NCAA \n", "1197419 6.142668 Xavier Johnson Malik Beasley NCAA \n", "1069023 6.158689 Christian Vital Zach LaVine NCAA \n", "1674917 6.229392 Cullen VanLeer Jordan Hamilton NCAA \n", "2468661 6.258655 Alex Murphy Abdel Nader NCAA \n", "2656239 6.387709 Rob Edwards Jordan Williams NCAA \n", "1553807 6.409944 D.J. Fenner Malcolm Lee NCAA \n", "120981 6.443540 Edrice Adebayo Quincy Acy NCAA \n", "2000483 6.444821 Brynton Lemar James Young NCAA \n", "2000410 6.457194 Brynton Lemar Austin Rivers NCAA \n", "2439989 6.606846 Alexander Aka Gorski Jordan Hamilton NCAA \n", "33068 6.608452 Donovan Mitchell Gary Harris NCAA \n", "914535 6.698608 Kavin Gilder-Tilbury Terrence Ross NCAA \n", "280171 6.719336 London Perrantes Tony Snell NCAA \n", "... ... ... ... ... \n", "4723217 55.659372 Sebastian Townes Anthony Davis NCAA \n", "4834889 55.806828 Darius Moore Anthony Davis NCAA \n", "4808753 55.818913 Eliel Gonzalez Anthony Davis NCAA \n", "4792121 55.901396 Mike Green Anthony Davis NCAA \n", "4017545 55.908765 Christian Ellis Anthony Davis NCAA \n", "4616297 56.092542 Isaiah Walton Anthony Davis NCAA \n", "4956065 56.164070 Darrell Riley Anthony Davis NCAA \n", "4915673 56.267556 Sam Hunt Anthony Davis NCAA \n", "4958441 56.304151 Raheem Watts Anthony Davis NCAA \n", "4899041 56.318796 Sam Burmeister Anthony Davis NCAA \n", "4918049 56.357529 Jermaine Marrow Anthony Davis NCAA \n", "4925177 56.468464 Reggie Dillard Anthony Davis NCAA \n", "4637681 56.498150 Matthew Butler Anthony Davis NCAA \n", "4685201 56.636098 Asante Gist Anthony Davis NCAA \n", "4908545 56.642419 Jo'Vontae Millner Anthony Davis NCAA \n", "4682825 56.740089 Josh Boyd Anthony Davis NCAA \n", "4782617 56.883457 Reggie Oliver Anthony Davis NCAA \n", "4877657 56.946361 Charles Tucker Jr. Anthony Davis NCAA \n", "4963193 56.987312 Max Heidegger Anthony Davis NCAA \n", "4825385 57.088632 Delante Jones Anthony Davis NCAA \n", "4991705 57.289504 Amos Given Anthony Davis NCAA \n", "4939433 57.606328 Tyson Batiste Anthony Davis NCAA \n", "4708961 57.632143 Taylor Johnson Anthony Davis NCAA \n", "4998833 57.667914 Marcus Merriweather Anthony Davis NCAA \n", "4970321 57.728321 Rakim Lubin Anthony Davis NCAA \n", "4763609 57.844177 August Haas Anthony Davis NCAA \n", "4982201 58.519947 Rakiya Battle Anthony Davis NCAA \n", "4865777 58.674243 Junior Lomomba Anthony Davis NCAA \n", "4984577 59.255539 Elijah Pughsley Anthony Davis NCAA \n", "4977449 59.585905 Chris Shields Anthony Davis NCAA \n", "\n", " player_x_type \n", "553466 NBA \n", "1066727 NBA \n", "230272 NBA \n", "589021 NBA \n", "1361372 NBA \n", "2157231 NBA \n", "503535 NBA \n", "762657 NBA \n", "116274 NBA \n", "413247 NBA \n", "676827 NBA \n", "676826 NBA \n", "966792 NBA \n", "967018 NBA \n", "762514 NBA \n", "2285549 NBA \n", "696001 NBA \n", "1197419 NBA \n", "1069023 NBA \n", "1674917 NBA \n", "2468661 NBA \n", "2656239 NBA \n", "1553807 NBA \n", "120981 NBA \n", "2000483 NBA \n", "2000410 NBA \n", "2439989 NBA \n", "33068 NBA \n", "914535 NBA \n", "280171 NBA \n", "... ... \n", "4723217 NBA \n", "4834889 NBA \n", "4808753 NBA \n", "4792121 NBA \n", "4017545 NBA \n", "4616297 NBA \n", "4956065 NBA \n", "4915673 NBA \n", "4958441 NBA \n", "4899041 NBA \n", "4918049 NBA \n", "4925177 NBA \n", "4637681 NBA \n", "4685201 NBA \n", "4908545 NBA \n", "4682825 NBA \n", "4782617 NBA \n", "4877657 NBA \n", "4963193 NBA \n", "4825385 NBA \n", "4991705 NBA \n", "4939433 NBA \n", "4708961 NBA \n", "4998833 NBA \n", "4970321 NBA \n", "4763609 NBA \n", "4982201 NBA \n", "4865777 NBA \n", "4984577 NBA \n", "4977449 NBA \n", "\n", "[570455 rows x 5 columns]" ] }, "execution_count": 263, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#write this final_final table out to CSV\n", "final_final.to_csv(\"player_comparison_2017_pca.csv\")" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 1 }