K-Means Clustering Output



After merging order_id and user_id:



  1. merging rows


Load the products csv file data:



  1. products csv


Load the aisles csv file data:



  1. aisles csv


Print shape of aisles:



  1. aisles shape


Merge the aisle_id,product_id,order_id rows of products, orders and aisles:



  1. merging rows aisle product order


Top 10 rows in the products list:



  1. top 10 products


Length of total number of unique products list:



  1. unique products


Shape of prior data:



  1. unique products


Length of total number of unique aisles data list:



  1. unique aisles


Fresh fruits and fresh vegetables are best selling top two products in the products list:



  1. best selling


Perform cross tabulation or contingency tables of customer products with user_id and aisle_id:



  1. cross tabulation


Shape of customer products:



  1. shape of cust_product


Perform Principal component analysis (PCA) on customer products from sklearn.decomposition import PCA:



  1. pca


Plot cluster of class 1 from matplotlib import pyplot as plt:



  1. cluster


Provide random centers for each clusters:



  1. day of week vs hour


Predict the one hundred fifty data points for generating clusters:



  1. predict 1 to 150 data points


Plot clusters belonging to class 1, class 2,class 3 and class 4:



  1. predict 100 data points


Plot sub plots of each clusters belonging to different classes :



  1. sub plots


Top 10 products belonging to cluster 0:



  1. cluster 0


Top 10 products belonging to cluster 1:



  1. cluster 1


Top 10 products belonging to cluster 2:



  1. cluster 2


Top 10 products belonging to cluster 3:



  1. cluster 3


Top 10 products belonging to cluster 4:



  1. cluster 4


A first analysis of the clusters confirm the initial hypothesis that are products which are genereically bought by the majority of the customers:





Ratio of purchasing orders of most frequently bought products:



  1. ratio of purchasing orders


Percentage of purchasing orders of most frequently bought products:



  1. Percentage of purchasing orders