{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Improving superconductivity in BaFe2As2-based crystals by cobalt clustering and electronic uniformity\n",
    "\n",
    "Li Li, Qiang Zheng, Qiang Zou, Shivani Rajput, Anota Ijaduola, Zhiming Wu, Xiaoping Wang, Huibo Cao, Suhas Somnath, Stephen Jesse, Miaofang Chi, Zheng Gai, David Parker, and Athena Sefat\n",
    "<br>**Scientific Reports**\n",
    "\n",
    "Notebook written by: Suhas Somnath<br>\n",
    "4/9/2017\n",
    "\n",
    "### References:\n",
    "This Jupyter notebook uses [pycroscopy](https://pycroscopy.github.io/pycroscopy/about.html) to analyze Band Excitation data. We request you to reference the following papers if you use this notebook for your research: \n",
    "\n",
    "* [Arxiv paper](https://arxiv.org/abs/1903.09515) titled \"*USID and Pycroscopy - Open frameworks for storing and analyzing spectroscopic and imaging data*\"\n",
    "* [Nature Communications paper](https://www.nature.com/articles/s41598-017-00984-1) titled \"*Improving superconductivity in BaFe2As2-based crystals by cobalt clustering and electronic uniformity*\"\n",
    "\n",
    "### About the notebook\n",
    "\n",
    "This is a Jupyter Notebook - it contains text and executable code `cells`. To learn more about how to use it, please see [this video](https://www.youtube.com/watch?v=jZ952vChhuI). Please see the image below for some basic tips on using this notebook.\n",
    "\n",
    "![notebook_rules.png](https://raw.githubusercontent.com/pycroscopy/pycroscopy/master/jupyter_notebooks/notebook_rules.png)\n",
    "\n",
    "Image courtesy of Jean Bilheux from the [neutron imaging](https://github.com/neutronimaging/python_notebooks) GitHub repository.\n",
    "\n",
    "### A note about software versions:\n",
    "**Note: This notebook was written for the pycroscopy version listed below and is not guaranteed to work on past or future versions of the package**\n",
    "\n",
    "`pycroscopy` version: `0.0a36`\n",
    "\n",
    "If you have a different version of `pycroscopy` installed, you may consider using the notebook as is and accept the possibility of errors. The cell below will attempt to install the correct versions of the packages. However, if you experience trouble, uninstall the existing version of pycroscopy and install the required version above by executing the following commands in a terminal (Linux / MacOS) / Anaconda prompt (Windows): \n",
    "\n",
    "```bash\n",
    "pip uninstall pycroscopy\n",
    "pip install -I pycroscopy==0.0a36\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "!conda install --yes --prefix {sys.prefix} numpy scipy matplotlib scikit-learn Ipython ipywidgets h5py\n",
    "!pip install pycroscopy==0.0a36"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Ensure python 3 compatibility:\n",
    "from __future__ import division, print_function, absolute_import\n",
    "\n",
    "from os import path\n",
    "import numpy as np\n",
    "import h5py\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.patches as patches\n",
    "from mpl_toolkits.axes_grid1 import make_axes_locatable\n",
    "\n",
    "from scipy.signal import medfilt\n",
    "from scipy.ndimage.filters import gaussian_filter\n",
    "from sklearn.utils.extmath import randomized_svd\n",
    "from sklearn.cluster import KMeans\n",
    "\n",
    "import pycroscopy as px\n",
    "from pycroscopy.io.translators.omicron_asc import AscTranslator\n",
    "\n",
    "# set up notebook to show plots within the notebook\n",
    "% matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% Load file\n",
    "file_path = r\"E:\\Shivani\\STS_Au-Ba122 crystal\\I(V) TraceUp Tue Sep 20 09_17_08 2016 [14-1]  STM_Spectroscopy STM.asc\"\n",
    "folder_path, file_name = path.split(file_path)\n",
    "file_name = file_name[:-4] + '_'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tran = AscTranslator()\n",
    "h5_path = tran.translate(file_path)\n",
    "\n",
    "hdf = px.ioHDF5(h5_path)\n",
    "h5_main = px.hdf_utils.getDataSet(hdf.file, 'Raw_Data')[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x_label = 'Bias (V)'\n",
    "y_label = 'Current (a.u.)'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "print('Datasets and datagroups within the file:\\n------------------------------------')\n",
    "px.hdf_utils.print_tree(h5_main.file)\n",
    " \n",
    "print('\\nThe main dataset:\\n------------------------------------')\n",
    "print(h5_main)\n",
    "print('\\nThe ancillary datasets:\\n------------------------------------')\n",
    "print(hdf.file['/Measurement_000/Channel_000/Position_Indices'])\n",
    "print(hdf.file['/Measurement_000/Channel_000/Position_Values'])\n",
    "print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Indices'])\n",
    "print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Values'])\n",
    "\n",
    "print('\\nMetadata or attributes in a datagroup\\n------------------------------------')\n",
    "for key in hdf.file['/Measurement_000'].attrs:\n",
    "    print('{} : {}'.format(key, hdf.file['/Measurement_000'].attrs[key]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Read some basic parameters\n",
    "h5_meas_grp = hdf.file['/Measurement_000']\n",
    "num_rows = int(h5_meas_grp.attrs['y-pixels'])\n",
    "num_cols = int(h5_meas_grp.attrs['x-pixels'])\n",
    "num_pos = num_rows * num_cols\n",
    "spectra_length = int(h5_meas_grp.attrs['z-points'])\n",
    "volt_vec = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Values')[-1][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "raw_data_3d = np.reshape(h5_main[()], (num_rows, num_cols, spectra_length))\n",
    "fig, axes = plt.subplots(ncols=2, figsize=(8, 4))\n",
    "px.plot_utils.plot_map(axes[0], raw_data_3d[:, :, 0], cmap=px.plot_utils.cmap_jet_white_center())\n",
    "axes[0].set_aspect(1)\n",
    "axes[0].invert_yaxis()\n",
    "axes[0].set_title('slice of data - use this for cropping the data')\n",
    "axes[1].plot(volt_vec, raw_data_3d[10, 10, :])\n",
    "axes[1].set_xlabel('Bias (V)')\n",
    "axes[1].set_ylabel('Current (nA)')\n",
    "axes[1].set_title('Response at a single pixel')\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% User supplied information for cropping data:\n",
    "start_row = 30  # 0 is the first row\n",
    "end_row = num_rows  # num_rows is the last row\n",
    "start_col = 0\n",
    "end_col = num_cols  # num_cols is the last column\n",
    "volt_chop = 0.3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# now crop the data according the specifications above:\n",
    "volt_index = np.where(np.logical_and(volt_vec >= -volt_chop, volt_vec <= volt_chop))[0]\n",
    "volt_vec = volt_vec[volt_index]\n",
    "\n",
    "raw_data_3d = raw_data_3d[start_row:end_row, start_col:end_col, volt_index]\n",
    "num_rows, num_cols, spectra_length = raw_data_3d.shape\n",
    "raw_data_2d = raw_data_3d.reshape(-1, spectra_length)\n",
    "\n",
    "fig, axes = plt.subplots(ncols=2, figsize=(8, 4))\n",
    "px.plot_utils.plot_map(axes[0], raw_data_3d[:, :, 0], cmap=px.plot_utils.cmap_jet_white_center())\n",
    "axes[0].set_aspect(1)\n",
    "axes[0].invert_yaxis()\n",
    "axes[0].set_title('Cropped data')\n",
    "axes[1].plot(volt_vec, raw_data_3d[10, 10, :])\n",
    "axes[1].set_xlabel('Bias (V)')\n",
    "axes[1].set_ylabel('Current (nA)')\n",
    "axes[1].set_title('Cropped response at a single pixel')\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% Do k-means to identify the principal response types and where (spatially) the occur in the data\n",
    "num_clusters = 16  # change the number of clusters here. Something ~ 16 is easy to understand and visualize\n",
    "estimators = KMeans(num_clusters)\n",
    "results = estimators.fit(raw_data_2d)\n",
    "labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)\n",
    "labels_mat = np.reshape(labels, (raw_data_3d.shape[0], raw_data_3d.shape[1]))\n",
    "fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,\n",
    "                                                                spec_label=x_label, resp_label=y_label)\n",
    "axes_KM2[0].invert_yaxis()\n",
    "axes_KM2[0].set_aspect(1)\n",
    "fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_V_' + str(num_clusters) + '_clusters.png'),\n",
    "                format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# throw out bad points using a threshold. Works better than finding outliers using K-means\n",
    "current_threshold = 0.2\n",
    "bad_pixels = np.where(np.max(np.abs(raw_data_2d), axis=1) > current_threshold)[0]\n",
    "\n",
    "# find the pixels that are good:\n",
    "good_pixels = [item for item in np.arange(raw_data_2d.shape[0]) if item not in bad_pixels]\n",
    "\n",
    "# find the average of all the good pixels\n",
    "good_resp = np.mean(raw_data_2d[good_pixels], axis=0)\n",
    "\n",
    "fig, axis = plt.subplots(ncols=2, figsize=(10, 5))\n",
    "axis[0].set_title('Bad pixels')\n",
    "px.plot_utils.plot_line_family(axis[0], volt_vec, raw_data_2d[bad_pixels], label_prefix='Cluster')\n",
    "axis[1].set_title('Mean of all other pixels')\n",
    "axis[1].plot(volt_vec, good_resp)\n",
    "fig.savefig(path.join(folder_path, file_name + '_outliers_removal.png'), format='png', dpi=300)\n",
    "\n",
    "# now replace bad pixels by the mean of the good pixels\n",
    "outlier_free_2d = np.copy(raw_data_2d)\n",
    "outlier_free_2d[bad_pixels] = good_resp\n",
    "outlier_free_3d = np.reshape(outlier_free_2d, raw_data_3d.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "estimators = KMeans(num_clusters)\n",
    "results = estimators.fit(outlier_free_2d)\n",
    "labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)\n",
    "labels_mat = np.reshape(labels, (outlier_free_3d.shape[0], outlier_free_3d.shape[1]))\n",
    "fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,\n",
    "                                                                spec_label=x_label, resp_label=y_label)\n",
    "axes_KM2[0].invert_yaxis()\n",
    "axes_KM2[0].set_aspect(1)\n",
    "fig_KM2.savefig(path.join(folder_path, file_name + 'After_removing_outliers_K_means_V_' + str(num_clusters) + '_clusters.png'),\n",
    "                format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "num_comps = 256\n",
    "U, S, V = randomized_svd(outlier_free_2d, num_comps, n_iter=3)\n",
    "\n",
    "fig_V, axes_V = px.plot_utils.plot_loops(volt_vec, V, title='SVD Eigenvectors',\n",
    "                                         evenly_spaced=False, plots_on_side=4, use_rainbow_plots=False,\n",
    "                                         x_label=x_label, y_label=y_label, subtitles='Eigenvector')\n",
    "fig_V.savefig(path.join(folder_path, file_name + 'SVD_Eigenvectors.png'), format='png', dpi=200)\n",
    "\n",
    "loadings = np.reshape(U, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))\n",
    "fig_U, axes_U = px.plot_utils.plot_map_stack(loadings, num_comps=16, cmap=px.plot_utils.cmap_jet_white_center())\n",
    "fig_U.savefig(path.join(folder_path, file_name + 'SVD_Eigenvalues.png'), format='png', dpi=150)\n",
    "\n",
    "fig_S, axes_S = px.plot_utils.plotScree(S)\n",
    "fig_S.savefig(path.join(folder_path, file_name + 'SVD_S.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% SVD filtering - Reconstructing the original data with a subset of the SVD components to remove noise present in the senior components\n",
    "# Note that this method of filtering does NOT downsample the data\n",
    "filter_comps = 32\n",
    "filtered_2d = np.dot(np.dot(U[:, :filter_comps], np.eye(filter_comps)*S[:filter_comps]), V[:filter_comps, :])\n",
    "filtered_3d = np.reshape(filtered_2d, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))\n",
    "\n",
    "# Compare the raw and filtered data\n",
    "fig_filt, axes = px.plot_utils.plot_loops(volt_vec, [outlier_free_2d, filtered_2d], line_colors=['k', 'r'],\n",
    "                                          dataset_names=['Raw','SVD Filtered']\n",
    "                                          y_label='Current (a.u.)', title='SVD filtering')\n",
    "fig_filt.savefig(path.join(folder_path, file_name + '_SVD_Filtered_data_I_' + str(filter_comps) + '_comps.png'),\n",
    "                 format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Try a Gaussian filter on the stack:\n",
    "gaus_sigma = 0.35\n",
    "gaus_cycles = 1\n",
    "source_dset = filtered_3d.copy()\n",
    "for cyc_ind in range(gaus_cycles):\n",
    "    sink_dset = np.zeros(raw_data_3d.shape)\n",
    "    for volt_ind in range(spectra_length):\n",
    "        volt_slice = source_dset[:, :, volt_ind]\n",
    "        sink_dset[:, :, volt_ind] = gaussian_filter(volt_slice, gaus_sigma)\n",
    "    source_dset = sink_dset.copy()\n",
    "gaus_filt_3d = sink_dset\n",
    "# See some example loops:\n",
    "gaus_filt_2d = np.reshape(gaus_filt_3d, raw_data_2d.shape)\n",
    "fig, axes = px.viz.plot_utils.plot_loops(volt_vec, [filtered_2d, gaus_filt_2d], line_colors=['k', 'r'],\n",
    "                                         dataset_names=['Raw', 'Filtered'], x_label=x_label, y_label=y_label,\n",
    "                                         title='Gaussian Filter')\n",
    "fig.savefig(path.join(folder_path, file_name + 'Gaussian_Filter_sigma_' + str(gaus_sigma) + '_cycles_' +\n",
    "                      str(gaus_cycles) + '.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% Offset the data in the current axis by setting the current at 0V to 0\n",
    "\n",
    "mid_pt = int(volt_vec.size * 0.5)\n",
    "offsets = np.mean(gaus_filt_2d[:, mid_pt-1:mid_pt+1], axis=1)\n",
    "offseted_data_2d = gaus_filt_2d - np.repeat(np.atleast_2d(offsets).transpose(), spectra_length, axis=1)\n",
    "offseted_data_3d = np.reshape(offseted_data_2d, (raw_data_3d.shape[0], raw_data_3d.shape[1], -1))\n",
    "\n",
    "#%% Plotting an example IV to check correction\n",
    "row_ind = 15\n",
    "col_ind = 12\n",
    "\n",
    "fig, ax = plt.subplots()\n",
    "ax.plot(volt_vec, filtered_3d[row_ind, col_ind], 'bo-', label='Raw')\n",
    "ax.plot(volt_vec, offseted_data_3d[row_ind, col_ind], 'r*-', label='Offsetted')\n",
    "ax.plot([volt_vec[0], volt_vec[-1]], [0, 0], 'k')\n",
    "ax.plot([0, 0], [np.min(filtered_3d[row_ind, col_ind]), np.max(filtered_3d[row_ind, col_ind])], 'k')\n",
    "# ax.set_xlim([volt_vec[mid_pt-5], volt_vec[mid_pt + 5]])\n",
    "ax.set_xlabel(x_label)\n",
    "ax.set_ylabel(y_label)\n",
    "extents = 5\n",
    "ax.set_xlim([volt_vec[mid_pt - extents], volt_vec[mid_pt + extents]])\n",
    "chosen_sections_1 = filtered_3d[row_ind, col_ind, mid_pt - extents:mid_pt + extents]\n",
    "chosen_sections_2 = offseted_data_3d[row_ind, col_ind, mid_pt - extents:mid_pt + extents]\n",
    "ax.set_ylim([np.min(np.hstack([chosen_sections_1, chosen_sections_2])),\n",
    "             np.max(np.hstack([chosen_sections_1, chosen_sections_2]))])\n",
    "ax.set_title('Ofsetting the current')\n",
    "ax.legend(loc='best')\n",
    "fig.savefig(path.join(folder_path, file_name + '_offsetting_current.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#%% Calculating dI/dV now:\n",
    "dIdV_3d = np.diff(offseted_data_3d, axis=2)\n",
    "# Adding 0 to spectral axis since we lose one index due to differential\n",
    "dIdV_3d = np.dstack((dIdV_3d, np.zeros(shape=(offseted_data_3d.shape[0], offseted_data_3d.shape[1]))))\n",
    "dIdV_2d = np.reshape(dIdV_3d, offseted_data_2d.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Apply some median filters:\n",
    "med_filt_size = 3\n",
    "dIdV_filt_3d = medfilt(dIdV_3d, [1, 1, med_filt_size])\n",
    "dIdV_filt_2d = np.reshape(dIdV_filt_3d, raw_data_2d.shape)\n",
    "fig, axes = px.viz.plot_utils.plot_loops(volt_vec, [dIdV_2d, dIdV_filt_2d], line_colors=['k', 'r'],\n",
    "                                         dataset_names=['Raw', 'Filtered'], x_label=x_label, y_label=y_label,\n",
    "                                         title='Median Filter')\n",
    "fig.savefig(path.join(folder_path, file_name + 'Median_Filter_size_' + str(med_filt_size) + '.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "num_clusters = 8  # change the number of clusters here. Something ~ 16 is easy to understand and visualize\n",
    "estimators = KMeans(num_clusters)\n",
    "results = estimators.fit(dIdV_filt_2d)\n",
    "labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)\n",
    "labels_mat = np.reshape(labels, (dIdV_filt_3d.shape[0], dIdV_filt_3d.shape[1]))\n",
    "fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,\n",
    "                                                                spec_label=x_label, resp_label=y_label)\n",
    "axes_KM2[0].invert_yaxis()\n",
    "axes_KM2[0].set_aspect(1)\n",
    "fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_filtered_dIdV_' +\n",
    "                          str(num_clusters) + '_clusters.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Downsample and average\n",
    "box_size = 3\n",
    "down_samp_didv_3d = np.zeros((dIdV_filt_3d.shape[0]-box_size + 1,\n",
    "                              dIdV_filt_3d.shape[1] - box_size + 1,\n",
    "                              dIdV_filt_3d.shape[2]))\n",
    "down_samp_iv_3d = np.zeros(down_samp_didv_3d.shape)\n",
    "for row_ind in range(down_samp_didv_3d.shape[0]):\n",
    "    row_start = row_ind\n",
    "    row_end = row_ind + box_size\n",
    "    for col_ind in range(down_samp_didv_3d.shape[1]):\n",
    "        col_start = col_ind\n",
    "        col_end = col_ind + box_size\n",
    "\n",
    "        box_didv_data = np.reshape(dIdV_filt_3d[row_start: row_end, col_start: col_end], (-1, spectra_length))\n",
    "        mean_di_dv = np.mean(box_didv_data, axis=0)\n",
    "        box_iv_data = np.reshape(offseted_data_3d[row_start: row_end, col_start: col_end], (-1, spectra_length))\n",
    "        mean_iv = np.mean(box_iv_data, axis=0)\n",
    "        offset_iv = mean_iv - np.mean(mean_iv[mid_pt - 1:mid_pt + 1])\n",
    "        #shifted_di_dv = mean_di_dv - np.min(mean_di_dv[int(0.5*spectra_length) - 2: int(0.5*spectra_length) + 2])\n",
    "        ldos = volt_vec * mean_di_dv / offset_iv\n",
    "        down_samp_iv_3d[row_ind, col_ind] = offset_iv\n",
    "        down_samp_didv_3d[row_ind, col_ind] = mean_di_dv\n",
    "\n",
    "down_samp_didv_2d = np.reshape(down_samp_didv_3d, (-1, spectra_length))\n",
    "down_samp_iv_2d = np.reshape(down_samp_didv_3d, (-1, spectra_length))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "num_clusters = 8  # change the number of clusters here. Something ~ 16 is easy to understand and visualize\n",
    "estimators = KMeans(num_clusters)\n",
    "results = estimators.fit(down_samp_didv_2d)\n",
    "labels, centroids = px.processing.cluster.reorder_clusters(results.labels_, results.cluster_centers_)\n",
    "labels_mat = np.reshape(labels, (down_samp_didv_3d.shape[0], down_samp_didv_3d.shape[1]))\n",
    "fig_KM2, axes_KM2 = px.plot_utils.plot_cluster_results_together(labels_mat, centroids, spec_val=volt_vec,\n",
    "                                                                spec_label=x_label, resp_label=y_label)\n",
    "axes_KM2[0].invert_yaxis()\n",
    "axes_KM2[0].set_aspect(1)\n",
    "fig_KM2.savefig(path.join(folder_path, file_name + 'K_means_downsampled_dIdV_' + str(box_size) + '_box_' +\n",
    "                          str(num_clusters) + '_clusters.png'), format='png', dpi=300)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "y = centroids[-1]\n",
    "x = volt_vec\n",
    "from scipy.interpolate import interp1d\n",
    "f2 = interp1d(x, y, kind='cubic')\n",
    "xnew = np.linspace(volt_vec[0], volt_vec[-1], num=250, endpoint=True)\n",
    "plt.plot(x, y, 'k', xnew, f2(xnew), 'r')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}