{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [PyBroMo](http://opensmfs.github.io/PyBroMo/) - 2. Generate smFRET data, including mixtures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<small><i>\n",
    "This notebook is part of <a href=\"http://opensmfs.github.io/PyBroMo\" target=\"_blank\">PyBroMo</a> a \n",
    "python-based single-molecule Brownian motion diffusion simulator \n",
    "that simulates confocal smFRET\n",
    "experiments.\n",
    "</i></small>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## *Overview*\n",
    "\n",
    "*In this notebook we show how to generated smFRET data files from the diffusion trajectories*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading the software\n",
    "\n",
    "Import all the relevant libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from pathlib import Path\n",
    "import numpy as np\n",
    "import tables\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import pybromo as pbm\n",
    "print('Numpy version:', np.__version__)\n",
    "print('PyTables version:', tables.__version__)\n",
    "print('PyBroMo version:', pbm.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Create smFRET data-files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a file for a single FRET efficiency\n",
    "\n",
    "In this section we show how to save a single smFRET data file. In the next section we will perform the same steps in a loop to generate a sequence of smFRET data files.\n",
    "\n",
    "Here we load a diffusion simulation opening a file to save\n",
    "timstamps in *write* mode. Use `'a'` (i.e. *append*) to keep \n",
    "previously simulated timestamps for the given diffusion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "code = 'e30f'  # 1s simulation, NumericPSF\n",
    "#code = 'e6fe'  # 1s simulation, GaussianPSF\n",
    "S = pbm.ParticlesSimulation.from_datafile(code, mode='w')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Number of populations with distinct diffusion coefficient\n",
    "S.particles.num_populations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Number of particles in each population\n",
    "S.particles.particles_counts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Diffusion coefficient paired with the number particles in each population\n",
    "S.particles.diffusion_coeff_counts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Note on particles population\n",
    "\n",
    "Population defined in the diffusion simulations are not necessarily equal\n",
    "to the populations used for timestamps simulation.\n",
    "\n",
    "For example, you may decide to assign the same diffusion coefficient \n",
    "to all particles which requires only one population during the diffusion\n",
    "simulation. When simulating timestamps, however, you can split your\n",
    "particles assigning different \"brightness\" or FRET, creating many populations\n",
    "from a single diffusion population.\n",
    "\n",
    "You may also, simulate timestamps for fewer particles than present \n",
    "in the diffusion trajectory file.\n",
    "\n",
    "To avoid errors, I suggest to follow one of these rule of thumbs:\n",
    "\n",
    "1. Use the same particles per population both during diffusion and during timestamps simulations\n",
    "\n",
    "2. Use only one population during diffusion and split populations during\n",
    "   the timestamps simulation.\n",
    "   \n",
    "Other scenarios are possible, but you should carefully read the code \n",
    "to make sure pybromo is doing exactly what you intend to do.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simulate timestamps of smFRET\n",
    "\n",
    "### Example1: single FRET population\n",
    "\n",
    "Define the simulation parameters with the following syntax:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "params = dict(\n",
    "    em_rates = (200e3,),    # Peak emission rates (cps) for each population (D+A)\n",
    "    E_values = (0.75,),     # FRET efficiency for each population\n",
    "    num_particles = (3,),   # Number of particles in each population\n",
    "    bg_rate_d = 1500,       # Poisson background rate (cps) Donor channel\n",
    "    bg_rate_a = 800,        # Poisson background rate (cps) Acceptor channel\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create the object that will run the simulation and print a summary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mix_sim = pbm.TimestampSimulation(S, **params)\n",
    "mix_sim.summarize()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the simualtion:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rs = np.random.RandomState(1234)\n",
    "mix_sim.run(rs=rs, \n",
    "            overwrite=True,      # overwite existing timstamp arrays\n",
    "            skip_existing=True,  # skip simulation of existing timestamps arrays to save time\n",
    "            save_pos=True,       # save particle position at emission time\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save simulation to a smFRET [Photon-HDF5](http://photon-hdf5.org) file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mix_sim.save_photon_hdf5(identity=dict(author='John Doe', \n",
    "                                       author_affiliation='Planet Mars'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example 2: 2 FRET populations\n",
    "\n",
    "To simulate 2 population we just define the parameters with \n",
    "one value per population, except for the Poisson background \n",
    "rate that is a single value for each channel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "params = dict(\n",
    "    em_rates = (200e3, 180e3),   # Peak emission rates (cps) for each population (D+A)\n",
    "    E_values = (0.75, 0.35),     # FRET efficiency for each population\n",
    "    num_particles = (3, 1),      # Number of particles in each population\n",
    "    bg_rate_d = 1500,       # Poisson background rate (cps) Donor channel\n",
    "    bg_rate_a = 800,        # Poisson background rate (cps) Acceptor channel\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mix_sim = pbm.TimestampSimulation(S, **params)\n",
    "mix_sim.summarize()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rs = np.random.RandomState(1234)\n",
    "mix_sim.run(rs=rs, overwrite=False, save_pos=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mix_sim.save_photon_hdf5()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Burst analysis\n",
    "\n",
    "The generated Photon-HDF5 files can be analyzed by any smFRET burst\n",
    "analysis program. Here we show an example using the opensource\n",
    "[FRETBursts](https://github.com/OpenSMFS/FRETBursts/) program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fretbursts as fb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filepath = list(Path('./').glob(f'smFRET_{code}*'))\n",
    "filepath"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d = fb.loader.photon_hdf5(str(filepath[0]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.A_em"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fb.dplot(d, fb.timetrace);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.calc_bg(fun=fb.bg.exp_fit, tail_min_us='auto', F_bg=1.7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.bg_dd, d.bg_ad"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.burst_search(F=7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.num_bursts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = d.select_bursts(fb.select_bursts.size, th1=20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.num_bursts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fb.dplot(d, fb.timetrace, bursts=True);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fb.dplot(ds, fb.hist_fret, pdf=False)\n",
    "plt.axvline(0.75);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **NOTE:** Unless you simulated a diffusion of 30s or more the previous histogram will be very poor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fb.bext.burst_data(ds)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Play with photon positons"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The smFRET file name:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d.fname"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Read the positions array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tables\n",
    "with tables.open_file(d.fname, 'r') as h5file:\n",
    "    positions = h5file.root.photon_data.user.positions.read()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also get the timestamps and particles arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "timestamps = d.ph_times_m[0]\n",
    "particles = d.particles[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These are all the particles ID in the file, the last ID is not a real particle\n",
    "but is associated with timestamps from background:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.unique(particles)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Positions must have the same number of rows as number of timestamps or particles.\n",
    "Let's test it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert positions.shape[0] == timestamps.size == particles.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "positions.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print first 5 positions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "positions[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **NOTE**: positions are NaN when the timestamp is from background.\n",
    "\n",
    "Check that we have NaNs if and only if the timestamp is from background:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bg_timestamp = particles == np.unique(particles)[-1]\n",
    "\n",
    "assert np.all(bg_timestamp == np.isnan(positions[:, 0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we take the burst data including index of burst start and stop \n",
    "(`i_start` and `i_stop` columns):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bursts = fb.bext.burst_data(ds, include_ph_index=True)\n",
    "bursts.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "burstph = fb.bext.burst_photons(ds)\n",
    "burstph['particle'] = np.hstack(\n",
    "    fb.burstlib.iter_bursts_ph(ds.particles[0], ds.mburst[0]))\n",
    "burstpos = np.vstack(\n",
    "    fb.burstlib.iter_bursts_ph(positions, ds.mburst[0]))\n",
    "burstph['x_um'] = burstpos[:, 0] * 1e6\n",
    "burstph['y_um'] = burstpos[:, 1] * 1e6\n",
    "burstph['z_um'] = burstpos[:, 2] * 1e6\n",
    "burstph['r_um'] = np.linalg.norm(burstpos[:, :2], axis=1) * 1e6\n",
    "burstph.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can assigning to each burst the particle with most photons:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bursts['particle'] = np.zeros(bursts.shape[0], dtype='uint8')\n",
    "for iburst, bph in burstph.groupby('burst'):\n",
    "    par, counts = np.unique(bph.particle, return_counts=True)\n",
    "    bursts.loc[iburst, 'particle'] = par[counts.argmax()]    \n",
    "bursts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Positions where each burst starts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "positions[bursts.i_start]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "particles[bursts.i_start]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All photon emission positions in **one** burst:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "burst_idx = 0\n",
    "pos_burst = burstph.loc[burst_idx]\n",
    "pos_burst"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Photon emission positions in **all** bursts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Increase the resolution of the figures displayed in the notebook\n",
    "%config InlineBackend.figure_format = 'retina' # 'png' for default res"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "S.psf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if S.psf.kind == 'numeric':\n",
    "    PSF = pbm.NumericPSF()\n",
    "    psf = PSF.hdata\n",
    "    z_peak = PSF.zi[PSF.zm]  # z position of PSF peak in μm\n",
    "else:\n",
    "    x = np.arange(0, 4, 0.01) * 1e-6\n",
    "    z = np.arange(-6, 6, 0.01) * 1e-6\n",
    "    X, Z = np.meshgrid(x, z)\n",
    "    psf = S.psf.eval_xz(X, Z)\n",
    "cmap = plt.cm.YlGnBu\n",
    "cmap.set_under(alpha=0)\n",
    "kwargs = dict(interpolation='bicubic', origin='lower', cmap=cmap, vmin=1e-1, zorder=1)\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(8, 2))\n",
    "ax.imshow(psf.T, extent=(-6, 6, 0, 4), **kwargs)\n",
    "ax.plot(burstph.z_um, burstph.r_um, '.', ms=5, color='C1', alpha=0.3)\n",
    "ax.set(ylabel='R (μm)', xlabel='Z (μm)', ylim=(0, 1), xlim=(-2, 2),\n",
    "       title=f'Position of photon emission (PSF {S.psf.kind})');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(figsize=(8, 2))\n",
    "ax.imshow(psf.T, extent=(-6, 6, 0, 4), **kwargs)\n",
    "sns.scatterplot(x=\"z_um\", y=\"r_um\", hue=\"burst\", data=burstph.reset_index(),\n",
    "                zorder=10, ax=ax, marker='o', linewidth=0,\n",
    "                palette='Spectral', alpha=0.3, s=20)\n",
    "if S.psf.kind == 'numeric':\n",
    "    ax.axvline(PSF.zi[PSF.zm], color='k')\n",
    "ax.set(ylabel='R (μm)', xlabel='Z (μm)', ylim=(0, 1), xlim=(-2, 2),\n",
    "       title='Position of photon emission by burst');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "S.store.close()\n",
    "S.ts_store.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "591.867px",
    "left": "0px",
    "right": "959px",
    "top": "111.133px",
    "width": "212px"
   },
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}