{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [PyBroMo](http://tritemio.github.io/PyBroMo/) - B.2 Disk-single-core - Generate smFRET data files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<small><i>\n",
    "This notebook is part of <a href=\"http://tritemio.github.io/PyBroMo\" target=\"_blank\">PyBroMo</a> a \n",
    "python-based single-molecule Brownian motion diffusion simulator \n",
    "that simulates confocal smFRET\n",
    "experiments.\n",
    "</i></small>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## *Overview*\n",
    "\n",
    "*In this notebook we show how to generated smFRET data files from raw timestamps*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from pathlib import Path\n",
    "import numpy as np\n",
    "import tables\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import pybromo as pbm\n",
    "print('Numpy version:', np.__version__)\n",
    "print('PyTables version:', tables.__version__)\n",
    "print('PyBroMo version:', pbm.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Create smFRET data-files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a file for a single FRET efficiency\n",
    "\n",
    "In this section we show how to save a single smFRET data file. In the next section we will perform the same steps in a loop to generate a sequence of smFRET data files."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 1: Create a the timestamp array"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The start by loading the timestamps for donor and acceptor channel. \n",
    "The FRET efficiency is determined by the **max emission rate ratio (*k*)**. We also need to choose the background rate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a memo, let's write some formulas related to the FRET efficiency:\n",
    "\n",
    "$$ k = \\frac{F_a}{F_d} \\quad,\\qquad E = \\frac{k}{k+1}  \\qquad\\Rightarrow\\qquad k = \\frac{E}{1-E}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "S = pbm.ParticlesSimulation.from_datafile('0168', mode='w')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#S = pbm.ParticlesSimulation.from_datafile('0168', mode)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simulate timestamps"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def em_rates_DA_from_E(em_rate_tot, E_values):\n",
    "    E_values = np.asarray(E_values)\n",
    "    em_rates_a = E_values * em_rate_tot\n",
    "    em_rates_d = em_rate_tot - em_rates_a\n",
    "    return em_rates_d, em_rates_a\n",
    "\n",
    "def em_rates_from_E(em_rate_tot, E_values):\n",
    "    em_rates_d, em_rates_a = em_rates_DA_from_E(em_rate_tot, E_values)\n",
    "    return np.unique(np.hstack([em_rates_d, em_rates_a]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "em_rate_tot = 300e3\n",
    "E_list = np.array([0, 0.2, 0.3, 0.4, 0.49, 0.6, 0.7, 0.8])\n",
    "\n",
    "em_rate_list = em_rates_from_E(em_rate_tot, E_list)\n",
    "em_rate_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "bg_rate_d, bg_rate_a = 900, 700\n",
    "bg_rates = [bg_rate_a, bg_rate_d]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "rs = np.random.RandomState(123)\n",
    "\n",
    "for bg in bg_rates:\n",
    "    for em_rate in em_rate_list:\n",
    "        print(\"- Simulating timestamps @%3d kcps, background %.1f kcps\" %(\n",
    "              em_rate*1e-3, bg*1e-3), flush=True)\n",
    "        S.simulate_timestamps_mix(max_rates=(em_rate,), populations=(slice(0, 20),), \n",
    "                                  bg_rate=bg, rs=rs, overwrite=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "for k in S.ts_store.h5file.root.timestamps._v_children.keys():\n",
    "    if not k.endswith('_par'):\n",
    "        print(k)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compose timestamps for FRET"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "E_sim = 0.49\n",
    "\n",
    "em_rate_d, em_rate_a = em_rates_DA_from_E(em_rate_tot, E_sim)\n",
    "em_rate_d, em_rate_a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "donor_names = S.timestamps_match_mix((em_rate_d,), populations=(slice(0, 20),), bg_rate=bg_rate_d)\n",
    "donor_names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "acceptor_names = S.timestamps_match_mix((em_rate_a,), populations=(slice(0, 20),), bg_rate=bg_rate_a)\n",
    "acceptor_names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ts_d, ts_par_d = S.get_timestamps_part(donor_names[0])\n",
    "ts_a, ts_par_a = S.get_timestamps_part(acceptor_names[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ts_d.attrs['clk_p']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we need to create a single array with donor + acceptor timestamps:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ts, a_ch, part = pbm.timestamps.merge_da(ts_d, ts_par_d, ts_a, ts_par_a)\n",
    "ts.shape, a_ch.shape, part"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Perform some safety checks and plot:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "assert a_ch.sum() == ts_a.shape[0]\n",
    "assert (-a_ch).sum() == ts_d.shape[0]\n",
    "assert a_ch.size == ts_a.shape[0] + ts_d.shape[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.plot(ts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "bins = np.arange(0, 1, 1e-3)\n",
    "plt.hist(ts*ts_d.attrs['clk_p'], bins=bins, histtype='step');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "bins = np.arange(0, 1, 1e-3)\n",
    "counts_d, _ = np.histogram(ts[~a_ch]*ts_d.attrs['clk_p'], bins=bins)\n",
    "counts_a, _ = np.histogram(ts[a_ch]*ts_d.attrs['clk_p'], bins=bins)\n",
    "plt.plot(bins[:-1], counts_d, 'g')\n",
    "plt.plot(bins[:-1], -counts_a, 'r')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 2: saving to Photon-HDF5 format\n",
    "\n",
    "To save the data in [Photon-HDF5 format](http://photon-hdf5.org) we use \n",
    "the library [**phconvert**](http://photon-hdf5.github.io/phconvert/):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import phconvert as phc\n",
    "print('Phconvert version: ', phc.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We neeed a file name. We could use a random name, but it is better to generate it programmatically, by joining the filename of the browniam motion simulation with specific FRET simulation info:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fret_string = '_E%03d_EmD%dk_EmA%03dk_BgD%d_BgA%d' %\\\n",
    "        (E_sim*100, em_rate_d*1e-3, em_rate_a*1e-3, \n",
    "         bg_rate_d, bg_rate_a)\n",
    "fret_string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "filename_smfret = S.store.filepath.stem.replace('pybromo', 'smFRET') + fret_string + '.hdf5'\n",
    "filename_smfret"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fret_sim_fname = Path(filename_smfret)\n",
    "fret_sim_fname"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# inputs: E_sim, ts, a_ch, ts_d (clk_p), S.ts_store.filename, S.t_max\n",
    "photon_data = dict(\n",
    "    timestamps = ts,\n",
    "    timestamps_specs = dict(timestamps_unit=ts_d.attrs['clk_p']),\n",
    "    detectors = a_ch,\n",
    "    measurement_specs = dict(\n",
    "        measurement_type = 'smFRET',\n",
    "        detectors_specs = dict(spectral_ch1 = np.atleast_1d(False),\n",
    "                               spectral_ch2 = np.atleast_1d(True))))\n",
    "\n",
    "setup = dict(\n",
    "    num_pixels = 2,\n",
    "    num_spots = 1,\n",
    "    num_spectral_ch = 2,\n",
    "    num_polarization_ch = 1,\n",
    "    num_split_ch = 1,\n",
    "    modulated_excitation = False,\n",
    "    lifetime = False)\n",
    "\n",
    "provenance = dict(filename=S.ts_store.filename, \n",
    "                  software='PyBroMo', software_version=pbm.__version__)\n",
    "\n",
    "identity = dict(\n",
    "    author='Author Name',\n",
    "    author_affiliation='Research Institution or Company')\n",
    "\n",
    "description = 'Simulated freely-diffusing smFRET experiment, E = %.2f%%' % E_sim\n",
    "acquisition_duration = S.t_max\n",
    "data = dict(\n",
    "    acquisition_duration = round(acquisition_duration),\n",
    "    description = description,\n",
    "    photon_data = photon_data,\n",
    "    setup=setup,\n",
    "    provenance=provenance,\n",
    "    identity=identity)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "phc.hdf5.save_photon_hdf5(data, h5_fname=str(fret_sim_fname), overwrite=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "h5file = tables.open_file(str(fret_sim_fname))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "phc.hdf5.print_children(h5file.root.photon_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "h5file.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Batch creation of smFRET files\n",
    "\n",
    "We have seen how to create a single smFRET file. \n",
    "In this section we generate a sequence of smFRET files\n",
    "for different FRET efficiencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def make_photon_hdf5(ts, a_ch, clk_p, E_sim):\n",
    "    # globals: S.ts_store.filename, S.t_max\n",
    "    photon_data = dict(\n",
    "        timestamps = ts,\n",
    "        timestamps_specs = dict(timestamps_unit=clk_p),#ts_d.attrs['clk_p']),\n",
    "        detectors = a_ch,\n",
    "        measurement_specs = dict(\n",
    "            measurement_type = 'smFRET',\n",
    "            detectors_specs = dict(spectral_ch1 = np.atleast_1d(False),\n",
    "                                   spectral_ch2 = np.atleast_1d(True))))\n",
    "\n",
    "    setup = dict(\n",
    "        num_pixels = 2,\n",
    "        num_spots = 1,\n",
    "        num_spectral_ch = 2,\n",
    "        num_polarization_ch = 1,\n",
    "        num_split_ch = 1,\n",
    "        modulated_excitation = False,\n",
    "        lifetime = False)\n",
    "\n",
    "    provenance = dict(filename=S.ts_store.filename, \n",
    "                      software='PyBroMo', software_version=pbm.__version__)\n",
    "\n",
    "    identity = dict(\n",
    "        author='Author Name',\n",
    "        author_affiliation='Research Institution or Company')\n",
    "\n",
    "    description = 'Simulated freely-diffusing smFRET experiment, E = %.2f%%' % E_sim\n",
    "    acquisition_duration = S.t_max\n",
    "    data = dict(\n",
    "        acquisition_duration = round(acquisition_duration),\n",
    "        description = description,\n",
    "        photon_data = photon_data,\n",
    "        setup=setup,\n",
    "        provenance=provenance,\n",
    "        identity=identity)\n",
    "    return data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "E_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "em_rates_d, em_rates_a = em_rates_DA_from_E(em_rate_tot, E_list)\n",
    "em_rates_d, em_rates_a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%%timeit -n1 -r1\n",
    "\n",
    "for E_sim, em_d, em_a in zip(E_list, em_rates_d, em_rates_a):\n",
    "    print('E = %d%%, em_d = %6.1f, em_a = %6.1f' % \\\n",
    "          (E_sim*100, em_d, em_a))\n",
    "    \n",
    "    # Build the file name\n",
    "    fret_string = '_E%03d_EmD%dk_EmA%03dk_BgD%d_BgA%d' %\\\n",
    "            (E_sim*100, em_rate_d*1e-3, em_rate_a*1e-3, \n",
    "             bg_rate_d, bg_rate_a)\n",
    "    filename_smfret = S.store.filepath.stem.replace('pybromo', 'smFRET') + fret_string + '.hdf5'\n",
    "    fret_sim_fname = Path(filename_smfret)\n",
    "\n",
    "    # Merge D and A timestamps\n",
    "    donor_name = S.timestamps_match_mix((em_rate_d,), populations=(slice(0, 20),), bg_rate=bg_rate_d)[0]\n",
    "    accept_name = S.timestamps_match_mix((em_rate_a,), populations=(slice(0, 20),), bg_rate=bg_rate_a)[0]\n",
    "    ts_d, ts_par_d = S.get_timestamps_part(donor_name)\n",
    "    ts_a, ts_par_a = S.get_timestamps_part(accept_name)\n",
    "    ts, a_ch, ts_part = pbm.timestamps.merge_da(ts_d, ts_par_d, ts_a, ts_par_a)\n",
    "    assert a_ch.sum() == ts_a.shape[0]\n",
    "    assert (-a_ch).sum() == ts_d.shape[0]\n",
    "    assert a_ch.size == ts_a.shape[0] + ts_d.shape[0]\n",
    "    \n",
    "    # Save to Photon-HDF5\n",
    "    data = make_photon_hdf5(ts, a_ch, ts_d.attrs['clk_p'], E_sim)\n",
    "    phc.hdf5.save_photon_hdf5(data, h5_fname=str(fret_sim_fname), overwrite=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "S.ts_store.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Burst analysis\n",
    "\n",
    "As a final check we analyze the created files with \n",
    "[FRETBursts](https://github.com/tritemio/FRETBursts/) \n",
    "smFRET burst analysis program."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import fretbursts as fb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "filepath = list(Path('./').glob('smFRET_016*E020*'))[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "str(filepath)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d = fb.loader.photon_hdf5(str(filepath))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d.A_em"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fb.dplot(d, fb.timetrace);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d.calc_bg(fun=fb.bg.exp_fit, tail_min_us='auto', F_bg=1.7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d.bg_dd, d.bg_ad"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d.burst_search(F=7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "d.num_bursts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ds = d.select_bursts(fb.select_bursts.size, th1=200)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ds.num_bursts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fb.dplot(ds, fb.hist_fret)\n",
    "plt.axvline(0.2);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fb.dplot(ds, fb.timetrace, bursts=True);\n",
    "plt.ylim(-100, 150);\n",
    "plt.xlim(0.25, 0.5);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fb.bext.burst_data(ds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}