{ "cells": [ { "cell_type": "markdown", "id": "specific-efficiency", "metadata": {}, "source": [ "# Análisis de resultados Smina\n", "\n", "Importa las librerías necesarias" ] }, { "cell_type": "code", "execution_count": null, "id": "light-reflection", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib as plt" ] }, { "cell_type": "markdown", "id": "liable-shock", "metadata": {}, "source": [ "Carga los resultados del scrining. Asumiendo que tu archivo lleva por nombe `vs_results.csv`" ] }, { "cell_type": "code", "execution_count": null, "id": "grave-citizenship", "metadata": {}, "outputs": [], "source": [ "# Carga de los resultados de vina\n", "dk_res = pd.read_csv('./vs_results.csv')\n", "dk_res.head()" ] }, { "cell_type": "markdown", "id": "exceptional-amsterdam", "metadata": {}, "source": [ "Carga el archivo con las etiquetas de actividad de las moléculas:\n", "- 1 = molécula activa para CDK2 (*true binder*)\n", "- 0 = molécula señuelo para CDK2 (*inactive/decoy*)" ] }, { "cell_type": "code", "execution_count": null, "id": "developing-jamaica", "metadata": {}, "outputs": [], "source": [ "# Carga de las actividades reales de los ligandos\n", "df_act = pd.read_csv('./cdk2_activity_mols.csv')" ] }, { "cell_type": "markdown", "id": "hidden-disclosure", "metadata": {}, "source": [ "Utiliza el comando `merge` de *[pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html)* para unir la tabla de resultados con la de etiquetas de actividad. El parámetro `inner` permite conservar sólo las filas con valores en común en ambas tablas." ] }, { "cell_type": "code", "execution_count": null, "id": "modified-timeline", "metadata": {}, "outputs": [], "source": [ "# Unión de ambas tablas\n", "df = dk_res.merge(df_act, how='inner')\n", "df.head()" ] }, { "cell_type": "markdown", "id": "integral-radiation", "metadata": {}, "source": [ "## Visualización de los resultados" ] }, { "cell_type": "code", "execution_count": null, "id": "genetic-insertion", "metadata": {}, "outputs": [], "source": [ "%run plot_metrics.py" ] }, { "cell_type": "markdown", "id": "enhanced-wisdom", "metadata": {}, "source": [ "### Ranking de moléculas" ] }, { "cell_type": "code", "execution_count": null, "id": "unsigned-motorcycle", "metadata": {}, "outputs": [], "source": [ "vs_res = PlotMetric(y_true = df.actividad, \n", " y_pred_dict = {'smina': df.score})\n", "vs_res.plot_actives_distribution(add_to_title='')" ] }, { "cell_type": "markdown", "id": "fatal-mirror", "metadata": {}, "source": [ "### Curva ROC y Área bajo la curva ROC" ] }, { "cell_type": "code", "execution_count": null, "id": "japanese-browser", "metadata": {}, "outputs": [], "source": [ "vs_res.plot_roc_auc(title='VS results',\n", " fontsize='x-small', \n", " show_by_itself = False)" ] }, { "cell_type": "markdown", "id": "opening-letter", "metadata": {}, "source": [ "### Distribución de los scores entre Activos e Inactivos" ] }, { "cell_type": "code", "execution_count": null, "id": "progressive-shame", "metadata": { "scrolled": false }, "outputs": [], "source": [ "compare_two_distributions(df)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 5 }