{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b12b5440-e982-4e4c-9edc-bf9c1e10daf5",
   "metadata": {},
   "source": [
    "# 🌉 3. Build Bridges, Not Walls"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea17fcaa-2b2a-45e0-9837-be1414278154",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pykx as kx"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f824b8e-d1a8-4f10-8a2c-44fdb5b6f70c",
   "metadata": {},
   "source": [
    "## Traffic"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "289cc2c9-6fe6-4deb-9905-a742174f477d",
   "metadata": {},
   "source": [
    "The q language incorporates SQL-like functionality, making it easy to query tables. This module serves as a great starting point for learning q."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c0d189d2-267e-4bad-80d0-f4c2142115ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "traffic = kx.q.read.csv(\"data/traffic.csv\", \"IPSJS\", \",\", True)\n",
    "traffic = kx.q.qsql.select(traffic, where='error=`N')\n",
    "traffic_mad = kx.q('{select avg carga by fecha from x}', traffic)\n",
    "traffic_mad"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "92d3c0c2-d8eb-4cb1-aa6d-4af0e87efbef",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-warning\">\n",
    "In this case, we recommend using <i>kx.q.sql</i> instead of <i>kx.q</i>. Putting code in strings is generally considered a bad practice due to increased vulnerability, error-proneness, and difficulty in debugging.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16c4ec16-c04d-4e6b-b86a-2498535678d0",
   "metadata": {},
   "source": [
    "## Weather"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc4a30fc-5386-4fdc-ac3d-04bd5bd965a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "weather = kx.q.read.csv(\"data/weather.csv\", \"DUIFFFFFFFF\", \",\", True)\n",
    "weather_mad = kx.q.qsql.select(weather, columns={'precipitacion': 'avg precipitacion'}, where='not null precipitacion', by={'fecha': 'fecha+hora'})\n",
    "weather_mad"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65be2f13-9f69-4072-ad4f-9b5ad11f327b",
   "metadata": {},
   "source": [
    "## All Together"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9fc5c45-c409-496e-9444-548473ce5cb0",
   "metadata": {},
   "source": [
    "The `aj` (as-of join) operator stands out as one of the corona's jewels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1758e0c-94b2-443e-a177-9751bd40f2f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "traffic_weather = kx.q.aj('fecha', traffic_mad, weather_mad)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f5a58af-afee-4f67-a86c-cb7a9bdeae9b",
   "metadata": {},
   "source": [
    "## Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d956528-ae06-4f68-9d85-f6c81fdb2ede",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.preprocessing import MinMaxScaler\n",
    "from sklearn.model_selection import train_test_split\n",
    "from tensorflow.keras.models import Sequential\n",
    "from tensorflow.keras.layers import Dense\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "traffic_weather = traffic_weather.pd().reset_index()\n",
    "\n",
    "to_quarter = lambda x: int((x.hour * 60 + x.minute) / 15)\n",
    "traffic_weather['hora'] = traffic_weather['fecha'].dt.time.apply(to_quarter)\n",
    "\n",
    "X = traffic_weather[['hora', 'precipitacion']]\n",
    "y = traffic_weather['carga']\n",
    "\n",
    "scaler = MinMaxScaler()\n",
    "X_normalized = scaler.fit_transform(X)\n",
    "\n",
    "X_train, X_test, y_train, y_test = train_test_split(X_normalized, y, test_size=0.2, random_state=42)\n",
    "\n",
    "model = Sequential()\n",
    "model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))\n",
    "model.add(Dense(32, activation='relu'))\n",
    "model.add(Dense(1, activation='linear'))\n",
    "\n",
    "model.compile(optimizer='adam', loss='mean_squared_error')\n",
    "history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), verbose=1)\n",
    "predictions = model.predict(X_test)\n",
    "\n",
    "plt.scatter(y_test, predictions, color='blue')\n",
    "plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], linestyle='--', color='red', linewidth=2)  # Diagonal line for reference\n",
    "plt.xlabel('Actual Traffic Load')\n",
    "plt.ylabel('Predicted Traffic Load')\n",
    "plt.title('Actual vs. Predicted Traffic Load')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8b4f674-2fbe-4f23-99d3-edca6842de40",
   "metadata": {},
   "source": [
    "## What Next?\n",
    "* Gradual Acquaintance with the Core Interface\n",
    "* Usage of `kx.q('...')` could determine readiness to turn it all upside down"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1bff3593-e781-4e70-849d-759b16d2362e",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pykx",
   "language": "python",
   "name": "pykx"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}