{ "cells": [ { "cell_type": "markdown", "id": "b12b5440-e982-4e4c-9edc-bf9c1e10daf5", "metadata": {}, "source": [ "# 🌉 3. Build Bridges, Not Walls" ] }, { "cell_type": "code", "execution_count": null, "id": "ea17fcaa-2b2a-45e0-9837-be1414278154", "metadata": {}, "outputs": [], "source": [ "import pykx as kx" ] }, { "cell_type": "markdown", "id": "1f824b8e-d1a8-4f10-8a2c-44fdb5b6f70c", "metadata": {}, "source": [ "## Traffic" ] }, { "cell_type": "markdown", "id": "289cc2c9-6fe6-4deb-9905-a742174f477d", "metadata": {}, "source": [ "The q language incorporates SQL-like functionality, making it easy to query tables. This module serves as a great starting point for learning q." ] }, { "cell_type": "code", "execution_count": null, "id": "c0d189d2-267e-4bad-80d0-f4c2142115ee", "metadata": {}, "outputs": [], "source": [ "traffic = kx.q.read.csv(\"data/traffic.csv\", \"IPSJS\", \",\", True)\n", "traffic = kx.q.qsql.select(traffic, where='error=`N')\n", "traffic_mad = kx.q('{select avg carga by fecha from x}', traffic)\n", "traffic_mad" ] }, { "cell_type": "markdown", "id": "92d3c0c2-d8eb-4cb1-aa6d-4af0e87efbef", "metadata": {}, "source": [ "<div class=\"alert alert-warning\">\n", "In this case, we recommend using <i>kx.q.sql</i> instead of <i>kx.q</i>. Putting code in strings is generally considered a bad practice due to increased vulnerability, error-proneness, and difficulty in debugging.\n", "</div>" ] }, { "cell_type": "markdown", "id": "16c4ec16-c04d-4e6b-b86a-2498535678d0", "metadata": {}, "source": [ "## Weather" ] }, { "cell_type": "code", "execution_count": null, "id": "fc4a30fc-5386-4fdc-ac3d-04bd5bd965a0", "metadata": {}, "outputs": [], "source": [ "weather = kx.q.read.csv(\"data/weather.csv\", \"DUIFFFFFFFF\", \",\", True)\n", "weather_mad = kx.q.qsql.select(weather, columns={'precipitacion': 'avg precipitacion'}, where='not null precipitacion', by={'fecha': 'fecha+hora'})\n", "weather_mad" ] }, { "cell_type": "markdown", "id": "65be2f13-9f69-4072-ad4f-9b5ad11f327b", "metadata": {}, "source": [ "## All Together" ] }, { "cell_type": "markdown", "id": "a9fc5c45-c409-496e-9444-548473ce5cb0", "metadata": {}, "source": [ "The `aj` (as-of join) operator stands out as one of the corona's jewels." ] }, { "cell_type": "code", "execution_count": null, "id": "e1758e0c-94b2-443e-a177-9751bd40f2f7", "metadata": {}, "outputs": [], "source": [ "traffic_weather = kx.q.aj('fecha', traffic_mad, weather_mad)" ] }, { "cell_type": "markdown", "id": "5f5a58af-afee-4f67-a86c-cb7a9bdeae9b", "metadata": {}, "source": [ "## Model" ] }, { "cell_type": "code", "execution_count": null, "id": "9d956528-ae06-4f68-9d85-f6c81fdb2ede", "metadata": {}, "outputs": [], "source": [ "from sklearn.preprocessing import MinMaxScaler\n", "from sklearn.model_selection import train_test_split\n", "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Dense\n", "import matplotlib.pyplot as plt\n", "\n", "traffic_weather = traffic_weather.pd().reset_index()\n", "\n", "to_quarter = lambda x: int((x.hour * 60 + x.minute) / 15)\n", "traffic_weather['hora'] = traffic_weather['fecha'].dt.time.apply(to_quarter)\n", "\n", "X = traffic_weather[['hora', 'precipitacion']]\n", "y = traffic_weather['carga']\n", "\n", "scaler = MinMaxScaler()\n", "X_normalized = scaler.fit_transform(X)\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(X_normalized, y, test_size=0.2, random_state=42)\n", "\n", "model = Sequential()\n", "model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))\n", "model.add(Dense(32, activation='relu'))\n", "model.add(Dense(1, activation='linear'))\n", "\n", "model.compile(optimizer='adam', loss='mean_squared_error')\n", "history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), verbose=1)\n", "predictions = model.predict(X_test)\n", "\n", "plt.scatter(y_test, predictions, color='blue')\n", "plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], linestyle='--', color='red', linewidth=2) # Diagonal line for reference\n", "plt.xlabel('Actual Traffic Load')\n", "plt.ylabel('Predicted Traffic Load')\n", "plt.title('Actual vs. Predicted Traffic Load')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "f8b4f674-2fbe-4f23-99d3-edca6842de40", "metadata": {}, "source": [ "## What Next?\n", "* Gradual Acquaintance with the Core Interface\n", "* Usage of `kx.q('...')` could determine readiness to turn it all upside down" ] }, { "cell_type": "code", "execution_count": null, "id": "1bff3593-e781-4e70-849d-759b16d2362e", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "pykx", "language": "python", "name": "pykx" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.16" } }, "nbformat": 4, "nbformat_minor": 5 }