{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "keras_implementation.ipynb",
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.5.7"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "J4YkDhwVTyWr",
        "colab_type": "text"
      },
      "source": [
        "# Keras implementation of the proposed model\n",
        "This note describes how to implement the proposed model with Keras."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "1VMXUhJxtj1t"
      },
      "source": [
        "## Import modules"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "iAfdhVt_f146",
        "colab": {}
      },
      "source": [
        "import numpy as np\n",
        "\n",
        "################################### for neural network modeling\n",
        "import tensorflow as tf\n",
        "import tensorflow.keras as keras\n",
        "from tensorflow.keras import layers\n",
        "from tensorflow.keras import backend as K\n",
        "from tensorflow.keras.models import Model"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XRxRMv2LHoaR",
        "colab_type": "text"
      },
      "source": [
        "## Generate synthetic data (stationary Poisson process)\n",
        "A code to generate the other synthetic sequences used in the paper is given in \"code.ipynb\"."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "sf1IByutd1zV",
        "colab": {}
      },
      "source": [
        "## simmulate a stationary Poisson process\n",
        "T_train = np.random.exponential(size=80000).cumsum() # training data \n",
        "T_test  = np.random.exponential(size=20000).cumsum() # test data"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RpT5eaJYIIEc",
        "colab_type": "text"
      },
      "source": [
        "## Implementation of the proposed model\n",
        "The following keras model receives the event history and the elapsed time from the most recent event and outputs the hazard function and the cumulative hazard function."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "G41zRy3JABwj",
        "colab_type": "code",
        "outputId": "f7c13781-9032-40b7-f4c5-09ac20dac5f5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 88
        }
      },
      "source": [
        "## hyper parameters\n",
        "time_step = 20 # truncation depth of RNN \n",
        "size_rnn = 64 # the number of units in the RNN\n",
        "size_nn = 64 # the nubmer of units in each hidden layer in the cumulative hazard function network\n",
        "size_layer_chfn = 2 # the number of the hidden layers in the cumulative hazard function network\n",
        "\n",
        "## mean and std of the log of the inter-event interval, which will be used for the data standardization\n",
        "mu = np.log(np.ediff1d(T_train)).mean()\n",
        "sigma = np.log(np.ediff1d(T_train)).std()\n",
        "\n",
        "## kernel initializer for positive weights\n",
        "def abs_glorot_uniform(shape, dtype=None, partition_info=None): \n",
        "    return K.abs(keras.initializers.glorot_uniform(seed=None)(shape,dtype=dtype))\n",
        "\n",
        "## Inputs \n",
        "event_history  = layers.Input(shape=(time_step,1)) # input to RNN (event history)\n",
        "elapsed_time = layers.Input(shape=(1,)) # input to cumulative hazard function network (the elapsed time from the most recent event)\n",
        "\n",
        "## log-transformation and standardization\n",
        "event_history_nmlz = layers.Lambda(lambda x: (K.log(x)-mu)/sigma )(event_history)\n",
        "elapsed_time_nmlz = layers.Lambda(lambda x: (K.log(x)-mu)/sigma )(elapsed_time) \n",
        "\n",
        "## RNN\n",
        "output_rnn = layers.SimpleRNN(size_rnn,input_shape=(time_step,1),activation='tanh')(event_history_nmlz)\n",
        "\n",
        "## the first hidden layer in the cummulative hazard function network\n",
        "hidden_tau = layers.Dense(size_nn,kernel_initializer=abs_glorot_uniform,kernel_constraint=keras.constraints.NonNeg(),use_bias=False)(elapsed_time_nmlz) # elapsed time -> the 1st hidden layer, positive weights\n",
        "hidden_rnn = layers.Dense(size_nn)(output_rnn) # rnn output -> the 1st hidden layer\n",
        "hidden = layers.Lambda(lambda inputs: K.tanh(inputs[0]+inputs[1]) )([hidden_tau,hidden_rnn])\n",
        "\n",
        "## the second and higher hidden layers\n",
        "for i in range(size_layer_chfn-1):\n",
        "    hidden = layers.Dense(size_nn,activation='tanh',kernel_initializer=abs_glorot_uniform,kernel_constraint=keras.constraints.NonNeg())(hidden) # positive weights\n",
        "\n",
        "## Outputs\n",
        "Int_l = layers.Dense(1, activation='softplus',kernel_initializer=abs_glorot_uniform, kernel_constraint=keras.constraints.NonNeg() )(hidden) # cumulative hazard function, positive weights\n",
        "l = layers.Lambda( lambda inputs: K.gradients(inputs[0],inputs[1])[0] )([Int_l,elapsed_time]) # hazard function\n",
        "\n",
        "## define model\n",
        "model = Model(inputs=[event_history,elapsed_time],outputs=[l,Int_l])\n",
        "model.add_loss( -K.mean( K.log( 1e-10 + l ) - Int_l ) ) # set loss function to be the negative log-likelihood function"
      ],
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n",
            "Instructions for updating:\n",
            "If using Keras pass *_constraint arguments to layers.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BDTXptTmJW7K",
        "colab_type": "text"
      },
      "source": [
        "## Training"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TzCdfZkPJUy0",
        "colab_type": "code",
        "outputId": "c08092e2-090f-43c4-ca98-e643a3d96b8b",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 425
        }
      },
      "source": [
        "## format the input data\n",
        "dT_train = np.ediff1d(T_train) # transform a series of timestamps to a series of interevent intervals: T_train -> dT_train\n",
        "n = dT_train.shape[0]\n",
        "input_RNN = np.array( [ dT_train[i:i+time_step] for i in range(n-time_step) ]).reshape(n-time_step,time_step,1)\n",
        "input_CHFN = dT_train[-n+time_step:].reshape(n-time_step,1)\n",
        "\n",
        "## training \n",
        "model.compile(keras.optimizers.Adam(lr=0.001))\n",
        "model.fit([input_RNN,input_CHFN],epochs=10,batch_size=256,validation_split=0.2) # In our study, we have set epochs = 100 and employed early stopping. Please see code.ipynb for more details."
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:Output lambda_3 missing from loss dictionary. We assume this was done on purpose. The fit and evaluate APIs will not be expecting any data to be passed to lambda_3.\n",
            "WARNING:tensorflow:Output dense_3 missing from loss dictionary. We assume this was done on purpose. The fit and evaluate APIs will not be expecting any data to be passed to dense_3.\n",
            "Train on 63983 samples, validate on 15996 samples\n",
            "Epoch 1/10\n",
            "63983/63983 [==============================] - 6s 86us/sample - loss: 1.3159 - val_loss: 1.0593\n",
            "Epoch 2/10\n",
            "63983/63983 [==============================] - 3s 54us/sample - loss: 1.0645 - val_loss: 1.0571\n",
            "Epoch 3/10\n",
            "63983/63983 [==============================] - 3s 53us/sample - loss: 1.0507 - val_loss: 1.0373\n",
            "Epoch 4/10\n",
            "63983/63983 [==============================] - 3s 54us/sample - loss: 1.0360 - val_loss: 1.0328\n",
            "Epoch 5/10\n",
            "63983/63983 [==============================] - 3s 52us/sample - loss: 1.0263 - val_loss: 1.0180\n",
            "Epoch 6/10\n",
            "63983/63983 [==============================] - 3s 52us/sample - loss: 1.0216 - val_loss: 1.0168\n",
            "Epoch 7/10\n",
            "63983/63983 [==============================] - 3s 53us/sample - loss: 1.0192 - val_loss: 1.0120\n",
            "Epoch 8/10\n",
            "63983/63983 [==============================] - 3s 52us/sample - loss: 1.0152 - val_loss: 1.0149\n",
            "Epoch 9/10\n",
            "63983/63983 [==============================] - 3s 53us/sample - loss: 1.0139 - val_loss: 1.0081\n",
            "Epoch 10/10\n",
            "63983/63983 [==============================] - 3s 54us/sample - loss: 1.0127 - val_loss: 1.0126\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tensorflow.python.keras.callbacks.History at 0x7f1ff043e5c0>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 4
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pYYu7qrgMJjB",
        "colab_type": "text"
      },
      "source": [
        "## Test 1\n",
        "We here evaluate the performance of the trained model using the test data. We use the mean negative log-likelihood (MNLL) for the evaluation."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "S7zLJrIFO3qN",
        "colab_type": "code",
        "outputId": "651b183c-4d22-4869-f10a-b7b721a2eb60",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "## format the input data\n",
        "dT_test = np.ediff1d(T_test) # transform a series of timestamps to a series of interevent intervals: T_test -> dT_test\n",
        "n = dT_test.shape[0]\n",
        "input_RNN_test = np.array( [ dT_test[i:i+time_step] for i in range(n-time_step) ]).reshape(n-time_step,time_step,1)\n",
        "input_CHFN_test = dT_test[-n+time_step:].reshape(n-time_step,1)\n",
        "\n",
        "## testing\n",
        "[l_test,Int_l_test] = model.predict([input_RNN_test,input_CHFN_test],batch_size=input_RNN_test.shape[0])\n",
        "LL = np.log(l_test+1e-10) - Int_l_test # log-liklihood\n",
        "print(\"Mean negative log-likelihood per event: \",-LL.mean())"
      ],
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Mean negative log-likelihood per event:  1.0063448\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jRXhd7tHTLF4",
        "colab_type": "text"
      },
      "source": [
        "## Test 2\n",
        "Next, we predict the timing of the next event at each time step using the median of the predictive distribution. We then evaluate the prediction in terms of mean absolute error (MAE)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "alUBeH5CTKHA",
        "colab_type": "code",
        "outputId": "784150fc-0152-4c64-bd78-d902fa23a4ea",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "# The median of the predictive distribution is determined using the bisection method.\n",
        "x_left = 1e-4  * np.mean(dT_train) * np.ones_like(input_CHFN_test)\n",
        "x_right = 100 * np.mean(dT_train) * np.ones_like(input_CHFN_test)\n",
        "\n",
        "for i in range(13):\n",
        "    x_center = (x_left+x_right)/2\n",
        "    v = model.predict([input_RNN_test,x_center],batch_size=x_center.shape[0])[1]\n",
        "    x_left = np.where(v<np.log(2),x_center,x_left)\n",
        "    x_right = np.where(v>=np.log(2),x_center,x_right)\n",
        "    \n",
        "tau_pred = (x_left+x_right)/2 # predicted interevent interval\n",
        "AE = np.abs(input_CHFN_test-tau_pred) # absolute error\n",
        "print(\"Mean absolute error: \", AE.mean() )    "
      ],
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Mean absolute error:  0.6963987826906138\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2mo5ZWPrivfs",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        ""
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}