{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Probabilistic Programming 3: Regression and classification\n", "\n", "#### Goal \n", " - Learn how to infer a posterior distribution for a linear regression model using a probabilistic programming language.\n", " - Learn how to infer a posterior distribution for a linear classification model using a probabilistic programming language.\n", " \n", "#### Materials \n", " - Mandatory\n", " - This notebook.\n", " - Lecture notes on [regression](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Regression.ipynb).\n", " - Lecture notes on [discriminative classification](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Discriminative-Classification.ipynb).\n", " - Optional\n", " - Bayesian linear regression (Section 3.3 [Bishop](https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf))\n", " - Bayesian logistic regression (Section 4.5 [Bishop](https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf))\n", " - [Cheatsheets: how does Julia differ from Matlab / Python](https://docs.julialang.org/en/v1/manual/noteworthy-differences/index.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "using Pkg\n", "Pkg.activate(\"../../../lessons/\")\n", "Pkg.instantiate();\n", "IJulia.clear_output();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem: Economic growth\n", "\n", "In 2008, the credit crisis sparked a recession in the US, which spread to other countries in the ensuing years. It took most countries a couple of years to recover. \n", "Now, the year is 2011. The Turkish government is asking you to estimate whether Turkey is out of the recession. You decide to look at the data of the national stock exchange to see if there's a positive trend. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "using CSV\n", "using DataFrames\n", "using LinearAlgebra\n", "using Distributions\n", "using StatsFuns\n", "using RxInfer\n", "using Plots\n", "default(label=\"\", margin=10Plots.pt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data\n", "\n", "We are going to start with loading in a data set. We have daily measurements from Istanbul, from the 5th of January 2009 until 22nd of February 2011. The dataset comes from an online resource for machine learning data sets: the [UCI ML Repository](https://archive.ics.uci.edu/ml/datasets/ISTANBUL+STOCK+EXCHANGE)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Row | date | ISE |
---|---|---|
String15 | Float64 | |
1 | 5-Jan-09 | 0.0357537 |
2 | 6-Jan-09 | 0.0254259 |
3 | 7-Jan-09 | -0.0288617 |
4 | 8-Jan-09 | -0.0622081 |
5 | 9-Jan-09 | 0.00985991 |
6 | 12-Jan-09 | -0.029191 |
7 | 13-Jan-09 | 0.0154453 |
8 | 14-Jan-09 | -0.0411676 |
9 | 15-Jan-09 | 0.000661905 |
10 | 16-Jan-09 | 0.0220373 |
11 | 19-Jan-09 | -0.0226925 |
12 | 20-Jan-09 | -0.0137087 |
13 | 21-Jan-09 | 0.000864697 |
⋮ | ⋮ | ⋮ |
525 | 7-Feb-11 | -0.0061961 |
526 | 8-Feb-11 | 0.00535559 |
527 | 9-Feb-11 | 0.00482299 |
528 | 10-Feb-11 | -0.0176644 |
529 | 11-Feb-11 | 0.00478229 |
530 | 14-Feb-11 | -0.00249793 |
531 | 15-Feb-11 | 0.00360638 |
532 | 16-Feb-11 | 0.00859906 |
533 | 17-Feb-11 | 0.00931031 |
534 | 18-Feb-11 | 0.000190969 |
535 | 21-Feb-11 | -0.013069 |
536 | 22-Feb-11 | -0.00724632 |