{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider a ridge regression problem as follows:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\text{minimize } \\ \\frac{1}{2} \\sum_{i=1}^n (w^T x_i - y_i)^2 + \\frac{\\lambda}{2} \\|w\\|^2$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This problem is a special case of a more general family of problems called *regularized empirical risk minimization*, where the objective function is usually comprised of two parts: a set of *loss terms* and a *regularization term*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we show how to use the package *SGDOptim* to solve such a problem. First, we have to prepare some simulation data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       "  3.0\n",
       " -4.0\n",
       "  5.0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "w = [3.0, -4.0, 5.0];   # the underlying model coefficients"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3x10000 Array{Float64,2}:\n",
       "  1.20754  -0.106163  1.2656   -0.195827  …   0.926012   1.33045    0.24129 \n",
       "  1.36097  -0.86756   1.05529   0.609302      1.11368   -1.57341   -1.165   \n",
       " -1.05901   0.648764  0.67563   0.276843     -0.262188   0.425759  -0.052067"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n = 10000; X = randn(3, n);  # generate 10000 sample features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10000-element Array{Float64,1}:\n",
       "  -7.16593\n",
       "   6.56201\n",
       "   2.89612\n",
       "  -1.69577\n",
       "   4.10703\n",
       "  11.8532 \n",
       "  14.9927 \n",
       "   3.83893\n",
       " -22.1367 \n",
       " -10.1685 \n",
       "  -6.24833\n",
       "  -3.17315\n",
       "   5.49029\n",
       "   ⋮      \n",
       "  10.8841 \n",
       "  11.0071 \n",
       "  -7.784  \n",
       "  12.7698 \n",
       "  -1.27293\n",
       "  -2.37607\n",
       "  -5.78557\n",
       "  -8.26047\n",
       "  -9.313  \n",
       "  -3.02949\n",
       "  12.331  \n",
       "   5.12746"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sig = 0.1; y = vec(w'X) + sig * randn(n); # generate the responses, adding some noise"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's try to estimate $w$ from the data. This can be done by the following statement:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "using SGDOptim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first step is to construct a risk model, which comprises a prediction model and a loss function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "EmpiricalRisks.SupervisedRiskModel{EmpiricalRisks.LinearPred,EmpiricalRisks.SqrLoss}(EmpiricalRisks.LinearPred(3),EmpiricalRisks.SqrLoss())"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rmodel = riskmodel(LinearPred(3),  # use linear prediction x -> w'x, 3 is the input dimension\n",
    "                   SqrLoss())      # use squared loss: loss(u, y) = (u - y)^2/2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we are ready to solve the problem:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iter 100: avg.loss = 6.5156e-03\n",
      "Iter 200: avg.loss = 4.5992e-03\n",
      "Iter 300: avg.loss = 4.8039e-03\n",
      "Iter 400: avg.loss = 8.1380e-03\n",
      "Iter 500: avg.loss = 7.6406e-03\n",
      "Iter 600: avg.loss = 1.0267e-02\n",
      "Iter 700: avg.loss = 1.6335e-02\n",
      "Iter 800: avg.loss = 1.3647e-02\n",
      "Iter 900: avg.loss = 1.1578e-02\n",
      "Iter 1000: avg.loss = 1.0762e-02\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       "  3.00284\n",
       " -3.99607\n",
       "  5.0021 "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "w_e = sgd(rmodel,\n",
    "    zeros(3),      # the initial guess\n",
    "    minibatch_seq(X, y, 10),    # supply the data in mini-batches, each with 10 samples\n",
    "    reg = SqrL2Reg(1.0e-4),     # add a squared L2 regression with coefficient 1.0e-4\n",
    "    lrate = t->1.0/(100.0 + t), # set the rule of learning rate \n",
    "    cbinterval = 100,           # invoke the callback every 100 iterations\n",
    "    callback = simple_trace)    # print the optimization trace in callback"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that 10000 samples can be partitioned into 1000 minibatches of size 10. So there were 1000 iterations, each using a single minibatch.\n",
    "\n",
    "Now let's compare the estimated solution with the ground-truth:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5.582303599714307e-7"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sumabs2(w_e - w) / sumabs2(w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The result looks quite accurate. We are done!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.4.0-dev",
   "language": "julia",
   "name": "julia 0.4"
  },
  "language_info": {
   "name": "julia",
   "version": "0.4.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}