{ "cells": [ { "cell_type": "markdown", "source": [ "# Solving Sparse Regression using `NExOS.jl`\n", "**Shuvomoy Das Gupta**\n", "\n", "The sparse regression problem (also known as regressor selection problem) is concerned with approximating a vector $b\\in\\mathbf{R}^{m}$ with a linear combination of at most $k$ columns of a matrix $A\\in\\mathbf{R}^{m\\times d}$ with bounded coefficients. The problem can be written as the following optimization\n", "problem\n", "$$\n", "\\begin{equation}\n", "\\begin{array}{ll}\n", "\\textrm{minimize} & \\|Ax-b\\|_{2}^{2}+\\frac{\\beta}{2}\\|x\\|^{2}\\\\\n", "\\textrm{subject to} & \\mathbf{card}(x)\\leq k\\\\\n", " & \\|x\\|_{\\infty}\\leq M,\n", "\\end{array}\n", "\\end{equation}\n", "$$\n", "where $x\\in\\mathbf{R}^{d}$ is the decision variable, and $A\\in\\mathbf{R}^{m\\times d},b\\in\\mathbf{R}^{m},$ and $M>0$ are problem data.\n", "\n", "First, load the packages." ], "metadata": {} }, { "cell_type": "code", "source": [ "using Random, NExOS, ProximalOperators" ], "outputs": [], "execution_count": 1, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:35:58.213Z", "iopub.execute_input": "2020-11-11T18:35:58.684Z", "iopub.status.idle": "2020-11-11T18:36:09.892Z" } } }, { "cell_type": "markdown", "source": [ "Let us generate some random data for this problem." ], "metadata": {} }, { "cell_type": "code", "source": [ "m = 25\n", "n = 50\n", "A = randn(m,n)\n", "b = randn(m)\n", "M = 100\n", "k = convert(Int64, round(m/3))\n", "beta = 10^-10" ], "outputs": [ { "output_type": "execute_result", "execution_count": 2, "data": { "text/plain": "1.0000000000000006e-10" }, "metadata": {} } ], "execution_count": 2, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:09.902Z", "iopub.execute_input": "2020-11-11T18:36:10.269Z", "iopub.status.idle": "2020-11-11T18:36:11.502Z" } } }, { "cell_type": "markdown", "source": [ "Create the problem instance in `NExOS`." ], "metadata": {} }, { "cell_type": "code", "source": [ "C = SparseSet(M, k) # Create the set\n", "f = LeastSquares(A, b, iterative = true) # Create the function\n", "settings = Settings(μ_max = 2, μ_mult_fact = 0.85, verbose = false, freq = 250, γ_updt_rule = :adaptive, β = beta) # settings\n", "z0 = zeros(n) # create an initial point\n", "problem = Problem(f, C, settings.β, z0) # problem instance" ], "outputs": [ { "output_type": "execute_result", "execution_count": 3, "data": { "text/plain": "Problem{ProximalOperators.LeastSquaresIterative{1,Float64,Float64,Array{Float64,2},Array{Float64,1},ProximalOperators.AAc},SparseSet{Int64,Int64},Float64,Array{Float64,1}}(description : Least squares penalty\ndomain : n/a\nexpression : n/a\nparameters : n/a, SparseSet{Int64,Int64}(100, 8), 1.0000000000000006e-10, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 … 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])" }, "metadata": {} } ], "execution_count": 3, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:11.512Z", "iopub.execute_input": "2020-11-11T18:36:11.519Z", "iopub.status.idle": "2020-11-11T18:36:13.171Z" } } }, { "cell_type": "markdown", "source": [ "Time to solve the problem." ], "metadata": {} }, { "cell_type": "code", "source": [ "state_final = solve!(problem, settings)" ], "outputs": [ { "output_type": "execute_result", "execution_count": 4, "data": { "text/plain": "State{Array{Float64,1},Int64,Float64}([0.24701000766550982, -0.24356676418901654, 5.398080115334404e-8, -3.4929684602205246e-9, 2.6065156412524142e-8, 1.5946069144306067e-8, -7.630900321164644e-9, 3.907322609359914e-9, -3.972535918634318e-8, -2.317536490058187e-9 … -3.069384498367188e-9, 9.744408143847951e-9, -1.2217929273801725e-8, -2.9938485516205667e-8, 1.2370000675166513e-9, -0.2324999125397446, -5.372515086168684e-9, 0.5657579002049802, -3.4057851523858884e-8, 2.9848750351434576e-8], [0.24701000772586093, -0.24356676425942392, 5.7244326295707174e-8, -3.7041633897196132e-9, 2.7640953104422848e-8, 1.691009427084463e-8, -8.092314755556757e-9, 4.1435552164323004e-9, -4.2127065930870655e-8, -2.4576877899582207e-9 … -3.2550197141275017e-9, 1.0333505843181107e-8, -1.2956586048500635e-8, -3.1748567496928025e-8, 1.3116831310921374e-9, -0.23249991218033036, -5.697292475655342e-9, 0.5657579004409481, -3.611700315797661e-8, 3.165346638765637e-8], [0.24701000766550976, -0.2435667641890165, -5.3690129855456e-7, 3.474180488783803e-8, -2.592477377230451e-7, -1.586017446844432e-7, 7.589884391165743e-8, -3.886290048478037e-8, 3.9511475131091067e-7, 2.305098150516354e-8 … 3.052927946946281e-8, -9.691915181179844e-8, 1.2152134699676747e-7, 2.977736466590621e-7, -1.2302334444824045e-8, -0.2324999125397446, 5.3435551293638174e-8, 0.5657579002049802, 3.3874574390849054e-7, -2.9688173793408416e-7], 2, 3.9235795081516326e-9, 6.882125117023454e-8, 9.385625688121253e-9, 9.68794389337658e-8)" }, "metadata": {} } ], "execution_count": 4, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:13.180Z", "iopub.execute_input": "2020-11-11T18:36:13.186Z", "iopub.status.idle": "2020-11-11T18:36:17.403Z" } } }, { "cell_type": "markdown", "source": [ "Let us take a look at the quality of the solution." ], "metadata": {} }, { "cell_type": "code", "source": [ "log10(state_final.fxd_pnt_gap) <= -4 # if the fixed point gap is less than 10^-4 (to determin if the algorithm has converged)" ], "outputs": [ { "output_type": "execute_result", "execution_count": 5, "data": { "text/plain": "true" }, "metadata": {} } ], "execution_count": 5, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:17.427Z", "iopub.execute_input": "2020-11-11T18:36:17.438Z", "iopub.status.idle": "2020-11-11T18:36:17.675Z" } } }, { "cell_type": "code", "source": [ "log10(state_final.fsblt_gap) <= -4 # this is to test if the found solution by NExOS is locally optimal" ], "outputs": [ { "output_type": "execute_result", "execution_count": 6, "data": { "text/plain": "true" }, "metadata": {} } ], "execution_count": 6, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:17.686Z", "iopub.execute_input": "2020-11-11T18:36:17.691Z", "iopub.status.idle": "2020-11-11T18:36:17.706Z" } } }, { "cell_type": "code", "source": [ "f(state_final.x) # this gives the objective value of the solution found by NExOS" ], "outputs": [ { "output_type": "execute_result", "execution_count": 7, "data": { "text/plain": "3.510658260563844" }, "metadata": {} } ], "execution_count": 7, "metadata": { "execution": { "iopub.status.busy": "2020-11-11T18:36:17.717Z", "iopub.execute_input": "2020-11-11T18:36:17.724Z", "iopub.status.idle": "2020-11-11T18:36:17.753Z" } } } ], "metadata": { "language_info": { "file_extension": ".jl", "name": "julia", "mimetype": "application/julia", "version": "1.5.0" }, "kernelspec": { "name": "julia-1.5", "display_name": "Julia 1.5.0", "language": "julia" }, "nteract": { "version": "0.28.0" } }, "nbformat": 4, "nbformat_minor": 2 }