{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "

1  Optimization in Julia
1.1  Flowchart
1.2  Modeling tools and solvers
1.3  DCP Using Convex.jl
1.3.1  Example: microbiome regression analysis
1.3.2  Sum-to-zero regression
1.3.2.1  Modeling using Convex.jl
1.3.2.2  Mosek
1.3.2.3  Gurobi
1.3.2.4  Cplex
1.3.2.5  SCS
1.3.3  Sum-to-zero lasso
1.3.4  Sum-to-zero group lasso
1.3.5  Example: matrix completion
1.4  Nonlinear programming (NLP)
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Optimization in Julia\n", "\n", "This lecture gives an overview of some optimization tools in Julia." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Julia Version 1.1.0\n", "Commit 80516ca202 (2019-01-21 21:24 UTC)\n", "Platform Info:\n", " OS: macOS (x86_64-apple-darwin14.5.0)\n", " CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz\n", " WORD_SIZE: 64\n", " LIBM: libopenlibm\n", " LLVM: libLLVM-6.0.1 (ORCJIT, skylake)\n", "Environment:\n", " JULIA_EDITOR = code\n" ] } ], "source": [ "versioninfo()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Flowchart\n", "\n", "* Statisticians do optimizations in daily life: maximum likelihood estimation, machine learning, ...\n", "\n", "* Category of optimization problems:\n", "\n", " 1. Problems with analytical solutions: least squares, principle component analysis, canonical correlation analysis, ...\n", " \n", " 2. Problems subject to Disciplined Convex Programming (DCP): linear programming (LP), quadratic programming (QP), second-order cone programming (SOCP), semidefinite programming (SDP), and geometric programming (GP).\n", " \n", " 3. Nonlinear programming (NLP): Newton type algorithms, Fisher scoring algorithm, EM algorithm, MM algorithms. \n", " \n", " 4. Large scale optimization: ADMM, SGD, ...\n", " \n", "![Flowchart](./optimization_flowchart.png) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modeling tools and solvers\n", "\n", "Getting familiar with **good** optimization softwares broadens the scope and scale of problems we are able to solve in statistics. Following table lists some of the best optimization softwares. \n", "\n", "\n", "| | | LP | MILP | SOCP | MISOCP | SDP | GP | NLP | MINLP | | R | Matlab | Julia | Python | | Cost | \n", "|:---------:|:-:|:--:|:----:|:----:|:--------------:|:---:|:--:|:---:|:-----:|:-:|:-:|:------:|:-----:|:------:|:-:|:----:| \n", "| **modeling tools** | | | | | | | | | | | | | | | | | \n", "| cvx | | x | x | x | x | x | x | | | | | x | | x | | A | \n", "| Convex.jl | | x | x | x | x | x | | | | | | | x | | | O | \n", "| JuMP.jl | | x | x | x | x | | | x | x | | | | x | | | O | \n", "| MathProgBase.jl | | x | x | x | x | | | x | x | | | | x | | | O | \n", "| MathOptInterface.jl | | x | x | x | x | | | x | x | | | | x | | | O | \n", "| **convex solvers** | | | | | | | | | | | | | | | | | \n", "| Mosek | | x | x | x | x | x | x | x | | | x | x | x | x | | A | \n", "| Gurobi | | x | x | x | x | | | | | | x | x | x | x | | A | \n", "| CPLEX | | x | x | x | x | | | | | | x | x | x | x | | A | \n", "| SCS | | x | | x | | x | | | | | | x | x | x | | O | \n", "| **NLP solvers** | | | | | | | | | | | | | | | | | \n", "| NLopt | | x | | | | | | x | | | | x | x | x | | O | \n", "| Ipopt | | x | | | | | | x | | | | x | x | x | | O | \n", "| KNITRO | | x | x | | | | | x | x | | x | x | x | x | | $ | \n", "\n", "* O: open source \n", "* A: free academic license \n", "* $: commercial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Difference between **modeling tool** and **solvers**\n", "\n", " - **Modeling tools** such as cvx (for Matlab) and Convex.jl (Julia analog of cvx) implement the disciplined convex programming (DCP) paradigm proposed by Grant and Boyd (2008) . DCP prescribes a set of simple rules from which users can construct convex optimization problems easily.\n", " \n", " - **Solvers** (Mosek, Gurobi, Cplex, SCS, ...) are concrete software implementation of optimization algorithms. My favorite ones are: Mosek/Gurobi/SCS for DCP and Ipopt/NLopt for nonlinear programming. Mosek and Gurobi are commercial software but free for academic use. SCS/Ipopt/NLopt are open source. \n", " \n", " - Modeling tools usually have the capability to use a variety of solvers. But modeling tools are solver agnostic so users do not have to worry about specific solver interface.\n", " \n", "* For this course, **install** following tools:\n", " - Gurobi: 1. Download Gurobi at [link](http://www.gurobi.com/downloads/gurobi-optimizer). 2. Request free academic license at [link](https://user.gurobi.com/download/licenses/free-academic). 3. Run `grbgetkey XXXXXXXXX` command on terminal as suggested. It'll retrieve a license file and put it under the home folder.\n", " - Mosek: 1. Request free academic license at [link](https://www.mosek.com/products/academic-licenses/). The license file will be sent to your edu email within minutes. Check Spam folder if necessary. 2. Put the license file at the default location `/home/YOURNAME/mosek/`.\n", " - Convex.jl, SCS.jl, Gurobi.jl, Mosek.jl, MathProgBase.jl, NLopt.jl, Ipopt.jl." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DCP Using Convex.jl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Standard convex problem classes like LP (linear programming), QP (quadratic programming), SOCP (second-order cone programming), SDP (semidefinite programming), and GP (geometric programming), are becoming a **technology**.\n", "\n", "![DCP Hierarchy](./convex-hierarchy.png)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Example: microbiome regression analysis\n", "\n", "We illustrate optimization tools in Julia using microbiome analysis as an example.\n", "\n", "16S microbiome sequencing techonology generates sequence counts of various organisms (OTUs, operational taxonomic units) in samples. \n", "\n", "![Microbiome Data](./microbiome_data.png)\n", "\n", "For statistical analysis, counts are normalized into **proportions** for each sample, resulting in a covariate matrix $\\mathbf{X}$ with all rows summing to 1. For identifiability, we need to add a sum-to-zero constraint to the regression cofficients. In other words, we need to solve a **constrained least squares problem** \n", "$$\n", " \\text{minimize} \\frac{1}{2} \\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2\n", "$$\n", "subject to the constraint $\\sum_{j=1}^p \\beta_j = 0$. For simplicity we ignore intercept and non-OTU covariates in this presentation.\n", "\n", "Let's first generate an artifical data set." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "using Random, LinearAlgebra, SparseArrays\n", "\n", "Random.seed!(123) # seed\n", "\n", "n, p = 100, 50\n", "X = rand(n, p)\n", "lmul!(Diagonal(1 ./ vec(sum(X, dims=2))), X)\n", "β = sprandn(p, 0.1) # sparse vector with about 10% non-zero entries\n", "y = X * β + randn(n);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sum-to-zero regression\n", "\n", "The sum-to-zero contrained least squares is a standard quadratic programming (QP) problem so should be solved easily by any QP solver." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Modeling using Convex.jl\n", "\n", "We use the Convex.jl package to model this QP problem. For a complete list of operations supported by Convex.jl, see ." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "using Convex\n", "\n", "β̂cls = Variable(size(X, 2))\n", "problem = minimize(0.5sumsquares(y - X * β̂cls)) # objective\n", "problem.constraints += sum(β̂cls) == 0; # constraint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Mosek\n", "\n", "We first use the Mosek solver to solve this QP." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Problem\n", " Name : \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 107 \n", " Cones : 2 \n", " Scalar variables : 157 \n", " Matrix variables : 0 \n", " Integer variables : 0 \n", "\n", "Optimizer started.\n", "Presolve started.\n", "Linear dependency checker started.\n", "Linear dependency checker terminated.\n", "Eliminator started.\n", "Freed constraints in eliminator : 0\n", "Eliminator terminated.\n", "Eliminator - tries : 1 time : 0.00 \n", "Lin. dep. - tries : 1 time : 0.00 \n", "Lin. dep. - number : 0 \n", "Presolve terminated. Time: 0.00 \n", "Problem\n", " Name : \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 107 \n", " Cones : 2 \n", " Scalar variables : 157 \n", " Matrix variables : 0 \n", " Integer variables : 0 \n", "\n", "Optimizer - threads : 8 \n", "Optimizer - solved problem : the dual \n", "Optimizer - Constraints : 52\n", "Optimizer - Cones : 3\n", "Optimizer - Scalar variables : 106 conic : 106 \n", "Optimizer - Semi-definite variables: 0 scalarized : 0 \n", "Factor - setup time : 0.00 dense det. time : 0.00 \n", "Factor - ML order time : 0.00 GP order time : 0.00 \n", "Factor - nonzeros before factor : 1328 after factor : 1328 \n", "Factor - dense dim. : 0 flops : 3.08e+05 \n", "ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME \n", "0 1.5e+00 7.5e-01 1.5e+00 0.00e+00 0.000000000e+00 -2.000000000e+00 1.0e+00 0.00 \n", "1 1.7e-01 8.4e-02 7.0e-02 -7.80e-01 4.353550341e+00 2.260083736e+01 1.1e-01 0.00 \n", "2 2.8e-02 1.3e-02 1.1e-02 -4.28e-01 2.091400155e+01 4.281754629e+01 1.8e-02 0.00 \n", "3 2.0e-03 9.7e-04 4.9e-03 9.83e-01 2.693910885e+01 2.746907590e+01 1.3e-03 0.00 \n", "4 1.9e-06 9.3e-07 1.2e-04 1.05e+00 2.645057050e+01 2.645136710e+01 1.2e-06 0.00 \n", "5 8.0e-10 3.9e-10 2.5e-06 1.00e+00 2.645074168e+01 2.645074202e+01 5.2e-10 0.00 \n", "Optimizer terminated. Time: 0.00 \n", "\n", "\n", "Interior-point solution summary\n", " Problem status : PRIMAL_AND_DUAL_FEASIBLE\n", " Solution status : OPTIMAL\n", " Primal. obj: 2.6450741681e+01 nrm: 5e+01 Viol. con: 8e-10 var: 0e+00 cones: 3e-08 \n", " Dual. obj: 2.6450742020e+01 nrm: 1e+01 Viol. con: 0e+00 var: 1e-08 cones: 0e+00 \n", " 0.006196 seconds (3.52 k allocations: 1.643 MiB)\n" ] } ], "source": [ "using Mosek\n", "solver = MosekSolver(LOG=1)\n", "@time solve!(problem, solver)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(:Optimal, 26.450741680579814, [20.1736; 2.11868; … ; 24.1402; -4.51783])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the status, optimal value, and minimizer of the problem\n", "problem.status, problem.optval, β̂cls.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Gurobi\n", "\n", "Switch to Gurobi solver:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Academic license - for non-commercial use only\n", "Optimize a model with 107 rows, 157 columns and 5160 nonzeros\n", "Model has 2 quadratic constraints\n", "Coefficient statistics:\n", " Matrix range [1e-05, 2e+00]\n", " QMatrix range [1e+00, 1e+00]\n", " Objective range [1e+00, 1e+00]\n", " Bounds range [0e+00, 0e+00]\n", " RHS range [4e-03, 3e+00]\n", "Presolve removed 2 rows and 1 columns\n", "Presolve time: 0.00s\n", "Presolved: 105 rows, 156 columns, 5158 nonzeros\n", "Presolved model has 2 second-order cone constraints\n", "Ordering time: 0.00s\n", "\n", "Barrier statistics:\n", " Free vars : 50\n", " AA' NZ : 5.154e+03\n", " Factor NZ : 5.262e+03\n", " Factor Ops : 3.590e+05 (less than 1 second per iteration)\n", " Threads : 1\n", "\n", " Objective Residual\n", "Iter Primal Dual Primal Dual Compl Time\n", " 0 1.18451802e+01 -5.01000000e-01 2.40e+01 1.00e-01 2.10e-01 0s\n", " 1 3.33648461e+00 -3.24029170e-02 4.58e+00 4.72e-05 4.06e-02 0s\n", " 2 3.12915622e+00 4.52501039e+00 3.61e+00 4.47e-05 1.99e-02 0s\n", " 3 1.51263772e+01 7.51185121e+00 1.79e+00 4.02e-05 1.02e-01 0s\n", " 4 1.38248014e+01 2.01142309e+01 1.08e+00 9.67e-05 1.68e-02 0s\n", " 5 3.30715906e+01 2.40146272e+01 3.35e-03 1.01e-05 8.59e-02 0s\n", " 6 2.71094293e+01 2.63988011e+01 3.68e-09 2.91e-06 6.71e-03 0s\n", " 7 2.64515175e+01 2.64506426e+01 1.43e-10 4.75e-08 8.22e-06 0s\n", " 8 2.64507432e+01 2.64507419e+01 5.58e-09 1.54e-09 1.39e-08 0s\n", "\n", "Barrier solved model in 8 iterations and 0.01 seconds\n", "Optimal objective 2.64507432e+01\n", "\n", " 0.009493 seconds (2.00 k allocations: 1.699 MiB)\n" ] } ], "source": [ "using Gurobi\n", "solver = GurobiSolver(OutputFlag=1)\n", "@time solve!(problem, solver)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(:Optimal, 26.45074321661315, [20.1748; 2.11866; … ; 24.1422; -4.51908])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the status, optimal value, and minimizer of the problem\n", "problem.status, problem.optval, β̂cls.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Cplex\n", "\n", "Switch to IBM Cplex solver:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tried aggregator 1 time.\n", "QCP Presolve eliminated 2 rows and 1 columns.\n", "Aggregator did 2 substitutions.\n", "Reduced QCP has 103 rows, 204 columns, and 5154 nonzeros.\n", "Reduced QCP has 52 quadratic constraints.\n", "Presolve time = 0.00 sec. (0.42 ticks)\n", "Parallel mode: using up to 8 threads for barrier.\n", "Number of nonzeros in lower triangle of A*A' = 5151\n", "Using Approximate Minimum Degree ordering\n", "Total time for automatic ordering = 0.00 sec. (1.13 ticks)\n", "Summary statistics for Cholesky factor:\n", " Threads = 8\n", " Rows in Factor = 103\n", " Integer space required = 104\n", " Total non-zeros in factor = 5255\n", " Total FP ops to factor = 358959\n", " Itn Primal Obj Dual Obj Prim Inf Upper Inf Dual Inf Inf Ratio\n", " 0 2.0710678e-01 -5.0000000e-01 8.08e+01 0.00e+00 7.30e+01 1.00e+00\n", " 1 3.8378132e+00 1.1892083e+01 8.08e+01 0.00e+00 7.30e+01 1.04e-01\n", " 2 1.4760051e+01 2.6113460e+01 7.19e+01 0.00e+00 6.50e+01 7.99e-02\n", " 3 3.2279497e+01 4.4035225e+01 5.51e+01 0.00e+00 4.98e+01 8.38e-02\n", " 4 2.7177376e+01 3.1140706e+01 8.25e+00 0.00e+00 7.46e+00 2.50e-01\n", " 5 2.6492206e+01 2.8891271e+01 1.58e+00 0.00e+00 1.43e+00 4.13e-01\n", " 6 2.6546001e+01 2.6870962e+01 9.88e-01 0.00e+00 8.93e-01 3.05e+00\n", " 7 2.6485052e+01 2.6785273e+01 1.38e-01 0.00e+00 1.24e-01 3.30e+00\n", " 8 2.6477246e+01 2.6565213e+01 1.27e-01 0.00e+00 1.15e-01 1.13e+01\n", " 9 2.6467787e+01 2.6525581e+01 3.83e-02 0.00e+00 3.46e-02 1.71e+01\n", " 10 2.6453149e+01 2.6474596e+01 2.66e-02 0.00e+00 2.41e-02 4.61e+01\n", " 11 2.6450770e+01 2.6450903e+01 1.11e-02 0.00e+00 1.00e-02 7.44e+03\n", " 12 2.6450744e+01 2.6450748e+01 6.54e-05 0.00e+00 5.91e-05 2.05e+05\n", " 13 2.6450742e+01 2.6450743e+01 2.36e-06 0.00e+00 2.14e-06 1.95e+06\n", " 0.026841 seconds (1.99 k allocations: 1.698 MiB)\n" ] } ], "source": [ "# Use Cplex solver\n", "using CPLEX\n", "solver = CplexSolver(CPXPARAM_ScreenOutput=1)\n", "@time solve!(problem, solver)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(:Optimal, 26.45074204102406, [20.1739; 2.11874; … ; 24.1406; -4.51808])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the status, optimal value, and minimizer of the problem\n", "problem.status, problem.optval, β̂cls.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### SCS\n", "\n", "Switch to the open source SCS solver:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.027974 seconds (2.25 k allocations: 2.028 MiB)\n", "----------------------------------------------------------------------------\n", "\tSCS v2.0.2 - Splitting Conic Solver\n", "\t(c) Brendan O'Donoghue, Stanford University, 2012-2017\n", "----------------------------------------------------------------------------\n", "Lin-sys: sparse-indirect, nnz in A = 5056, CG tol ~ 1/iter^(2.00)\n", "eps = 1.00e-05, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00\n", "acceleration_lookback = 20, rho_x = 1.00e-03\n", "Variables n = 53, constraints m = 107\n", "Cones:\tprimal zero / dual free vars: 2\n", "\tlinear vars: 1\n", "\tsoc vars: 104, soc blks: 2\n", "Setup time: 2.20e-04s\n", "----------------------------------------------------------------------------\n", " Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)\n", "----------------------------------------------------------------------------\n", " 0| 4.23e+20 2.27e+20 1.00e+00 -5.02e+21 3.29e+21 8.65e+20 7.80e-04 \n", " 100| 1.57e-05 1.89e-05 8.31e-07 2.64e+01 2.64e+01 1.62e-14 1.88e-02 \n", " 120| 9.72e-07 4.25e-06 1.73e-09 2.65e+01 2.65e+01 5.98e-15 2.05e-02 \n", "----------------------------------------------------------------------------\n", "Status: Solved\n", "Timing: Solve time: 2.05e-02s\n", "\tLin-sys: avg # CG iterations: 7.60, avg solve time: 1.39e-04s\n", "\tCones: avg projection time: 2.97e-07s\n", "\tAcceleration: avg step time: 2.51e-05s\n", "----------------------------------------------------------------------------\n", "Error metrics:\n", "dist(s, K) = 1.6815e-11, dist(y, K*) = 1.6816e-11, s'y/|s||y| = -1.7609e-13\n", "primal res: |Ax + s - b|_2 / (1 + |b|_2) = 9.7212e-07\n", "dual res: |A'y + c|_2 / (1 + |c|_2) = 4.2535e-06\n", "rel gap: |c'x + b'y| / (1 + |c'x| + |b'y|) = 1.7334e-09\n", "----------------------------------------------------------------------------\n", "c'x = 26.4506, -b'y = 26.4506\n", "============================================================================\n" ] } ], "source": [ "# Use SCS solver\n", "using SCS\n", "solver = SCSSolver(verbose=1)\n", "@time solve!(problem, solver)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(:Optimal, 26.45062094817076, [20.1736; 2.11862; … ; 24.1401; -4.51784])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the status, optimal value, and minimizer of the problem\n", "problem.status, problem.optval, β̂cls.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sum-to-zero lasso\n", "\n", "Suppose we want to know which organisms (OTU) are associated with the response. We can answer this question using a sum-to-zero contrained lasso\n", "$$\n", " \\text{minimize} \\frac 12 \\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2 + \\lambda \\|\\beta\\|_1\n", "$$\n", "subject to the constraint $\\sum_{j=1}^p \\beta_j = 0$. Varying $\\lambda$ from small to large values will generate a solution path." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1.364175 seconds (1.25 M allocations: 838.979 MiB, 26.14% gc time)\n" ] } ], "source": [ "# Use Mosek solver\n", "using Mosek\n", "solver = MosekSolver(LOG=0)\n", "\n", "# # Use Gurobi solver\n", "# using Gurobi\n", "# solver = GurobiSolver(OutputFlag=0)\n", "\n", "# Use Cplex solver\n", "# using CPLEX\n", "# solver = CplexSolver(CPXPARAM_ScreenOutput=0)\n", "\n", "# # Use SCS solver\n", "# using SCS\n", "# solver = SCSSolver(verbose=0)\n", "\n", "# solve at a grid of λ\n", "λgrid = 0:0.01:0.35\n", "β̂path = zeros(length(λgrid), size(X, 2)) # each row is β̂ at a λ\n", "β̂classo = Variable(size(X, 2))\n", "@time for i in 1:length(λgrid)\n", " λ = λgrid[i]\n", " # objective\n", " problem = minimize(0.5sumsquares(y - X * β̂classo) + λ * sum(abs, β̂classo))\n", " # constraint\n", " problem.constraints += sum(β̂classo) == 0 # constraint\n", " solve!(problem, solver)\n", " β̂path[i, :] = β̂classo.value\n", "end" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "0.0\n", "\n", "\n", "0.1\n", "\n", "\n", "0.2\n", "\n", "\n", "0.3\n", "\n", "\n", "-20\n", "\n", "\n", "-10\n", "\n", "\n", "0\n", "\n", "\n", "10\n", "\n", "\n", "20\n", "\n", "\n", "30\n", "\n", "\n", "40\n", "\n", "\n", "Sum-to-Zero Lasso\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Plots; gr()\n", "using LaTeXStrings\n", "\n", "p = plot(collect(λgrid), β̂path, legend=:none)\n", "xlabel!(p, L\"\\lambda\")\n", "ylabel!(p, L\"\\hat \\beta\")\n", "title!(p, \"Sum-to-Zero Lasso\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sum-to-zero group lasso\n", "\n", "Suppose we want to do variable selection not at the OTU level, but at the Phylum level. OTUs are clustered into various Phyla. We can answer this question using a sum-to-zero contrained group lasso\n", "$$\n", " \\text{minimize} \\frac 12 \\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2 + \\lambda \\sum_j \\|\\mathbf{\\beta}_j\\|_2\n", "$$\n", "subject to the constraint $\\sum_{j=1}^p \\beta_j = 0$, where $\\mathbf{\\beta}_j$ are regression coefficients corresponding to the $j$-th phylum. This is a second-order cone programming (SOCP) problem readily modeled by Convex.jl.\n", "\n", "Let's assume each 10 contiguous OTUs belong to one Phylum." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.801165 seconds (400.55 k allocations: 235.576 MiB, 10.67% gc time)\n" ] } ], "source": [ "# Use Mosek solver\n", "using Mosek\n", "solver = MosekSolver(LOG=0)\n", "\n", "# # Use Gurobi solver\n", "# using Gurobi\n", "# solver = GurobiSolver(OutputFlag=0)\n", "\n", "# # Use Cplex solver\n", "# using CPLEX\n", "# solver = CplexSolver(CPXPARAM_ScreenOutput=0)\n", "\n", "# # Use SCS solver\n", "# using SCS\n", "# solver = SCSSolver(verbose=0)\n", "\n", "# solve at a grid of λ\n", "λgrid = 0.1:0.005:0.5\n", "β̂pathgrp = zeros(length(λgrid), size(X, 2)) # each row is β̂ at a λ\n", "β̂classo = Variable(size(X, 2))\n", "@time for i in 1:length(λgrid)\n", " λ = λgrid[i]\n", " # loss\n", " obj = 0.5sumsquares(y - X * β̂classo)\n", " # group lasso penalty term\n", " for j in 1:(size(X, 2)/10)\n", " βj = β̂classo[(10(j-1)+1):10j]\n", " obj = obj + λ * norm(βj)\n", " end\n", " problem = minimize(obj)\n", " # constraint\n", " problem.constraints += sum(β̂classo) == 0 # constraint\n", " solve!(problem, solver)\n", " β̂pathgrp[i, :] = β̂classo.value\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We it took Mosek <1 second to solve this seemingly hard optimization problem at **80** different $\\lambda$ values." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "0.1\n", "\n", "\n", "0.2\n", "\n", "\n", "0.3\n", "\n", "\n", "0.4\n", "\n", "\n", "0.5\n", "\n", "\n", "-15\n", "\n", "\n", "-10\n", "\n", "\n", "-5\n", "\n", "\n", "0\n", "\n", "\n", "5\n", "\n", "\n", "10\n", "\n", "\n", "15\n", "\n", "\n", "Sum-to-Zero Group Lasso\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "p2 = plot(collect(λgrid), β̂pathgrp, legend=:none)\n", "xlabel!(p2, L\"\\lambda\")\n", "ylabel!(p2, L\"\\hat \\beta\")\n", "title!(p2, \"Sum-to-Zero Group Lasso\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example: matrix completion\n", "\n", "Load the $128 \\times 128$ Lena picture with missing pixels." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "128×128 Array{Gray{N0f8},2} with eltype Gray{Normed{UInt8,8}}:\n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) … Gray{N0f8}(0.627)\n", " Gray{N0f8}(0.627) Gray{N0f8}(0.624) Gray{N0f8}(0.388)\n", " Gray{N0f8}(0.612) Gray{N0f8}(0.612) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) Gray{N0f8}(0.192)\n", " Gray{N0f8}(0.612) Gray{N0f8}(0.0) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) … Gray{N0f8}(0.2) \n", " Gray{N0f8}(0.608) Gray{N0f8}(0.0) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) Gray{N0f8}(0.216)\n", " Gray{N0f8}(0.62) Gray{N0f8}(0.62) Gray{N0f8}(0.208)\n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) Gray{N0f8}(0.188)\n", " Gray{N0f8}(0.635) Gray{N0f8}(0.0) … Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.631) Gray{N0f8}(0.0) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.627) Gray{N0f8}(0.184)\n", " ⋮ ⋱ \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.129) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.149) Gray{N0f8}(0.129) Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.216) Gray{N0f8}(0.0) Gray{N0f8}(0.208)\n", " Gray{N0f8}(0.345) Gray{N0f8}(0.341) Gray{N0f8}(0.231)\n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) … Gray{N0f8}(0.259)\n", " Gray{N0f8}(0.298) Gray{N0f8}(0.416) Gray{N0f8}(0.259)\n", " Gray{N0f8}(0.0) Gray{N0f8}(0.369) Gray{N0f8}(0.235)\n", " Gray{N0f8}(0.0) Gray{N0f8}(0.0) Gray{N0f8}(0.208)\n", " Gray{N0f8}(0.22) Gray{N0f8}(0.0) Gray{N0f8}(0.2) \n", " Gray{N0f8}(0.0) Gray{N0f8}(0.22) … Gray{N0f8}(0.0) \n", " Gray{N0f8}(0.196) Gray{N0f8}(0.208) Gray{N0f8}(0.345)\n", " Gray{N0f8}(0.192) Gray{N0f8}(0.0) Gray{N0f8}(0.0) " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Images\n", "\n", "lena = load(\"lena128missing.png\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "128×128 Array{Float64,2}:\n", " 0.0 0.0 0.635294 0.0 … 0.0 0.0 0.627451\n", " 0.627451 0.623529 0.0 0.611765 0.0 0.0 0.388235\n", " 0.611765 0.611765 0.0 0.0 0.403922 0.219608 0.0 \n", " 0.0 0.0 0.611765 0.0 0.223529 0.176471 0.192157\n", " 0.611765 0.0 0.615686 0.615686 0.0 0.0 0.0 \n", " 0.0 0.0 0.0 0.619608 … 0.0 0.0 0.2 \n", " 0.607843 0.0 0.623529 0.0 0.176471 0.192157 0.0 \n", " 0.0 0.0 0.623529 0.0 0.0 0.0 0.215686\n", " 0.619608 0.619608 0.0 0.0 0.2 0.0 0.207843\n", " 0.0 0.0 0.635294 0.635294 0.2 0.192157 0.188235\n", " 0.635294 0.0 0.0 0.0 … 0.192157 0.180392 0.0 \n", " 0.631373 0.0 0.0 0.0 0.0 0.0 0.0 \n", " 0.0 0.627451 0.635294 0.666667 0.172549 0.0 0.184314\n", " ⋮ ⋱ ⋮ \n", " 0.0 0.129412 0.0 0.541176 0.0 0.286275 0.0 \n", " 0.14902 0.129412 0.196078 0.537255 0.345098 0.0 0.0 \n", " 0.215686 0.0 0.262745 0.0 0.301961 0.0 0.207843\n", " 0.345098 0.341176 0.356863 0.513725 0.0 0.0 0.231373\n", " 0.0 0.0 0.0 0.0 … 0.0 0.243137 0.258824\n", " 0.298039 0.415686 0.458824 0.0 0.0 0.0 0.258824\n", " 0.0 0.368627 0.4 0.0 0.0 0.0 0.235294\n", " 0.0 0.0 0.34902 0.0 0.0 0.239216 0.207843\n", " 0.219608 0.0 0.0 0.0 0.0 0.0 0.2 \n", " 0.0 0.219608 0.235294 0.356863 … 0.0 0.0 0.0 \n", " 0.196078 0.207843 0.211765 0.0 0.0 0.270588 0.345098\n", " 0.192157 0.0 0.196078 0.309804 0.266667 0.356863 0.0 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# convert to real matrices\n", "Y = Float64.(lena)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We fill out the missin pixels uisng a **matrix completion** technique developed by Candes and Tao\n", "$$\n", " \\text{minimize } \\|\\mathbf{X}\\|_*\n", "$$\n", "$$\n", " \\text{subject to } x_{ij} = y_{ij} \\text{ for all observed entries } (i, j).\n", "$$\n", "Here $\\|\\mathbf{M}\\|_* = \\sum_i \\sigma_i(\\mathbf{M})$ is the nuclear norm. In words we seek the matrix with minimal nuclear norm that agrees with the observed entries. This is a semidefinite programming (SDP) problem readily modeled by Convex.jl.\n", "\n", "This example takes long because of high dimensionality." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Problem\n", " Name : \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 73665 \n", " Cones : 0 \n", " Scalar variables : 49153 \n", " Matrix variables : 1 \n", " Integer variables : 0 \n", "\n", "Optimizer started.\n", "Presolve started.\n", "Linear dependency checker started.\n", "Linear dependency checker terminated.\n", "Eliminator started.\n", "Freed constraints in eliminator : 0\n", "Eliminator terminated.\n", "Eliminator - tries : 1 time : 0.00 \n", "Lin. dep. - tries : 1 time : 0.01 \n", "Lin. dep. - number : 0 \n", "Presolve terminated. Time: 0.05 \n", "Problem\n", " Name : \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 73665 \n", " Cones : 0 \n", " Scalar variables : 49153 \n", " Matrix variables : 1 \n", " Integer variables : 0 \n", "\n", "Optimizer - threads : 8 \n", "Optimizer - solved problem : the primal \n", "Optimizer - Constraints : 32896\n", "Optimizer - Cones : 1\n", "Optimizer - Scalar variables : 24769 conic : 24769 \n", "Optimizer - Semi-definite variables: 1 scalarized : 32896 \n", "Factor - setup time : 231.01 dense det. time : 0.00 \n", "Factor - ML order time : 206.00 GP order time : 0.00 \n", "Factor - nonzeros before factor : 5.41e+08 after factor : 5.41e+08 \n", "Factor - dense dim. : 2 flops : 1.19e+13 \n", "ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME \n", "0 1.0e+00 1.0e+00 1.0e+00 0.00e+00 0.000000000e+00 0.000000000e+00 1.0e+00 231.11\n", "1 3.7e-01 3.7e-01 2.3e-01 -9.40e-01 2.723707186e+02 2.783663417e+02 3.7e-01 358.60\n", "2 3.1e-01 3.1e-01 5.4e-01 2.89e-01 1.669310739e+02 1.662436137e+02 3.1e-01 499.40\n", "3 5.4e-02 5.4e-02 5.3e-01 8.69e-01 1.629170332e+02 1.627092805e+02 5.4e-02 646.21\n", "4 4.3e-03 4.3e-03 7.1e-02 1.20e+00 1.486177530e+02 1.486148653e+02 4.3e-03 796.20\n", "5 1.6e-03 1.6e-03 4.2e-02 1.01e+00 1.483660057e+02 1.483651653e+02 1.6e-03 944.00\n", "6 4.5e-05 4.5e-05 2.9e-02 1.00e+00 1.479825741e+02 1.479824023e+02 4.5e-05 1105.40\n", "7 1.4e-05 1.4e-05 5.5e-03 1.00e+00 1.479752237e+02 1.479751933e+02 1.4e-05 1254.12\n", "8 3.5e-07 3.5e-07 2.4e-03 1.00e+00 1.479711707e+02 1.479711694e+02 3.5e-07 1416.05\n", "9 9.0e-08 8.3e-08 3.6e-04 1.00e+00 1.479711088e+02 1.479711087e+02 8.4e-08 1559.26\n", "10 1.6e-09 1.5e-09 9.1e-05 1.00e+00 1.479710826e+02 1.479710826e+02 1.3e-09 1728.32\n", "Optimizer terminated. Time: 1728.63 \n", "\n", "\n", "Interior-point solution summary\n", " Problem status : PRIMAL_AND_DUAL_FEASIBLE\n", " Solution status : OPTIMAL\n", " Primal. obj: 1.4797108260e+02 nrm: 1e+02 Viol. con: 3e-09 var: 0e+00 barvar: 0e+00 \n", " Dual. obj: 1.4797108259e+02 nrm: 1e+00 Viol. con: 0e+00 var: 6e-10 barvar: 3e-09 \n", "1735.908026 seconds (18.59 M allocations: 977.941 MiB, 0.04% gc time)\n" ] } ], "source": [ "# Use Mosek solver\n", "using Mosek\n", "solver = MosekSolver(LOG=1)\n", "\n", "# Linear indices of obs. entries\n", "obsidx = findall(Y[:] .≠ 0.0)\n", "# Create optimization variables\n", "X = Convex.Variable(size(Y))\n", "# Set up optmization problem\n", "problem = minimize(nuclearnorm(X))\n", "problem.constraints += X[obsidx] == Y[obsidx]\n", "# Solve the problem by calling solve\n", "@time solve!(problem, solver)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "128×128 reinterpret(Gray{Float64}, ::Array{Float64,2}):\n", " Gray{Float64}(0.510136) Gray{Float64}(0.541837) … Gray{Float64}(0.627451)\n", " Gray{Float64}(0.627451) Gray{Float64}(0.623529) Gray{Float64}(0.388235)\n", " Gray{Float64}(0.611765) Gray{Float64}(0.611765) Gray{Float64}(0.337453)\n", " Gray{Float64}(0.58598) Gray{Float64}(0.58162) Gray{Float64}(0.192157)\n", " Gray{Float64}(0.611765) Gray{Float64}(0.581892) Gray{Float64}(0.277462)\n", " Gray{Float64}(0.557051) Gray{Float64}(0.520124) … Gray{Float64}(0.2) \n", " Gray{Float64}(0.607843) Gray{Float64}(0.582899) Gray{Float64}(0.197011)\n", " Gray{Float64}(0.61125) Gray{Float64}(0.637072) Gray{Float64}(0.215686)\n", " Gray{Float64}(0.619608) Gray{Float64}(0.619608) Gray{Float64}(0.207843)\n", " Gray{Float64}(0.617212) Gray{Float64}(0.588928) Gray{Float64}(0.188235)\n", " Gray{Float64}(0.635294) Gray{Float64}(0.594274) … Gray{Float64}(0.206484)\n", " Gray{Float64}(0.631373) Gray{Float64}(0.592405) Gray{Float64}(0.184312)\n", " Gray{Float64}(0.588056) Gray{Float64}(0.627451) Gray{Float64}(0.184314)\n", " ⋮ ⋱ \n", " Gray{Float64}(0.149179) Gray{Float64}(0.129412) Gray{Float64}(0.241798)\n", " Gray{Float64}(0.14902) Gray{Float64}(0.129412) Gray{Float64}(0.265207)\n", " Gray{Float64}(0.215686) Gray{Float64}(0.222663) Gray{Float64}(0.207843)\n", " Gray{Float64}(0.345098) Gray{Float64}(0.341176) Gray{Float64}(0.231373)\n", " Gray{Float64}(0.263512) Gray{Float64}(0.334629) … Gray{Float64}(0.258824)\n", " Gray{Float64}(0.298039) Gray{Float64}(0.415686) Gray{Float64}(0.258824)\n", " Gray{Float64}(0.268499) Gray{Float64}(0.368627) Gray{Float64}(0.235294)\n", " Gray{Float64}(0.249184) Gray{Float64}(0.289783) Gray{Float64}(0.207843)\n", " Gray{Float64}(0.219608) Gray{Float64}(0.251988) Gray{Float64}(0.2) \n", " Gray{Float64}(0.229011) Gray{Float64}(0.219608) … Gray{Float64}(0.203565)\n", " Gray{Float64}(0.196078) Gray{Float64}(0.207843) Gray{Float64}(0.345098)\n", " Gray{Float64}(0.192157) Gray{Float64}(0.202593) Gray{Float64}(0.304792)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "colorview(Gray, X.value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nonlinear programming (NLP)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "We use MLE of Gamma distribution to illustrate some rudiments of nonlinear programming (NLP) in Julia. \n", "\n", "Let $x_1,\\ldots,x_m$ be a random sample from the gamma density\n", "$$\n", "f(x) = \\Gamma(\\alpha)^{-1} \\beta^{\\alpha} x^{\\alpha-1} e^{-\\beta x}\n", "$$\n", "on $(0,\\infty)$. The loglikelihood function is\n", "$$\n", " L(\\alpha, \\beta) = m [- \\ln \\Gamma(\\alpha) + \\alpha \\ln \\beta + (\\alpha - 1)\\overline{\\ln x} - \\beta \\bar x],\n", "$$\n", "where $\\overline{x} = \\frac{1}{m} \\sum_{i=1}^m x_i$ and \n", "$\\overline{\\ln x} = \\frac{1}{m} \\sum_{i=1}^m \\ln x_i$." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-2.7886541275400365" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Random, Statistics, SpecialFunctions\n", "Random.seed!(280)\n", "\n", "function gamma_logpdf(x::Vector, α::Real, β::Real)\n", " m = length(x)\n", " avg = mean(x)\n", " logavg = sum(log, x) / m\n", " m * (- lgamma(α) + α * log(β) + (α - 1) * logavg - β * avg)\n", "end\n", "\n", "x = rand(5)\n", "gamma_logpdf(x, 1.0, 1.0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Many optimization algorithms involve taking derivatives of the objective function. The `ForwardDiff.jl` package implements automatic differentiation. For example, to compute the derivative and Hessian of the log-likelihood with data `x` at `α=1.0` and `β=1.0`." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " -1.058800554530917 \n", " 2.2113458724599635" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using ForwardDiff\n", "ForwardDiff.gradient(θ -> gamma_logpdf(x, θ...), [1.0; 1.0])" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Float64,2}:\n", " -8.22467 5.0\n", " 5.0 -5.0" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ForwardDiff.hessian(θ -> gamma_logpdf(x, θ...), [1.0; 1.0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate data:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True parameter values:\n", "α = 0.5535947086407722, β = 4.637963827225865\n" ] } ], "source": [ "using Distributions, Random\n", "\n", "Random.seed!(280)\n", "(n, p) = (1000, 2)\n", "(α, β) = 5.0 * rand(p)\n", "x = rand(Gamma(α, β), n)\n", "println(\"True parameter values:\")\n", "println(\"α = \", α, \", β = \", β)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use JuMP.jl to define and solve our NLP problem." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Max myf(α, β)\n", "Subject to\n", " α ≥ 1.0e-8\n", " β ≥ 1.0e-8\n", "Total number of variables............................: 2\n", " variables with only lower bounds: 2\n", " variables with lower and upper bounds: 0\n", " variables with only upper bounds: 0\n", "Total number of equality constraints.................: 0\n", "Total number of inequality constraints...............: 0\n", " inequality constraints with only lower bounds: 0\n", " inequality constraints with lower and upper bounds: 0\n", " inequality constraints with only upper bounds: 0\n", "\n", "\n", "Number of Iterations....: 14\n", "\n", " (scaled) (unscaled)\n", "Objective...............: 1.8533848152383021e+03 1.8533848152383021e+03\n", "Dual infeasibility......: 5.2040090087569420e-09 5.2040090087569420e-09\n", "Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00\n", "Complementarity.........: 9.9999999999999994e-12 9.9999999999999994e-12\n", "Overall NLP error.......: 5.2040090087569420e-09 5.2040090087569420e-09\n", "\n", "\n", "Number of objective function evaluations = 25\n", "Number of objective gradient evaluations = 15\n", "Number of equality constraint evaluations = 0\n", "Number of inequality constraint evaluations = 0\n", "Number of equality constraint Jacobian evaluations = 0\n", "Number of inequality constraint Jacobian evaluations = 0\n", "Number of Lagrangian Hessian evaluations = 0\n", "Total CPU secs in IPOPT (w/o function evaluations) = 0.019\n", "Total CPU secs in NLP function evaluations = 0.001\n", "\n", "EXIT: Optimal Solution Found.\n", "MLE (JuMP):\n", "α = α, β = β\n", "Objective value: -1853.384815238302\n", "α = 0.5489317142212758, β = 4.9909639237115\n", "MLE (Distribution package):\n", "Gamma{Float64}(α=0.5489317142213413, θ=4.990963923701522)\n" ] } ], "source": [ "using JuMP, Ipopt, NLopt\n", "\n", "m = Model(with_optimizer(Ipopt.Optimizer, print_level=3))\n", "# m = Model(with_optimizer(NLopt.Optimizer, algorithm=:LD_MMA))\n", "\n", "myf(a, b) = gamma_logpdf(x, a, b)\n", "JuMP.register(m, :myf, 2, myf, autodiff=true)\n", "@variable(m, α >= 1e-8)\n", "@variable(m, β >= 1e-8)\n", "@NLobjective(m, Max, myf(α, β))\n", "\n", "print(m)\n", "status = JuMP.optimize!(m)\n", "\n", "println(\"MLE (JuMP):\")\n", "println(\"α = \", α, \", β = \", β)\n", "println(\"Objective value: \", JuMP.objective_value(m))\n", "println(\"α = \", JuMP.value(α), \", β = \", 1 / JuMP.value(β))\n", "println(\"MLE (Distribution package):\")\n", "println(fit_mle(Gamma, x))" ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "153px", "width": "252px" }, "navigate_menu": true, "number_sections": true, "sideBar": true, "skip_h1_title": true, "threshold": 4, "toc_cell": true, "toc_position": { "height": "510.6000061035156px", "left": "0px", "right": "812px", "top": "109.4000015258789px", "width": "179.39999389648438px" }, "toc_section_display": "block", "toc_window_display": true, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 1 }