{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Convex Nonparametric Least Squares (`CNLS`): An alternative formulation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CNLS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hildreth (1954) was the first to consider nonparametric regression subject to monotonicity and concavity constraints in the case of a single input variable $x$ (see also Hanson and Pledger 1976). Kuosmanen (2008) extended Hidreth’s approach to the multivariate setting with a vector-valued $\\bf{x}$, and coined the term convex nonparametric least squares (`CNLS`) for this method. CNLS builds upon the assumption that the true but unknown production function f belongs to the set of continuous, monotonic increasing and globally concave functions, $F_2$, imposing exactly the same production axioms as standard DEA." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `CNLS` estimator of function $f$ is obtained as the optimal solution to the infinite dimensional least squares problem\n", "\n", "\\begin{align}\n", "& \\underset{f} min \\sum_{i=1}^n(y_i-f(\\bf{x}_i))^2 \\\\\n", "& \\text{s.t.} \\\\\n", "& f \\in F_2\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CNLS estimator with the observed data points ($\\mathbf{x}_i$, $y_i$)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\\begin{align}\n", "& \\underset{\\alpha, \\beta, \\varepsilon} {min} \\sum_{i=1}^n\\varepsilon_i^2 \\\\\n", "& \\text{s.t.} \\\\\n", "& y_i = \\alpha_i + \\beta_i^{'}X_i + \\varepsilon_i \\quad \\forall i \\\\\n", "& \\alpha_i + \\beta_i^{'}X_i \\le \\alpha_j + \\beta_j^{'}X_i \\quad \\forall i, j\\\\\n", "& \\beta_i \\ge 0 \\quad \\forall i \\\\\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where $\\alpha_i$ and $\\beta_i$ define the intercept and slope parameters of tangent hyperplanes that characterize the estimated piece-wise linear frontier. $\\varepsilon_i$ denotes the CNLS residuals." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alternative formulation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To estimate the above `CNLS` model on Python, we have to resort to an alternative `CNLS` formulation.\n", "\n", "\\begin{align}\n", "& \\underset{\\hat{y}, \\beta} {min} \\frac{1}{2} \\mid\\mid y_i-\\hat{y}_i\\mid\\mid_2^2 \\\\\n", "& \\text{s.t.} \\\\\n", "& \\hat{y}_i - \\hat{y}_j \\ge \\beta_i^{'}(X_i -X_j) \\quad \\forall i, j\\\\\n", "& \\beta_i \\ge 0 \\quad \\forall i \\\\\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Proof 1: The constraints in the two formulations are equivalent." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " + Replace the CNLS residuals by: $\\varepsilon_i = y_i − \\alpha_i - X_i \\beta_i^{'} = y_i - \\hat{y}_i$\n", " + Left-hand side in second constraint: $\\hat{y}_i$\n", " + Right-hand side in second constraint: $\\hat{y}_j - X_j \\beta_j^{'} + X_i \\beta_j^{'} = \\hat{y}_j + \\beta_j^{'} (X_i-X_j)$\n", "\n", " Therefore, the second constraint is redefined as:\n", "\n", "\\begin{align}\n", "& \\alpha_i + \\beta_iX_i \\le \\alpha_j + \\beta_jX_i \\quad \\forall i, j\\\\\n", "& \\hat{y}_i \\le \\hat{y}_j + \\beta_j^{'} (X_i-X_j) \\\\\n", "& \\hat{y}_j - \\hat{y}_i \\ge \\beta_j^{'} (X_j-X_i) \\\\\n", "& \\hat{y}_i - \\hat{y}_j \\ge \\beta_i^{'} (X_i-X_j) \\quad \\blacksquare \\\\\n", "\\end{align} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Proof 2: The objective function is a standard quadratic problem (QP)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The standard `QP` formulation:\n", "\\begin{align}\n", "\\text{min} f(x) = \\frac{1}{2}x^TAx + q^Tx \n", "\\end{align}\n", "\n", "The objective function in the alternative `CNLS` formulation:\n", "\\begin{align}\n", "& \\text{min} \\frac{1}{2} \\mid\\mid y-\\hat{y}\\mid\\mid_2^2 \\\\\n", "& = \\frac{1}{2} (\\hat{y} - y)^T(\\hat{y} - y) \\\\\n", "& = \\frac{1}{2}(\\hat{y}^T\\hat{y} -y^T\\hat{y} -\\hat{y}^Ty+ y^Ty) \\\\\n", "& = \\frac{1}{2}\\hat{y}^T\\hat{y} - y^T\\hat{y} + \\frac{1}{2}y^Ty \\\\\n", "\\end{align}\n", "\n", "since $\\frac{1}{2}y^Ty$ is a contant term, it is sufficient to solve the QP-form objective function. Note that A is a identity matrix, q= -y. $\\blacksquare$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can estimate the alternative `CNLS` model based on convex quadratic problem solver in Python, eg., CVXOPT." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### References" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[1] Johnson, A. L. and Kuosmanen, T. (2015) An Introduction to CNLS and StoNED Methods for Efficiency Analysis: Economic Insights and Computational Aspects, in Ray, S. C., Kumbhakar, S. C., and Dua, P. (eds) Benchmarking for Performance Evaluation: A Production Frontier Approach. Springer, pp. 117–186.\n", "\n", "[2] Kuosmanen, T., Johnson, A. and Saastamoinen, A. (2015) Stochastic Nonparametric Approach to Efficiency Analysis: A unified Framework, in Zhu, J. (ed.) Data Envelopment Analysis. Springer, pp. 191–244." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }