{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quadrat Based Statistical Method for Planar Point Patterns\n", "\n", "**Authors: Serge Rey , Wei Kang and Hu Shao **\n", "\n", "## Introduction\n", "\n", "In this notebook, we are going to introduce how to apply quadrat statistics to a point pattern to infer whether it comes from a CSR process.\n", "\n", "1. In [Quadrat Statistic](#Quadrat-Statistic) we introduce the concept of quadrat based method.\n", "2. We illustrate how to use the module **quadrat_statistics.py** through an example dataset **juvenile** in [Juvenile Example](#Juvenile-Example)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quadrat Statistic\n", "\n", "In the previous notebooks, we introduced the concept of Complete Spatial Randomness (CSR) process which serves as the benchmark process. Utilizing CSR properties, we can discriminate those that are not from a CSR process. Quadrat statistic is one such method. Since a CSR process has two major characteristics:\n", "1. Uniform: each location has equal probability of getting a point (where an event happens).\n", "2. Independent: location of event points are independent.\n", "\n", "We can imagine that for any point pattern, if the underlying process is a CSR process, the expected point counts inside any cell of area $|A|$ should be $\\lambda |A|$ ($\\lambda$ is the intensity which is uniform across the study area for a CSR). Thus, if we impose a $m \\times k$ rectangular tessellation over the study area (window), we can easily calculate the expected number of points inside each cell under the null of CSR. By comparing the observed point counts against the expected counts and calculate a $\\chi^2$ test statistic, we can decide whether to reject the null based on the position of the $\\chi^2$ test statistic in the sampling distribution. \n", "\n", "$$\\chi^2 = \\sum^m_{i=1} \\sum^k_{j=1} \\frac{[x_{i,j}-E(x_{i,j})]^2}{\\lambda |A_{i,j}|}$$\n", "\n", "There are two ways to construct the sampling distribution and acquire a p-value:\n", "1. Analytical sampling distribution: a $\\chi^2$ distribution of $m \\times k -1$ degree of freedom. We can refer to the $\\chi^2$ distribution table to acquire the p-value. If it is smaller than $0.05$, we will reject the null at the $95\\%$ confidence level.\n", "2. Empirical sampling distribution: a distribution constructed from a large number of $\\chi^2$ test statistics for simulations under the null of CSR. If the $\\chi^2$ test statistic for the observed point pattern is among the largest $5%$ test statistics, we would say that it is very unlikely that it is the outcome of a CSR process at the $95\\%$ confidence level. Then, the null is rejected. A pseudo p-value can be calculated based on which we can use the same rule as p-value to make the decision:\n", "$$p(\\chi^2) = \\frac{1+\\sum^{nsim}_{i=1}\\phi_i}{nsim+1}$$\n", "where \n", "$$\n", "\\phi_i =\n", " \\begin{cases}\n", " 1 & \\quad \\text{if } \\psi_i^2 \\geq \\chi^2 \\\\\n", " 0 & \\quad \\text{otherwise } \\\\\n", " \\end{cases}\n", "$$\n", "\n", "$nsim$ is the number of simulations, $\\psi_i^2$ is the $\\chi^2$ test statistic calculated for each simulated point pattern, $\\chi^2$ is the $\\chi^2$ test statistic calculated for the observed point pattern, $\\phi_i$ is an indicator variable.\n", "\n", "We are going to introduce how to use the **quadrat_statistics.py** module to perform quadrat based method using either of the above two approaches to constructing the sampling distribution and acquire a p-value.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Juvenile Example" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import libpysal as ps\n", "import numpy as np\n", "from pointpats import PointPattern, as_window\n", "from pointpats import PoissonPointProcess as csr\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the quadrat_statistics module to conduct quadrat-based method. \n", "\n", "Among the three major classes in the module, **RectangleM, HexagonM, QStatistic**, the first two are aimed at imposing a tessellation (rectangular or hexagonal shape) over the minimum bounding rectangle of the point pattern and calculate the number of points falling in each cell; **QStatistic** is the main class with which we can calculate a p-value, as well as a pseudo p-value to help us make the decision of rejecting the null or not." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pointpats.quadrat_statistics as qs" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['HexagonM',\n", " 'QStatistic',\n", " 'RectangleM',\n", " '__all__',\n", " '__author__',\n", " '__builtins__',\n", " '__cached__',\n", " '__doc__',\n", " '__file__',\n", " '__loader__',\n", " '__name__',\n", " '__package__',\n", " '__spec__',\n", " 'math',\n", " 'np',\n", " 'plt',\n", " 'scipy']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(qs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open the point shapefile \"juvenile.shp\"." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "juv = ps.io.open(ps.examples.get_path(\"juvenile.shp\"))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "168" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(juv) # 168 point events in total" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[94., 93.],\n", " [80., 95.],\n", " [79., 90.],\n", " [78., 92.],\n", " [76., 92.],\n", " [66., 93.],\n", " [64., 90.],\n", " [27., 70.],\n", " [58., 88.],\n", " [57., 92.],\n", " [53., 92.],\n", " [50., 90.],\n", " [49., 90.],\n", " [32., 90.],\n", " [31., 87.],\n", " [22., 87.],\n", " [21., 87.],\n", " [21., 86.],\n", " [22., 81.],\n", " [23., 83.],\n", " [27., 85.],\n", " [27., 84.],\n", " [27., 83.],\n", " [27., 82.],\n", " [30., 84.],\n", " [31., 84.],\n", " [31., 84.],\n", " [32., 83.],\n", " [33., 81.],\n", " [32., 79.],\n", " [32., 76.],\n", " [33., 77.],\n", " [34., 86.],\n", " [34., 84.],\n", " [38., 82.],\n", " [39., 81.],\n", " [40., 80.],\n", " [41., 83.],\n", " [43., 75.],\n", " [44., 81.],\n", " [46., 81.],\n", " [47., 82.],\n", " [47., 81.],\n", " [48., 80.],\n", " [48., 81.],\n", " [50., 85.],\n", " [51., 84.],\n", " [52., 83.],\n", " [55., 85.],\n", " [57., 88.],\n", " [57., 81.],\n", " [60., 87.],\n", " [69., 80.],\n", " [71., 82.],\n", " [72., 81.],\n", " [74., 82.],\n", " [75., 81.],\n", " [77., 88.],\n", " [80., 88.],\n", " [82., 77.],\n", " [66., 62.],\n", " [64., 71.],\n", " [59., 63.],\n", " [55., 64.],\n", " [53., 68.],\n", " [52., 59.],\n", " [51., 61.],\n", " [50., 75.],\n", " [50., 74.],\n", " [45., 61.],\n", " [44., 60.],\n", " [43., 59.],\n", " [42., 61.],\n", " [39., 71.],\n", " [37., 67.],\n", " [35., 70.],\n", " [31., 68.],\n", " [30., 71.],\n", " [29., 61.],\n", " [26., 69.],\n", " [24., 68.],\n", " [ 7., 52.],\n", " [11., 53.],\n", " [34., 50.],\n", " [36., 47.],\n", " [37., 45.],\n", " [37., 56.],\n", " [38., 55.],\n", " [38., 50.],\n", " [39., 52.],\n", " [41., 52.],\n", " [47., 49.],\n", " [50., 57.],\n", " [52., 56.],\n", " [53., 55.],\n", " [56., 57.],\n", " [69., 52.],\n", " [69., 50.],\n", " [71., 51.],\n", " [71., 51.],\n", " [73., 48.],\n", " [74., 48.],\n", " [75., 46.],\n", " [75., 46.],\n", " [86., 51.],\n", " [87., 51.],\n", " [87., 52.],\n", " [90., 52.],\n", " [91., 51.],\n", " [87., 42.],\n", " [81., 39.],\n", " [80., 43.],\n", " [79., 37.],\n", " [78., 38.],\n", " [75., 44.],\n", " [73., 41.],\n", " [71., 44.],\n", " [68., 29.],\n", " [62., 33.],\n", " [61., 35.],\n", " [60., 34.],\n", " [58., 36.],\n", " [54., 30.],\n", " [52., 38.],\n", " [52., 36.],\n", " [47., 37.],\n", " [46., 36.],\n", " [45., 33.],\n", " [36., 32.],\n", " [22., 39.],\n", " [21., 38.],\n", " [22., 35.],\n", " [21., 36.],\n", " [22., 30.],\n", " [19., 29.],\n", " [17., 40.],\n", " [14., 41.],\n", " [13., 36.],\n", " [10., 34.],\n", " [ 7., 37.],\n", " [ 2., 39.],\n", " [21., 16.],\n", " [22., 14.],\n", " [29., 17.],\n", " [30., 25.],\n", " [32., 26.],\n", " [39., 28.],\n", " [40., 26.],\n", " [40., 26.],\n", " [42., 25.],\n", " [43., 24.],\n", " [43., 16.],\n", " [48., 16.],\n", " [51., 25.],\n", " [52., 26.],\n", " [57., 27.],\n", " [60., 22.],\n", " [63., 24.],\n", " [64., 23.],\n", " [64., 27.],\n", " [71., 25.],\n", " [50., 10.],\n", " [48., 12.],\n", " [45., 14.],\n", " [33., 8.],\n", " [31., 7.],\n", " [32., 6.],\n", " [31., 8.]])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "juv_points = np.array([event for event in juv]) # get x,y coordinates for all the points\n", "juv_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Construct a point pattern from numpy array **juv_points**." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pp_juv = PointPattern(juv_points)\n", "pp_juv" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Point Pattern\n", "168 points\n", "Bounding rectangle [(2.0,6.0), (94.0,95.0)]\n", "Area of window: 8188.0\n", "Intensity estimate for window: 0.02051783097215437\n", " x y\n", "0 94.0 93.0\n", "1 80.0 95.0\n", "2 79.0 90.0\n", "3 78.0 92.0\n", "4 76.0 92.0\n" ] } ], "source": [ "pp_juv.summary()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "