{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Singular Spectrum Analysis for NH Monthly Land Temperature\n", "\n", "Singular Spectrum Analysis (SSA) is a powerful nonparametric method of analysis and forecasting of time series. It tries to overcome the problems of finite sample length and noisiness of sampled time series not by fitting an assumed model to the available series, but by using a data-adaptive basis set, instead of the fixed sine and cosine of the BT method.\n", "\n", "The scope of applications of SSA is very wide, from non-parametric time series decomposition and filtration to parameter estimation and forecasting. One of the differences between SSA and the methods of traditional time series analysis is the fact that SSA and SSA-related methods can be applied to quite different and not conventional for classical time series analysis problems such as exploratory analysis for data-mining and parameter estimation in signal processing, among others. For a rigorous and advanced introduction to SSA, please refer to the review paper Ghil et al. (2002) “Advanced spectral methods for climatic time series”, Rev Geophys 40(1): 3.1–3.41. \n", "\n", "In this notebook, we take the datasets of GISTEMP Northern Hemisphere-mean as an example (from 1880 to present). The GISS Surface Temperature Analysis (GISTEMP) is an estimate of global surface temperature change (https://data.giss.nasa.gov/gistemp/). The data is updated around the middle of every month using current data files from NOAA GHCN v3 (meteorological stations), ERSST v5 (ocean areas), and SCAR (Antarctic stations), combined as described in our December 2010 publication (Hansen et al., 2010). The data are presented as anomalies, i.e. deviations from the corresponding 1951-1980 means.\n", "\n", "There are many implementations of SSA such as R, Mathlab and Python. The Python package of [pyssa](https://github.com/aj-cloete/pySSA) is used In this notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load all needed libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline \n", "\n", "import pandas as pd\n", "import numpy as np\n", "\n", "from mySSA import mySSA # private lib\n", "from matplotlib.pylab import rcParams\n", "rcParams['figure.figsize'] = 11, 4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Read data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Jan | \n", "Feb | \n", "Mar | \n", "Apr | \n", "May | \n", "Jun | \n", "Jul | \n", "Aug | \n", "Sep | \n", "Oct | \n", "Nov | \n", "Dec | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
Year | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
2012-01-01 | \n", "0.79 | \n", "0.75 | \n", "0.98 | \n", "1.35 | \n", "1.24 | \n", "1.16 | \n", "0.99 | \n", "0.95 | \n", "1.05 | \n", "1.09 | \n", "1.15 | \n", "0.63 | \n", "
2013-01-01 | \n", "1.16 | \n", "0.95 | \n", "1.11 | \n", "0.86 | \n", "0.92 | \n", "0.95 | \n", "0.84 | \n", "0.81 | \n", "0.81 | \n", "1.03 | \n", "1.20 | \n", "1.01 | \n", "
2014-01-01 | \n", "1.30 | \n", "0.98 | \n", "1.44 | \n", "1.39 | \n", "1.15 | \n", "1.00 | \n", "0.95 | \n", "0.96 | \n", "0.93 | \n", "1.17 | \n", "1.02 | \n", "1.43 | \n", "
2015-01-01 | \n", "1.42 | \n", "1.36 | \n", "1.58 | \n", "1.24 | \n", "1.17 | \n", "1.21 | \n", "1.03 | \n", "1.12 | \n", "1.22 | \n", "1.39 | \n", "1.58 | \n", "1.91 | \n", "
2016-01-01 | \n", "1.95 | \n", "2.49 | \n", "2.44 | \n", "1.98 | \n", "1.40 | \n", "1.37 | \n", "1.30 | \n", "1.40 | \n", "1.45 | \n", "1.36 | \n", "1.37 | \n", "1.40 | \n", "