{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Statistical seismology and the Parkfield region\n", "> Herein, you'll use your statistical thinking skills to study the frequency and magnitudes of earthquakes. Along the way, you'll learn some basic statistical seismology, including the Gutenberg-Richter law. This exercise exposes two key ideas about data science 1) As a data scientist, you wander into all sorts of domain specific analyses, which is very exciting. You constantly get to learn. 2) You are sometimes faced with limited data, which is also the case for many of these earthquake studies. This is the Summary of lecture \"Case Studies in Statistical Thinking\", via datacamp.\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Datacamp, Statistics]\n", "- image: images/compare_exp_norm.png" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import dc_stat_think as dcst\n", "\n", "plt.rcParams['figure.figsize'] = (10, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction to statistical seismology and the Parkfield experiment\n", "- The Gutenberg-Richter Law\n", "The magnitudes of earthquakes in a given region over a given time period are exponentially distributed. \n", "\n", "One parameter, given by $\\bar{m} - m_t$, describes earthquake magnitude\n", "- Completeness threshold\n", "The magnitude $m_t$, above which all earthquakes in a region can be detected" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Parkfield earthquake magnitudes\n", "As usual, you will start with EDA and plot the ECDF of the magnitudes of earthquakes detected in the Parkfield region from 1950 to 2016. The magnitudes of all earthquakes in the region from the ANSS ComCat are stored in the Numpy array `mags`.\n", "\n", "When you do it this time, though, take a shortcut in generating the ECDF. You may recall that putting an asterisk before an argument in a function splits what follows into separate arguments. Since `dcst.ecdf()` returns two values, we can pass them as the x, y positional arguments to `plt.plot()` as `plt.plot(*dcst.ecdf(data_you_want_to_plot))`.\n", "\n", "You will use this shortcut in this exercise and going forward." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | latitude | \n", "longitude | \n", "depth | \n", "mag | \n", "magType | \n", "nst | \n", "gap | \n", "dmin | \n", "rms | \n", "net | \n", "... | \n", "depthError | \n", "magError | \n", "magNst | \n", "status | \n", "locationSource | \n", "magSource | \n", "loc_name | \n", "loc_admin1 | \n", "loc_admin2 | \n", "loc_cc | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
1951-10-03 13:44:33.170 | \n", "35.869333 | \n", "-120.451000 | \n", "6.0 | \n", "3.67 | \n", "ml | \n", "6.0 | \n", "259.0 | \n", "1.5480 | \n", "0.43 | \n", "ci | \n", "... | \n", "31.61 | \n", "0.154 | \n", "10.0 | \n", "reviewed | \n", "ci | \n", "ci | \n", "Shandon | \n", "California | \n", "San Luis Obispo County | \n", "US | \n", "
1953-05-28 07:58:34.510 | \n", "36.004167 | \n", "-120.501167 | \n", "6.0 | \n", "3.61 | \n", "ml | \n", "7.0 | \n", "296.0 | \n", "0.9139 | \n", "0.39 | \n", "ci | \n", "... | \n", "31.61 | \n", "NaN | \n", "1.0 | \n", "reviewed | \n", "ci | \n", "ci | \n", "Coalinga | \n", "California | \n", "Fresno County | \n", "US | \n", "
1961-12-14 11:51:15.410 | \n", "35.970000 | \n", "-120.470167 | \n", "6.0 | \n", "3.95 | \n", "ml | \n", "12.0 | \n", "297.0 | \n", "0.8718 | \n", "0.51 | \n", "ci | \n", "... | \n", "31.61 | \n", "0.070 | \n", "11.0 | \n", "reviewed | \n", "ci | \n", "ci | \n", "Coalinga | \n", "California | \n", "Fresno County | \n", "US | \n", "
1965-02-21 18:39:24.500 | \n", "35.881000 | \n", "-120.383500 | \n", "6.0 | \n", "3.54 | \n", "ml | \n", "10.0 | \n", "257.0 | \n", "1.5380 | \n", "0.56 | \n", "ci | \n", "... | \n", "31.61 | \n", "0.048 | \n", "11.0 | \n", "reviewed | \n", "ci | \n", "ci | \n", "Shandon | \n", "California | \n", "San Luis Obispo County | \n", "US | \n", "
1966-06-28 04:18:36.180 | \n", "35.856500 | \n", "-120.446167 | \n", "6.0 | \n", "3.15 | \n", "ml | \n", "7.0 | \n", "259.0 | \n", "1.3120 | \n", "0.32 | \n", "ci | \n", "... | \n", "31.61 | \n", "0.105 | \n", "7.0 | \n", "reviewed | \n", "ci | \n", "ci | \n", "Shandon | \n", "California | \n", "San Luis Obispo County | \n", "US | \n", "
5 rows × 25 columns
\n", "