{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#### PYT-DS SAISOFT\n", "\n", "* [Overview 2](https://github.com/4dsolutions/Python5/blob/master/OverviewNotes_PYTDS_2.ipynb)\n", "* [Overview 3](https://github.com/4dsolutions/Python5/blob/master/OverviewNotes_PYTDS_3.ipynb)\n", "\n", "\n", "\n", "# DATA SCIENCE WITH PYTHON\n", "\n", "## Where Have We Been, What Have We Seen?\n", "\n", "My focus in this course is two track:\n", "\n", "* develop high level intuitions about statistical and [machine learning concepts](https://goo.gl/z9xgQz)\n", "\n", "* practice with nuts and bolts tools of the trade, namely pandas, matplotlib, numpy, other visualization tools (seaborn, bokeh...), specialized versions of pandas (geopandas, basemap).\n", "\n", "However, these two tracks are not strictly distinct, as navigating one's way through the extensive APIs associated with nuts and bolts tools, requires developing high level intuitions. These tracks are complementary and require each other.\n", "\n", "### HIGH LEVEL INTUITIONS\n", "\n", "What are some examples of high level intuitions?\n", "\n", "I talked at some length about long-raging debates between two schools of thought in statistics: frequentist and Bayesian. Some of these debates have been concealed from us, as the successes of Bayesian thinking, also known as subjectivist, tend to feature early electronic computers and prototypical examples of machine learning, as these were emergent in the UK and US during WW2 especially, and highly classified.\n", "\n", "Here in 2018, we're getting more of a picture of what went on at Bletchley Park. Neal Stephenson's *Cryptonomicon*, a work of historical science fiction, helped break the ice around sharing these stories. I learned a lot [about cryptography](http://www.4dsolutions.net/ocn/clubhouse.html) simply from reading about the history of RSA.\n", "\n", "Frequentists focus on sampling sufficiently to make reliable estimates regarding a larger population, deemed approachable in the asymptote but with diminishing returns. Why sample a million people if choosing the right few hundred gives the same predictions? Find out what sampling techniques give the most bang for the buck and then consider yourself ready to predict what will happen on the larger scale. The focus is on finding correlating patterns, whether or not causation might be implied.\n", "\n", "\n", "
\n", " | wR | \n", "wKn | \n", "wB | \n", "K | \n", "Q | \n", "eB | \n", "eKn | \n", "eR | \n", "
---|---|---|---|---|---|---|---|---|
1 | \n", "♖ | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | ♖ | \n", "
2 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
3 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
4 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
5 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
6 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
7 | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
8 | \n", "♖ | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | ♖ | \n", "