{
"metadata": {
"name": "",
"signature": "sha256:aaf441757ac12f69c726c423393aad9c733dffa72be335b64a190565acee2421"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Select Rows When Columns Contain Certain Values\n",
"\n",
"- **Author:** [Chris Albon](http://www.chrisalbon.com/), [@ChrisAlbon](https://twitter.com/chrisalbon)\n",
"- **Date:** -\n",
"- **Repo:** [Python 3 code snippets for data science](https://github.com/chrisalbon/code_py)\n",
"- **Note:**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preliminaries"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Import modules\n",
"import pandas as pd\n",
"\n",
"# Set ipython's max row display\n",
"pd.set_option('display.max_row', 1000)\n",
"\n",
"# Set iPython's max column width to 50\n",
"pd.set_option('display.max_columns', 50)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create an example dataframe"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Create an example dataframe\n",
"data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], \n",
" 'year': [2012, 2012, 2013, 2014, 2014], \n",
" 'reports': [4, 24, 31, 2, 3]}\n",
"df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])\n",
"df"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" reports | \n",
" year | \n",
"
\n",
" \n",
" \n",
" \n",
" Cochice | \n",
" Jason | \n",
" 4 | \n",
" 2012 | \n",
"
\n",
" \n",
" Pima | \n",
" Molly | \n",
" 24 | \n",
" 2012 | \n",
"
\n",
" \n",
" Santa Cruz | \n",
" Tina | \n",
" 31 | \n",
" 2013 | \n",
"
\n",
" \n",
" Maricopa | \n",
" Jake | \n",
" 2 | \n",
" 2014 | \n",
"
\n",
" \n",
" Yuma | \n",
" Amy | \n",
" 3 | \n",
" 2014 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
" name reports year\n",
"Cochice Jason 4 2012\n",
"Pima Molly 24 2012\n",
"Santa Cruz Tina 31 2013\n",
"Maricopa Jake 2 2014\n",
"Yuma Amy 3 2014"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Grab rows based on column values"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"value_list = ['Tina', 'Molly', 'Jason']"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#Grab DataFrame rows where column has certain values\n",
"df[df.name.isin(value_list)]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" reports | \n",
" year | \n",
"
\n",
" \n",
" \n",
" \n",
" Cochice | \n",
" Jason | \n",
" 4 | \n",
" 2012 | \n",
"
\n",
" \n",
" Pima | \n",
" Molly | \n",
" 24 | \n",
" 2012 | \n",
"
\n",
" \n",
" Santa Cruz | \n",
" Tina | \n",
" 31 | \n",
" 2013 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 17,
"text": [
" name reports year\n",
"Cochice Jason 4 2012\n",
"Pima Molly 24 2012\n",
"Santa Cruz Tina 31 2013"
]
}
],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#Grab DataFrame rows where column doesn't have certain values\n",
"df[~df.name.isin(value_list)]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" reports | \n",
" year | \n",
"
\n",
" \n",
" \n",
" \n",
" Maricopa | \n",
" Jake | \n",
" 2 | \n",
" 2014 | \n",
"
\n",
" \n",
" Yuma | \n",
" Amy | \n",
" 3 | \n",
" 2014 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 19,
"text": [
" name reports year\n",
"Maricopa Jake 2 2014\n",
"Yuma Amy 3 2014"
]
}
],
"prompt_number": 19
}
],
"metadata": {}
}
]
}