{ "metadata": { "name": "", "signature": "sha256:798fda98bebba5b506ed6281d9e76b081ce3088ef2aff43ccafaea9870e2b6b2" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Create a Column Based on a Conditional in pandas\n", "\n", "- **Author:** [Chris Albon](http://www.chrisalbon.com/), [@ChrisAlbon](https://twitter.com/chrisalbon)\n", "- **Date:** -\n", "- **Repo:** [Python 3 code snippets for data science](https://github.com/chrisalbon/code_py)\n", "- **Note:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preliminaries" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Import required modules\n", "import pandas as pd\n", "import numpy as np" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a dataframe" ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], \n", " 'age': [42, 52, 36, 24, 73], \n", " 'preTestScore': [4, 24, 31, 2, 3],\n", " 'postTestScore': [25, 94, 57, 62, 70]}\n", "df = pd.DataFrame(data, columns = ['name', 'age', 'preTestScore', 'postTestScore'])\n", "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nameagepreTestScorepostTestScore
0 Jason 42 4 25
1 Molly 52 24 94
2 Tina 36 31 57
3 Jake 24 2 62
4 Amy 73 3 70
\n", "

5 rows \u00d7 4 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ " name age preTestScore postTestScore\n", "0 Jason 42 4 25\n", "1 Molly 52 24 94\n", "2 Tina 36 31 57\n", "3 Jake 24 2 62\n", "4 Amy 73 3 70\n", "\n", "[5 rows x 4 columns]" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add a new column for elderly" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Create a new column called df.elderly where the value is yes\n", "# if df.age is greater than 50 and no if not\n", "df['elderly'] = np.where(df['age']>=50, 'yes', 'no')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "# View the dataframe\n", "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nameagepreTestScorepostTestScoreelderly
0 Jason 42 4 25 no
1 Molly 52 24 94 yes
2 Tina 36 31 57 no
3 Jake 24 2 62 no
4 Amy 73 3 70 yes
\n", "

5 rows \u00d7 5 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ " name age preTestScore postTestScore elderly\n", "0 Jason 42 4 25 no\n", "1 Molly 52 24 94 yes\n", "2 Tina 36 31 57 no\n", "3 Jake 24 2 62 no\n", "4 Amy 73 3 70 yes\n", "\n", "[5 rows x 5 columns]" ] } ], "prompt_number": 9 } ], "metadata": {} } ] }