{ "metadata": { "name": "", "signature": "sha256:ce5d7a0b02c96fce4e26f8a138da75f1b4d747ee72bc3c70fac094991c183266" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Iterating Through The Rows Of Multiple Columns In Pandas\n", "\n", "- **Author:** [Chris Albon](http://www.chrisalbon.com/), [@ChrisAlbon](https://twitter.com/chrisalbon)\n", "- **Date:** -\n", "- **Repo:** [Python 3 code snippets for data science](https://github.com/chrisalbon/code_py)\n", "- **Note:**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Import modules\n", "import pandas as pd\n", "import numpy as np" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "raw_data = {'first_name': ['Jason', 'Jason', 'Tina', 'Jake', 'Amy'], \n", " 'last_name': ['Miller', 'Miller', 'Ali', 'Milner', 'Cooze'], \n", " 'age': [42, 42, 36, 24, 73], \n", " 'preTestScore': [4, 4, 31, 2, 3],\n", " 'postTestScore': [25, 25, 57, 62, 70]}\n", "df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'preTestScore', 'postTestScore'])\n", "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first_namelast_nameagepreTestScorepostTestScore
0 Jason Miller 42 4 25
1 Jason Miller 42 4 25
2 Tina Ali 36 31 57
3 Jake Milner 24 2 62
4 Amy Cooze 73 3 70
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ " first_name last_name age preTestScore postTestScore\n", "0 Jason Miller 42 4 25\n", "1 Jason Miller 42 4 25\n", "2 Tina Ali 36 31 57\n", "3 Jake Milner 24 2 62\n", "4 Amy Cooze 73 3 70" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterate Over The Rows Of Two Columns" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Create an empty column for the full names\n", "df['full_name'] = np.NaN\n", "\n", "# Create an iteration counter\n", "i = 0\n", "\n", "# For each element in first_name and last_name,\n", "for first, last in zip(df['first_name'], df['last_name']):\n", " # Change the value of the i'th row in full_name \n", " # to the combination of the first and last name\n", " df['full_name'][i] = first + ' ' + last\n", " \n", " # Add one to the iteration counter\n", " i = i+1" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "# View the dataframe\n", "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first_namelast_nameagepreTestScorepostTestScorefull_name
0 Jason Miller 42 4 25 Jason Miller
1 Jason Miller 42 4 25 Jason Miller
2 Tina Ali 36 31 57 Tina Ali
3 Jake Milner 24 2 62 Jake Milner
4 Amy Cooze 73 3 70 Amy Cooze
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ " first_name last_name age preTestScore postTestScore full_name\n", "0 Jason Miller 42 4 25 Jason Miller\n", "1 Jason Miller 42 4 25 Jason Miller\n", "2 Tina Ali 36 31 57 Tina Ali\n", "3 Jake Milner 24 2 62 Jake Milner\n", "4 Amy Cooze 73 3 70 Amy Cooze" ] } ], "prompt_number": 28 } ], "metadata": {} } ] }