{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data = pd.read_csv(\"person_info.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can find the data here: https://github.com/lizhouf/oscr2019/blob/master/person_info.csv" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first_namelast_namebirthdayagestateaddressCityphoneemailcar_1gpayearclass_ofonline_signiture
0CarolDavis9/29/199623Illinois1674 Carolyns CircleBurr Ridge, Illinois(IL), 60527312-295-1941curt1995@gmail.com6ZUA6182.8512022Don't aim for success if you want it just do w...
1BrunoHoran6/11/199524California1561 Still StreetSan Diego, California(CA), 92111858-449-3324guadalupe1974@yahoo.com982KRK3.4722021In any investment, you expect to have fun and ...
2WilliamMoody2/27/199722Illinois541 Jade wood DriveArlington Heights, Illinois(IL), 60004979-614-4038roosevelt.fee@hotmail.comPS9-S9172.7822021It's not my fault that people don't appreciate...
3RobinSteel8/3/198957Texas1674 Caroly ns CircleJosephine, Texas(TX), 75173214-694-7864lloyd2009@hotmail.comna4.3342019The press is the hired agent of a monied system
4MichelleRoberts7/17/199524Oregon1372 Gateway RoadPortland, Oregon(OR), 97217\\n\\n503-283-2255ben1972@gmail.com6XNK6203.75NaN2019I am desperate for change - now - not in 8 yea...
5JuneSneed3/27/200019Arizona2411 Clarksburg Park RoadPhoenix, Arizona(AZ), 85003256-286-5628kathlyn_runolf@yahoo.comNaN3.60Jr2021Civilization is the progress toward a society ...
6CurtisCampbell3/15/199128Idahol2760 Science Center DrivePocatello, Idaho(ID), 83201979-614-4038justen_schust@yahoo.comPS9-S9172.3232020I have an incredible amount of basketball know...
7DorothySchott1/2/199721California2742 Sunny Day DriveSanta Ana, California(CA), 92770501-281-4074megane_purd1@hotmail.comNaN3.93NaN2020A lawyer with his briefcase can steal more tha...
8MaeSkinner3/16/199524PennsylvaniaNaNNewark, Pennsylvania(PA), 19714501-334-8502enrique.berni@gmail.comWCE-28233.8532020When humor can be made to alternate with melan...
9DavidVictoria8/2/199623Maine3327 Chipmunk LaneHarpswell, Maine(ME), 04079207-570-1895carolina1977@hotmail.comVDS-56391.74S2021The difference between a beautifully made fail...
\n", "
" ], "text/plain": [ " first_name last_name birthday age state \\\n", "0 Carol Davis 9/29/1996 23 Illinois \n", "1 Bruno Horan 6/11/1995 24 California \n", "2 William Moody 2/27/1997 22 Illinois \n", "3 Robin Steel 8/3/1989 57 Texas \n", "4 Michelle Roberts 7/17/1995 24 Oregon \n", "5 June Sneed 3/27/2000 19 Arizona \n", "6 Curtis Campbell 3/15/1991 28 Idahol \n", "7 Dorothy Schott 1/2/1997 21 California \n", "8 Mae Skinner 3/16/1995 24 Pennsylvania \n", "9 David Victoria 8/2/1996 23 Maine \n", "\n", " address City \\\n", "0 1674 Carolyns Circle Burr Ridge, Illinois(IL), 60527 \n", "1 1561 Still Street San Diego, California(CA), 92111 \n", "2 541 Jade wood Drive Arlington Heights, Illinois(IL), 60004 \n", "3 1674 Caroly ns Circle Josephine, Texas(TX), 75173 \n", "4 1372 Gateway Road Portland, Oregon(OR), 97217\\n\\n \n", "5 2411 Clarksburg Park Road Phoenix, Arizona(AZ), 85003 \n", "6 2760 Science Center Drive Pocatello, Idaho(ID), 83201 \n", "7 2742 Sunny Day Drive Santa Ana, California(CA), 92770 \n", "8 NaN Newark, Pennsylvania(PA), 19714 \n", "9 3327 Chipmunk Lane Harpswell, Maine(ME), 04079 \n", "\n", " phone email car_1 gpa year class_of \\\n", "0 312-295-1941 curt1995@gmail.com 6ZUA618 2.85 1 2022 \n", "1 858-449-3324 guadalupe1974@yahoo.com 982KRK 3.47 2 2021 \n", "2 979-614-4038 roosevelt.fee@hotmail.com PS9-S917 2.78 2 2021 \n", "3 214-694-7864 lloyd2009@hotmail.com na 4.33 4 2019 \n", "4 503-283-2255 ben1972@gmail.com 6XNK620 3.75 NaN 2019 \n", "5 256-286-5628 kathlyn_runolf@yahoo.com NaN 3.60 Jr 2021 \n", "6 979-614-4038 justen_schust@yahoo.com PS9-S917 2.32 3 2020 \n", "7 501-281-4074 megane_purd1@hotmail.com NaN 3.93 NaN 2020 \n", "8 501-334-8502 enrique.berni@gmail.com WCE-2823 3.85 3 2020 \n", "9 207-570-1895 carolina1977@hotmail.com VDS-5639 1.74 S 2021 \n", "\n", " online_signiture \n", "0 Don't aim for success if you want it just do w... \n", "1 In any investment, you expect to have fun and ... \n", "2 It's not my fault that people don't appreciate... \n", "3 The press is the hired agent of a monied system \n", "4 I am desperate for change - now - not in 8 yea... \n", "5 Civilization is the progress toward a society ... \n", "6 I have an incredible amount of basketball know... \n", "7 A lawyer with his briefcase can steal more tha... \n", "8 When humor can be made to alternate with melan... \n", "9 The difference between a beautifully made fail... " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How to Check & Convert Data Types using Python? " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Check Data Types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In python, we can use the .info() method to check the data information (note that \"object\" includes string data type):" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 10 entries, 0 to 9\n", "Data columns (total 14 columns):\n", "first_name 10 non-null object\n", "last_name 10 non-null object\n", "birthday 10 non-null object\n", "age 10 non-null int64\n", "state 10 non-null object\n", "address 9 non-null object\n", "City 10 non-null object\n", "phone 10 non-null object\n", "email 10 non-null object\n", "car_1 8 non-null object\n", "gpa 10 non-null float64\n", "year 8 non-null object\n", "class_of 10 non-null int64\n", "online_signiture 10 non-null object\n", "dtypes: float64(1), int64(2), object(11)\n", "memory usage: 1.2+ KB\n" ] } ], "source": [ "data.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert Data Type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want the class_of column to be all string types (i.e. dummy variables), we can:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "data.class_of = data.class_of.to_string()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that, after these operations, the data type description of the class_of column is now changed to object." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 10 entries, 0 to 9\n", "Data columns (total 14 columns):\n", "first_name 10 non-null object\n", "last_name 10 non-null object\n", "birthday 10 non-null object\n", "age 10 non-null int64\n", "state 10 non-null object\n", "address 9 non-null object\n", "City 10 non-null object\n", "phone 10 non-null object\n", "email 10 non-null object\n", "car_1 8 non-null object\n", "gpa 10 non-null float64\n", "year 8 non-null object\n", "class_of 10 non-null object\n", "online_signiture 10 non-null object\n", "dtypes: float64(1), int64(1), object(12)\n", "memory usage: 1.2+ KB\n" ] } ], "source": [ "data.info()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }