{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "3b938485",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'C:\\\\Users\\\\Karishma\\\\anaconda3\\\\envs\\\\ickle\\\\python.exe'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import sys\n",
"sys.executable"
]
},
{
"cell_type": "markdown",
"id": "accurate-secretariat",
"metadata": {},
"source": [
"# Ickle\n",
"*Data Analysis Library for Python*\n",
"
\n",
"\n",
"This Jupyter Notebook serves as the documentation for Ickle.\n",
"\n",
"You can contribute to Ickle here: https://github.com/karishmashuklaa/ickle "
]
},
{
"cell_type": "markdown",
"id": "educational-neighbor",
"metadata": {},
"source": [
"## Table Of Contents\n",
"1. [Getting Started](#Getting-Started)\n",
"2. [DataFrame and Visual Representation](#DataFrame-and-Visual-Representation)\n",
"3. [Basic Properties](#Basic-Properties)\n",
"4. [Selection of Subsets](#Selection-of-Subsets)\n",
"5. [Basic and Aggregation Methods](#Basic-And-Aggregation-Methods)\n",
"6. [Non-Aggregation Methods](#Non-Aggregation-Methods)\n",
"7. [Other Methods](#Other-Methods)\n",
"8. [Arithmetic and Comparison Operators](#Arithmetic-And-Comparison-Operators)\n",
"9. [String-Only Methods](#String-Only-Methods)\n",
"10. [Pivot Table](#Pivot-Table)\n",
"11. [Read CSV](#Read-CSV)\n",
"12. [Read SQL](#Read-SQL)\n",
"13. [Read Excel](#Read-Excel)"
]
},
{
"cell_type": "markdown",
"id": "interstate-mainstream",
"metadata": {},
"source": [
"## Getting Started"
]
},
{
"cell_type": "markdown",
"id": "upper-genre",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"Ickle can be installed via pip.\n",
"\n",
"`pip install ickle`"
]
},
{
"cell_type": "markdown",
"id": "technical-climb",
"metadata": {},
"source": [
"### Import\n",
"\n",
"`import ickle as ick`"
]
},
{
"cell_type": "markdown",
"id": "meaning-yemen",
"metadata": {},
"source": [
"## DataFrame and Visual Representation\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "6043688e",
"metadata": {},
"source": [
"### DataFrame\n",
"A `DataFrame` holds two dimensional heterogenous data. It accepts dictionary as input, with Numpy arrays as values and strings as column names.\n",
"\n",
"Parameters:\n",
"- `data`: A dictionary of strings mapped to Numpy arrays. The key will become the column name."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b0df5039",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import ickle as ick"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "6c0f7f1f",
"metadata": {},
"outputs": [],
"source": [
"name = np.array(['John', 'Sam', 'Tina', 'Josh', 'Jack', 'Jill'])\n",
"place = np.array(['Kolkata', 'Mumbai', 'Delhi', 'Mumbai', 'Mumbai', 'Mumbai'])\n",
"weight = np.array([57, 70, 54, 59, 62, 70])\n",
"married = np.array([True, False, True, False, False, False])\n",
"\n",
"data = {'name': name, 'place': place, 'weight': weight, 'married': married}\n",
"df = ick.DataFrame(data)"
]
},
{
"cell_type": "markdown",
"id": "9ab1033f",
"metadata": {},
"source": [
"### Visual Representation"
]
},
{
"cell_type": "markdown",
"id": "5fdae139",
"metadata": {},
"source": [
"`DataFrame` can be displayed in the following manner"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "90b4e248",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | name | place | weight | married |
---|
0 | John | Kolkata | 57 | True |
1 | Sam | Mumbai | 70 | False |
2 | Tina | Delhi | 54 | True |
3 | Josh | Mumbai | 59 | False |
4 | Jack | Mumbai | 62 | False |
5 | Jill | Mumbai | 70 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"id": "developing-webster",
"metadata": {},
"source": [
"We will use the above `DataFrame` throughout the notebook"
]
},
{
"cell_type": "markdown",
"id": "d75257e4",
"metadata": {},
"source": [
"## Basic Properties\n",
"1. [len](#len)\n",
"2. [columns](#columns)\n",
"3. [shape](#shape)\n",
"4. [values](#values)\n",
"5. [dtypes](#dtypes)\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "e2f62efd",
"metadata": {},
"source": [
"### `len`\n",
"returns: the number of rows in the `DataFrame`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "926a015f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(df)"
]
},
{
"cell_type": "markdown",
"id": "8889ccc9",
"metadata": {},
"source": [
"### `columns`\n",
"returns: list of column names"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1b833cb8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['name', 'place', 'weight', 'married']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "markdown",
"id": "deac7b62",
"metadata": {},
"source": [
"### Modify exisiting column names"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "2331a1dd",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED |
---|
0 | John | Kolkata | 57 | True |
1 | Sam | Mumbai | 70 | False |
2 | Tina | Delhi | 54 | True |
3 | Josh | Mumbai | 59 | False |
4 | Jack | Mumbai | 62 | False |
5 | Jill | Mumbai | 70 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns = ['NAME', 'PLACE', 'WEIGHT', 'MARRIED']\n",
"df"
]
},
{
"cell_type": "markdown",
"id": "a73feb23",
"metadata": {},
"source": [
"### `shape` \n",
"returns: two-item tuple of number of rows and columns in the DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "854a584d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(6, 4)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.shape"
]
},
{
"cell_type": "markdown",
"id": "7bb8f220",
"metadata": {},
"source": [
"### `values`\n",
"returns: a single 2D NumPy array of all the columns of data."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c88f9fbe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([['John', 'Kolkata', 57, True],\n",
" ['Sam', 'Mumbai', 70, False],\n",
" ['Tina', 'Delhi', 54, True],\n",
" ['Josh', 'Mumbai', 59, False],\n",
" ['Jack', 'Mumbai', 62, False],\n",
" ['Jill', 'Mumbai', 70, False]], dtype=object)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.values"
]
},
{
"cell_type": "markdown",
"id": "d845021f",
"metadata": {},
"source": [
"### `dtypes`\n",
"returns: a two-column `DataFrame` of column names in one column and their data type in the other"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "384ddf93",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | Column Name | Data Type |
---|
0 | NAME | string |
1 | PLACE | string |
2 | WEIGHT | int |
3 | MARRIED | bool |
"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.dtypes"
]
},
{
"cell_type": "markdown",
"id": "ad6fdb90",
"metadata": {},
"source": [
"## Selection of Subsets\n",
"1. [Select a single column](#Select-a-single-column)\n",
"2. [Select multiple columns](#Select-multiple-columns)\n",
"3. [Boolean selection](#Boolean-selection)\n",
"4. [Simultaneuous selection of row and column](#Simultaneuous-selection-of-row-and-column)\n",
"6. [Add new / Overwrite existing columns](#Add-new-/-Overwrite-existing-columns)\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "82b21a31",
"metadata": {},
"source": [
"### Select a single column\n",
"by passing the name of column as a string"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "992355b5",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['NAME']"
]
},
{
"cell_type": "markdown",
"id": "679d01ce",
"metadata": {},
"source": [
"### Select multiple columns \n",
"by passing column names as a list of strings"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "4c2b9eb9",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE |
---|
0 | John | Kolkata |
1 | Sam | Mumbai |
2 | Tina | Delhi |
3 | Josh | Mumbai |
4 | Jack | Mumbai |
5 | Jill | Mumbai |
"
],
"text/plain": [
""
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[['NAME', 'PLACE']]"
]
},
{
"cell_type": "markdown",
"id": "11d85dc0",
"metadata": {},
"source": [
"### Boolean Selection"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "adb77b5d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED |
---|
0 | Sam | Mumbai | 70 | False |
1 | Jack | Mumbai | 62 | False |
2 | Jill | Mumbai | 70 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bool_sel = df['WEIGHT'] > 60\n",
"\n",
"df[bool_sel]"
]
},
{
"cell_type": "markdown",
"id": "8c280a69",
"metadata": {},
"source": [
"### Simultaneuous selection of row and column\n",
"df[row, col]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "b376c0f8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[0,2]"
]
},
{
"cell_type": "markdown",
"id": "mathematical-twins",
"metadata": {},
"source": [
"### Select columns as strings"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "rotary-drunk",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[0, 'WEIGHT']"
]
},
{
"cell_type": "markdown",
"id": "white-incidence",
"metadata": {},
"source": [
"### Select rows as slices"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "convenient-drilling",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[:1, 'WEIGHT']"
]
},
{
"cell_type": "markdown",
"id": "faced-figure",
"metadata": {},
"source": [
"### Select rows as booleans and lists"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "facial-lucas",
"metadata": {},
"outputs": [],
"source": [
"bool_row = df['MARRIED']"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "magnetic-offering",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[bool_row, 'WEIGHT']"
]
},
{
"cell_type": "markdown",
"id": "4e16ad13",
"metadata": {},
"source": [
"### Add new / Overwrite existing columns"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "338eb506",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
5 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['AGE'] = np.array([21, 41, 22, 42, 32, 25])\n",
"df"
]
},
{
"cell_type": "markdown",
"id": "8dc613ac",
"metadata": {},
"source": [
"## Basic And Aggregation Methods\n",
"Basic Methods:\n",
"1. [head()](#head)\n",
"2. [tail()](#tail)\n",
"\n",
"Aggregation Methods:\n",
"1. [min()](#min)\n",
"2. [max()](#max)\n",
"3. [mean()](#mean)\n",
"4. [median()](#median)\n",
"5. [sum()](#sum)\n",
"6. [var()](#var)\n",
"7. [std()](#std)\n",
"8. [all()](#all)\n",
"9. [any()](#any)\n",
"10. [argmax()](#argmax)\n",
"11. [argmin()](#argmin)\n",
"\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "f9ef40b9",
"metadata": {},
"source": [
"## head\n",
"\n",
"**head(n)**\n",
"\n",
"returns: the first n rows. By default n=5 "
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "708ef787",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
"
],
"text/plain": [
""
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"id": "moving-briefing",
"metadata": {},
"source": [
"## tail\n",
"\n",
"**tail(n)**\n",
"return the last n rows.By default n=5"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "fdebc49f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Sam | Mumbai | 70 | False | 41 |
1 | Tina | Delhi | 54 | True | 22 |
2 | Josh | Mumbai | 59 | False | 42 |
3 | Jack | Mumbai | 62 | False | 32 |
4 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.tail()"
]
},
{
"cell_type": "markdown",
"id": "a00bfe6b",
"metadata": {},
"source": [
"## Aggregation Methods\n",
"\n",
"**All aggregation methods are applied only column-wise and not row-wise.**\n",
"\n",
"1. [min()](#min)\n",
"2. [max()](#max)\n",
"3. [median()](#median)\n",
"4. [mean()](#mean)\n",
"5. [sum()](#sum)\n",
"6. [var()](#var)\n",
"7. [std()](#std)\n",
"8. [all()](#all)\n",
"9. [any()](#any)\n",
"10. [argmax()](#argmax)\n",
"11. [argmin()](#argmin)"
]
},
{
"cell_type": "markdown",
"id": "described-namibia",
"metadata": {},
"source": [
"## min\n",
"\n",
"**min()**\n",
"\n",
"used to get a minimum value for each column"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "d1a31bc0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Jack | Delhi | 54 | False | 21 |
"
],
"text/plain": [
""
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.min()"
]
},
{
"cell_type": "markdown",
"id": "invalid-resolution",
"metadata": {},
"source": [
"## max\n",
"\n",
"**max()**\n",
"\n",
"used to get a maximum value for each column"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "54a793d4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Tina | Mumbai | 70 | True | 42 |
"
],
"text/plain": [
""
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.max()"
]
},
{
"cell_type": "markdown",
"id": "covered-charter",
"metadata": {},
"source": [
"## median\n",
"\n",
"**median()**\n",
"\n",
"computes median for each numeric column"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "06e6994d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT | MARRIED | AGE |
---|
0 | 60.500 | 0.000 | 28.500 |
"
],
"text/plain": [
""
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.median()"
]
},
{
"cell_type": "markdown",
"id": "occupational-lease",
"metadata": {},
"source": [
"## mean\n",
"\n",
"**mean()**\n",
"\n",
"computes mean for each numeric column"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "a595c55d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT | MARRIED | AGE |
---|
0 | 62.000 | 0.333 | 30.500 |
"
],
"text/plain": [
""
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.mean()"
]
},
{
"cell_type": "markdown",
"id": "collective-limitation",
"metadata": {},
"source": [
"## sum\n",
"\n",
"**sum()**\n",
"\n",
"returns the sum of all values for each column"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "366c80e9",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | JohnSamTinaJoshJackJill | KolkataMumbaiDelhiMumbaiMumbaiMumbai | 372 | 2 | 183 |
"
],
"text/plain": [
""
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sum()"
]
},
{
"cell_type": "markdown",
"id": "pediatric-auditor",
"metadata": {},
"source": [
"## var\n",
"\n",
"**var()**\n",
"\n",
"computes variance for each numeric column"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "b94e56d0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT | MARRIED | AGE |
---|
0 | 37.667 | 0.222 | 72.917 |
"
],
"text/plain": [
""
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.var()"
]
},
{
"cell_type": "markdown",
"id": "working-rugby",
"metadata": {},
"source": [
"## std\n",
"\n",
"**std()**\n",
"\n",
"computes standard deviation for each numeric column"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "9527eac3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT | MARRIED | AGE |
---|
0 | 6.137 | 0.471 | 8.539 |
"
],
"text/plain": [
""
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.std()"
]
},
{
"cell_type": "markdown",
"id": "collect-modeling",
"metadata": {},
"source": [
"## all\n",
"\n",
"**all()**\n",
"\n",
"it tests whether all array elements along the columns evaluate to True"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "5269b024",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Jill | Mumbai | True | False | True |
"
],
"text/plain": [
""
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.all()"
]
},
{
"cell_type": "markdown",
"id": "physical-beads",
"metadata": {},
"source": [
"## any\n",
"\n",
"**any()**\n",
"\n",
"it tests whether any array element along the columns evaluate to True"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "ad0f9aad",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | True | True | True |
"
],
"text/plain": [
""
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.any()"
]
},
{
"cell_type": "markdown",
"id": "palestinian-murder",
"metadata": {},
"source": [
"## argmax\n",
"\n",
"**argmax()**\n",
"\n",
"returns the indices of the maximum values"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "b165d213",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | 2 | 1 | 1 | 0 | 3 |
"
],
"text/plain": [
""
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.argmax()"
]
},
{
"cell_type": "markdown",
"id": "spoken-animation",
"metadata": {},
"source": [
"## argmin\n",
"\n",
"**argmin()**\n",
"\n",
"returns the indices of the minimum values"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "f1671979",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | 4 | 2 | 2 | 1 | 0 |
"
],
"text/plain": [
""
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.argmin()"
]
},
{
"cell_type": "markdown",
"id": "other-diving",
"metadata": {},
"source": [
"## Other Methods \n",
"1. [isna()](#isna)\n",
"2. [count()](#count)\n",
"3. [unique()](#unique)\n",
"4. [nunique()](#nunique)\n",
"5. [value_counts()](#value_counts)\n",
"6. [rename()](#rename)\n",
"7. [drop()](#drop)\n",
"8. [diff()](#diff)\n",
"9. [pct_change()](#pct_change)\n",
"10. [sort_values()](#sort_values)\n",
"11. [sample()](#sample)"
]
},
{
"cell_type": "markdown",
"id": "dcdf622d",
"metadata": {},
"source": [
"### isna\n",
"\n",
"**isna()**\n",
"\n",
"Determines whether each value in the DataFrame is missing or not"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "8b47a2a3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | False | False | False | False | False |
1 | False | False | False | False | False |
2 | False | False | False | False | False |
3 | False | False | False | False | False |
4 | False | False | False | False | False |
5 | False | False | False | False | False |
"
],
"text/plain": [
""
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.isna()"
]
},
{
"cell_type": "markdown",
"id": "221c1433",
"metadata": {},
"source": [
"### count\n",
"\n",
"**count()**\n",
"\n",
"Counts the number of non-missing values per column"
]
},
{
"cell_type": "code",
"execution_count": 107,
"id": "da271425",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | 6 | 6 | 6 | 6 | 6 |
"
],
"text/plain": [
""
]
},
"execution_count": 107,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.count()"
]
},
{
"cell_type": "markdown",
"id": "3010fd76",
"metadata": {},
"source": [
"### unique\n",
"\n",
"**unique()**\n",
"\n",
"Finds the unique values of each column"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "8abf8bd4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfs = df.unique()\n",
"dfs[3]"
]
},
{
"cell_type": "markdown",
"id": "6ddbe1c1",
"metadata": {},
"source": [
"### nunique\n",
"\n",
"**nunique()**\n",
"\n",
"Finds the **number** of unique values in each column.\n",
"\n",
"Identical to `unique()`"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "ac159969",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | 6 | 3 | 5 | 2 | 6 |
"
],
"text/plain": [
""
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.nunique()"
]
},
{
"cell_type": "markdown",
"id": "9894920b",
"metadata": {},
"source": [
"### value_counts\n",
"\n",
"**value_counts(normalize=False)**\n",
"\n",
"Returns the frequency of each unique value for each column.\n",
"\n",
"Parameters\n",
"\n",
"`normalize`: bool\n",
" \n",
" If True, returns the relative frequencies(percent)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "54b3942e",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | PLACE | count |
---|
0 | Mumbai | 4 |
1 | Delhi | 1 |
2 | Kolkata | 1 |
"
],
"text/plain": [
""
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfs = df.value_counts()\n",
"dfs[1]"
]
},
{
"cell_type": "markdown",
"id": "acf33c05",
"metadata": {},
"source": [
"### rename\n",
"\n",
"**rename(columns)**\n",
"\n",
"Renames columns in the DataFrame\n",
"\n",
"Parameters\n",
" \n",
"`columns`: dict\n",
"\n",
" A dictionary mapping the old column name to the new column name\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "e052881c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT (kg) | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
5 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.rename({'WEIGHT': 'WEIGHT (kg)'})"
]
},
{
"cell_type": "markdown",
"id": "815e0c44",
"metadata": {},
"source": [
"### drop\n",
"\n",
"**drop(columns)**\n",
"\n",
"Drops one or more columns from a DataFrame\n",
"\n",
"Parameters\n",
"\n",
"`columns`: str or list of strings"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9fde87bb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED |
---|
0 | John | Kolkata | 57 | True |
1 | Sam | Mumbai | 70 | False |
2 | Tina | Delhi | 54 | True |
3 | Josh | Mumbai | 59 | False |
4 | Jack | Mumbai | 62 | False |
5 | Jill | Mumbai | 70 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.drop('AGE')"
]
},
{
"cell_type": "markdown",
"id": "22cb2eb0",
"metadata": {},
"source": [
"### diff\n",
"\n",
"**diff(n=1)**\n",
"\n",
"Take the difference between the current value and the nth value above it\n",
"\n",
"Parameters\n",
"\n",
"`n`: int"
]
},
{
"cell_type": "code",
"execution_count": 111,
"id": "fee1e7ae",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | nan | nan | nan |
1 | Sam | Mumbai | nan | nan | nan |
2 | Tina | Delhi | -3.000 | 0.000 | 1.000 |
3 | Josh | Mumbai | -11.000 | 0.000 | 1.000 |
4 | Jack | Mumbai | 8.000 | -1.000 | 10.000 |
5 | Jill | Mumbai | 11.000 | 0.000 | -17.000 |
"
],
"text/plain": [
""
]
},
"execution_count": 111,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.diff(n=2)"
]
},
{
"cell_type": "markdown",
"id": "a53ae36a",
"metadata": {},
"source": [
"### pct_change\n",
"\n",
"**pct_change(n=1)**\n",
"\n",
"Take the percentage difference between the current value and the nth value above it\n",
"\n",
"Parameters\n",
"\n",
"`n`: int"
]
},
{
"cell_type": "code",
"execution_count": 104,
"id": "d787b41f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | nan | nan | nan |
1 | Sam | Mumbai | 0.228 | -1.000 | 0.952 |
2 | Tina | Delhi | -0.229 | inf | -0.463 |
3 | Josh | Mumbai | 0.093 | -1.000 | 0.909 |
4 | Jack | Mumbai | 0.051 | nan | -0.238 |
5 | Jill | Mumbai | 0.129 | nan | -0.219 |
"
],
"text/plain": [
""
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.pct_change(n=1)"
]
},
{
"cell_type": "markdown",
"id": "d5704449",
"metadata": {},
"source": [
"### sort_values\n",
"\n",
"**sort_values(by, asc=True)**\n",
"\n",
"Sort the DataFrame by one or more values\n",
"\n",
"Parameters\n",
"\n",
"`by`: str or list of column names\n",
" \n",
"`asc`: boolean of sorting order\n"
]
},
{
"cell_type": "code",
"execution_count": 115,
"id": "2aaaf0a7",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Josh | Mumbai | 59 | False | 42 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Jack | Mumbai | 62 | False | 32 |
3 | Jill | Mumbai | 70 | False | 25 |
4 | Tina | Delhi | 54 | True | 22 |
5 | John | Kolkata | 57 | True | 21 |
"
],
"text/plain": [
""
]
},
"execution_count": 115,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_values('AGE', asc=False)"
]
},
{
"cell_type": "markdown",
"id": "d18179b1",
"metadata": {},
"source": [
"### sample\n",
"\n",
"**sample(n=None, frac=None, replace=False, seed=None)**\n",
"\n",
"Randomly samples rows of the DataFrame\n",
"\n",
"Parameters\n",
"\n",
"`n`: int\n",
"\n",
" number of rows to return\n",
" \n",
"`frac`: float\n",
"\n",
" Proportion of the data to sample\n",
" \n",
"`replace`: bool\n",
"\n",
" Whether or not to sample with replacement\n",
" \n",
"`seed`: int\n",
"\n",
" Seed the random number generator"
]
},
{
"cell_type": "code",
"execution_count": 116,
"id": "c34ffa3b",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | Jack | Mumbai | 62 | False | 32 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
"
],
"text/plain": [
""
]
},
"execution_count": 116,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sample(n=4, frac=0.5, replace=True)"
]
},
{
"cell_type": "markdown",
"id": "bfd30c5a",
"metadata": {},
"source": [
"## Non Aggregation Methods\n",
"1. [abs()](#abs)\n",
"2. [cummin()](#cummin)\n",
"3. [cummax()](#cummax)\n",
"4. [cumsum()](#cumsum)\n",
"5. [clip()](#clip)\n",
"6. [round()](#round)\n",
"7. [copy()](#copy)\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "39abfea4",
"metadata": {},
"source": [
"### abs\n",
"\n",
"**abs()**\n",
"\n",
"Takes the absolute value of each value in the DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "153a3d1a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
5 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.abs()"
]
},
{
"cell_type": "markdown",
"id": "74f0a742",
"metadata": {},
"source": [
"### cummin\n",
"\n",
"**cummin()**\n",
"\n",
"Finds cumulative minimum by column"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "d7d02237",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 57 | False | 21 |
2 | Tina | Delhi | 54 | False | 21 |
3 | Josh | Mumbai | 54 | False | 21 |
4 | Jack | Mumbai | 54 | False | 21 |
5 | Jill | Mumbai | 54 | False | 21 |
"
],
"text/plain": [
""
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.cummin()"
]
},
{
"cell_type": "markdown",
"id": "63ab86e9",
"metadata": {},
"source": [
"### cummax\n",
"\n",
"**cummax()**\n",
"\n",
"Finds cumulative maximum by column"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "966e80a5",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | True | 41 |
2 | Tina | Delhi | 70 | True | 41 |
3 | Josh | Mumbai | 70 | True | 42 |
4 | Jack | Mumbai | 70 | True | 42 |
5 | Jill | Mumbai | 70 | True | 42 |
"
],
"text/plain": [
""
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.cummax()"
]
},
{
"cell_type": "markdown",
"id": "e5d356de",
"metadata": {},
"source": [
"### cumsum\n",
"\n",
"**cumsum()**\n",
"\n",
"Finds cumulative sum by column"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "12b48747",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | 1 | 21 |
1 | Sam | Mumbai | 127 | 1 | 62 |
2 | Tina | Delhi | 181 | 2 | 84 |
3 | Josh | Mumbai | 240 | 2 | 126 |
4 | Jack | Mumbai | 302 | 2 | 158 |
5 | Jill | Mumbai | 372 | 2 | 183 |
"
],
"text/plain": [
""
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.cumsum()"
]
},
{
"cell_type": "markdown",
"id": "496ea5b3",
"metadata": {},
"source": [
"### clip\n",
"\n",
"**clip(lower=None, upper=None)**\n",
"\n",
"All values less than `lower` will be set to `lower`\n",
"\n",
"All values greater than `upper` will be set to `upper`\n",
"\n",
"Parameters\n",
"\n",
"`lower`: number or None\n",
" \n",
"`upper`: number or None"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "f786de6f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | 55 | 55 |
1 | Sam | Mumbai | 60 | 55 | 55 |
2 | Tina | Delhi | 55 | 55 | 55 |
3 | Josh | Mumbai | 59 | 55 | 55 |
4 | Jack | Mumbai | 60 | 55 | 55 |
5 | Jill | Mumbai | 60 | 55 | 55 |
"
],
"text/plain": [
""
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.clip(lower=55, upper=60)"
]
},
{
"cell_type": "markdown",
"id": "272ca128",
"metadata": {},
"source": [
"### round\n",
"\n",
"**round(n)**\n",
"\n",
"Rounds values to the nearest `n` decimals"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "29e5b1c0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
5 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.round(n=1)"
]
},
{
"cell_type": "markdown",
"id": "439ed9ad",
"metadata": {},
"source": [
"### copy\n",
"\n",
"**copy()**\n",
"\n",
"Copies the DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "97de4076",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | PLACE | WEIGHT | MARRIED | AGE |
---|
0 | John | Kolkata | 57 | True | 21 |
1 | Sam | Mumbai | 70 | False | 41 |
2 | Tina | Delhi | 54 | True | 22 |
3 | Josh | Mumbai | 59 | False | 42 |
4 | Jack | Mumbai | 62 | False | 32 |
5 | Jill | Mumbai | 70 | False | 25 |
"
],
"text/plain": [
""
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.copy()"
]
},
{
"cell_type": "markdown",
"id": "5beea9aa",
"metadata": {},
"source": [
"## Arithmetic And Comparison Operators \n",
"1. [Addition](#Addition)\n",
"2. [Subtraction](#Subtraction)\n",
"3. [Multiplication](#Multiplication)\n",
"4. [Division](#Division)\n",
"5. [Floor Division](#Floor-Division)\n",
"6. [Power](#Power)\n",
"7. [Greater than](#Greater-than)\n",
"8. [Less than](#Less-than)\n",
"9. [Greater than equal to](#Greater-than-equal)\n",
"10. [Lesser than equal to](#Lesser-than-equal)\n",
"11. [Not Equal](#Not-Equal)\n",
"12. [Equal](#Equal)\n",
"\n",
"*Arithmetic and Comparison Operators only work with numerical columns*\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "code",
"execution_count": 110,
"id": "assumed-burner",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 57 |
1 | 70 |
2 | 54 |
3 | 59 |
4 | 62 |
5 | 70 |
"
],
"text/plain": [
""
]
},
"execution_count": 110,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op = df['WEIGHT']\n",
"df_op"
]
},
{
"cell_type": "markdown",
"id": "hungarian-jacob",
"metadata": {},
"source": [
"### Addition"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "catholic-wiring",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 59 |
1 | 72 |
2 | 56 |
3 | 61 |
4 | 64 |
5 | 72 |
"
],
"text/plain": [
""
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op + 2"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "imported-screen",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 59 |
1 | 72 |
2 | 56 |
3 | 61 |
4 | 64 |
5 | 72 |
"
],
"text/plain": [
""
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2 + df_op"
]
},
{
"cell_type": "markdown",
"id": "individual-respect",
"metadata": {},
"source": [
"### Subtraction"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "gentle-boring",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 55 |
1 | 68 |
2 | 52 |
3 | 57 |
4 | 60 |
5 | 68 |
"
],
"text/plain": [
""
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op - 2"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "duplicate-cancer",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | -55 |
1 | -68 |
2 | -52 |
3 | -57 |
4 | -60 |
5 | -68 |
"
],
"text/plain": [
""
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2 - df_op"
]
},
{
"cell_type": "markdown",
"id": "offensive-praise",
"metadata": {},
"source": [
"### Multiplication "
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "surrounded-dealing",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 114 |
1 | 140 |
2 | 108 |
3 | 118 |
4 | 124 |
5 | 140 |
"
],
"text/plain": [
""
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op * 2"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "devoted-warrior",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 114 |
1 | 140 |
2 | 108 |
3 | 118 |
4 | 124 |
5 | 140 |
"
],
"text/plain": [
""
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2 * df_op"
]
},
{
"cell_type": "markdown",
"id": "stupid-cuisine",
"metadata": {},
"source": [
"### Division "
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "express-interim",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 28.500 |
1 | 35.000 |
2 | 27.000 |
3 | 29.500 |
4 | 31.000 |
5 | 35.000 |
"
],
"text/plain": [
""
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op / 2"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "lonely-kitty",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 0.035 |
1 | 0.029 |
2 | 0.037 |
3 | 0.034 |
4 | 0.032 |
5 | 0.029 |
"
],
"text/plain": [
""
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2 / df_op "
]
},
{
"cell_type": "markdown",
"id": "activated-arrival",
"metadata": {},
"source": [
"### Floor Division "
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "moderate-madison",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 28 |
1 | 35 |
2 | 27 |
3 | 29 |
4 | 31 |
5 | 35 |
"
],
"text/plain": [
""
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op // 2"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "expensive-england",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 0 |
1 | 0 |
2 | 0 |
3 | 0 |
4 | 0 |
5 | 0 |
"
],
"text/plain": [
""
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2 // df_op"
]
},
{
"cell_type": "markdown",
"id": "accepted-birthday",
"metadata": {},
"source": [
"### Power "
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "instructional-happening",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 185193 |
1 | 343000 |
2 | 157464 |
3 | 205379 |
4 | 238328 |
5 | 343000 |
"
],
"text/plain": [
""
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op ** 3"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "false-parking",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | 886634019 |
1 | -102221863 |
2 | 32838297 |
3 | -610228421 |
4 | 703701817 |
5 | -102221863 |
"
],
"text/plain": [
""
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"3 ** df_op"
]
},
{
"cell_type": "markdown",
"id": "prompt-bacteria",
"metadata": {},
"source": [
"### Greater than"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "hungarian-image",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | True |
1 | True |
2 | True |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op > 50"
]
},
{
"cell_type": "markdown",
"id": "asian-drinking",
"metadata": {},
"source": [
"### Less than"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "efficient-hearts",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | False |
1 | False |
2 | True |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op < 55"
]
},
{
"cell_type": "markdown",
"id": "confirmed-railway",
"metadata": {},
"source": [
"### Greater than equal "
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "funny-referral",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op >= 75"
]
},
{
"cell_type": "markdown",
"id": "elementary-morgan",
"metadata": {},
"source": [
"### Lesser than equal "
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "overall-friend",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | False |
1 | False |
2 | True |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op <= 55"
]
},
{
"cell_type": "markdown",
"id": "considerable-volleyball",
"metadata": {},
"source": [
"### Not Equal "
]
},
{
"cell_type": "code",
"execution_count": 66,
"id": "instrumental-simon",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | True |
1 | True |
2 | True |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op != 55"
]
},
{
"cell_type": "markdown",
"id": "automated-memphis",
"metadata": {},
"source": [
"### Equal "
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "dried-genesis",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | WEIGHT |
---|
0 | False |
1 | True |
2 | False |
3 | False |
4 | False |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_op == 70"
]
},
{
"cell_type": "markdown",
"id": "flying-tumor",
"metadata": {},
"source": [
"## String Only Methods \n",
"All the strings behave in the same manner as built-in string functions in Python.\n",
"These methods can be used only with `columns`\n",
"\n",
"1. [capitalize()](#capitalize)\n",
"2. [center()](#center)\n",
"3. [count()](#count)\n",
"4. [endswith()](#endswith)\n",
"5. [startswith()](#startswith)\n",
"6. [find()](#find)\n",
"7. [len()](#len)\n",
"8. [get()](#get)\n",
"9. [index()](#index)\n",
"10. [isalnum()](#isalnum)\n",
"11. [isalpha()](#isalpha)\n",
"12. [isdecimal()](#isdecimal)\n",
"13. [islower()](#islower)\n",
"14. [isnumeric()](#isnumeric)\n",
"15. [isspace()](#isspace)\n",
"16. [istitle()](#istitle)\n",
"17. [isupper()](#isupper)\n",
"18. [lstrip()](#lstrip)\n",
"19. [rstrip()](#rstrip)\n",
"20. [strip()](#strip)\n",
"21. [replace()](#replace)\n",
"22. [swapcase()](#swapcase)\n",
"23. [title()](#title)\n",
"24. [lower()](#lower)\n",
"25. [upper()](#upper)\n",
"26. [zfill()](#zfill)\n",
"27. [encode()](#encode)\n",
"\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "markdown",
"id": "turkish-bruce",
"metadata": {},
"source": [
"### capitalize\n",
"\n",
"**capitalize(col)**"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "laughing-delaware",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.capitalize('NAME')"
]
},
{
"cell_type": "markdown",
"id": "atmospheric-youth",
"metadata": {},
"source": [
"### center\n",
"\n",
"**center(col, width, fillchar=None)**"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "historic-millennium",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | aaaJohnaaa |
1 | aaaSamaaaa |
2 | aaaTinaaaa |
3 | aaaJoshaaa |
4 | aaaJackaaa |
5 | aaaJillaaa |
"
],
"text/plain": [
""
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.center('NAME', 10, 'a')"
]
},
{
"cell_type": "markdown",
"id": "moral-psychiatry",
"metadata": {},
"source": [
"### count\n",
"\n",
"**count(col, sub, start=None, stop=None)**"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "senior-ethiopia",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.count('PLACE', 'Mumbai')"
]
},
{
"cell_type": "markdown",
"id": "material-fashion",
"metadata": {},
"source": [
"### endswith\n",
"\n",
"**endswith(col, suffix, start=None, stop=None)**"
]
},
{
"cell_type": "code",
"execution_count": 71,
"id": "plastic-retirement",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | True |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.endswith('NAME', 'n')"
]
},
{
"cell_type": "markdown",
"id": "danish-property",
"metadata": {},
"source": [
"### startswith\n",
"\n",
"**startswith(col, suffix, start=None, stop=None)**"
]
},
{
"cell_type": "code",
"execution_count": 72,
"id": "collective-decade",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | True |
1 | False |
2 | False |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.startswith('NAME', 'J')"
]
},
{
"cell_type": "markdown",
"id": "democratic-courage",
"metadata": {},
"source": [
"### find\n",
"\n",
"**find(col, sub, start=None, stop=None)**"
]
},
{
"cell_type": "code",
"execution_count": 73,
"id": "comfortable-humanitarian",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | -1 |
1 | -1 |
2 | 0 |
3 | -1 |
4 | -1 |
5 | -1 |
"
],
"text/plain": [
""
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.find('NAME', 'Tina')"
]
},
{
"cell_type": "markdown",
"id": "limiting-developer",
"metadata": {},
"source": [
"### len\n",
"\n",
"**len(col)**"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "overall-nashville",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.len('NAME')"
]
},
{
"cell_type": "markdown",
"id": "deadly-destiny",
"metadata": {},
"source": [
"### get\n",
"\n",
"**get(col, item)**"
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "hundred-maine",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.get('NAME', 0)"
]
},
{
"cell_type": "markdown",
"id": "placed-service",
"metadata": {},
"source": [
"### index\n",
"\n",
"**index(col, sub, start=None, stop=None)**"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "illegal-cattle",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.index('NAME', '')"
]
},
{
"cell_type": "markdown",
"id": "legendary-murder",
"metadata": {},
"source": [
"### isalnum\n",
"\n",
"**isalnum(col)**"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "completed-praise",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | True |
1 | True |
2 | True |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isalnum('NAME')"
]
},
{
"cell_type": "markdown",
"id": "structured-introduction",
"metadata": {},
"source": [
"### isalpha\n",
"\n",
"**isalpha(col)**"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "recent-globe",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | True |
1 | True |
2 | True |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isalpha('NAME')"
]
},
{
"cell_type": "markdown",
"id": "developed-setting",
"metadata": {},
"source": [
"### isdecimal\n",
"\n",
"**isdecimal(col)**"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "double-links",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isdecimal('NAME')"
]
},
{
"cell_type": "markdown",
"id": "mature-exploration",
"metadata": {},
"source": [
"### isnumeric\n",
"\n",
"**isnumeric(col)**"
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "early-result",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isnumeric('NAME')"
]
},
{
"cell_type": "markdown",
"id": "surprised-audit",
"metadata": {},
"source": [
"### isspace\n",
"\n",
"**isspace(col)**"
]
},
{
"cell_type": "code",
"execution_count": 81,
"id": "persistent-dialogue",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 81,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isspace('NAME')"
]
},
{
"cell_type": "markdown",
"id": "published-delicious",
"metadata": {},
"source": [
"### istitle\n",
"\n",
"**istitle(col)**"
]
},
{
"cell_type": "code",
"execution_count": 82,
"id": "seven-signature",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | True |
1 | True |
2 | True |
3 | True |
4 | True |
5 | True |
"
],
"text/plain": [
""
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.istitle('NAME')"
]
},
{
"cell_type": "markdown",
"id": "brutal-shower",
"metadata": {},
"source": [
"### isupper\n",
"\n",
"**isupper(col)**"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "ranking-origin",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.isupper('NAME')"
]
},
{
"cell_type": "markdown",
"id": "important-knowing",
"metadata": {},
"source": [
"### islower\n",
"\n",
"**islower(col)**"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "designed-memorial",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | False |
1 | False |
2 | False |
3 | False |
4 | False |
5 | False |
"
],
"text/plain": [
""
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.islower('NAME')"
]
},
{
"cell_type": "markdown",
"id": "45a61a3e",
"metadata": {},
"source": [
"### lstrip\n",
"\n",
"**lstrip(col, chars)**"
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "crucial-texas",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.lstrip('NAME', 'o')"
]
},
{
"cell_type": "markdown",
"id": "3dc0e805",
"metadata": {},
"source": [
"### rstrip\n",
"\n",
"**rstrip(col, chars)**"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "effective-internship",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 86,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.rstrip('NAME', 'o')"
]
},
{
"cell_type": "markdown",
"id": "2c209eb3",
"metadata": {},
"source": [
"### strip\n",
"\n",
"**strip(col, chars)**"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "demographic-going",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 87,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.strip('NAME', 'o')"
]
},
{
"cell_type": "markdown",
"id": "attractive-focus",
"metadata": {},
"source": [
"### replace\n",
"\n",
"**replace(col, old, new, count=None)**"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "bored-license",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | Cena |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 88,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.replace('NAME', 'John', 'Cena')"
]
},
{
"cell_type": "markdown",
"id": "knowing-lodge",
"metadata": {},
"source": [
"### swapcase\n",
"\n",
"**swapcase(col)**"
]
},
{
"cell_type": "code",
"execution_count": 89,
"id": "respected-sussex",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | jOHN |
1 | sAM |
2 | tINA |
3 | jOSH |
4 | jACK |
5 | jILL |
"
],
"text/plain": [
""
]
},
"execution_count": 89,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.swapcase('NAME')"
]
},
{
"cell_type": "markdown",
"id": "disabled-photograph",
"metadata": {},
"source": [
"### title\n",
"\n",
"**title(col)**"
]
},
{
"cell_type": "code",
"execution_count": 90,
"id": "covered-advisory",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | John |
1 | Sam |
2 | Tina |
3 | Josh |
4 | Jack |
5 | Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 90,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.title('NAME')"
]
},
{
"cell_type": "markdown",
"id": "entire-silly",
"metadata": {},
"source": [
"### lower\n",
"\n",
"**lower(col)**"
]
},
{
"cell_type": "code",
"execution_count": 91,
"id": "detected-charles",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | john |
1 | sam |
2 | tina |
3 | josh |
4 | jack |
5 | jill |
"
],
"text/plain": [
""
]
},
"execution_count": 91,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.lower('NAME')"
]
},
{
"cell_type": "markdown",
"id": "enclosed-philadelphia",
"metadata": {},
"source": [
"### upper\n",
"\n",
"**upper(col)**"
]
},
{
"cell_type": "code",
"execution_count": 92,
"id": "peaceful-stage",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | JOHN |
1 | SAM |
2 | TINA |
3 | JOSH |
4 | JACK |
5 | JILL |
"
],
"text/plain": [
""
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.upper('NAME')"
]
},
{
"cell_type": "markdown",
"id": "framed-superior",
"metadata": {},
"source": [
"### zfill\n",
"\n",
"**zfill(col, width)**"
]
},
{
"cell_type": "code",
"execution_count": 93,
"id": "increased-professor",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | 000000John |
1 | 0000000Sam |
2 | 000000Tina |
3 | 000000Josh |
4 | 000000Jack |
5 | 000000Jill |
"
],
"text/plain": [
""
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.zfill('NAME', 10)"
]
},
{
"cell_type": "markdown",
"id": "powerful-contact",
"metadata": {},
"source": [
"### encode\n",
"\n",
"**encode(col, encoding='utf-8', errors='strict')**"
]
},
{
"cell_type": "code",
"execution_count": 94,
"id": "beginning-converter",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME |
---|
0 | b'John' |
1 | b'Sam' |
2 | b'Tina' |
3 | b'Josh' |
4 | b'Jack' |
5 | b'Jill' |
"
],
"text/plain": [
""
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.str.encode('NAME')"
]
},
{
"cell_type": "markdown",
"id": "unauthorized-chinese",
"metadata": {},
"source": [
"## Pivot Table\n",
"Creates a pivot table from one or two 'grouping' columns\n",
"\n",
"Parameters\n",
"\n",
"`rows`: str of column name to group by (Optional)\n",
"\n",
"`columns`: str of column name to group by (Optional)\n",
"\n",
"`values`: str of column name to aggregate (Required)\n",
"\n",
"`aggfunc`: str of aggregation function\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "code",
"execution_count": 95,
"id": "promotional-worse",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | NAME | Delhi | Kolkata | Mumbai |
---|
0 | Jack | nan | nan | 62.000 |
1 | Jill | nan | nan | 70.000 |
2 | John | nan | 57.000 | nan |
3 | Josh | nan | nan | 59.000 |
4 | Sam | nan | nan | 70.000 |
5 | Tina | 54.000 | nan | nan |
"
],
"text/plain": [
""
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.pivot_table(rows='NAME', columns='PLACE', values='WEIGHT', aggfunc='mean')"
]
},
{
"cell_type": "markdown",
"id": "307d809e",
"metadata": {},
"source": [
"## Read CSV\n",
"Reads a CSV file into a dataframe\n",
"\n",
"Parameters\n",
"\n",
"`filelocation`: str of the file location\n",
"\n",
"`header`: int of the number of lines to skip in case of multi-header files, None in case of header-less files (Optional)\n",
"\n",
"**read_csv(file, header)**\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "code",
"execution_count": 96,
"id": "01015cbe",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | dept | race | gender | salary |
---|
0 | Houston Police Department-HPD | White | Male | 45279 |
1 | Houston Fire Department (HFD) | White | Male | 63166 |
2 | Houston Police Department-HPD | Black | Male | 66614 |
3 | Public Works & Engineering-PWE | Asian | Male | 71680 |
4 | Houston Airport System (HAS) | White | Male | 42390 |
"
],
"text/plain": [
""
]
},
"execution_count": 96,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = ick.read_csv('./dataset/employee.csv');\n",
"data.head()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "6e5c9b24",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | 0 | 1 | 2 | 3 |
---|
0 | Houston Police Department-HPD | White | Male | 45279 |
1 | Houston Fire Department (HFD) | White | Male | 63166 |
2 | Houston Police Department-HPD | Black | Male | 66614 |
3 | Public Works & Engineering-PWE | Asian | Male | 71680 |
4 | Houston Airport System (HAS) | White | Male | 42390 |
"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = ick.read_csv('./dataset/employee.csv', header=None);\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"id": "105f6782",
"metadata": {},
"source": [
"## Read SQL\n",
"Reads the data from SQL query into a dataframe\n",
"\n",
"Parameters\n",
"\n",
"`sql`: str of the sql statement to be executed\n",
"\n",
"`drivername`: str of driver engine of the database\n",
"\n",
"`username`: str of username to connect to the database\n",
"\n",
"`password`: str of password to connect to the database\n",
"\n",
"`host`: str of host to connect to the database\n",
"\n",
"`port`: int of port to connect to the database\n",
"\n",
"`database`: str of database name \n",
"\n",
"**read_sql(sql, drivername, username, password, host, port, database)**\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "47e12bf9",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | id | firstname | lastname | email |
---|
0 | 1 | John | Doe | upchh@example.com |
1 | 2 | John | Doe | upchh@example.com |
2 | 3 | John | Doe | upchh@example.com |
3 | 4 | John | Doe | upchh@example.com |
4 | 5 | John | Doe | upchh@example.com |
5 | 6 | John | Doe | upchh@example.com |
6 | 7 | John | Doe | upchh@example.com |
7 | 8 | John | Doe | upchh@example.com |
8 | 9 | John | Doe | upchh@example.com |
"
],
"text/plain": [
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"drivername=\"postgresql\"\n",
"username=\"postgres\"\n",
"password=\"test123\" \n",
"host=\"localhost\"\n",
"port=\"5432\"\n",
"database=\"spotify_trends\"\n",
"sql=\"SELECT id, firstname, lastname, email FROM authors\"\n",
"\n",
"ick.read_sql(sql,drivername,username,password,host,port,database)"
]
},
{
"cell_type": "markdown",
"id": "c8ee926c",
"metadata": {},
"source": [
"## Read Excel\n",
"Reads the data from a simple Excel file into a dataframe\n",
"\n",
"Parameters\n",
"\n",
"`file_path`: str of the path to the Excel file that you want to read\n",
"`sheet_name`: str of sheet name\n",
"\n",
"Supported file extensions - xlsx, xlsm, xltx, xltm\n",
"\n",
"**read_excel(file_path, sheet_name)**\n",
"\n",
"*[Go to table of contents](#Table-Of-Contents)*"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "582b7c33",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" | Name | Age | Country |
---|
0 | John | 54 | USA |
1 | Sam | 23 | UK |
2 | Max | 44 | Pakistan |
"
],
"text/plain": [
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ick.read_excel(file_path=\"./dataset/Book1.xlsx\", sheet_name=\"Sheet1\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bf754bc7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
},
"vscode": {
"interpreter": {
"hash": "84acd9daa1976748b0cb1aac2d1d7e831b987877e0682bbd8df981985727d27e"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}