{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Start with a table of data, we'll make up an employee database and a sales database" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using CSV, DataFramesMeta, Statistics, Dates" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
8 rows × 5 columns
| id | first_name | last_name | department | salary | |
|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | |
| 1 | 1 | Michael | Scott | Management & Admin | 5100 |
| 2 | 2 | Dwight | Schrute | Sales | 4200 |
| 3 | 3 | Angela | Martin | Accounting | 3750 |
| 4 | 4 | Jim | Halpert | Sales | 4300 |
| 5 | 5 | Pam | Beesly | Management & Admin | 2200 |
| 6 | 6 | Oscar | Nunez | Accounting | 3400 |
| 7 | 7 | Meredith | Palmer | Purchasing | 3300 |
| 8 | 8 | Creed | Bratton | Purchasing | 3200 |
1 rows × 5 columns
| id | first_name | last_name | department | salary | |
|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | |
| 1 | 1 | Michael | Scott | Management & Admin | 5100 |
2 rows × 5 columns
| id | first_name | last_name | department | salary | |
|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | |
| 1 | 1 | Michael | Scott | Management & Admin | 5100 |
| 2 | 2 | Dwight | Schrute | Sales | 4200 |
2 rows × 5 columns
| id | first_name | last_name | department | salary | |
|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | |
| 1 | 1 | Michael | Scott | Management & Admin | 5100 |
| 2 | 5 | Pam | Beesly | Management & Admin | 2200 |
4 rows × 3 columns
| department | Average Salary | count | |
|---|---|---|---|
| String | Float64 | Int64 | |
| 1 | Management & Admin | 3650.0 | 2 |
| 2 | Sales | 4250.0 | 2 |
| 3 | Accounting | 3575.0 | 2 |
| 4 | Purchasing | 3250.0 | 2 |
4 rows × 3 columns
| department | Average Salary | count | |
|---|---|---|---|
| String | Float64 | Int64 | |
| 1 | Management & Admin | 3650.0 | 2 |
| 2 | Sales | 4250.0 | 2 |
| 3 | Accounting | 3575.0 | 2 |
| 4 | Purchasing | 3250.0 | 2 |
8 rows × 6 columns
| id | first_name | last_name | department | salary | Average Salary | |
|---|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | Float64 | |
| 1 | 1 | Michael | Scott | Management & Admin | 5100 | 3650.0 |
| 2 | 2 | Dwight | Schrute | Sales | 4200 | 4250.0 |
| 3 | 3 | Angela | Martin | Accounting | 3750 | 3575.0 |
| 4 | 4 | Jim | Halpert | Sales | 4300 | 4250.0 |
| 5 | 5 | Pam | Beesly | Management & Admin | 2200 | 3650.0 |
| 6 | 6 | Oscar | Nunez | Accounting | 3400 | 3575.0 |
| 7 | 7 | Meredith | Palmer | Purchasing | 3300 | 3250.0 |
| 8 | 8 | Creed | Bratton | Purchasing | 3200 | 3250.0 |
12 rows × 9 columns (omitted printing of 2 columns)
| id | first_name | last_name | department | salary | id_1 | transaction_date | |
|---|---|---|---|---|---|---|---|
| Int64 | String | String | String | Int64 | Int64? | Date? | |
| 1 | 2 | Dwight | Schrute | Sales | 4200 | 2 | 2006-01-29 |
| 2 | 2 | Dwight | Schrute | Sales | 4200 | 4 | 2006-02-14 |
| 3 | 2 | Dwight | Schrute | Sales | 4200 | 6 | 2006-03-20 |
| 4 | 4 | Jim | Halpert | Sales | 4300 | 1 | 2006-01-02 |
| 5 | 4 | Jim | Halpert | Sales | 4300 | 3 | 2006-02-01 |
| 6 | 4 | Jim | Halpert | Sales | 4300 | 5 | 2006-03-01 |
| 7 | 1 | Michael | Scott | Management & Admin | 5100 | missing | missing |
| 8 | 3 | Angela | Martin | Accounting | 3750 | missing | missing |
| 9 | 5 | Pam | Beesly | Management & Admin | 2200 | missing | missing |
| 10 | 6 | Oscar | Nunez | Accounting | 3400 | missing | missing |
| 11 | 7 | Meredith | Palmer | Purchasing | 3300 | missing | missing |
| 12 | 8 | Creed | Bratton | Purchasing | 3200 | missing | missing |
2 rows × 6 columns
| id | first_name | last_name | department | total_quantity | number_of_customers | |
|---|---|---|---|---|---|---|
| Int64 | String | String | String | Int64? | Int64? | |
| 1 | 4 | Jim | Halpert | Sales | 1100 | 3 |
| 2 | 2 | Dwight | Schrute | Sales | 950 | 3 |