# Getting Started

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/whylogs/blob/1.0.x/python/examples/basic/Getting_Started.ipynb)

whylogs provides a standard to log any kind of data.

With whylogs, we will show how to log data, generating statistical summaries called *profiles*. These profiles can be used in a number of ways, like:

* Data Visualization
* Data Validation
* Tracking changes in your datasets

## Table of Content

In this example, we'll explore the basics of logging data with whylogs:
- Installing whylogs
- Profiling data
- Interacting with the profile
- Writing/Reading profiles to/from disk

## Installing whylogs

whylogs is made available as a Python package. You can get the latest version from PyPI with `pip install whylogs`:

In [1]:
!pip install -q whylogs --pre

## Loading a Pandas DataFrame

Before showing how we can log data, we first need the data itself. Let's create a simple Pandas DataFrame:

In [2]:
import pandas as pd
data = {
    "animal": ["cat", "hawk", "snake", "cat"],
    "legs": [4, 2, 0, 4],
    "weight": [4.3, 1.8, 1.3, 4.1],
}

df = pd.DataFrame(data)


## Profiling with whylogs

To obtain a profile of your data, you can simply use whylogs' `log` call, and navigate through the result to a specific profile with `get_profile`:

In [3]:
import whylogs as why

results = why.log(df)
profile = results.profile()

## Analyzing Profiles

Once you're done logging the data, you can generate a `Profile View` and inspect it in a Pandas Dataframe format:

In [6]:
prof_view = profile.view()
prof_df = prof_view.to_pandas()

prof_df

Unnamed: 0_level_0,counts/n,counts/null,types/integral,types/fractional,types/boolean,types/string,types/object,cardinality/est,cardinality/upper_1,cardinality/lower_1,...,distribution/n,distribution/max,distribution/min,distribution/q_10,distribution/q_25,distribution/median,distribution/q_75,distribution/q_90,ints/max,ints/min
column,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
animal,8,0,0,0,0,8,0,6.0,6.0003,6.0,...,,,,,,,,,,
weight,8,0,0,8,0,0,0,7.0,7.00035,7.0,...,8.0,30.1,1.3,1.3,4.1,4.3,14.3,30.1,,
legs,8,0,8,0,0,0,0,3.0,3.00015,3.0,...,8.0,4.0,0.0,0.0,2.0,4.0,4.0,4.0,4.0,0.0


This will provide you with valuable statistics on a column (feature) basis, such as:

- Counters, such as number of samples and null values
- Inferred types, such as integral, fractional and boolean
- Estimated Cardinality
- Frequent Items
- Distribution Metrics: min,max, median, quantile values

## Writing to Disk

You can also store your profile in disk for further inspection:

In [7]:
why.write(profile,"profile.bin")

This will create a profile binary file in your local filesystem.

## Reading from Disk

You can read the profile back into memory with:

In [8]:
n_prof = why.read("profile.bin")

> Note: `write` expects a profile as parameter, while `read` returns a `Profile View`. That means that you can use the loaded profile for visualization purposes and merging, but not for further tracking and updates.

## What's Next?

There's a lot you can do with the profiles you just created. Keep getting your hands dirty with the following examples!

- Basic
    - [Visualizing Profiles](https://whylogs-v1-doc-dev.netlify.app/examples/basic/notebook_profile_visualizer) - Compare profiles to detect distribution shifts, visualize histograms and bar charts and explore your data
    - [Schema Configuration for Tracking Metrics](https://whylogs-v1-doc-dev.netlify.app/examples/basic/schema_configuration) - Configure tracking metrics according to data type or column features
    - More to Come!