[![Open in Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/justmarkham/scikit-learn-tips/master?filepath=notebooks%2F37_pipeline_diagram.ipynb)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/justmarkham/scikit-learn-tips/blob/master/notebooks/37_pipeline_diagram.ipynb)

# ðŸ¤–âš¡ scikit-learn tip #37 ([video](https://www.youtube.com/watch?v=_UKYxucD1Io&list=PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6&index=37))

New in version 0.23: Create interactive diagrams of Pipelines (and other estimators) in Jupyter!

Click on any element to see more details. You can even export the diagram to an HTML file!

See example ðŸ‘‡

In [1]:
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_selection import SelectPercentile, chi2
from sklearn.linear_model import LogisticRegression
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline

In [2]:
df = pd.read_csv('http://bit.ly/kaggletrain')
X = df[['Parch', 'Fare', 'Embarked', 'Sex', 'Name', 'Age']]
y = df['Survived']

In [3]:
imp_constant = SimpleImputer(strategy='constant')
ohe = OneHotEncoder()

In [4]:
imp_ohe = make_pipeline(imp_constant, ohe)
vect = CountVectorizer()
imp = SimpleImputer()

In [5]:
# pipeline step 1
ct = make_column_transformer(
    (imp_ohe, ['Embarked', 'Sex']),
    (vect, 'Name'),
    (imp, ['Age', 'Fare']),
    ('passthrough', ['Parch']))

In [6]:
# pipeline step 2
selection = SelectPercentile(chi2, percentile=50)

In [7]:
# pipeline step 3
logreg = LogisticRegression(solver='liblinear')

In [8]:
# display estimators as diagrams
from sklearn import set_config
set_config(display='diagram')

In [9]:
pipe = make_pipeline(ct, selection, logreg)
pipe

In [10]:
# export the diagram to a file
from sklearn.utils import estimator_html_repr
with open('pipeline.html', 'w') as f:  
    f.write(estimator_html_repr(pipe))

### Want more tips? [View all tips on GitHub](https://github.com/justmarkham/scikit-learn-tips) or [Sign up to receive 2 tips by email every week](https://scikit-learn.tips) ðŸ’Œ

Â© 2020 [Data School](https://www.dataschool.io). All rights reserved.