[![Open in Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/justmarkham/scikit-learn-tips/master?filepath=notebooks%2F33_function_transformer.ipynb)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/justmarkham/scikit-learn-tips/blob/master/notebooks/33_function_transformer.ipynb)

# ðŸ¤–âš¡ scikit-learn tip #33 ([video](https://www.youtube.com/watch?v=s1gL82BxKos&list=PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6&index=33))

Want to do feature engineering within a ColumnTransformer or Pipeline?

1. Select an existing function (or write your own)
2. Convert it into a transformer using FunctionTransformer
3. ðŸ¥³

See example ðŸ‘‡

In [1]:
import pandas as pd
import numpy as np
from sklearn.compose import make_column_transformer

In [2]:
X = pd.DataFrame({'Fare':[200, 300, 50, 900],
                  'Code':['X12', 'Y20', 'Z7', np.nan],
                  'Deck':['A101', 'C102', 'A200', 'C300']})

In [3]:
from sklearn.preprocessing import FunctionTransformer

### Convert existing function into a transformer:

In [4]:
clip_values = FunctionTransformer(np.clip, kw_args={'a_min':100, 'a_max':600})

### Convert custom function into a transformer:

In [5]:
# extract the first letter from each string
def first_letter(df):
    return df.apply(lambda x: x.str.slice(0, 1))

In [6]:
get_first_letter = FunctionTransformer(first_letter)

### Include them in a ColumnTransformer:

In [7]:
ct = make_column_transformer(
    (clip_values, ['Fare']),
    (get_first_letter, ['Code', 'Deck']))

### Apply the transformations:

In [8]:
X

Unnamed: 0,Fare,Code,Deck
0,200,X12,A101
1,300,Y20,C102
2,50,Z7,A200
3,900,,C300


In [9]:
ct.fit_transform(X)

array([[200, 'X', 'A'],
       [300, 'Y', 'C'],
       [100, 'Z', 'A'],
       [600, nan, 'C']], dtype=object)

### Want more tips? [View all tips on GitHub](https://github.com/justmarkham/scikit-learn-tips) or [Sign up to receive 2 tips by email every week](https://scikit-learn.tips) ðŸ’Œ

Â© 2020 [Data School](https://www.dataschool.io). All rights reserved.