# What makes you sound like a female/male

Data is from Kaggle's [Gender Recognition by Voice](https://www.kaggle.com/primaryobjects/voicegender)

In [1]:
import pandas as pd

In [2]:
xy = pd.read_csv('data/voice.csv')

X = xy.drop('label', axis='columns')
y = xy['label']

In [3]:
from sklearn.model_selection import train_test_split

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y)

We'll train a random forest classifier on the entire dataset.

In [5]:
from sklearn.ensemble import RandomForestClassifier

In [6]:
rf = RandomForestClassifier(n_estimators=100)

In [7]:
rf.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=100, n_jobs=1, oob_score=False, random_state=None,
            verbose=0, warm_start=False)

In [8]:
from sklearn.metrics import accuracy_score

In [9]:
accuracy_score(y_test, rf.predict(X_test))

0.98232323232323238

Nice! We got over 98% accuracy.

## Explaining the classifier

In [10]:
from lime.lime_tabular import LimeTabularExplainer

In [11]:
features = list(X_train.columns)
explainer = LimeTabularExplainer(X_train.values, feature_names=features, class_names=['female', 'male'])

In [16]:
# randomly pick an example
example = X_train.sample(1).values[0]

In [17]:
exp = explainer.explain_instance(example, rf.predict_proba)

In [18]:
exp.show_in_notebook()

This person has less than 0.12 mean fundamental frequency. That's why the model classified this person as a male.

#  Reference

https://github.com/marcotcr/lime

---
**dreamgonfly@gmail.com**

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-91026007-1', 'auto');
  ga('send', 'pageview');

</script>