{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Generalize Names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A function that converts a name into a general format ` (all lowercase)`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> from mlxtend.text import generalize_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "A function that converts a name into a general format ` (all lowercase)`, which is useful if data is collected from different sources and is supposed to be compared or merged based on name identifiers. E.g., if names are stored in a pandas `DataFrame` column, the apply function can be used to generalize names: `df['name'] = df['name'].apply(generalize_names)`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### References\n", "\n", "- -" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1 - Defaults" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from mlxtend.text import generalize_names" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'pozo j'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names('Pozo, José Ángel')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'pozo j'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names('José Pozo')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'pozo j'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names('José Ángel Pozo')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 2 - Optional Parameters" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from mlxtend.text import generalize_names" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'etoo sa'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names(\"Eto'o, Samuel\", firstname_output_letters=2)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'etoo'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names(\"Eto'o, Samuel\", firstname_output_letters=0)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'etoo, s'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "generalize_names(\"Eto'o, Samuel\", output_sep=', ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## API" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "## generalize_names\n", "\n", "*generalize_names(name, output_sep=' ', firstname_output_letters=1)*\n", "\n", "Generalize a person's first and last name.\n", "\n", "Returns a person's name in the format\n", "` (all lowercase)`\n", "\n", "**Parameters**\n", "\n", "- `name` : `str`\n", "\n", " Name of the player\n", "\n", "- `output_sep` : `str` (default: ' ')\n", "\n", " String for separating last name and first name in the output.\n", "\n", "- `firstname_output_letters` : `int`\n", "\n", " Number of letters in the abbreviated first name.\n", "\n", "**Returns**\n", "\n", "- `gen_name` : `str`\n", "\n", " The generalized name.\n", "\n", "**Examples**\n", "\n", "For usage examples, please see\n", " [http://rasbt.github.io/mlxtend/user_guide/text/generalize_names/](http://rasbt.github.io/mlxtend/user_guide/text/generalize_names/)\n", "\n", "\n" ] } ], "source": [ "with open('../../api_modules/mlxtend.text/generalize_names.md', 'r') as f:\n", " print(f.read())" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }