{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [Named Groups](http://www.regular-expressions.info/named.html) of [Regular Expressions](https://docs.python.org/3/library/re.html)\n", "\n", "Make use of regular expressions more readable with named groups." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "'Mackenzie'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(2)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "'Mackenzie'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group('first_name')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- [Mackenzie (first name)](https://en.wikipedia.org/wiki/Mackenzie_%28given_name%29#People_with_the_given_name)\n", "- [Mackenzie (last name)](https://en.wikipedia.org/wiki/Mackenzie_%28surname%29#People_with_the_surname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[The Zen of Python](https://www.python.org/dev/peps/pep-0020/), by Tim Peters\n", "\n", "Readability counts." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import re" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Regular expressions can be used to indicate if a string matches a pattern or not.\n", "\n", "Regular expressions can also be used to do some parsing.\n", "The substrings of interest are called groups.\n", "The traditional way of referring to a group is by index number.\n", "Python has another way of referring to a group by name.\n", "\n", "Using names give both the regular expression\n", "and references to match groups more meaning.\n", "They make Python code more readable." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "foo_pattern = re.compile('''\n", " ^\n", " ([A-Za-z]+)\n", " ,[ ]\n", " ([A-Za-z]+)\n", " $\n", "''', re.VERBOSE)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "s = 'James, Mackenzie'" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = re.match(foo_pattern, s)\n", "m" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.groups" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'James, Mackenzie'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(0)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'James'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(1)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'Mackenzie'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(2)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "foo_pattern = re.compile('''\n", " ^\n", " (?P[A-Za-z]+)\n", " ,[ ]\n", " (?P[A-Za-z]+)\n", " $\n", "''', re.VERBOSE)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = re.match(foo_pattern, s)\n", "m" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.groups" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'James, Mackenzie'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(0)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'James'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(1)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'Mackenzie'" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group(2)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'James'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group('last_name')" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'Mackenzie'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.group('first_name')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Questions\n", "\n", "1. Was Python\n", "[first to name groups](http://www.regular-expressions.info/named.html)\n", "in regular expressions?\n", "\n", "2. Catherine asked why there is a capital P in named group syntax.\n", "\n", "Eric found [Named regular expression group ā€œ(?Pregexp)ā€: what does ā€œPā€ stand for?](http://stackoverflow.com/questions/10059673/named-regular-expression-group-pgroup-nameregexp-what-does-p-stand-for)\n", "article, which addresses both questions.\n", "\n", "1. Yes, Python was first.\n", "2. The 'P' seems to stand for Python, but we do not really know." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 0 }