{ "metadata": { "name": "", "signature": "sha256:ded94f2baeac630507627dd813671f8a496b507b8c87b3afd934c9a290e337b0" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# String Operations\n", "\n", "- **Author:** [Chris Albon](http://www.chrisalbon.com/), [@ChrisAlbon](https://twitter.com/chrisalbon)\n", "- **Date:** -\n", "- **Repo:** [Python 3 code snippets for data science](https://github.com/chrisalbon/code_py)\n", "- **Note:**\n", "\n", "### Python 3 has three string types\n", "- str() is for unicode\n", "- bytes() is for binary data\n", "- bytesarray() mutable variable of bytes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create some simulated text." ] }, { "cell_type": "code", "collapsed": false, "input": [ "string = 'The quick brown fox jumped over the lazy brown bear.'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Capitalize the first letter." ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_capitalized = string.capitalize()\n", "string_capitalized" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "'The quick brown fox jumped over the lazy brown bear.'" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Center the string with periods on either side, for a total of 79 characters" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_centered = string.center(79, '.')\n", "string_centered" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "'..............The quick brown fox jumped over the lazy brown bear..............'" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Count the number of e's between the fifth and last character" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_counted = string.count('e', 4, len(string))\n", "string_counted" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "4" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Locate any e's between the fifth and last character" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_find = string.find('e', 4, len(string))\n", "string_find" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "24" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all characters are alphabet?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_isalpha = string.isalpha()\n", "string_isalpha" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "False" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all characters digits?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_isdigit = string.isdigit()\n", "string_isdigit" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "False" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all characters lower case?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_islower = string.islower()\n", "string_islower" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "False" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all chracters alphanumeric?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_isalnum = string.isalnum()\n", "string_isalnum" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "False" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all characters whitespaces?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_isalnum = string.isspace()\n", "string_isalnum" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "False" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Is the string properly titlespaced?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_istitle = string.istitle()\n", "string_istitle" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "False" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Are all the characters uppercase?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_isupper = string.isupper()\n", "string_isupper" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "False" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Return the lengths of string" ] }, { "cell_type": "code", "collapsed": false, "input": [ "len(string)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "52" ] } ], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert string to lower case" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_lower = string.lower()\n", "string_lower" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "'the quick brown fox jumped over the lazy brown bear.'" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert string to lower case" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_upper = string.upper()\n", "string_upper" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "'THE QUICK BROWN FOX JUMPED OVER THE LAZY BROWN BEAR.'" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert string to title case" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_title = string.title()\n", "string_title" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "'The Quick Brown Fox Jumped Over The Lazy Brown Bear.'" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert string the inverted case" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_swapcase = string.swapcase()\n", "string_swapcase" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "'tHE QUICK BROWN FOX JUMPED OVER THE LAZY BROWN BEAR.'" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove all leading whitespaces (i.e. to the left)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_lstrip = string.lstrip()\n", "string_lstrip" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ "'The quick brown fox jumped over the lazy brown bear.'" ] } ], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove all leading and trailing whitespaces (i.e. to the left and right)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_strip = string.strip()\n", "string_strip" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "'The quick brown fox jumped over the lazy brown bear.'" ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove all trailing whitespaces (i.e. to the right)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_rstrip = string.rstrip()\n", "string_rstrip" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "'The quick brown fox jumped over the lazy brown bear.'" ] } ], "prompt_number": 22 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Replace lower case e's with upper case E's, to a maximum of 4" ] }, { "cell_type": "code", "collapsed": false, "input": [ "string_replace = string.replace('e', 'E', 4)\n", "string_replace" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "'ThE quick brown fox jumpEd ovEr thE lazy brown bear.'" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }