{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python Regular Expression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. 일반적인 문자는 그 문자 자체로 매칭이 된다. 특수 문자는 다음과 같다.\n",
    "\n",
    "    \\       특수 문자 Escape (or start a sequence)\n",
    "    .       줄바꿈을 제외한 모든 문자 (참고 re.DOTALL)\n",
    "    ^       문자열의 시작 (참고 re.MULTILINE)\n",
    "    $       문자열의 마지막 (re.MULTILINE)\n",
    "    []      문자 집합\n",
    "    |       또는\n",
    "    ()      Capture 그룹 생성 (우선순위 지정)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. 집합기호 '``[``' 문자 이후에 사용할 수 있는 특수 문자는 다음과 같다. \n",
    "\n",
    "    ]       집합의 끝\n",
    "    -       범위 (예 a-c 는 a, b 또는 c를 의미)\n",
    "    ^       Negate를 의미"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Quantifiers (append '``?``' for non-greedy):\n",
    "\n",
    "    {m}     Exactly m repetitions\n",
    "    {m,n}   From m (default 0) to n (default infinity)\n",
    "    *       0 or more. Same as {,}\n",
    "    +       1 or more. Same as {1,}\n",
    "    ?       0 or 1. Same as {,1}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Special sequences::\n",
    "\n",
    "    \\A  Start of string\n",
    "    \\b  Match empty string at word (\\w+) boundary\n",
    "    \\B  Match empty string not at word boundary\n",
    "    \\d  Digit\n",
    "    \\D  Non-digit\n",
    "    \\s  Whitespace - [ \\t\\n\\r\\f\\v]과 동일 (참고: LOCALE, UNICODE)\n",
    "    \\S  Non-whitespace\n",
    "    \\w  Alphanumeric: [0-9a-zA-Z_], see LOCALE\n",
    "    \\W  Non-alphanumeric\n",
    "    \\Z  End of string\n",
    "    \\g<id>  Match prev named or numbered group,\n",
    "            '<' & '>' are literal, e.g. \\g<0>\n",
    "            or \\g<name> (not \\g0 or \\gname)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Special character escapes are much like those already escaped in Python string\n",
    "literals. Hence regex '``\\n``' is same as regex '``\\\\n``'::\n",
    "\n",
    "    \\a  ASCII Bell (BEL)\n",
    "    \\f  ASCII Formfeed\n",
    "    \\n  ASCII Linefeed\n",
    "    \\r  ASCII Carriage return\n",
    "    \\t  ASCII Tab\n",
    "    \\v  ASCII Vertical tab\n",
    "    \\\\  A single backslash\n",
    "    \\xHH   Two digit hexadecimal character goes here\n",
    "    \\OOO   Three digit octal char (or just use an\n",
    "           initial zero, e.g. \\0, \\09)\n",
    "    \\DD    Decimal number 1 to 99, match\n",
    "           previous numbered group\n",
    "\n",
    "Extensions. Do not cause grouping, except '``P<name>``'::\n",
    "\n",
    "    (?iLmsux)     Match empty string, sets re.X flags\n",
    "    (?:...)       Non-capturing version of regular parens\n",
    "    (?P<name>...) Create a named capturing group.\n",
    "    (?P=name)     Match whatever matched prev named group\n",
    "    (?#...)       A comment; ignored.\n",
    "    (?=...)       Lookahead assertion, match without consuming\n",
    "    (?!...)       Negative lookahead assertion\n",
    "    (?<=...)      Lookbehind assertion, match if preceded\n",
    "    (?<!...)      Negative lookbehind assertion\n",
    "    (?(id)y|n)    Match 'y' if group 'id' matched, else 'n'\n",
    "\n",
    "Flags for re.compile(), etc. Combine with ``'|'``::\n",
    "\n",
    "    re.I == re.IGNORECASE   Ignore case\n",
    "    re.L == re.LOCALE       Make \\w, \\b, and \\s locale dependent\n",
    "    re.M == re.MULTILINE    Multiline\n",
    "    re.S == re.DOTALL       Dot matches all (including newline)\n",
    "    re.U == re.UNICODE      Make \\w, \\b, \\d, and \\s unicode dependent\n",
    "    re.X == re.VERBOSE      Verbose (unescaped whitespace in pattern\n",
    "                            is ignored, and '#' marks comment lines)\n",
    "\n",
    "Module level functions::\n",
    "\n",
    "    compile(pattern[, flags]) -> RegexObject\n",
    "    match(pattern, string[, flags]) -> MatchObject\n",
    "    search(pattner, string[, flags]) -> MatchObject\n",
    "    findall(pattern, string[, flags]) -> list of strings\n",
    "    finditer(pattern, string[, flags]) -> iter of MatchObjects\n",
    "    split(pattern, string[, maxsplit, flags]) -> list of strings\n",
    "    sub(pattern, repl, string[, count, flags]) -> string\n",
    "    subn(pattern, repl, string[, count, flags]) -> (string, int)\n",
    "    escape(string) -> string\n",
    "    purge() # the re cache\n",
    "\n",
    "RegexObjects (returned from ``compile()``)::\n",
    "\n",
    "    .match(string[, pos, endpos]) -> MatchObject\n",
    "    .search(string[, pos, endpos]) -> MatchObject\n",
    "    .findall(string[, pos, endpos]) -> list of strings\n",
    "    .finditer(string[, pos, endpos]) -> iter of MatchObjects\n",
    "    .split(string[, maxsplit]) -> list of strings\n",
    "    .sub(repl, string[, count]) -> string\n",
    "    .subn(repl, string[, count]) -> (string, int)\n",
    "    .flags      # int, Passed to compile()\n",
    "    .groups     # int, Number of capturing groups\n",
    "    .groupindex # {}, Maps group names to ints\n",
    "    .pattern    # string, Passed to compile()\n",
    "\n",
    "MatchObjects (returned from ``match()`` and ``search()``)::\n",
    "\n",
    "    .expand(template) -> string, Backslash & group expansion\n",
    "    .group([group1...]) -> string or tuple of strings, 1 per arg\n",
    "    .groups([default]) -> tuple of all groups, non-matching=default\n",
    "    .groupdict([default]) -> {}, Named groups, non-matching=default\n",
    "    .start([group]) -> int, Start/end of substring match by group\n",
    "    .end([group]) -> int, Group defaults to 0, the whole match\n",
    "    .span([group]) -> tuple (match.start(group), match.end(group))\n",
    "    .pos       int, Passed to search() or match()\n",
    "    .endpos    int, \"\n",
    "    .lastindex int, Index of last matched capturing group\n",
    "    .lastgroup string, Name of last matched capturing group\n",
    "    .re        regex, As passed to search() or match()\n",
    "    .string    string, \""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}