{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Python-常用模块\" data-toc-modified-id=\"Python-常用模块-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Python 常用模块</a></div><div class=\"lev2 toc-item\"><a href=\"#内置模块\" data-toc-modified-id=\"内置模块-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>内置模块</a></div><div class=\"lev3 toc-item\"><a href=\"#导入模块\" data-toc-modified-id=\"导入模块-111\"><span class=\"toc-item-num\">1.1.1&nbsp;&nbsp;</span>导入模块</a></div><div class=\"lev3 toc-item\"><a href=\"#sys\" data-toc-modified-id=\"sys-112\"><span class=\"toc-item-num\">1.1.2&nbsp;&nbsp;</span>sys</a></div><div class=\"lev3 toc-item\"><a href=\"#os\" data-toc-modified-id=\"os-113\"><span class=\"toc-item-num\">1.1.3&nbsp;&nbsp;</span>os</a></div><div class=\"lev3 toc-item\"><a href=\"#time\" data-toc-modified-id=\"time-114\"><span class=\"toc-item-num\">1.1.4&nbsp;&nbsp;</span>time</a></div><div class=\"lev3 toc-item\"><a href=\"#datetime\" data-toc-modified-id=\"datetime-115\"><span class=\"toc-item-num\">1.1.5&nbsp;&nbsp;</span>datetime</a></div><div class=\"lev3 toc-item\"><a href=\"#random\" data-toc-modified-id=\"random-116\"><span class=\"toc-item-num\">1.1.6&nbsp;&nbsp;</span>random</a></div><div class=\"lev3 toc-item\"><a href=\"#hashlib\" data-toc-modified-id=\"hashlib-117\"><span class=\"toc-item-num\">1.1.7&nbsp;&nbsp;</span>hashlib</a></div><div class=\"lev3 toc-item\"><a href=\"#subprocess\" data-toc-modified-id=\"subprocess-118\"><span class=\"toc-item-num\">1.1.8&nbsp;&nbsp;</span>subprocess</a></div><div class=\"lev3 toc-item\"><a href=\"#shutil\" data-toc-modified-id=\"shutil-119\"><span class=\"toc-item-num\">1.1.9&nbsp;&nbsp;</span>shutil</a></div><div class=\"lev3 toc-item\"><a href=\"#re-正则表达式\" data-toc-modified-id=\"re-正则表达式-1110\"><span class=\"toc-item-num\">1.1.10&nbsp;&nbsp;</span>re 正则表达式</a></div><div class=\"lev3 toc-item\"><a href=\"#python-序列化\" data-toc-modified-id=\"python-序列化-1111\"><span class=\"toc-item-num\">1.1.11&nbsp;&nbsp;</span>python 序列化</a></div><div class=\"lev3 toc-item\"><a href=\"#configparser\" data-toc-modified-id=\"configparser-1112\"><span class=\"toc-item-num\">1.1.12&nbsp;&nbsp;</span>configparser</a></div><div class=\"lev3 toc-item\"><a href=\"#XML\" data-toc-modified-id=\"XML-1113\"><span class=\"toc-item-num\">1.1.13&nbsp;&nbsp;</span>XML</a></div><div class=\"lev3 toc-item\"><a href=\"#logging\" data-toc-modified-id=\"logging-1114\"><span class=\"toc-item-num\">1.1.14&nbsp;&nbsp;</span>logging</a></div><div class=\"lev2 toc-item\"><a href=\"#第三方模块\" data-toc-modified-id=\"第三方模块-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>第三方模块</a></div><div class=\"lev3 toc-item\"><a href=\"#requests\" data-toc-modified-id=\"requests-121\"><span class=\"toc-item-num\">1.2.1&nbsp;&nbsp;</span>requests</a></div><div class=\"lev3 toc-item\"><a href=\"#paramiko\" data-toc-modified-id=\"paramiko-122\"><span class=\"toc-item-num\">1.2.2&nbsp;&nbsp;</span>paramiko</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Python 常用模块\n",
    "\n",
    "Python 模块是实现了某个功能的一堆代码的集合。包含n个.py文件实现一个复杂得功能。\n",
    "\n",
    "可以把模块理解为乐高积木，你用这些模块组合出一个模型，然后也可以用这个模块加上其他的模块组合成一个新的模型，Python 开发速度快很多受益于Python 有很多功能强大的第三方模块。\n",
    "\n",
    "模块分为三种：\n",
    "- 内置模块\n",
    "- 第三方模块\n",
    "- 自定义模块\n",
    "\n",
    "在这里我会竟可能详细介绍到各个模块常用的功能，如果想深入了解具体和最新的内容务必到查看官方文档。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 内置模块\n",
    "\n",
    "### 导入模块"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# 单模块,在同一级目录下的\n",
    "import <模块名>\n",
    "\n",
    "# 嵌套在文件夹下的\n",
    "from lib import sa\n",
    "\n",
    "# 嵌套在多级文件夹下的(lib目录下,test目录下)\n",
    "from lib.test import sa\n",
    "\n",
    "# 不同文件夹,重名模块\n",
    "from lib import com as lib_com\n",
    "from  src import com as src_com"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "那么问题来了，导入模块时是根据哪个路径作为基准来进行的呢？即：sys.path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python35.zip', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/plat-darwin', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload', '/usr/local/lib/python3.5/site-packages', '/usr/local/lib/python3.5/site-packages/IPython/extensions', '/Users/lianliang/.ipython']\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "\n",
    "print(sys.path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "如果sys.path路径列表没有你想要的路径，可以通过 sys.path.append('路径') 添加。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/lianliang/Desktop/python_notebook\n",
      "['', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python35.zip', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/plat-darwin', '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload', '/usr/local/lib/python3.5/site-packages', '/usr/local/lib/python3.5/site-packages/IPython/extensions', '/Users/lianliang/.ipython', '/Users/lianliang/Desktop/python_notebook']\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "import os\n",
    "\n",
    "project_path = os.path.dirname(os.path.abspath('/Users/lianliang/Desktop/python_notebook/Python 常用模块.ipynb'))\n",
    "print(project_path)\n",
    "sys.path.append(project_path)\n",
    "print(sys.path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "上面看到了有导入两个模块，那么来详细看看这两个模块的功能吧。\n",
    "\n",
    "### sys\n",
    "\n",
    "用于提供对Python解释器相关的操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import sys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "常用的操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "sys.argv           # 命令行参数List，第一个元素是程序本身路径\n",
    "sys.exit(n)        # 退出程序，正常退出时exit(0)\n",
    "sys.version        # 获取Python解释程序的版本信息\n",
    "sys.maxint         # 最大的Int值\n",
    "sys.path           # 返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值\n",
    "sys.platform       # 返回操作系统平台名称\n",
    "sys.stdin          # 输入相关\n",
    "sys.stdout         # 输出相关\n",
    "sys.stderror       # 错误相关"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### os\n",
    "\n",
    "用于系统级别的操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "os.getcwd()                 # 获取当前工作目录，即当前python脚本工作的目录路径\n",
    "os.chdir(\"dirname\")         # 改变当前脚本工作目录；相当于shell下cd\n",
    "os.curdir                   # 返回当前目录: ('.')\n",
    "os.pardir                   # 获取当前目录的父目录字符串名：('..')\n",
    "os.makedirs('dir1/dir2')    # 可生成多层递归目录\n",
    "os.removedirs('dirname1')   # 若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推\n",
    "os.mkdir('dirname')         # 生成单级目录；相当于shell中mkdir dirname\n",
    "os.rmdir('dirname')         # 删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname\n",
    "os.listdir('dirname')       # 列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印\n",
    "os.remove()                 # 删除一个文件\n",
    "os.rename(\"oldname\",\"new\")  # 重命名文件/目录\n",
    "os.stat('path/filename')    # 获取文件/目录信息\n",
    "os.sep                      # 操作系统特定的路径分隔符，win下为\"\\\\\",Linux下为\"/\"\n",
    "os.linesep                  # 当前平台使用的行终止符，win下为\"\\t\\n\",Linux下为\"\\n\"\n",
    "os.pathsep                  # 用于分割文件路径的字符串\n",
    "os.name                     # 字符串指示当前使用平台。win->'nt'; Linux->'posix'\n",
    "os.system(\"bash command\")   # 运行shell命令，直接显示\n",
    "os.environ                  # 获取系统环境变量\n",
    "os.path.abspath(path)       # 返回path规范化的绝对路径\n",
    "os.path.split(path)         # 将path分割成目录和文件名二元组返回\n",
    "os.path.dirname(path)       # 返回path的目录。其实就是os.path.split(path)的第一个元素\n",
    "os.path.basename(path)      # 返回path最后的文件名。如何path以／或\\结尾，那么就会返回空值。即os.path.split(path)的第二个元素\n",
    "os.path.exists(path)        # 如果path存在，返回True；如果path不存在，返回False\n",
    "os.path.isabs(path)         # 如果path是绝对路径，返回True\n",
    "os.path.isfile(path)        # 如果path是一个存在的文件，返回True。否则返回False\n",
    "os.path.isdir(path)         # 如果path是一个存在的目录，则返回True。否则返回False\n",
    "os.path.join(path1[, path2[, ...]])  # 将多个路径组合后返回，第一个绝对路径之前的参数将被忽略\n",
    "os.path.getatime(path)      #返回path所指向的文件或者目录的最后存取时间\n",
    "os.path.getmtime(path)      #返回path所指向的文件或者目录的最后修改时间"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### time\n",
    "\n",
    "时间相关的操作，时间有三种表示方式：\n",
    "- 时间戳\n",
    "- 格式化字符串\n",
    "- 结构化时间"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1504597649.343132"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 时间戳，1970年1月1日之后的秒\n",
    "time.time()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2017-09-05 17:00:49'"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 格式化的字符串\n",
    "time.strftime('%Y-%m-%d %H:%M:%S')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "time.struct_time(tm_year=2017, tm_mon=9, tm_mday=5, tm_hour=17, tm_min=1, tm_sec=7, tm_wday=1, tm_yday=248, tm_isdst=0)"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "(2017, 9, 5, 17, 1, 7, 1, 248, 0)"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 结构化时间，元组包含了：年、日、星期等... time.struct_time    即：time.localtime()\n",
    "time.localtime()\n",
    "tuple(time.localtime())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "| 属性       | 含义     | 值                        |\n",
    "| -------- | :----- | ------------------------ |\n",
    "| tm_year  | 4位数年   | 2008                     |\n",
    "| tm_mon   | 月      | 1 到 12                   |\n",
    "| tm_mday  | 日      | 1 到 31                   |\n",
    "| tm_hour  | 小时     | 0 到 23                   |\n",
    "| tm_min   | 分钟     | 0 到 59                   |\n",
    "| tm_sec   | 秒      | 0 到 61 (60或61 是闰秒)       |\n",
    "| tm_wday  | 一周的第几日 | 0到6 (0是周日)               |\n",
    "| tm_yday  | 一年的第几日 | 1 到 366(儒略历)             |\n",
    "| tm_isdst | 夏令时    | -1, 0, 1, -1是决定是否为夏令时的旗帜 |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "接下来看看常用方法："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1504602078.759796"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 返回当前系统时间戳\n",
    "time.time()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Tue Sep  5 17:01:19 2017'"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 返回当前系统时间\n",
    "time.ctime()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Mon Sep  4 17:01:20 2017'"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将时间转换为字符串格式\n",
    "time.ctime(time.time()-86400)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "time.struct_time(tm_year=2017, tm_mon=9, tm_mday=4, tm_hour=9, tm_min=1, tm_sec=21, tm_wday=0, tm_yday=247, tm_isdst=0)"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 接收时间辍（1970纪元后经过的浮点秒数）并返回格林威治天文时间下的时间元组\n",
    "time.gmtime(time.time()-86400)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "time.struct_time(tm_year=2017, tm_mon=9, tm_mday=4, tm_hour=16, tm_min=57, tm_sec=51, tm_wday=0, tm_yday=247, tm_isdst=0)"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 接收时间辍（1970纪元后经过的浮点秒数）并返回当地时间下的时间元组\n",
    "time.localtime(time.time()-86400)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1504602397.0"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 与time.localtime() 功能相反,将struct_time格式转回成时间戳格式\n",
    "time.mktime(time.localtime())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2017-09-05 17:31:56'"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将struct_time格式转成指定的字符串格式\n",
    "time.strftime(\"%Y-%m-%d %H:%M:%S\", time.localtime())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "time.struct_time(tm_year=2017, tm_mon=9, tm_mday=4, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=247, tm_isdst=-1)"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将struct_time格式转成指定的字符串格式\n",
    "time.strptime(\"2017-9-4\", \"%Y-%m-%d\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.243893"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 返回当前的CPU时间。用来衡量不同程序的耗时，比time.time()更有用。\n",
    "time.clock()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "#sleep,等待4秒\n",
    "time.sleep(4) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0014300000000000423 seconds process time\n",
      "2.503314971923828 seconds wall time\n"
     ]
    }
   ],
   "source": [
    "def procedure():\n",
    "    time.sleep(2.5)\n",
    "\n",
    "# measure process time\n",
    "t0 = time.clock()\n",
    "procedure()\n",
    "print(time.clock() - t0, \"seconds process time\")\n",
    "\n",
    "# measure wall time\n",
    "t0 = time.time()\n",
    "procedure()\n",
    "print(time.time() - t0, \"seconds wall time\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### datetime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import datetime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.date(2017, 9, 5)"
      ]
     },
     "execution_count": 91,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2017-09-05\n"
     ]
    }
   ],
   "source": [
    "# 获取当天日期\n",
    "datetime.date.today()\n",
    "print(datetime.date.today())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2017-09-04\n"
     ]
    }
   ],
   "source": [
    "# 将时间戳转成日期格式\n",
    "print(datetime.date.fromtimestamp(time.time()-86400))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2017-09-05 17:43:32.869855\n"
     ]
    }
   ],
   "source": [
    "# 获取当前时间，精确到毫秒\n",
    "current_time = datetime.datetime.now()\n",
    "print(current_time)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "time.struct_time(tm_year=2017, tm_mon=9, tm_mday=5, tm_hour=17, tm_min=43, tm_sec=32, tm_wday=1, tm_yday=248, tm_isdst=-1)"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 返回struct_time格式\n",
    "current_time.timetuple()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2002-02-01 17:43:32.869855\n"
     ]
    }
   ],
   "source": [
    "# 返回当前时间,但指定的值将被替换\n",
    "current_time.replace(2002, 2, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2006, 11, 21, 16, 30)"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 将字符串转换成日期格式\n",
    "datetime.datetime.strptime(\"21/11/06 16:30\", \"%d/%m/%y %H:%M\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2017, 9, 15, 18, 10, 5, 733945)"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2017, 8, 26, 18, 10, 5, 736410)"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2017, 9, 5, 8, 10, 5, 739484)"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2017, 9, 5, 18, 12, 5, 741866)"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "datetime.datetime.now() + datetime.timedelta(days=10) #比现在加10天\n",
    "datetime.datetime.now() + datetime.timedelta(days=-10) #比现在减10天\n",
    "datetime.datetime.now() + datetime.timedelta(hours=-10) #比现在减10小时\n",
    "datetime.datetime.now() + datetime.timedelta(seconds=120) #比现在+120s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "python中时间日期格式化符号：\n",
    "\n",
    "- %y 两位数的年份表示（00-99）\n",
    "- %Y 四位数的年份表示（000-9999）\n",
    "- %m 月份（01-12）\n",
    "- %d 月内中的一天（0-31）\n",
    "- %H 24小时制小时数（0-23）\n",
    "- %I 12小时制小时数（01-12）\n",
    "- %M 分钟数（00=59）\n",
    "- %S 秒（00-59）\n",
    "- %a 本地简化星期名称\n",
    "- %A 本地完整星期名称\n",
    "- %b 本地简化的月份名称\n",
    "- %B 本地完整的月份名称\n",
    "- %c 本地相应的日期表示和时间表示\n",
    "- %j 年内的一天（001-366）\n",
    "- %p 本地A.M.或P.M.的等价符\n",
    "- %U 一年中的星期数（00-53）星期天为星期的开始\n",
    "- %w 星期（0-6），星期天为星期的开始\n",
    "- %W 一年中的星期数（00-53）星期一为星期的开始\n",
    "- %x 本地相应的日期表示\n",
    "- %X 本地相应的时间表示\n",
    "- %Z 当前时区的名称\n",
    "- %% %号本身"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### random\n",
    "\n",
    "生存随机数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8702062522116073"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random.random()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random.randint(1, 5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random.randrange(1, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "randint(self, a, b)\n",
    "\n",
    "    Return random integer in range [a, b], including both end points.\n",
    "    \n",
    "randrange(self, start, stop=None, step=1, _int=<class 'int'>)\n",
    "\n",
    "    Choose a random item from range(start, stop[, step]).\n",
    "  \n",
    "    This fixes the problem with randint() which includes the\n",
    "    endpoint; in Python this is usually not what you want.\n",
    "    \n",
    "看介绍这两个唯一的区别就是randint包含最后一个数字，randrange不包含。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**生成四位随机验证码：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "R4YY\n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "checkcode = ''\n",
    "for i in range(4):\n",
    "    current = random.randrange(0,4)\n",
    "    if current != i:\n",
    "        # ASCII码中表示26个字母的数字，这里通过随机拿到一个数字生产一个随机字母\n",
    "        temp = chr(random.randint(65,90)) \n",
    "    else:\n",
    "        # 这里其实没什么含义，就是0～9各位数字随机再抽一个数字。\n",
    "        temp = random.randint(0,9)\n",
    "    checkcode += str(temp)  # 拼接每次生产的随机验证码\n",
    "print(checkcode)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### hashlib\n",
    "\n",
    "用户加密相关操作，代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import hashlib"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**md5加密：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'cfdbb07149af95421c669df691fbef2f'"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b'\\xcf\\xdb\\xb0qI\\xaf\\x95B\\x1cf\\x9d\\xf6\\x91\\xfb\\xef/'"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash = hashlib.md5()\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**sha1加密：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'c085f76a451b740fbcc6981210c933980927e046'"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b\"\\xc0\\x85\\xf7jE\\x1bt\\x0f\\xbc\\xc6\\x98\\x12\\x10\\xc93\\x98\\t'\\xe0F\""
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash = hashlib.sha1()\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**sha256加密：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'342cb3e7bf3731583173d0fd8b69e6be5dbd663de38c6d06bc1c96c88e5b0a7c'"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b'4,\\xb3\\xe7\\xbf71X1s\\xd0\\xfd\\x8bi\\xe6\\xbe]\\xbdf=\\xe3\\x8cm\\x06\\xbc\\x1c\\x96\\xc8\\x8e[\\n|'"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash = hashlib.sha256()\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**sha384加密：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ddeede7a8da65f88452b06a03fb7369820e0c7e320548df9d81ea6184776fc6b4cc94843baca672b0aa17c24aff270bb'"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b'\\xdd\\xee\\xdez\\x8d\\xa6_\\x88E+\\x06\\xa0?\\xb76\\x98 \\xe0\\xc7\\xe3 T\\x8d\\xf9\\xd8\\x1e\\xa6\\x18Gv\\xfckL\\xc9HC\\xba\\xcag+\\n\\xa1|$\\xaf\\xf2p\\xbb'"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash = hashlib.sha384()\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**sha512加密：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'bad8745b3e120b270a998fbecad35fc444ca472f107da159d536b93f70b1f68c08cda0be9d8a2305f2ea1efc4d16969d85011de8f021a55ecd1711d6a8646e9b'"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b\"\\xba\\xd8t[>\\x12\\x0b'\\n\\x99\\x8f\\xbe\\xca\\xd3_\\xc4D\\xcaG/\\x10}\\xa1Y\\xd56\\xb9?p\\xb1\\xf6\\x8c\\x08\\xcd\\xa0\\xbe\\x9d\\x8a#\\x05\\xf2\\xea\\x1e\\xfcM\\x16\\x96\\x9d\\x85\\x01\\x1d\\xe8\\xf0!\\xa5^\\xcd\\x17\\x11\\xd6\\xa8dn\\x9b\""
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash = hashlib.sha512()\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'5be4ded703228db7cba551de0f48bbb0'"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b'[\\xe4\\xde\\xd7\\x03\"\\x8d\\xb7\\xcb\\xa5Q\\xde\\x0fH\\xbb\\xb0'"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import hashlib\n",
    "\n",
    "hash = hashlib.md5(bytes('www.liangxiansen.cn', encoding='utf-8'))\n",
    "hash.update(bytes('liangxiansen', encoding='utf-8'))\n",
    "hash.hexdigest()\n",
    "hash.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "其实就是做两次hash，用一个只有你自己知道的字符串再去加密一次hash值。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Python 内置还有一个 **hmac模块**， 它内部对我们创建的 Key 和内容进一步处理然后再加密。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "12488557134747a02162f81bd0c4f0b4\n"
     ]
    }
   ],
   "source": [
    "import hmac\n",
    "\n",
    "hash = hmac.new(bytes('www.liangxiansen.cn',encoding=\"utf-8\"))\n",
    "hash.update(bytes('liangxiansen',encoding=\"utf-8\"))\n",
    "print(hash.hexdigest())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### subprocess\n",
    "\n",
    "通过Python **subprocess** 可以使用Linux命令"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import subprocess"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 95,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 执行命令，返回状态码，执行正常则返回 0，报错返回报错状态吗\n",
    "subprocess.call(['ls', '-l'])  # 传递一个用于拼接命令的序列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "这样用感觉很不爽，这不像我们平时使用Linux 命令那样。 可以加 **shell=True** "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "subprocess.call(\"ls -l\", shell=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 执行命令，返回状态码，执行正常则返回 0, 报错捕捉错误信息\n",
    "subprocess.check_call(\"ls -l\", shell=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "到这里你可能会感觉subprocess 的 call 方法和 check_call 方法没什么区别啊， OK， 我们接下来看看错误的时候："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "127"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from subprocess import CalledProcessError\n",
    "try:\n",
    "    subprocess.call(\"laa -l\", shell=True)   # 没有laa这个命令\n",
    "except CalledProcessError as error:\n",
    "    print(error)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Command 'laa -l' returned non-zero exit status 127\n"
     ]
    }
   ],
   "source": [
    "from subprocess import CalledProcessError\n",
    "try:\n",
    "    subprocess.check_call(\"laa -l\", shell=True)    # 没有laa这个命令\n",
    "except CalledProcessError as error:\n",
    "    print(error)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "上面的方法可能我们在执行一些只关注执行结果，不关注输出。有时候我们也是关注输出结果的，可以使用下面的方法："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "b'lianliang\\n'\n",
      "lianliang\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 返回命令标准输出，执行失败报异常\n",
    "whoami = subprocess.check_output('whoami', shell=True)\n",
    "\n",
    "# 默认放回结果结果是bytes类型，转成string\n",
    "print(whoami)\n",
    "print(str(whoami, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "还有一个很牛逼的方法： **subprocess.Popen(...)**\n",
    "\n",
    "用于执行复杂的系统命令"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "参数：\n",
    "\n",
    "- args：shell命令，可以是字符串或者序列类型（如：list，元组）\n",
    "- bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他 缓冲区大小,负值 系统缓冲\n",
    "- stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄\n",
    "- preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用\n",
    "- close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。\n",
    "  所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。\n",
    "- shell：同上\n",
    "- cwd：用于设置子进程的当前目录\n",
    "- env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。\n",
    "- universal_newlines：不同系统的换行符不同，True -> 同意使用 \\n\n",
    "- startupinfo与createionflags只在windows下有效\n",
    "  将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/lianliang/Desktop/python_notebook\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 执行命令，指定标准输出，标准输入，错误\n",
    "cmd = subprocess.Popen(\"pwd\", shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)\n",
    "# 读取标准输出\n",
    "cmd_output = cmd.stdout.read()\n",
    "print(str(cmd_output, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/lianliang\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 执行命令，指定到运行命令的路径，指定标准输出，标准输入，错误\n",
    "cmd = subprocess.Popen(\"pwd\", shell=True, cwd='/Users/lianliang', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)\n",
    "# 读取标准输出\n",
    "cmd_output = cmd.stdout.read()\n",
    "print(str(cmd_output, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "2\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 执行python3 进入python环境\n",
    "cmd = subprocess.Popen(\"python3\", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)\n",
    "# 通过标准输入，send几条命令（字符串）\n",
    "cmd.stdin.write(\"print(1)\\n\")\n",
    "cmd.stdin.write(\"print(2)\\n\")\n",
    "cmd.stdin.close()\n",
    "\n",
    "# 拿到命令输出\n",
    "cmd_out = cmd.stdout.read()\n",
    "cmd.stdout.close()\n",
    "cmd_error = cmd.stderr.read()\n",
    "cmd.stderr.close()\n",
    "\n",
    "print(cmd_out)\n",
    "print(cmd_error)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 129,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 129,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('1\\n2\\n', '')\n"
     ]
    }
   ],
   "source": [
    "# 执行python3 进入python环境\n",
    "cmd = subprocess.Popen(\"python3\", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)\n",
    "# 通过标准输入，send几条命令（字符串）\n",
    "cmd.stdin.write(\"print(1)\\n\")\n",
    "cmd.stdin.write(\"print(2)\\n\")\n",
    "\n",
    "# 拿到命令输出\n",
    "cmd_out = cmd.communicate() # 返回一个元组，第一个元素是标准输出，第二个是错误内容。\n",
    "\n",
    "print(cmd_out)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('hello\\n123\\n', '')\n"
     ]
    }
   ],
   "source": [
    "# 执行python3 进入python环境\n",
    "cmd = subprocess.Popen(\"python3\", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)\n",
    "\n",
    "# communicate 可以执行命令并拿到结果。\n",
    "cmd_out = cmd.communicate(\"print('hello')\\nprint(123)\") # 返回一个元组，第一个元素是标准输出，第二个是错误内容。\n",
    "\n",
    "print(cmd_out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### shutil\n",
    "\n",
    "高级的文件、文件夹、压缩包 处理模块。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import shutil"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.copyfileobj(fsrc, fdst, length)**\n",
    "\n",
    "将文件内容拷贝到另一个文件中"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 155,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  7 16:19 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copyfileobj(open('open.file', 'r', encoding='utf-8'), open('new_open.file', 'w', encoding='utf-8'))\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.copyfile(fsrc, fdst)**\n",
    "\n",
    "拷贝文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 156,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'new_open.file'"
      ]
     },
     "execution_count": 156,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  7 16:20 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copyfile('open.file', 'new_open.file')\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.copy(fsrc, fdst)**\n",
    "\n",
    "拷贝文件和权限"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 157,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'new_open.file'"
      ]
     },
     "execution_count": 157,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  7 16:20 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copy('open.file', 'new_open.file')\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.copymode(fsrc, fdst)**\n",
    "\n",
    "仅拷贝权限, 注意不拷贝文件，如果文件不存在将会包异常。内容、组、用户均不变"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 158,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  7 16:20 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copymode('open.file', 'new_open.file')\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.copystat(fsrc, fdst)**\n",
    "\n",
    "仅拷贝状态信息, 包括：mode bits, atime, mtime, flags，注意不拷贝文件，如果文件不存在将会包异常"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 160,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copystat('open.file', 'new_open.file')\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "如果上面有些操作你看不出有哪些变化， 当执行完 **shutil.copystat** 后，连唯一有差异的文件创建时间都一致了。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "**shutil.copy2(fsrc, fdst)**\n",
    "\n",
    "拷贝文件和状态信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 161,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'new_open.file'"
      ]
     },
     "execution_count": 161,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 new_open.file\n",
      "-rw-r--r--  1 lianliang  staff  23  9  5 00:08 open.file\n",
      "\n"
     ]
    }
   ],
   "source": [
    "shutil.copy2('open.file', 'new_open.file')\n",
    "out = subprocess.check_output('ls -l open.file new_open.file', shell=True)\n",
    "print(str(out, encoding='utf-8'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.ignore_patterns(*patterns)**\n",
    "\n",
    "**shutil.copytree(src, dst, symlinks=False, ignore=None)**\n",
    "\n",
    "拷贝文件夹和文件夹里面的内容"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/Users/lianliang/Desktop/python_notebook_bak'"
      ]
     },
     "execution_count": 195,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "notebook_ignore = shutil.ignore_patterns('*.pyc', '.DS_Store')\n",
    "shutil.copytree('/Users/lianliang/Desktop/python_notebook', '/Users/lianliang/Desktop/python_notebook_bak', ignore=notebook_ignore)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.move(src, dst)**\n",
    "\n",
    "移动文件或文件夹，类型Linux mv命令"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 165,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/Users/lianliang/Desktop/python_notebook_111'"
      ]
     },
     "execution_count": 165,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "shutil.move('/Users/lianliang/Desktop/python_notebook_bak', '/Users/lianliang/Desktop/python_notebook_111')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.rmtree(path, ignore_errors=False, onerror=None)**\n",
    "\n",
    "删除文件夹"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 173,
   "metadata": {
    "collapsed": true,
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "shutil.rmtree('/Users/lianliang/Desktop/python_notebook_111')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**shutil.make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0, dry_run=0, owner=None, group=None, logger=None)**\n",
    "\n",
    "创建压缩包，并返回文件路径"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- base_name： 压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，\n",
    "- format：\t压缩包种类，“zip”, “tar”, “bztar”，“gztar”\n",
    "- root_dir：\t要压缩的文件夹路径（默认当前目录）\n",
    "- owner：\t用户，默认当前用户\n",
    "- group：\t组，默认当前组\n",
    "- logger：\t用于记录日志，通常是logging.Logger对象"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 196,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/Users/lianliang/Desktop/notebook.tar.gz'"
      ]
     },
     "execution_count": 196,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 创建tar压缩包,压缩包名称，压缩类型，要压缩的目录\n",
    "shutil.make_archive('/Users/lianliang/Desktop/notebook', 'gztar', '/Users/lianliang/Desktop/python_notebook_bak')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 214,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'/Users/lianliang/Desktop/notebook.zip'"
      ]
     },
     "execution_count": 214,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 创建zip压缩包，压缩包名称，压缩类型，要压缩的目录\n",
    "shutil.make_archive('/Users/lianliang/Desktop/notebook', 'zip', '/Users/lianliang/Desktop/python_notebook_bak')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "shutil 对压缩包的处理是调用 **zipfile** 和 **tarfile** 两个模块来进行的，详细："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 190,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import tarfile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "解压缩包："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# 如果你要解压缩包，模式选择‘r’，如果是‘w’会把文件存在的压缩包内里的内容清空\n",
    "tar = tarfile.open('/Users/lianliang/Desktop/notebook.tar.gz', 'r')\n",
    "# 查看压缩包里面的文件\n",
    "tar.getnames()\n",
    "# 要解压的文件路径要写全\n",
    "tar.extract('./open.file', '/Users/lianliang/Desktop/')\n",
    "# 解压所有文件到指定文件夹,如果目标文件夹不存在会创建，不指定路径默认当前路径\n",
    "tar.extractall('/Users/lianliang/Desktop/aaa/')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "创建压缩包："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# 创建压缩包文件'w'模式,如果文件存在会把文件内里的内容清空\n",
    "tar = tarfile.open('/Users/lianliang/Desktop/notebook.tar.gz', 'w')\n",
    "# 往压缩包里面添加文件,arcname可以指定文件放到压缩包后的名字。\n",
    "tar.add('open.file', arcname='open_file.txt')\n",
    "# 最后需要关闭文件句柄\n",
    "tar.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 213,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import zipfile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "解压缩包："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 215,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['./',\n",
       " '.git/',\n",
       " '.ipynb_checkpoints/',\n",
       " '.gitignore',\n",
       " 'deepcopy.png',\n",
       " 'filter-function.png',\n",
       " 'inside-function.png',\n",
       " 'lightcopy.png',\n",
       " 'map-function.png',\n",
       " 'new_open.file',\n",
       " 'open.file',\n",
       " 'Python 内置函数.ipynb',\n",
       " 'Python 内置函数.slides.html',\n",
       " 'Python 常用模块.ipynb',\n",
       " 'Python 数据类型.ipynb',\n",
       " 'Python 数据类型.slides.html',\n",
       " 'Python 文件操作.ipynb',\n",
       " 'Python 文件操作.slides.html',\n",
       " 'Python 流程控制.ipynb',\n",
       " 'Python 流程控制.slides.html',\n",
       " 'Python 输入输出.ipynb',\n",
       " 'Python 输入输出.slides.html',\n",
       " 'README.md',\n",
       " 'sys_path.gif',\n",
       " 'value1.png',\n",
       " 'value2.png',\n",
       " '位运算.png',\n",
       " '初识 Python.ipynb',\n",
       " '初识 Python.slides.html',\n",
       " '成员运算.png',\n",
       " '数学运算.png',\n",
       " '比较运算.png',\n",
       " '赋值运算.png',\n",
       " '身份运算.png',\n",
       " '运算优先级.png',\n",
       " '逻辑运算.png',\n",
       " '.git/hooks/',\n",
       " '.git/info/',\n",
       " '.git/logs/',\n",
       " '.git/objects/',\n",
       " '.git/refs/',\n",
       " '.git/COMMIT_EDITMSG',\n",
       " '.git/config',\n",
       " '.git/description',\n",
       " '.git/FETCH_HEAD',\n",
       " '.git/HEAD',\n",
       " '.git/index',\n",
       " '.git/hooks/applypatch-msg.sample',\n",
       " '.git/hooks/commit-msg.sample',\n",
       " '.git/hooks/post-update.sample',\n",
       " '.git/hooks/pre-applypatch.sample',\n",
       " '.git/hooks/pre-commit.sample',\n",
       " '.git/hooks/pre-push.sample',\n",
       " '.git/hooks/pre-rebase.sample',\n",
       " '.git/hooks/pre-receive.sample',\n",
       " '.git/hooks/prepare-commit-msg.sample',\n",
       " '.git/hooks/update.sample',\n",
       " '.git/info/exclude',\n",
       " '.git/logs/refs/',\n",
       " '.git/logs/HEAD',\n",
       " '.git/logs/refs/heads/',\n",
       " '.git/logs/refs/remotes/',\n",
       " '.git/logs/refs/heads/master',\n",
       " '.git/logs/refs/remotes/origin/',\n",
       " '.git/logs/refs/remotes/origin/master',\n",
       " '.git/objects/01/',\n",
       " '.git/objects/02/',\n",
       " '.git/objects/06/',\n",
       " '.git/objects/0a/',\n",
       " '.git/objects/11/',\n",
       " '.git/objects/12/',\n",
       " '.git/objects/15/',\n",
       " '.git/objects/19/',\n",
       " '.git/objects/1a/',\n",
       " '.git/objects/1b/',\n",
       " '.git/objects/1c/',\n",
       " '.git/objects/2a/',\n",
       " '.git/objects/2d/',\n",
       " '.git/objects/30/',\n",
       " '.git/objects/32/',\n",
       " '.git/objects/3f/',\n",
       " '.git/objects/43/',\n",
       " '.git/objects/44/',\n",
       " '.git/objects/4f/',\n",
       " '.git/objects/57/',\n",
       " '.git/objects/59/',\n",
       " '.git/objects/5a/',\n",
       " '.git/objects/5f/',\n",
       " '.git/objects/60/',\n",
       " '.git/objects/69/',\n",
       " '.git/objects/6d/',\n",
       " '.git/objects/6f/',\n",
       " '.git/objects/70/',\n",
       " '.git/objects/74/',\n",
       " '.git/objects/76/',\n",
       " '.git/objects/78/',\n",
       " '.git/objects/81/',\n",
       " '.git/objects/82/',\n",
       " '.git/objects/86/',\n",
       " '.git/objects/87/',\n",
       " '.git/objects/96/',\n",
       " '.git/objects/9d/',\n",
       " '.git/objects/a1/',\n",
       " '.git/objects/a3/',\n",
       " '.git/objects/a4/',\n",
       " '.git/objects/a7/',\n",
       " '.git/objects/a8/',\n",
       " '.git/objects/ab/',\n",
       " '.git/objects/ae/',\n",
       " '.git/objects/b2/',\n",
       " '.git/objects/b3/',\n",
       " '.git/objects/b9/',\n",
       " '.git/objects/ba/',\n",
       " '.git/objects/bf/',\n",
       " '.git/objects/c8/',\n",
       " '.git/objects/cb/',\n",
       " '.git/objects/d7/',\n",
       " '.git/objects/e0/',\n",
       " '.git/objects/e1/',\n",
       " '.git/objects/e4/',\n",
       " '.git/objects/e7/',\n",
       " '.git/objects/e8/',\n",
       " '.git/objects/ed/',\n",
       " '.git/objects/f2/',\n",
       " '.git/objects/f4/',\n",
       " '.git/objects/fa/',\n",
       " '.git/objects/fb/',\n",
       " '.git/objects/fc/',\n",
       " '.git/objects/info/',\n",
       " '.git/objects/pack/',\n",
       " '.git/objects/01/741f56bb1c310781be990d962d845a4bc07846',\n",
       " '.git/objects/02/772643c118ee3a7d1d1fcc32dedbadb49ba5d5',\n",
       " '.git/objects/06/b5e2771fac9117ae70a722316aeeb92fc94245',\n",
       " '.git/objects/0a/46242217d0202acd7ea7cb21feab13ed3abfa7',\n",
       " '.git/objects/11/8ef9e7bd3e3a2f5d2ed68b1dd39b774dfdf736',\n",
       " '.git/objects/12/b0a4325787e349b44c9aea7dfaf874f44e914e',\n",
       " '.git/objects/15/5904933a61cf23552cfb8013135f2b9caa3537',\n",
       " '.git/objects/19/79c34071c7730f2b726bd70d75112950032278',\n",
       " '.git/objects/1a/b59d86bcdec7dd6d7f9828f6d643c3f471b751',\n",
       " '.git/objects/1b/40f8b0c5194594123db439ea25afef347179cc',\n",
       " '.git/objects/1c/9e94dfe05556ab3b3de5836e0f023039d66771',\n",
       " '.git/objects/1c/b2bc1d73a17da33634f211cdf5a19055b45d2f',\n",
       " '.git/objects/2a/9c7011b792be51363c52cd298f40608290052f',\n",
       " '.git/objects/2d/c7fd0a270eae0145f03cccd01e771bdad8012b',\n",
       " '.git/objects/30/1f4e1974bf4590e40eda6cf6cd10224a7a662d',\n",
       " '.git/objects/32/332e62a0bc70f00e7e1c075f9464995e2e1e79',\n",
       " '.git/objects/3f/c1ffbbbbbf20df0cf0acba4c6c559012fb997c',\n",
       " '.git/objects/43/f1c687dc36e09b22473ad1e1b89e5abf77d1de',\n",
       " '.git/objects/44/2444cf886de4a70cd3937a22eb4a41838d253e',\n",
       " '.git/objects/4f/5b7bdf31733dabc4b39f974787c22421b43059',\n",
       " '.git/objects/57/ebeeb901544ccee542272fc695a1168070b96f',\n",
       " '.git/objects/59/f0728e17693c3046c7e6a36588e64c5f34e681',\n",
       " '.git/objects/5a/fd42d5475f0677ac6c8616143ade904cb8dca0',\n",
       " '.git/objects/5f/d6d4ac3f901494402973c5b92e648bdf733408',\n",
       " '.git/objects/60/c9481ec2eed2cafc234855c9c056758717c73a',\n",
       " '.git/objects/69/d4567ae8a431537ae51db7c65c776f25cb03dc',\n",
       " '.git/objects/6d/0dd2b198dc4ddd1c47becbe11eb1de3cb1c81b',\n",
       " '.git/objects/6d/ad413fa17c2a1e0be1c121b47dd94e0b0ef759',\n",
       " '.git/objects/6f/bc46d5daaad5c6c7bc83ebf6eeba5275a255b6',\n",
       " '.git/objects/70/a4f4a2726d5fa1d1319ebdc4ed4e7bf6a57120',\n",
       " '.git/objects/74/bbf6519cc448367cc0d3a16c9bf8306ba19248',\n",
       " '.git/objects/76/b98ef8098aac1973b116c0ec510900af5776f8',\n",
       " '.git/objects/78/19afb7187feb43730618527f5e1cef73694cbd',\n",
       " '.git/objects/81/d72747d458c649d8055308c63610ea8d2fa0e4',\n",
       " '.git/objects/82/64166f32930b6a25816cd9e7a4615dc8e2c423',\n",
       " '.git/objects/86/c572ae813f39e0bc36c8624bf4947758b3b03e',\n",
       " '.git/objects/87/31eb2449ca0a88c5c440f82120cb81775683ba',\n",
       " '.git/objects/96/9e51c5c51f5d1a367a58a98c596ffdc658c9ee',\n",
       " '.git/objects/96/c0ddf434d1ef65a33c9780e3c16e7066b0a8ff',\n",
       " '.git/objects/9d/ff8824420a41d65189eaf874a916364a26de92',\n",
       " '.git/objects/a1/4754c8556346239d5939331b28414238b532e2',\n",
       " '.git/objects/a3/a420bcc4b6a676db3d6942e733e517fd9c41cd',\n",
       " '.git/objects/a4/1909e5372b3072aaa04150676cacf30e14931b',\n",
       " '.git/objects/a7/7b9d1c3e16922c7082aa1d1f0de86227fa37ae',\n",
       " '.git/objects/a7/d3a128a788620b05a25ef08f6ff929dcaed2b8',\n",
       " '.git/objects/a8/8dd9b612d863cd4dfdc92f0df072654dad764e',\n",
       " '.git/objects/ab/31b5edd6a064dad6766d1350a020c2c23795a6',\n",
       " '.git/objects/ae/c776c01ebbfe5976282b56580640957f55d282',\n",
       " '.git/objects/b2/79da3c85c115fd78a68b58e21fc986c3f22b80',\n",
       " '.git/objects/b2/8982bcaa00f352ef4eedf32868cd9495b0084e',\n",
       " '.git/objects/b3/204256abe92d2261707be56f1300127c460662',\n",
       " '.git/objects/b9/0f5d54034ccfe32129b481de5da3845eeae762',\n",
       " '.git/objects/ba/502d573dab7d51505a8c80a1ea584792d6b567',\n",
       " '.git/objects/bf/5ba2190a0df732eda816d2de841a0a24a170fd',\n",
       " '.git/objects/c8/31b2723c6113790539cf352eb0eb06c01551f0',\n",
       " '.git/objects/cb/19df22366a741c4cd5a887caf90a06adba7358',\n",
       " '.git/objects/d7/c10da266ae300cfc566c49df55016a153d9188',\n",
       " '.git/objects/e0/c3141e23171afa2bcf39f47029513253248890',\n",
       " '.git/objects/e1/369b67dacbb437ba27a519005a329a8e7e8c31',\n",
       " '.git/objects/e4/26b2a948da3a0059ca5246102964feaa7862a9',\n",
       " '.git/objects/e7/0ab90dbdc58f9263507b624a58c6a8d8efad23',\n",
       " '.git/objects/e7/ff59934a22269d5b261dec9555e735d3fe3e27',\n",
       " '.git/objects/e8/afb26430aa2859631cfbbe2fbf614b53ff68ac',\n",
       " '.git/objects/ed/13c57c5aaed05947241c8e0b9c8a51b479ed13',\n",
       " '.git/objects/f2/2eaffe400dbc2bc1014f27519c14d5fec5e55c',\n",
       " '.git/objects/f4/67765015f63039a4e244ef56d3510d6dc3221a',\n",
       " '.git/objects/fa/fb15defbb22d9b23e5384c431f9db1da73268a',\n",
       " '.git/objects/fb/949f40f1edd9b59cd0380653d2cbcfc53166bd',\n",
       " '.git/objects/fb/de4d6b8faed378031b69d5a2e1cd782ab3ed44',\n",
       " '.git/objects/fc/695251eccc8172ce6f50d840cd4d82767c3591',\n",
       " '.git/refs/heads/',\n",
       " '.git/refs/remotes/',\n",
       " '.git/refs/tags/',\n",
       " '.git/refs/heads/master',\n",
       " '.git/refs/remotes/origin/',\n",
       " '.git/refs/remotes/origin/master',\n",
       " '.ipynb_checkpoints/Python 内置函数-checkpoint.ipynb',\n",
       " '.ipynb_checkpoints/Python 常用模块-checkpoint.ipynb',\n",
       " '.ipynb_checkpoints/Python 数据类型-checkpoint.ipynb',\n",
       " '.ipynb_checkpoints/Python 文件操作-checkpoint.ipynb',\n",
       " '.ipynb_checkpoints/Python 流程控制-checkpoint.ipynb',\n",
       " '.ipynb_checkpoints/初识 Python-checkpoint.ipynb']"
      ]
     },
     "execution_count": 215,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'/Users/lianliang/Desktop/open.file'"
      ]
     },
     "execution_count": 215,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "b'9999\\n1\\nthis is new word'"
      ]
     },
     "execution_count": 215,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 如果你要解压缩包，模式选择‘r’，如果是‘w’会把原来存在的压缩包内里的内容清空\n",
    "zfile = zipfile.ZipFile('/Users/lianliang/Desktop/notebook.zip', 'r')\n",
    "# 查看压缩包里面的文件\n",
    "zfile.namelist()\n",
    "# 要解压的文件路径要写全,他这里解压文件和上面tar，有点区别\n",
    "zfile.extract('open.file', '/Users/lianliang/Desktop/')\n",
    "# 不解压缩读文件\n",
    "zfile.read('open.file')\n",
    "# 解压所有文件到指定文件夹,如果目标文件夹不存在会创建，不指定路径默认当前路径\n",
    "zfile.extractall('/Users/lianliang/Desktop/aaa/')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "创建压缩包："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 220,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<ZipInfo filename='open_file.txt' filemode='-rw-r--r--' file_size=23>"
      ]
     },
     "execution_count": 220,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 创建压缩包文件'w'模式，如果文件存在会把文件内里的内容清空\n",
    "zfile = zipfile.ZipFile('/Users/lianliang/Desktop/notebook.zip', 'w')\n",
    "# 往压缩包里面添加文件,arcname可以指定文件放到压缩包后的名字。\n",
    "zfile.write('open.file', arcname='open_file.txt')\n",
    "zfile.writestr('open_file.txt', 'aaa')\n",
    "# 最后需要关闭文件句柄\n",
    "zfile.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 193,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on module zipfile:\n",
      "\n",
      "NAME\n",
      "    zipfile - Read and write ZIP files.\n",
      "\n",
      "MODULE REFERENCE\n",
      "    http://docs.python.org/3.5/library/zipfile\n",
      "    \n",
      "    The following documentation is automatically generated from the Python\n",
      "    source files.  It may be incomplete, incorrect or include features that\n",
      "    are considered implementation detail and may vary between Python\n",
      "    implementations.  When in doubt, consult the module reference at the\n",
      "    location listed above.\n",
      "\n",
      "DESCRIPTION\n",
      "    XXX references to utf-8 need further investigation.\n",
      "\n",
      "CLASSES\n",
      "    builtins.Exception(builtins.BaseException)\n",
      "        BadZipFile\n",
      "        LargeZipFile\n",
      "    builtins.object\n",
      "        ZipFile\n",
      "            PyZipFile\n",
      "        ZipInfo\n",
      "    \n",
      "    class BadZipFile(builtins.Exception)\n",
      "     |  Common base class for all non-exit exceptions.\n",
      "     |  \n",
      "     |  Method resolution order:\n",
      "     |      BadZipFile\n",
      "     |      builtins.Exception\n",
      "     |      builtins.BaseException\n",
      "     |      builtins.object\n",
      "     |  \n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.Exception:\n",
      "     |  \n",
      "     |  __init__(self, /, *args, **kwargs)\n",
      "     |      Initialize self.  See help(type(self)) for accurate signature.\n",
      "     |  \n",
      "     |  __new__(*args, **kwargs) from builtins.type\n",
      "     |      Create and return a new object.  See help(type) for accurate signature.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __delattr__(self, name, /)\n",
      "     |      Implement delattr(self, name).\n",
      "     |  \n",
      "     |  __getattribute__(self, name, /)\n",
      "     |      Return getattr(self, name).\n",
      "     |  \n",
      "     |  __reduce__(...)\n",
      "     |      helper for pickle\n",
      "     |  \n",
      "     |  __repr__(self, /)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  __setattr__(self, name, value, /)\n",
      "     |      Implement setattr(self, name, value).\n",
      "     |  \n",
      "     |  __setstate__(...)\n",
      "     |  \n",
      "     |  __str__(self, /)\n",
      "     |      Return str(self).\n",
      "     |  \n",
      "     |  with_traceback(...)\n",
      "     |      Exception.with_traceback(tb) --\n",
      "     |      set self.__traceback__ to tb and return self.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __cause__\n",
      "     |      exception cause\n",
      "     |  \n",
      "     |  __context__\n",
      "     |      exception context\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |  \n",
      "     |  __suppress_context__\n",
      "     |  \n",
      "     |  __traceback__\n",
      "     |  \n",
      "     |  args\n",
      "    \n",
      "    BadZipfile = class BadZipFile(builtins.Exception)\n",
      "     |  Common base class for all non-exit exceptions.\n",
      "     |  \n",
      "     |  Method resolution order:\n",
      "     |      BadZipFile\n",
      "     |      builtins.Exception\n",
      "     |      builtins.BaseException\n",
      "     |      builtins.object\n",
      "     |  \n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.Exception:\n",
      "     |  \n",
      "     |  __init__(self, /, *args, **kwargs)\n",
      "     |      Initialize self.  See help(type(self)) for accurate signature.\n",
      "     |  \n",
      "     |  __new__(*args, **kwargs) from builtins.type\n",
      "     |      Create and return a new object.  See help(type) for accurate signature.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __delattr__(self, name, /)\n",
      "     |      Implement delattr(self, name).\n",
      "     |  \n",
      "     |  __getattribute__(self, name, /)\n",
      "     |      Return getattr(self, name).\n",
      "     |  \n",
      "     |  __reduce__(...)\n",
      "     |      helper for pickle\n",
      "     |  \n",
      "     |  __repr__(self, /)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  __setattr__(self, name, value, /)\n",
      "     |      Implement setattr(self, name, value).\n",
      "     |  \n",
      "     |  __setstate__(...)\n",
      "     |  \n",
      "     |  __str__(self, /)\n",
      "     |      Return str(self).\n",
      "     |  \n",
      "     |  with_traceback(...)\n",
      "     |      Exception.with_traceback(tb) --\n",
      "     |      set self.__traceback__ to tb and return self.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __cause__\n",
      "     |      exception cause\n",
      "     |  \n",
      "     |  __context__\n",
      "     |      exception context\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |  \n",
      "     |  __suppress_context__\n",
      "     |  \n",
      "     |  __traceback__\n",
      "     |  \n",
      "     |  args\n",
      "    \n",
      "    class LargeZipFile(builtins.Exception)\n",
      "     |  Raised when writing a zipfile, the zipfile requires ZIP64 extensions\n",
      "     |  and those extensions are disabled.\n",
      "     |  \n",
      "     |  Method resolution order:\n",
      "     |      LargeZipFile\n",
      "     |      builtins.Exception\n",
      "     |      builtins.BaseException\n",
      "     |      builtins.object\n",
      "     |  \n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.Exception:\n",
      "     |  \n",
      "     |  __init__(self, /, *args, **kwargs)\n",
      "     |      Initialize self.  See help(type(self)) for accurate signature.\n",
      "     |  \n",
      "     |  __new__(*args, **kwargs) from builtins.type\n",
      "     |      Create and return a new object.  See help(type) for accurate signature.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __delattr__(self, name, /)\n",
      "     |      Implement delattr(self, name).\n",
      "     |  \n",
      "     |  __getattribute__(self, name, /)\n",
      "     |      Return getattr(self, name).\n",
      "     |  \n",
      "     |  __reduce__(...)\n",
      "     |      helper for pickle\n",
      "     |  \n",
      "     |  __repr__(self, /)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  __setattr__(self, name, value, /)\n",
      "     |      Implement setattr(self, name, value).\n",
      "     |  \n",
      "     |  __setstate__(...)\n",
      "     |  \n",
      "     |  __str__(self, /)\n",
      "     |      Return str(self).\n",
      "     |  \n",
      "     |  with_traceback(...)\n",
      "     |      Exception.with_traceback(tb) --\n",
      "     |      set self.__traceback__ to tb and return self.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __cause__\n",
      "     |      exception cause\n",
      "     |  \n",
      "     |  __context__\n",
      "     |      exception context\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |  \n",
      "     |  __suppress_context__\n",
      "     |  \n",
      "     |  __traceback__\n",
      "     |  \n",
      "     |  args\n",
      "    \n",
      "    class PyZipFile(ZipFile)\n",
      "     |  Class to create ZIP archives with Python library files and packages.\n",
      "     |  \n",
      "     |  Method resolution order:\n",
      "     |      PyZipFile\n",
      "     |      ZipFile\n",
      "     |      builtins.object\n",
      "     |  \n",
      "     |  Methods defined here:\n",
      "     |  \n",
      "     |  __init__(self, file, mode='r', compression=0, allowZip64=True, optimize=-1)\n",
      "     |      Open the ZIP file with mode read 'r', write 'w', exclusive create 'x',\n",
      "     |      or append 'a'.\n",
      "     |  \n",
      "     |  writepy(self, pathname, basename='', filterfunc=None)\n",
      "     |      Add all files from \"pathname\" to the ZIP archive.\n",
      "     |      \n",
      "     |      If pathname is a package directory, search the directory and\n",
      "     |      all package subdirectories recursively for all *.py and enter\n",
      "     |      the modules into the archive.  If pathname is a plain\n",
      "     |      directory, listdir *.py and enter all modules.  Else, pathname\n",
      "     |      must be a Python *.py file and the module will be put into the\n",
      "     |      archive.  Added modules are always module.pyc.\n",
      "     |      This method will compile the module.py into module.pyc if\n",
      "     |      necessary.\n",
      "     |      If filterfunc(pathname) is given, it is called with every argument.\n",
      "     |      When it is False, the file or directory is skipped.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from ZipFile:\n",
      "     |  \n",
      "     |  __del__(self)\n",
      "     |      Call the \"close()\" method in case the user forgot.\n",
      "     |  \n",
      "     |  __enter__(self)\n",
      "     |  \n",
      "     |  __exit__(self, type, value, traceback)\n",
      "     |  \n",
      "     |  __repr__(self)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  close(self)\n",
      "     |      Close the file, and for mode 'w', 'x' and 'a' write the ending\n",
      "     |      records.\n",
      "     |  \n",
      "     |  extract(self, member, path=None, pwd=None)\n",
      "     |      Extract a member from the archive to the current working directory,\n",
      "     |      using its full name. Its file information is extracted as accurately\n",
      "     |      as possible. `member' may be a filename or a ZipInfo object. You can\n",
      "     |      specify a different directory using `path'.\n",
      "     |  \n",
      "     |  extractall(self, path=None, members=None, pwd=None)\n",
      "     |      Extract all members from the archive to the current working\n",
      "     |      directory. `path' specifies a different directory to extract to.\n",
      "     |      `members' is optional and must be a subset of the list returned\n",
      "     |      by namelist().\n",
      "     |  \n",
      "     |  getinfo(self, name)\n",
      "     |      Return the instance of ZipInfo given 'name'.\n",
      "     |  \n",
      "     |  infolist(self)\n",
      "     |      Return a list of class ZipInfo instances for files in the\n",
      "     |      archive.\n",
      "     |  \n",
      "     |  namelist(self)\n",
      "     |      Return a list of file names in the archive.\n",
      "     |  \n",
      "     |  open(self, name, mode='r', pwd=None)\n",
      "     |      Return file-like object for 'name'.\n",
      "     |  \n",
      "     |  printdir(self, file=None)\n",
      "     |      Print a table of contents for the zip file.\n",
      "     |  \n",
      "     |  read(self, name, pwd=None)\n",
      "     |      Return file bytes (as a string) for name.\n",
      "     |  \n",
      "     |  setpassword(self, pwd)\n",
      "     |      Set default password for encrypted files.\n",
      "     |  \n",
      "     |  testzip(self)\n",
      "     |      Read all the files and check the CRC.\n",
      "     |  \n",
      "     |  write(self, filename, arcname=None, compress_type=None)\n",
      "     |      Put the bytes from filename into the archive under the name\n",
      "     |      arcname.\n",
      "     |  \n",
      "     |  writestr(self, zinfo_or_arcname, data, compress_type=None)\n",
      "     |      Write a file into the archive.  The contents is 'data', which\n",
      "     |      may be either a 'str' or a 'bytes' instance; if it is a 'str',\n",
      "     |      it is encoded as UTF-8 first.\n",
      "     |      'zinfo_or_arcname' is either a ZipInfo instance or\n",
      "     |      the name of the file in the archive.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors inherited from ZipFile:\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |      dictionary for instance variables (if defined)\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  comment\n",
      "     |      The comment text associated with the ZIP file.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data and other attributes inherited from ZipFile:\n",
      "     |  \n",
      "     |  fp = None\n",
      "    \n",
      "    class ZipFile(builtins.object)\n",
      "     |  Class with methods to open, read, write, close, list zip files.\n",
      "     |  \n",
      "     |  z = ZipFile(file, mode=\"r\", compression=ZIP_STORED, allowZip64=True)\n",
      "     |  \n",
      "     |  file: Either the path to the file, or a file-like object.\n",
      "     |        If it is a path, the file will be opened and closed by ZipFile.\n",
      "     |  mode: The mode can be either read 'r', write 'w', exclusive create 'x',\n",
      "     |        or append 'a'.\n",
      "     |  compression: ZIP_STORED (no compression), ZIP_DEFLATED (requires zlib),\n",
      "     |               ZIP_BZIP2 (requires bz2) or ZIP_LZMA (requires lzma).\n",
      "     |  allowZip64: if True ZipFile will create files with ZIP64 extensions when\n",
      "     |              needed, otherwise it will raise an exception when this would\n",
      "     |              be necessary.\n",
      "     |  \n",
      "     |  Methods defined here:\n",
      "     |  \n",
      "     |  __del__(self)\n",
      "     |      Call the \"close()\" method in case the user forgot.\n",
      "     |  \n",
      "     |  __enter__(self)\n",
      "     |  \n",
      "     |  __exit__(self, type, value, traceback)\n",
      "     |  \n",
      "     |  __init__(self, file, mode='r', compression=0, allowZip64=True)\n",
      "     |      Open the ZIP file with mode read 'r', write 'w', exclusive create 'x',\n",
      "     |      or append 'a'.\n",
      "     |  \n",
      "     |  __repr__(self)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  close(self)\n",
      "     |      Close the file, and for mode 'w', 'x' and 'a' write the ending\n",
      "     |      records.\n",
      "     |  \n",
      "     |  extract(self, member, path=None, pwd=None)\n",
      "     |      Extract a member from the archive to the current working directory,\n",
      "     |      using its full name. Its file information is extracted as accurately\n",
      "     |      as possible. `member' may be a filename or a ZipInfo object. You can\n",
      "     |      specify a different directory using `path'.\n",
      "     |  \n",
      "     |  extractall(self, path=None, members=None, pwd=None)\n",
      "     |      Extract all members from the archive to the current working\n",
      "     |      directory. `path' specifies a different directory to extract to.\n",
      "     |      `members' is optional and must be a subset of the list returned\n",
      "     |      by namelist().\n",
      "     |  \n",
      "     |  getinfo(self, name)\n",
      "     |      Return the instance of ZipInfo given 'name'.\n",
      "     |  \n",
      "     |  infolist(self)\n",
      "     |      Return a list of class ZipInfo instances for files in the\n",
      "     |      archive.\n",
      "     |  \n",
      "     |  namelist(self)\n",
      "     |      Return a list of file names in the archive.\n",
      "     |  \n",
      "     |  open(self, name, mode='r', pwd=None)\n",
      "     |      Return file-like object for 'name'.\n",
      "     |  \n",
      "     |  printdir(self, file=None)\n",
      "     |      Print a table of contents for the zip file.\n",
      "     |  \n",
      "     |  read(self, name, pwd=None)\n",
      "     |      Return file bytes (as a string) for name.\n",
      "     |  \n",
      "     |  setpassword(self, pwd)\n",
      "     |      Set default password for encrypted files.\n",
      "     |  \n",
      "     |  testzip(self)\n",
      "     |      Read all the files and check the CRC.\n",
      "     |  \n",
      "     |  write(self, filename, arcname=None, compress_type=None)\n",
      "     |      Put the bytes from filename into the archive under the name\n",
      "     |      arcname.\n",
      "     |  \n",
      "     |  writestr(self, zinfo_or_arcname, data, compress_type=None)\n",
      "     |      Write a file into the archive.  The contents is 'data', which\n",
      "     |      may be either a 'str' or a 'bytes' instance; if it is a 'str',\n",
      "     |      it is encoded as UTF-8 first.\n",
      "     |      'zinfo_or_arcname' is either a ZipInfo instance or\n",
      "     |      the name of the file in the archive.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |      dictionary for instance variables (if defined)\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  comment\n",
      "     |      The comment text associated with the ZIP file.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data and other attributes defined here:\n",
      "     |  \n",
      "     |  fp = None\n",
      "    \n",
      "    class ZipInfo(builtins.object)\n",
      "     |  Class with attributes describing each file in the ZIP archive.\n",
      "     |  \n",
      "     |  Methods defined here:\n",
      "     |  \n",
      "     |  FileHeader(self, zip64=None)\n",
      "     |      Return the per-file header as a string.\n",
      "     |  \n",
      "     |  __init__(self, filename='NoName', date_time=(1980, 1, 1, 0, 0, 0))\n",
      "     |      Initialize self.  See help(type(self)) for accurate signature.\n",
      "     |  \n",
      "     |  __repr__(self)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  CRC\n",
      "     |  \n",
      "     |  comment\n",
      "     |  \n",
      "     |  compress_size\n",
      "     |  \n",
      "     |  compress_type\n",
      "     |  \n",
      "     |  create_system\n",
      "     |  \n",
      "     |  create_version\n",
      "     |  \n",
      "     |  date_time\n",
      "     |  \n",
      "     |  external_attr\n",
      "     |  \n",
      "     |  extra\n",
      "     |  \n",
      "     |  extract_version\n",
      "     |  \n",
      "     |  file_size\n",
      "     |  \n",
      "     |  filename\n",
      "     |  \n",
      "     |  flag_bits\n",
      "     |  \n",
      "     |  header_offset\n",
      "     |  \n",
      "     |  internal_attr\n",
      "     |  \n",
      "     |  orig_filename\n",
      "     |  \n",
      "     |  reserved\n",
      "     |  \n",
      "     |  volume\n",
      "    \n",
      "    error = class BadZipFile(builtins.Exception)\n",
      "     |  Common base class for all non-exit exceptions.\n",
      "     |  \n",
      "     |  Method resolution order:\n",
      "     |      BadZipFile\n",
      "     |      builtins.Exception\n",
      "     |      builtins.BaseException\n",
      "     |      builtins.object\n",
      "     |  \n",
      "     |  Data descriptors defined here:\n",
      "     |  \n",
      "     |  __weakref__\n",
      "     |      list of weak references to the object (if defined)\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.Exception:\n",
      "     |  \n",
      "     |  __init__(self, /, *args, **kwargs)\n",
      "     |      Initialize self.  See help(type(self)) for accurate signature.\n",
      "     |  \n",
      "     |  __new__(*args, **kwargs) from builtins.type\n",
      "     |      Create and return a new object.  See help(type) for accurate signature.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Methods inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __delattr__(self, name, /)\n",
      "     |      Implement delattr(self, name).\n",
      "     |  \n",
      "     |  __getattribute__(self, name, /)\n",
      "     |      Return getattr(self, name).\n",
      "     |  \n",
      "     |  __reduce__(...)\n",
      "     |      helper for pickle\n",
      "     |  \n",
      "     |  __repr__(self, /)\n",
      "     |      Return repr(self).\n",
      "     |  \n",
      "     |  __setattr__(self, name, value, /)\n",
      "     |      Implement setattr(self, name, value).\n",
      "     |  \n",
      "     |  __setstate__(...)\n",
      "     |  \n",
      "     |  __str__(self, /)\n",
      "     |      Return str(self).\n",
      "     |  \n",
      "     |  with_traceback(...)\n",
      "     |      Exception.with_traceback(tb) --\n",
      "     |      set self.__traceback__ to tb and return self.\n",
      "     |  \n",
      "     |  ----------------------------------------------------------------------\n",
      "     |  Data descriptors inherited from builtins.BaseException:\n",
      "     |  \n",
      "     |  __cause__\n",
      "     |      exception cause\n",
      "     |  \n",
      "     |  __context__\n",
      "     |      exception context\n",
      "     |  \n",
      "     |  __dict__\n",
      "     |  \n",
      "     |  __suppress_context__\n",
      "     |  \n",
      "     |  __traceback__\n",
      "     |  \n",
      "     |  args\n",
      "\n",
      "FUNCTIONS\n",
      "    is_zipfile(filename)\n",
      "        Quickly see if a file is a ZIP file by checking the magic number.\n",
      "        \n",
      "        The filename argument may be a file or file-like object too.\n",
      "\n",
      "DATA\n",
      "    ZIP_BZIP2 = 12\n",
      "    ZIP_DEFLATED = 8\n",
      "    ZIP_LZMA = 14\n",
      "    ZIP_STORED = 0\n",
      "    __all__ = ['BadZipFile', 'BadZipfile', 'error', 'ZIP_STORED', 'ZIP_DEF...\n",
      "\n",
      "FILE\n",
      "    /usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(zipfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['',\n",
       " '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python35.zip',\n",
       " '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5',\n",
       " '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/plat-darwin',\n",
       " '/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload',\n",
       " '/usr/local/lib/python3.5/site-packages',\n",
       " '/usr/local/lib/python3.5/site-packages/IPython/extensions',\n",
       " '/Users/lianliang/.ipython']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import sys\n",
    "sys.path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### re 正则表达式\n",
    "\n",
    "python中re模块提供了正则表达式相关操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import re"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "![re](re.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "** match: **\n",
    "\n",
    "match，方法用于查找字符串的头部，匹配成功返回一个对象，未匹配成功返回None，可以使用分组\n",
    " \n",
    " match(pattern, string, flags=0)\n",
    " \n",
    "- pattern： 正则模型\n",
    "- string ： 要匹配的字符串\n",
    "- falgs  ： 匹配模式\n",
    "\n",
    " \n",
    "     X  VERBOSE     Ignore whitespace and comments for nicer looking RE's.\n",
    "     I  IGNORECASE  Perform case-insensitive matching.\n",
    "     M  MULTILINE   \"^\" matches the beginning of lines (after a newline)\n",
    "                    as well as the string.\n",
    "                    \"$\" matches the end of lines (before a newline) as well\n",
    "                    as the end of the string.\n",
    "     S  DOTALL      \".\" matches any character at all, including the newline.\n",
    "  \n",
    "     A  ASCII       For string patterns, make \\w, \\W, \\b, \\B, \\d, \\D\n",
    "                    match the corresponding ASCII character categories\n",
    "                    (rather than the whole Unicode categories, which is the\n",
    "                    default).\n",
    "                    For bytes patterns, this flag is the only available\n",
    "                    behaviour and needn't be specified.\n",
    "       \n",
    "     L  LOCALE      Make \\w, \\W, \\b, \\B, dependent on the current locale.\n",
    "     U  UNICODE     For compatibility only. Ignored for string patterns (it\n",
    "                    is the default), and forbidden for bytes patterns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<_sre.SRE_Match object; span=(0, 5), match='there'>\n",
      "there\n",
      "()\n",
      "0\n",
      "5\n",
      "(0, 5)\n",
      "<_sre.SRE_Match object; span=(0, 5), match='there'>\n",
      "there\n",
      "('here',)\n",
      "0\n",
      "5\n",
      "(0, 5)\n"
     ]
    }
   ],
   "source": [
    "# 无分组\n",
    "import re\n",
    "pattern = re.compile(\"th\\w+\")\n",
    "m = pattern.match(\"there are some things...\")\n",
    "print(m)           # 返回一个Match 对象\n",
    "print(m.group())   # 获取匹配到的完整内容\n",
    "print(m.groups())  # 获取分组内容\n",
    "print(m.start())   # 获取匹配的到字符串起始索引位置\n",
    "print(m.end())     # 获取匹配的到字符串起始索引位置\n",
    "print(m.span())    # 获取匹配的到字符串索引坐标元祖\n",
    "\n",
    "# 有分组\n",
    "pattern = re.compile(\"t(h\\w+)\")  # 写法稍有不一样，需要分组的用()括号抱起来，然后最后匹配的可以取完整的，也可以取括号里面的\n",
    "m = pattern.match(\"there are some things...\")\n",
    "print(m)           # 返回一个Match 对象\n",
    "print(m.group())   # 获取匹配到的完整内容\n",
    "print(m.groups())  # 获取分组内容\n",
    "print(m.start())   # 获取匹配的到字符串起始索引位置\n",
    "print(m.end())     # 获取匹配的到字符串起始索引位置\n",
    "print(m.span())    # 获取匹配的到字符串索引坐标元祖"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<_sre.SRE_Match object; span=(15, 21), match='things'>\n"
     ]
    }
   ],
   "source": [
    "import re\n",
    "pattern = re.compile(\"th\\w+\")\n",
    "m = pattern.match(\"there are some things...\", 15, 20)  # 设置匹配位置，从字符串哪个索引还是匹配，返回匹配都的第一个\n",
    "print(m)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<_sre.SRE_Match object; span=(0, 11), match='Hello World'>\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Hello World'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "(0, 11)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'Hello'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "(0, 5)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'World'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "(0, 5)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "('Hello', 'World')"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import re\n",
    "pattern = re.compile(r'([a-z]+) ([a-z]+)', re.I)  # re.I 表示忽略大小写\n",
    "m = pattern.match('Hello World Wide Web')\n",
    "print(m)\n",
    "m.group(0)   # 返回匹配成功的整个子串\n",
    "m.span(0)    # 返回匹配成功的整个子串的索引\n",
    "m.group(1)   # 返回第一个分组匹配成功的子串\n",
    "m.span(1)    # 返回第一个分组匹配成功的子串的索引\n",
    "m.group(2)   # 返回第二个分组匹配成功的子串\n",
    "m.span(1)    # 返回第二个分组匹配成功的子串\n",
    "m.groups()   # 等价于 (m.group(1), m.group(2), ...)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "** search: **\n",
    "\n",
    "search，方法用于查找字符串的任何位置，它也是一次匹配，只要找到了一个匹配的结果就返回，而不是查找所有匹配的结果，未匹配成功返回None, 可以使用分组\n",
    "\n",
    "search(pattern, string, flags=0)\n",
    "\n",
    "- pattern： 正则模型\n",
    "- string ： 要匹配的字符串\n",
    "- falgs  ： 匹配模式"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'12'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import re\n",
    "pattern = re.compile('\\d+')  # 这里如果使用 match 方法则不匹配\n",
    "m = pattern.search('number: 12 34 56 78')\n",
    "m.group()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "matching string: 123456\n",
      "position: (6, 12)\n"
     ]
    }
   ],
   "source": [
    "import re\n",
    " \n",
    "# 将正则表达式编译成 Pattern 对象\n",
    "pattern = re.compile('\\d+') \n",
    " \n",
    "# 使用 search() 查找匹配的子串，不存在匹配的子串时将返回 None \n",
    "# 这里使用 match() 无法成功匹配 \n",
    "m = pattern.search('hello 123456 789') \n",
    " \n",
    "if m: \n",
    "    # 获得分组信息 \n",
    "    print('matching string:', m.group())\n",
    "    print('position:',m.span())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "** findall: **\n",
    "\n",
    "上面的 match 和 search 方法都是一次匹配，只要找到了一个匹配的结果就返回。然而，在大多数时候，我们需要搜索整个字符串，获得所有匹配的结果。这时候后就需要使用 **findall** 方法， 可以使用分组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['123456', '789']\n",
      "['1', '2', '3', '4']\n"
     ]
    }
   ],
   "source": [
    "import re\n",
    " \n",
    "pattern = re.compile('\\d+')   # 查找数字\n",
    "result1 = pattern.findall('hello 123456 789')\n",
    "result2 = pattern.findall('one1two2three3four4')\n",
    " \n",
    "print(result1)\n",
    "print(result2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "** sub: **\n",
    "\n",
    "sub，替换匹配成功的指定位置字符串\n",
    "  \n",
    "sub(pattern, repl, string, count=0, flags=0)\n",
    "\n",
    "- pattern： 正则模型\n",
    "- repl   ： 要替换的字符串或可执行对象\n",
    "- string ： 要匹配的字符串\n",
    "- count  ： 指定匹配个数\n",
    "- flags  ： 匹配模式"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'hello world, hello world'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'\\x02 \\x01, \\x02 \\x01'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'hi 123, hi 456'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'hi 123, hello 456'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import re\n",
    " \n",
    "p = re.compile('(\\w+) (\\w+)')\n",
    "s = 'hello 123, hello 456'\n",
    "\n",
    "def func(m):\n",
    "    return 'hi' + ' ' + m.group(2)\n",
    "\n",
    "p.sub('hello world', s)  # 使用 'hello world' 替换 'hello 123' 和 'hello 456'\n",
    "p.sub('\\2 \\1', s)        # 引用分组\n",
    "p.sub(func, s)\n",
    "p.sub(func, s, 1)         # 最多替换一次"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "** split: **\n",
    "\n",
    "split 方法按照能够匹配的子串将字符串分割后返回列表"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['a', 'b', 'c', 'd']"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import re\n",
    " \n",
    "p = re.compile('[\\s\\,\\;]+')\n",
    "p.split('a,b;; c   d')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### python 序列化\n",
    "\n",
    "Python中用于序列化的两个模块\n",
    "\n",
    "- json     用于【字符串】和 【python基本数据类型】 间进行转换\n",
    "- pickle   用于【python特有的类型】 和 【python基本数据类型】间进行转换\n",
    "\n",
    "Json模块提供了四个功能：dumps、dump、loads、load\n",
    "\n",
    "pickle模块提供了四个功能：dumps、dump、loads、load"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'k1': 'v1'} <class 'dict'>\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "dic = {'k1': 'v1'}\n",
    "print(dic, type(dic))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\"k1\": \"v1\"} <class 'str'>\n"
     ]
    }
   ],
   "source": [
    "# 将python基本数据类型转化成字符串形式\n",
    "result = json.dumps(dic)\n",
    "print(result, type(result))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'k1': '123'} <class 'dict'>\n"
     ]
    }
   ],
   "source": [
    "s1 = '{\"k1\": \"123\"}'   # 通过json.loads反序列化,字符串一定要使用双引号(\"),因为在其语言中单引号是字符,双引号是字符串\n",
    "# 将字符串形式转换成python的基本数据类型\n",
    "dic = json.loads(s1)\n",
    "print(dic, type(dic))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "json.dump(dic, open('test.json', 'w'))   # 写入文件\n",
    "json.load(open('test.json', 'r'))        # 从文件读出"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "b'\\x80\\x03]q\\x00(K\\x0bK\\x16K!e.'\n"
     ]
    }
   ],
   "source": [
    "import pickle\n",
    " \n",
    "li = [11, 22, 33]\n",
    "r = pickle.dumps(li)\n",
    "print(r)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[11, 22, 33]\n"
     ]
    }
   ],
   "source": [
    "result = pickle.loads(r)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "pickle.dump(li, open('db', 'wb'))  # 写到文件\n",
    "pickle.load(open('db', 'rb'))      # 从文件中载入"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### configparser\n",
    "\n",
    "configparser 用于处理特定格式的文件(*.ini)，其本质上是利用open来操作文件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "# 注释1\n",
      "; 注释2\n",
      "\n",
      "[section1]\n",
      "k1 = v1\n",
      "k2:v2\n",
      "\n",
      "[section2]\n",
      "k1 = v1\n"
     ]
    }
   ],
   "source": [
    "print(open('config.ini', 'r').read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "1、获取所有节点"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['section1', 'section2']\n"
     ]
    }
   ],
   "source": [
    "import configparser\n",
    " \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    "sections = config.sections()\n",
    "print(sections)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "2、获取指定节点下所有的键值对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('k1', 'v1'), ('k2', 'v2')]\n"
     ]
    }
   ],
   "source": [
    "import configparser\n",
    " \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    "key_value = config.items('section1')\n",
    "print(key_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "3、获取指定节点下所有的键"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['k1', 'k2']\n"
     ]
    }
   ],
   "source": [
    "import configparser\n",
    " \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    "options = config.options('section1')\n",
    "print(options)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "4、获取指定节点下指定key的值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "v1\n"
     ]
    }
   ],
   "source": [
    "import configparser\n",
    " \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    "value= config.get('section1', 'k1')\n",
    "print(value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "5、检查、删除、添加节点"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import configparser\n",
    "  \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    "\n",
    "\n",
    "# 检查\n",
    "section = config.has_section('section1')\n",
    "print(section)\n",
    "  \n",
    "# 添加节点\n",
    "config.add_section(\"section10\")\n",
    "config.write(open('config.ini', 'w+'))\n",
    "  \n",
    "# 删除节点\n",
    "config.remove_section(\"section1\")\n",
    "config.write(open('config.ini', 'w+'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[section2]\n",
      "k1 = v1\n",
      "\n",
      "[section10]\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(open('config.ini', 'r').read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "6、检查、删除、设置指定组内的键值对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['config.ini']"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import configparser\n",
    " \n",
    "config = configparser.ConfigParser()\n",
    "config.read('config.ini', encoding='utf-8')\n",
    " \n",
    "\n",
    "# 检查\n",
    "option = config.has_option('section1', 'k1')\n",
    "print(option)\n",
    " \n",
    "# 删除节点\n",
    "config.remove_option(\"section1\", 'k1')\n",
    "config.write(open('config.ini', 'w'))\n",
    " \n",
    "# 设置\n",
    "config.set(\"section1\", 'k10', '10000')\n",
    "config.write(open('config.ini', 'w'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[section1]\n",
      "k2 = v2\n",
      "k10 = 10000\n",
      "\n",
      "[section2]\n",
      "k1 = v1\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(open('config.ini', 'r').read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### XML\n",
    "\n",
    "XML是实现不同语言或程序之间进行数据交换的协议，XML文件格式如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<data>\n",
      "    <country name=\"Liechtenstein\">\n",
      "        <rank updated=\"yes\">2</rank>\n",
      "        <year>2023</year>\n",
      "        <gdppc>141100</gdppc>\n",
      "        <neighbor direction=\"E\" name=\"Austria\" />\n",
      "        <neighbor direction=\"W\" name=\"Switzerland\" />\n",
      "    </country>\n",
      "    <country name=\"Singapore\">\n",
      "        <rank updated=\"yes\">5</rank>\n",
      "        <year>2026</year>\n",
      "        <gdppc>59900</gdppc>\n",
      "        <neighbor direction=\"N\" name=\"Malaysia\" />\n",
      "    </country>\n",
      "    <country name=\"Panama\">\n",
      "        <rank updated=\"yes\">69</rank>\n",
      "        <year>2026</year>\n",
      "        <gdppc>13600</gdppc>\n",
      "        <neighbor direction=\"W\" name=\"Costa Rica\" />\n",
      "        <neighbor direction=\"E\" name=\"Colombia\" />\n",
      "    </country>\n",
      "</data>\n"
     ]
    }
   ],
   "source": [
    "print(open('data.xml', 'r').read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "1、解析XML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "############ 解析方式一 ############\n",
    "\n",
    "# 打开文件，读取XML内容\n",
    "str_xml = open('data.xml', 'r').read()\n",
    "\n",
    "# 将字符串解析成xml特殊对象，root代指xml文件的根节点\n",
    "root = ET.XML(str_xml)\n",
    "\n",
    "\n",
    "############ 解析方式二 ############\n",
    "\n",
    "# 直接解析xml文件\n",
    "tree = ET.parse(\"data.xml\")\n",
    "\n",
    "# 获取xml文件的根节点\n",
    "root = tree.getroot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "2、操作XML\n",
    "\n",
    "XML格式类型是节点嵌套节点，对于每一个节点均有以下功能，以便对当前节点进行操作："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "由于 每个节点 都具有以上的方法，并且在上一步骤中解析时均得到了root（xml文件的根节点），so   可以利用以上方法进行操作xml文件。\n",
    "\n",
    "a、 遍历XML文档的所有内容:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "data\n",
      "country {'name': 'Liechtenstein'}\n",
      "rank 2\n",
      "year 2023\n",
      "gdppc 141100\n",
      "neighbor None\n",
      "neighbor None\n",
      "country {'name': 'Singapore'}\n",
      "rank 5\n",
      "year 2026\n",
      "gdppc 59900\n",
      "neighbor None\n",
      "country {'name': 'Panama'}\n",
      "rank 69\n",
      "year 2026\n",
      "gdppc 13600\n",
      "neighbor None\n",
      "neighbor None\n"
     ]
    }
   ],
   "source": [
    "### 操作\n",
    "\n",
    "# 顶层标签\n",
    "print(root.tag)\n",
    "\n",
    "\n",
    "# 遍历XML文档的第二层\n",
    "for child in root:\n",
    "    # 第二层节点的标签名称和标签属性\n",
    "    print(child.tag, child.attrib)\n",
    "    # 遍历XML文档的第三层\n",
    "    for i in child:\n",
    "        # 第二层节点的标签名称和内容\n",
    "        print(i.tag, i.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "data\n",
      "year 2023\n",
      "year 2026\n",
      "year 2026\n"
     ]
    }
   ],
   "source": [
    "### 操作\n",
    "\n",
    "# 顶层标签\n",
    "print(root.tag)\n",
    "\n",
    "\n",
    "# 遍历XML中所有的year节点\n",
    "for node in root.iter('year'):\n",
    "    # 节点的标签名称和内容\n",
    "    print(node.tag, node.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "b、修改节点内容：\n",
    "\n",
    "由于修改的节点时，均是在内存中进行，其不会影响文件中的内容。所以，如果想要修改，则需要重新将内存中的内容写到文件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "data\n"
     ]
    }
   ],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "############ 解析方式一 ############\n",
    "\n",
    "# 打开文件，读取XML内容\n",
    "str_xml = open('data.xml', 'r').read()\n",
    "\n",
    "# 将字符串解析成xml特殊对象，root代指xml文件的根节点\n",
    "root = ET.XML(str_xml)\n",
    "\n",
    "############ 操作 ############\n",
    "\n",
    "# 顶层标签\n",
    "print(root.tag)\n",
    "\n",
    "# 循环所有的year节点\n",
    "for node in root.iter('year'):\n",
    "    # 将year节点中的内容自增一\n",
    "    new_year = int(node.text) + 1\n",
    "    node.text = str(new_year)\n",
    "\n",
    "    # 设置属性\n",
    "    node.set('name', 'Kevin')\n",
    "    node.set('age', '18')\n",
    "    # 删除属性\n",
    "    del node.attrib['name']\n",
    "\n",
    "\n",
    "############ 保存文件 ############\n",
    "tree = ET.ElementTree(root)\n",
    "tree.write(\"new.xml\", encoding='utf-8')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<data>\n",
      "    <country name=\"Liechtenstein\">\n",
      "        <rank updated=\"yes\">2</rank>\n",
      "        <year age=\"18\">2024</year>\n",
      "        <gdppc>141100</gdppc>\n",
      "        <neighbor direction=\"E\" name=\"Austria\" />\n",
      "        <neighbor direction=\"W\" name=\"Switzerland\" />\n",
      "    </country>\n",
      "    <country name=\"Singapore\">\n",
      "        <rank updated=\"yes\">5</rank>\n",
      "        <year age=\"18\">2027</year>\n",
      "        <gdppc>59900</gdppc>\n",
      "        <neighbor direction=\"N\" name=\"Malaysia\" />\n",
      "    </country>\n",
      "    <country name=\"Panama\">\n",
      "        <rank updated=\"yes\">69</rank>\n",
      "        <year age=\"18\">2027</year>\n",
      "        <gdppc>13600</gdppc>\n",
      "        <neighbor direction=\"W\" name=\"Costa Rica\" />\n",
      "        <neighbor direction=\"E\" name=\"Colombia\" />\n",
      "    </country>\n",
      "</data>\n"
     ]
    }
   ],
   "source": [
    "print(open('new.xml', 'r').read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "data\n"
     ]
    }
   ],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "############ 解析方式二 ############\n",
    "\n",
    "# 直接解析xml文件\n",
    "tree = ET.parse(\"data.xml\")\n",
    "\n",
    "# 获取xml文件的根节点\n",
    "root = tree.getroot()\n",
    "\n",
    "############ 操作 ############\n",
    "\n",
    "# 顶层标签\n",
    "print(root.tag)\n",
    "\n",
    "# 循环所有的year节点\n",
    "for node in root.iter('year'):\n",
    "    # 将year节点中的内容自增一\n",
    "    new_year = int(node.text) + 1\n",
    "    node.text = str(new_year)\n",
    "\n",
    "    # 设置属性\n",
    "    node.set('name', 'Kevin')\n",
    "    node.set('age', '18')\n",
    "    # 删除属性\n",
    "    del node.attrib['name']\n",
    "\n",
    "\n",
    "############ 保存文件 ############\n",
    "tree.write(\"newnew.xml\", encoding='utf-8')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<data>\n",
      "    <country name=\"Liechtenstein\">\n",
      "        <rank updated=\"yes\">2</rank>\n",
      "        <year age=\"18\">2024</year>\n",
      "        <gdppc>141100</gdppc>\n",
      "        <neighbor direction=\"E\" name=\"Austria\" />\n",
      "        <neighbor direction=\"W\" name=\"Switzerland\" />\n",
      "    </country>\n",
      "    <country name=\"Singapore\">\n",
      "        <rank updated=\"yes\">5</rank>\n",
      "        <year age=\"18\">2027</year>\n",
      "        <gdppc>59900</gdppc>\n",
      "        <neighbor direction=\"N\" name=\"Malaysia\" />\n",
      "    </country>\n",
      "    <country name=\"Panama\">\n",
      "        <rank updated=\"yes\">69</rank>\n",
      "        <year age=\"18\">2027</year>\n",
      "        <gdppc>13600</gdppc>\n",
      "        <neighbor direction=\"W\" name=\"Costa Rica\" />\n",
      "        <neighbor direction=\"E\" name=\"Colombia\" />\n",
      "    </country>\n",
      "</data>\n"
     ]
    }
   ],
   "source": [
    "print(open('newnew.xml', 'r').read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# 遍历data下的所有country节点\n",
    "for country in root.findall('country'):\n",
    "    # 获取每一个country节点下rank节点的内容\n",
    "    rank = int(country.find('rank').text)\n",
    "\n",
    "    if rank > 50:\n",
    "        # 删除指定country节点\n",
    "        root.remove(country)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "3、创建XML文档："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "方法一："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "\n",
    "# 创建根节点\n",
    "root = ET.Element(\"famliy\")\n",
    "\n",
    "\n",
    "# 创建节点大儿子\n",
    "son1 = ET.Element('son', {'name': '儿1'})\n",
    "# 创建小儿子\n",
    "son2 = ET.Element('son', {\"name\": '儿2'})\n",
    "\n",
    "# 在大儿子中创建两个孙子\n",
    "grandson1 = ET.Element('grandson', {'name': '儿11'})\n",
    "grandson2 = ET.Element('grandson', {'name': '儿12'})\n",
    "son1.append(grandson1)\n",
    "son1.append(grandson2)\n",
    "\n",
    "\n",
    "# 把儿子添加到根节点中\n",
    "root.append(son1)\n",
    "root.append(son1)\n",
    "\n",
    "tree = ET.ElementTree(root)\n",
    "tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "方法二："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "# 创建根节点\n",
    "root = ET.Element(\"famliy\")\n",
    "\n",
    "\n",
    "# 创建大儿子\n",
    "# son1 = ET.Element('son', {'name': '儿1'})\n",
    "son1 = root.makeelement('son', {'name': '儿1'})\n",
    "# 创建小儿子\n",
    "# son2 = ET.Element('son', {\"name\": '儿2'})\n",
    "son2 = root.makeelement('son', {\"name\": '儿2'})\n",
    "\n",
    "# 在大儿子中创建两个孙子\n",
    "# grandson1 = ET.Element('grandson', {'name': '儿11'})\n",
    "grandson1 = son1.makeelement('grandson', {'name': '儿11'})\n",
    "# grandson2 = ET.Element('grandson', {'name': '儿12'})\n",
    "grandson2 = son1.makeelement('grandson', {'name': '儿12'})\n",
    "\n",
    "son1.append(grandson1)\n",
    "son1.append(grandson2)\n",
    "\n",
    "\n",
    "# 把儿子添加到根节点中\n",
    "root.append(son1)\n",
    "root.append(son1)\n",
    "\n",
    "tree = ET.ElementTree(root)\n",
    "tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "方法三："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "\n",
    "# 创建根节点\n",
    "root = ET.Element(\"famliy\")\n",
    "\n",
    "\n",
    "# 创建节点大儿子\n",
    "son1 = ET.SubElement(root, \"son\", attrib={'name': '儿1'})\n",
    "# 创建小儿子\n",
    "son2 = ET.SubElement(root, \"son\", attrib={\"name\": \"儿2\"})\n",
    "\n",
    "# 在大儿子中创建一个孙子\n",
    "grandson1 = ET.SubElement(son1, \"age\", attrib={'name': '儿11'})\n",
    "grandson1.text = '孙子'\n",
    "\n",
    "\n",
    "et = ET.ElementTree(root)  #生成文档对象\n",
    "et.write(\"test.xml\", encoding=\"utf-8\", xml_declaration=True, short_empty_elements=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "由于原生保存的XML时默认无缩进，如果想要设置缩进的话， 需要修改保存方式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "复制代码\n",
    "from xml.etree import ElementTree as ET\n",
    "from xml.dom import minidom\n",
    "\n",
    "\n",
    "def prettify(elem):\n",
    "    \"\"\"\n",
    "    将节点转换成字符串，并添加缩进。\n",
    "    \"\"\"\n",
    "    rough_string = ET.tostring(elem, 'utf-8')\n",
    "    reparsed = minidom.parseString(rough_string)\n",
    "    return reparsed.toprettyxml(indent=\"\\t\")\n",
    "\n",
    "# 创建根节点\n",
    "root = ET.Element(\"famliy\")\n",
    "\n",
    "\n",
    "# 创建大儿子\n",
    "son1 = root.makeelement('son', {'name': '儿1'})\n",
    "# 创建小儿子\n",
    "son2 = root.makeelement('son', {\"name\": '儿2'})\n",
    "\n",
    "# 在大儿子中创建两个孙子\n",
    "grandson1 = son1.makeelement('grandson', {'name': '儿11'})\n",
    "grandson2 = son1.makeelement('grandson', {'name': '儿12'})\n",
    "\n",
    "son1.append(grandson1)\n",
    "son1.append(grandson2)\n",
    "\n",
    "\n",
    "# 把儿子添加到根节点中\n",
    "root.append(son1)\n",
    "root.append(son1)\n",
    "\n",
    "\n",
    "raw_str = prettify(root)\n",
    "\n",
    "f = open(\"xxxoo.xml\",'w',encoding='utf-8')\n",
    "f.write(raw_str)\n",
    "f.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "4、命名空间:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "ET.register_namespace('com',\"http://www.company.com\") #some name\n",
    "\n",
    "# build a tree structure\n",
    "root = ET.Element(\"{http://www.company.com}STUFF\")\n",
    "body = ET.SubElement(root, \"{http://www.company.com}MORE_STUFF\", attrib={\"{http://www.company.com}hhh\": \"123\"})\n",
    "body.text = \"STUFF EVERYWHERE!\"\n",
    "\n",
    "# wrap it in an ElementTree instance, and save as XML\n",
    "tree = ET.ElementTree(root)\n",
    "\n",
    "tree.write(\"page.xml\",\n",
    "           xml_declaration=True,\n",
    "           encoding='utf-8',\n",
    "           method=\"xml\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### logging\n",
    "\n",
    "python logging模块提供了很完善的日志记录功能"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import logging\n",
    "import sys\n",
    "# 获取logger实例，如果参数为空则返回root logger\n",
    "logger = logging.getLogger(\"AppName\")\n",
    "\n",
    "# 指定logger输出格式\n",
    "formatter = logging.Formatter('%(asctime)s %(levelname)-8s: %(message)s')\n",
    "\n",
    "# 文件日志\n",
    "file_handler = logging.FileHandler(\"test.log\")\n",
    "file_handler.setFormatter(formatter)  # 可以通过setFormatter指定输出格式\n",
    "\n",
    "# 控制台日志\n",
    "console_handler = logging.StreamHandler(sys.stdout)\n",
    "console_handler.formatter = formatter  # 也可以直接给formatter赋值\n",
    "\n",
    "# 为logger添加的日志处理器\n",
    "logger.addHandler(file_handler)\n",
    "logger.addHandler(console_handler)\n",
    "\n",
    "# 指定日志的最低输出级别，默认为WARN级别\n",
    "logger.setLevel(logging.INFO)\n",
    "\n",
    "# 输出不同级别的log\n",
    "logger.debug('this is debug info')\n",
    "logger.info('this is information')\n",
    "logger.warning('this is warning message')\n",
    "logger.error('this is error message')\n",
    "logger.fatal('this is fatal message, it is same as logger.critical')\n",
    "logger.critical('this is critical message')\n",
    "\n",
    "# 2018-03-15 10:42:49,464 INFO    : this is information\n",
    "# 2018-03-15 10:42:49,464 WARNING : this is warning message\n",
    "# 2018-03-15 10:42:49,464 ERROR   : this is error message\n",
    "# 2018-03-15 10:42:49,465 CRITICAL: this is fatal message, it is same as logger.critical\n",
    "# 2018-03-15 10:42:49,465 CRITICAL: this is critical message\n",
    "\n",
    "# 移除一些日志处理器\n",
    "logger.removeHandler(file_handler)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# 格式化输出\n",
    " \n",
    "service_name = \"Booking\"\n",
    "logger.error('%s service is down!' % service_name)  # 使用python自带的字符串格式化，不推荐\n",
    "logger.error('%s service is down!', service_name)  # 使用logger的格式化，推荐\n",
    "logger.error('%s service is %s!', service_name, 'down')  # 多参数格式化\n",
    "logger.error('{} service is {}'.format(service_name, 'down')) # 使用format函数，推荐"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**记录异常信息:**\n",
    "\n",
    "当你使用logging模块记录异常信息时，不需要传入该异常对象，只要你直接调用 **logger.error()** 或者 **logger.exception()**就可以将当前异常记录下来。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# 记录异常信息\n",
    " \n",
    "try:\n",
    "    1 / 0\n",
    "except:\n",
    "    # 等同于error级别，但是会额外记录当前抛出的异常堆栈信息\n",
    "    logger.exception('this is an exception message')\n",
    "\n",
    "# 2018-03-15 10:47:27,229 ERROR   : this is an exception message\n",
    "# Traceback (most recent call last):\n",
    "#   File \"/Users/lianliang/Desktop/yisuo-faceid/test.py\", line 288, in <module>\n",
    "#     1 / 0\n",
    "# ZeroDivisionError: division by zero"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**logging配置要点:**\n",
    "\n",
    "**GetLogger()**方法\n",
    "\n",
    "这是最基本的入口，该方法参数可以为空，默认的logger名称是root，如果在同一个程序中一直都使用同名的logger，其实会拿到同一个实例，使用这个技巧就可以跨模块调用同样的logger来记录日志。\n",
    "\n",
    "另外你也可以通过日志名称来区分同一程序的不同模块，比如这个例子。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "logger = logging.getLogger(\"App.UI\")\n",
    "logger = logging.getLogger(\"App.Service\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Formatter日志格式:**\n",
    "\n",
    "Formatter对象定义了log信息的结构和内容，构造时需要带两个参数：\n",
    "\n",
    "- fmt，默认会包含最基本的level和 message信息\n",
    "- datefmt，默认为 2003-07-08 16:49:45,896 (%Y-%m-%d %H:%M:%S)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "fmt中允许使用的变量可以参考下表。\n",
    "\n",
    "| %(name)s            | Logger的名字                                                 |\n",
    "| ------------------- | ------------------------------------------------------------ |\n",
    "| %(levelno)s         | 数字形式的日志级别                                           |\n",
    "| %(levelname)s       | 文本形式的日志级别                                           |\n",
    "| %(pathname)s        | 调用日志输出函数的模块的完整路径名，可能没有                 |\n",
    "| %(filename)s        | 调用日志输出函数的模块的文件名                               |\n",
    "| %(module)s          | 调用日志输出函数的模块名                                     |\n",
    "| %(funcName)s        | 调用日志输出函数的函数名                                     |\n",
    "| %(lineno)d          | 调用日志输出函数的语句所在的代码行                           |\n",
    "| %(created)f         | 当前时间，用UNIX标准的表示时间的浮 点数表示                  |\n",
    "| %(relativeCreated)d | 输出日志信息时的，自Logger创建以 来的毫秒数                  |\n",
    "| %(asctime)s         | 字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒 |\n",
    "| %(thread)d          | 线程ID。可能没有                                             |\n",
    "| %(threadName)s      | 线程名。可能没有                                             |\n",
    "| %(process)d         | 进程ID。可能没有                                             |\n",
    "| %(message)s         | 用户输出的消息                                               |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**SetLevel 日志级别:**\n",
    "\n",
    "Logging有如下级别: **DEBUG，INFO，WARNING，ERROR，CRITICAL**\n",
    "\n",
    "默认级别是WARNING，logging模块只会输出指定level以上的log。这样的好处, 就是在项目开发时debug用的log，在产品release阶段不用一一注释，只需要调整logger的级别就可以了，很方便。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "**Handler 日志处理器:**\n",
    "\n",
    "最常用的是StreamHandler和FileHandler, Handler用于向不同的输出端打log。\n",
    "Logging包含很多handler, 可能用到的有下面几种\n",
    "\n",
    "- StreamHandler instances send error messages to streams (file-like objects).\n",
    "- FileHandler instances send error messages to disk files.\n",
    "- RotatingFileHandler instances send error messages to disk files, with support for maximum log file sizes and log file rotation.\n",
    "- TimedRotatingFileHandler instances send error messages to disk files, rotating the log file at certain timed intervals.\n",
    "- SocketHandler instances send error messages to TCP/IP sockets.\n",
    "- DatagramHandler instances send error messages to UDP sockets.\n",
    "- SMTPHandler instances send error messages to a designated email address."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**Configuration 配置方法:**\n",
    "\n",
    "logging的配置大致有下面几种方式。\n",
    "\n",
    "1. 通过代码进行完整配置，参考开头的例子，主要是通过getLogger方法实现。\n",
    "\n",
    "2. 通过代码进行简单配置，下面有例子，主要是通过basicConfig方法实现。\n",
    "\n",
    "3. 通过配置文件，下面有例子，主要是通过 logging.config.fileConfig(filepath)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "**logging.basicConfig**\n",
    "\n",
    "basicConfig()提供了非常便捷的方式让你配置logging模块并马上开始使用，可以参考下面的例子。具体可以配置的项目请查阅官方文档。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import logging\n",
    " \n",
    "logging.basicConfig(filename='example.log',level=logging.DEBUG)\n",
    "logging.debug('This message should go to the log file')\n",
    " \n",
    "logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)\n",
    "logging.debug('This message should appear on the console')\n",
    " \n",
    "logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')\n",
    "logging.warning('is when this event was logged.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "备注： 其实你甚至可以什么都不配置直接使用默认值在控制台中打log，用这样的方式替换print语句对日后项目维护会有很大帮助。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**通过文件配置logging**\n",
    "\n",
    "有三种配置文件类型如果你希望通过配置文件来管理logging，详细可以参考这个[官方文档](https://docs.python.org/3.5/library/logging.config.html)。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# I met a trouble here, must add a root logger, because getlogger() mothd default give a root logger.\n",
    "[loggers]\n",
    "keys=root, file\n",
    "\n",
    "[logger_root]\n",
    "level=INFO\n",
    "handlers=baseHandler\n",
    "qualname=root\n",
    "\n",
    "[logger_file]\n",
    "level=DEBUG\n",
    "handlers=fileHandler\n",
    "qualname=file\n",
    "\n",
    "[handlers]\n",
    "keys=baseHandler, fileHandler\n",
    "\n",
    "[handler_baseHandler]\n",
    "class=StreamHandler\n",
    "level=INFO\n",
    "formatter=simpleFormatter\n",
    "args=(sys.stdout,)\n",
    "\n",
    "[handler_fileHandler]\n",
    "class=logging.handlers.RotatingFileHandler\n",
    "level=DEBUG\n",
    "formatter=multipleFormatter\n",
    "args=('./test.log', 10485760, 10)\n",
    "\n",
    "[formatters]\n",
    "keys=simpleFormatter, multipleFormatter\n",
    "\n",
    "[formatter_simpleFormatter]\n",
    "format=%(asctime)s - %(name)s - %(levelname)s - %(message)s\n",
    "datefmt=\n",
    "\n",
    "[formatter_multipleFormatter]\n",
    "format=%(asctime)s %(filename)s:%(funcName)s:%(lineno)d %(levelname)s %(message)s\n",
    "datefmt=%Y-%m-%d %H:%M:%S %p"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "假设以上的配置文件放在和模块相同的目录，代码中的调用如下:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import logging\n",
    "from logging import config\n",
    "\n",
    "config.fileConfig('logging.ini')\n",
    "logger = logging.getLogger()\n",
    "logger.info(\"123\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 第三方模块\n",
    "\n",
    "内置模块是安装好python后就可以直接使用的，而第三方模块就是需要另外安装的，使用之前提到的pip 或 esay_install 安装。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "### requests \n",
    "\n",
    "Requests 是使用 Apache2 Licensed 许可证的 基于Python开发的HTTP 库，其在Python内置模块的基础上进行了高度的封装，从而使得Pythoner进行网络请求时，变得美好了许多，使用Requests可以轻而易举的完成浏览器可有的任何操作。让HTPP服务人类～"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "1、安装模块"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "pip3 install requests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "2、使用模块"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**Get请求：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "http://wthrcdn.etouch.cn/weather_mini?city=%E5%8C%97%E4%BA%AC\n",
      "{\"data\":{\"yesterday\":{\"date\":\"19日星期一\",\"high\":\"高温 15℃\",\"fx\":\"东风\",\"low\":\"低温 2℃\",\"fl\":\"<![CDATA[<3级]]>\",\"type\":\"多云\"},\"city\":\"北京\",\"aqi\":\"26\",\"forecast\":[{\"date\":\"20日星期二\",\"high\":\"高温 10℃\",\"fengli\":\"<![CDATA[<3级]]>\",\"low\":\"低温 -2℃\",\"fengxiang\":\"西南风\",\"type\":\"多云\"},{\"date\":\"21日星期三\",\"high\":\"高温 13℃\",\"fengli\":\"<![CDATA[<3级]]>\",\"low\":\"低温 1℃\",\"fengxiang\":\"西南风\",\"type\":\"多云\"},{\"date\":\"22日星期四\",\"high\":\"高温 16℃\",\"fengli\":\"<![CDATA[<3级]]>\",\"low\":\"低温 4℃\",\"fengxiang\":\"西南风\",\"type\":\"多云\"},{\"date\":\"23日星期五\",\"high\":\"高温 20℃\",\"fengli\":\"<![CDATA[<3级]]>\",\"low\":\"低温 5℃\",\"fengxiang\":\"北风\",\"type\":\"多云\"},{\"date\":\"24日星期六\",\"high\":\"高温 21℃\",\"fengli\":\"<![CDATA[<3级]]>\",\"low\":\"低温 7℃\",\"fengxiang\":\"北风\",\"type\":\"晴\"}],\"ganmao\":\"将有一次强降温过程，天气寒冷，极易发生感冒，请特别注意增加衣服保暖防寒。\",\"wendu\":\"7\"},\"status\":1000,\"desc\":\"OK\"}\n"
     ]
    }
   ],
   "source": [
    "# 无参数实例\n",
    " \n",
    "import requests\n",
    "\n",
    "ret = requests.get('http://wthrcdn.etouch.cn/weather_mini?city=北京')\n",
    "\n",
    "print(ret.url)\n",
    "print(ret.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "http://httpbin.org/get?key1=value1&key2=value2\n",
      "{\n",
      "  \"args\": {\n",
      "    \"key1\": \"value1\", \n",
      "    \"key2\": \"value2\"\n",
      "  }, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate\", \n",
      "    \"Connection\": \"close\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"python-requests/2.18.4\"\n",
      "  }, \n",
      "  \"origin\": \"111.207.143.125\", \n",
      "  \"url\": \"http://httpbin.org/get?key1=value1&key2=value2\"\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 有参数实例\n",
    "\n",
    "import requests\n",
    " \n",
    "payload = {'key1': 'value1', 'key2': 'value2'}\n",
    "ret = requests.get(\"http://httpbin.org/get\", params=payload)\n",
    " \n",
    "print(ret.url)\n",
    "print(ret.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**POST请求：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"args\": {}, \n",
      "  \"data\": \"\", \n",
      "  \"files\": {}, \n",
      "  \"form\": {\n",
      "    \"key1\": \"value1\", \n",
      "    \"key2\": \"value2\"\n",
      "  }, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate\", \n",
      "    \"Connection\": \"close\", \n",
      "    \"Content-Length\": \"23\", \n",
      "    \"Content-Type\": \"application/x-www-form-urlencoded\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"python-requests/2.18.4\"\n",
      "  }, \n",
      "  \"json\": null, \n",
      "  \"origin\": \"111.207.143.125\", \n",
      "  \"url\": \"http://httpbin.org/post\"\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 基本POST实例\n",
    " \n",
    "import requests\n",
    " \n",
    "payload = {'key1': 'value1', 'key2': 'value2'}\n",
    "ret = requests.post(\"http://httpbin.org/post\", data=payload)\n",
    " \n",
    "print(ret.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"args\": {}, \n",
      "  \"data\": \"{\\\"some\\\": \\\"data\\\"}\", \n",
      "  \"files\": {}, \n",
      "  \"form\": {}, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate\", \n",
      "    \"Connection\": \"close\", \n",
      "    \"Content-Length\": \"16\", \n",
      "    \"Content-Type\": \"application/json\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36\"\n",
      "  }, \n",
      "  \"json\": {\n",
      "    \"some\": \"data\"\n",
      "  }, \n",
      "  \"origin\": \"111.207.143.125\", \n",
      "  \"url\": \"http://httpbin.org/post\"\n",
      "}\n",
      "\n",
      "<RequestsCookieJar[]>\n"
     ]
    }
   ],
   "source": [
    "# 发送请求头和数据实例\n",
    " \n",
    "import requests\n",
    "import json\n",
    " \n",
    "url = \"http://httpbin.org/post\"\n",
    "payload = {'some': 'data'}\n",
    "headers = {'content-type': 'application/json', \"User-Agent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36\"}\n",
    " \n",
    "ret = requests.post(url, data=json.dumps(payload), headers=headers)\n",
    " \n",
    "print(ret.text)\n",
    "print(ret.cookies)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**其他方法：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "requests.get(url, params=None, **kwargs)\n",
    "requests.post(url, data=None, json=None, **kwargs)\n",
    "requests.put(url, data=None, **kwargs)\n",
    "requests.head(url, **kwargs)\n",
    "requests.delete(url, **kwargs)\n",
    "requests.patch(url, data=None, **kwargs)\n",
    "requests.options(url, **kwargs)\n",
    " \n",
    "# 以上方法均是在此方法的基础上构建\n",
    "requests.request(method, url, **kwargs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "更多细节请查看requests模块[官方文档](http://cn.python-requests.org/zh_CN/latest/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**HTP请求和实例：**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "检测QQ账号是否在线"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "在线\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "# 使用第三方模块requests发送HTTP请求，或者XML格式内容\n",
    "qqCode = 787710500\n",
    "r = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=%s' % qqCode)\n",
    "result = r.text\n",
    "\n",
    "# 解析XML格式内容\n",
    "node = ET.XML(result)\n",
    "\n",
    "# 获取内容\n",
    "if node.text == \"Y\":\n",
    "    print(\"在线\")\n",
    "else:\n",
    "    print(\"离线\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**列车时刻表查询：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "贵阳北（车次：G82） 08:37:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo1', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '0', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "怀化南 10:12:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo2', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '1', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "长沙南 11:37:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo3', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '2', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "武汉 12:58:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo4', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '3', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "郑州东 14:49:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo5', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '4', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "石家庄 16:14:00 TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo6', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '5', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n",
      "北京西 None TrainDetailInfo {'{urn:schemas-microsoft-com:xml-diffgram-v1}id': 'TrainDetailInfo7', '{urn:schemas-microsoft-com:xml-msdata}rowOrder': '6', '{urn:schemas-microsoft-com:xml-diffgram-v1}hasChanges': 'inserted'}\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "from xml.etree import ElementTree as ET\n",
    "\n",
    "# 使用第三方模块requests发送HTTP请求，或者XML格式内容\n",
    "train = \"G82\"\n",
    "r = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=%s&UserID=' % train)\n",
    "result = r.text\n",
    "\n",
    "# 解析XML格式内容\n",
    "root = ET.XML(result)\n",
    "for node in root.iter('TrainDetailInfo'):\n",
    "    print(node.find('TrainStation').text,node.find('StartTime').text,node.tag,node.attrib)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### paramiko\n",
    "\n",
    "paramiko是一个用于做远程控制的模块，使用该模块可以对远程服务器进行命令或文件操作，值得一说的是，fabric和ansible内部的远程管理就是使用的paramiko来现实。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "1、下载安装"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "pip install paramiko"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "2、使用模块"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**执行命令（用户名+密码）：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import paramiko\n",
    "\n",
    "ssh = paramiko.SSHClient()\n",
    "ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())\n",
    "ssh.connect('10.201.102.199', 22, 'root', 'megvii')\n",
    "stdin, stdout, stderr = ssh.exec_command('df')\n",
    "print(str(stdout.read(), encoding='utf-8'))\n",
    "ssh.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**执行命令（密钥）：**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "a. 生成密钥对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    ">>> ssh-keygen"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "b. 发送密钥给远端的主机"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    ">>> ssh-coppy-id root@10.201.102.199"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import paramiko\n",
    "\n",
    "private_key_path = '/root/.ssh/id_rsa'\n",
    "key = paramiko.RSAKey.from_private_key_file(private_key_path)\n",
    "\n",
    "ssh = paramiko.SSHClient()\n",
    "ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())\n",
    "ssh.connect('10.201.102.198', 22, 'root', key)\n",
    "\n",
    "stdin, stdout, stderr = ssh.exec_command('df')\n",
    "print(str(stdout.read(), encoding='utf-8'))\n",
    "ssh.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**上传和下载文件（用户名+密码）：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import paramiko\n",
    "\n",
    "t = paramiko.Transport(('10.201.102.198', 22))\n",
    "t.connect(username='root', password='megvii')\n",
    "sftp = paramiko.SFTPClient.from_transport(t)\n",
    "sftp.put('/root/testfile', '/root/text')  # 上传文件,目标路径要写全,写上文件名\n",
    "sftp.get('/root/text', '/root/testfile1')  # 下载文件.\n",
    "t.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "**上传和下载文件（密钥）：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import paramiko\n",
    "\n",
    "pravie_key_path = '/root/.ssh/id_rsa'\n",
    "key = paramiko.RSAKey.from_private_key_file(pravie_key_path)\n",
    "\n",
    "t = paramiko.Transport(('10.201.102.198',22))\n",
    "t.connect(username='root', pkey=key)\n",
    "\n",
    "sftp = paramiko.SFTPClient.from_transport(t)\n",
    "sftp.put('/root/testfile', '/root/text')\n",
    "sftp.get('/root/text', '/root/testfile1')\n",
    "t.close()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "navigate_num": "#000000",
    "navigate_text": "#333333",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700",
    "sidebar_border": "#EEEEEE",
    "wrapper_background": "#FFFFFF"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "156px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": false,
   "sideBar": false,
   "threshold": 4,
   "toc_cell": true,
   "toc_position": {
    "height": "649px",
    "left": "1px",
    "right": "20px",
    "top": "111px",
    "width": "129px"
   },
   "toc_section_display": "block",
   "toc_window_display": false,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}