{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    },
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Python-文件操作\" data-toc-modified-id=\"Python-文件操作-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Python 文件操作</a></div><div class=\"lev2 toc-item\"><a href=\"#打开文件\" data-toc-modified-id=\"打开文件-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>打开文件</a></div><div class=\"lev2 toc-item\"><a href=\"#打开文件模式\" data-toc-modified-id=\"打开文件模式-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>打开文件模式</a></div><div class=\"lev2 toc-item\"><a href=\"#open-对象\" data-toc-modified-id=\"open-对象-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>open 对象</a></div><div class=\"lev2 toc-item\"><a href=\"#操作文件\" data-toc-modified-id=\"操作文件-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>操作文件</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Python 文件操作\n",
    "\n",
    "Python操作文件时，一般要经历如下步骤\n",
    "- 打开文件\n",
    "- 操作文件"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 打开文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# 文件句柄 = open('文件路径'， ‘模式’)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Python内置open函数可以打开一个系统中存在的文件，会创建一个文件对象，通过这个文件对象就可以通过底文件进行操作。\n",
    "\n",
    "打开文件时，需要指定文件路径和以什么模式打开文件。 打开后获得该文件的句柄，通过问句柄对该文件操作。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 打开文件模式\n",
    "\n",
    "打开文件的模式有：\n",
    "- r，只读模式（默认）\n",
    "- w，只写模式（不可读；路径文件不存在将会创建文件，如果存在会清空里面的内容；注意！）\n",
    "- a，追加模式（可读；文件不存在将创建，文件存在则在末尾追加内容。）\n",
    "\n",
    "“+” 增强模式：\n",
    "- r+，可读，可写，可追加\n",
    "- w+，写读（文件存在会清空里面内容，注意！）\n",
    "- a+，同a\n",
    "\n",
    "“U” 兼容模式，在读取文件时，可以将\\r \\n \\r\\n自动转换成 \\n （与 r 或 r+ 模式同使用）：\n",
    "- rU\n",
    "- r+U\n",
    "\n",
    "“b” 处理二进制文件，与其他模式可（如图片）：\n",
    "- rb\n",
    "- wb\n",
    "- ab"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## open 对象"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class file(object):\n",
    "  \n",
    "    def close(self): # real signature unknown; restored from __doc__\n",
    "        关闭文件\n",
    "        \"\"\"\n",
    "        close() -> None or (perhaps) an integer.  Close the file.\n",
    "         \n",
    "        Sets data attribute .closed to True.  A closed file cannot be used for\n",
    "        further I/O operations.  close() may be called more than once without\n",
    "        error.  Some kinds of file objects (for example, opened by popen())\n",
    "        may return an exit status upon closing.\n",
    "        \"\"\"\n",
    " \n",
    "    def fileno(self): # real signature unknown; restored from __doc__\n",
    "        文件描述符  \n",
    "         \"\"\"\n",
    "        fileno() -> integer \"file descriptor\".\n",
    "         \n",
    "        This is needed for lower-level file interfaces, such os.read().\n",
    "        \"\"\"\n",
    "        return 0    \n",
    " \n",
    "    def flush(self): # real signature unknown; restored from __doc__\n",
    "        刷新文件内部缓冲区\n",
    "        \"\"\" flush() -> None.  Flush the internal I/O buffer. \"\"\"\n",
    "        pass\n",
    " \n",
    " \n",
    "    def isatty(self): # real signature unknown; restored from __doc__\n",
    "        判断文件是否是同意tty设备\n",
    "        \"\"\" isatty() -> true or false.  True if the file is connected to a tty device. \"\"\"\n",
    "        return False\n",
    " \n",
    " \n",
    "    def next(self): # real signature unknown; restored from __doc__\n",
    "        获取下一行数据，不存在，则报错\n",
    "        \"\"\" x.next() -> the next value, or raise StopIteration \"\"\"\n",
    "        pass\n",
    " \n",
    "    def read(self, size=None): # real signature unknown; restored from __doc__\n",
    "        读取指定字节数据\n",
    "        \"\"\"\n",
    "        read([size]) -> read at most size bytes, returned as a string.\n",
    "         \n",
    "        If the size argument is negative or omitted, read until EOF is reached.\n",
    "        Notice that when in non-blocking mode, less data than what was requested\n",
    "        may be returned, even if no size parameter was given.\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def readinto(self): # real signature unknown; restored from __doc__\n",
    "        读取到缓冲区，不要用，将被遗弃\n",
    "        \"\"\" readinto() -> Undocumented.  Don't use this; it may go away. \"\"\"\n",
    "        pass\n",
    " \n",
    "    def readline(self, size=None): # real signature unknown; restored from __doc__\n",
    "        仅读取一行数据\n",
    "        \"\"\"\n",
    "        readline([size]) -> next line from the file, as a string.\n",
    "         \n",
    "        Retain newline.  A non-negative size argument limits the maximum\n",
    "        number of bytes to return (an incomplete line may be returned then).\n",
    "        Return an empty string at EOF.\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def readlines(self, size=None): # real signature unknown; restored from __doc__\n",
    "        读取所有数据，并根据换行保存值列表\n",
    "        \"\"\"\n",
    "        readlines([size]) -> list of strings, each a line from the file.\n",
    "         \n",
    "        Call readline() repeatedly and return a list of the lines so read.\n",
    "        The optional size argument, if given, is an approximate bound on the\n",
    "        total number of bytes in the lines returned.\n",
    "        \"\"\"\n",
    "        return []\n",
    " \n",
    "    def seek(self, offset, whence=None): # real signature unknown; restored from __doc__\n",
    "        指定文件中指针位置\n",
    "        \"\"\"\n",
    "        seek(offset[, whence]) -> None.  Move to new file position.\n",
    "         \n",
    "        Argument offset is a byte count.  Optional argument whence defaults to\n",
    "(offset from start of file, offset should be >= 0); other values are 1\n",
    "        (move relative to current position, positive or negative), and 2 (move\n",
    "        relative to end of file, usually negative, although many platforms allow\n",
    "        seeking beyond the end of a file).  If the file is opened in text mode,\n",
    "        only offsets returned by tell() are legal.  Use of other offsets causes\n",
    "        undefined behavior.\n",
    "        Note that not all file objects are seekable.\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def tell(self): # real signature unknown; restored from __doc__\n",
    "        获取当前指针位置\n",
    "        \"\"\" tell() -> current file position, an integer (may be a long integer). \"\"\"\n",
    "        pass\n",
    " \n",
    "    def truncate(self, size=None): # real signature unknown; restored from __doc__\n",
    "        截断数据，仅保留指定之前数据\n",
    "        \"\"\"\n",
    "        truncate([size]) -> None.  Truncate the file to at most size bytes.\n",
    "         \n",
    "        Size defaults to the current file position, as returned by tell().\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def write(self, p_str): # real signature unknown; restored from __doc__\n",
    "        写内容\n",
    "        \"\"\"\n",
    "        write(str) -> None.  Write string str to file.\n",
    "         \n",
    "        Note that due to buffering, flush() or close() may be needed before\n",
    "        the file on disk reflects the data written.\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__\n",
    "        将一个字符串列表写入文件\n",
    "        \"\"\"\n",
    "        writelines(sequence_of_strings) -> None.  Write the strings to the file.\n",
    "         \n",
    "        Note that newlines are not added.  The sequence can be any iterable object\n",
    "        producing strings. This is equivalent to calling write() for each string.\n",
    "        \"\"\"\n",
    "        pass\n",
    " \n",
    "    def xreadlines(self): # real signature unknown; restored from __doc__\n",
    "        可用于逐行读取文件，非全部\n",
    "        \"\"\"\n",
    "        xreadlines() -> returns self.\n",
    "         \n",
    "        For backward compatibility. File objects now include the performance\n",
    "        optimizations previously implemented in the xreadlines module.\n",
    "        \"\"\"\n",
    "        pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 操作文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "\n",
      "1\n",
      "\n",
      "2\n",
      "\n",
      "3\n",
      "\n",
      "4\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# 创建文件\n",
    "create_file = open('open.file', 'w')\n",
    "for i in range(5):\n",
    "    content = str(i) + '\\n' # 将数字变成字符串，然后加上换行符\n",
    "    create_file.write(content)  \n",
    "create_file.close() # 关闭文件句柄\n",
    "\n",
    "# 读取文件内容\n",
    "read_file = open('open.file', 'r', encoding=\"utf-8\")  # 指定编码打开文件\n",
    "for line in read_file:   # open创建的文件对象是可迭代的\n",
    "    print(line)\n",
    "read_file.close() # 关闭文件句柄，打开文件一定要关闭"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'2'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'\\n'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "16"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'0\\n1\\nthis is new word'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "new_read_file = open('open.file', 'r+', encoding='utf-8')\n",
    "new_read_file.read(1) # 指定读取字节大小。\n",
    "new_read_file.seek(4) # 移动指针，到第几个字节。\n",
    "new_read_file.tell()  # 获取当前指针位置\n",
    "new_read_file.read(1)\n",
    "new_read_file.read(1)\n",
    "new_read_file.seek(4) # 移动指针，到第几个字节。\n",
    "new_read_file.tell()  # 获取当前指针位置\n",
    "new_read_file.write('this is new word')  # 移动过指针后，写操作会把指针后面的内容覆盖掉，注意！\n",
    "new_read_file.close()\n",
    "\n",
    "check_read_file = open('open.file', 'r+', encoding='utf-8')\n",
    "check_read_file.read()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "open打开文件后总要记得关闭是不是特别麻烦，你可以使用**with**， with会帮你close。**with** 还支持同时打开多个文件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "16"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "'9999\\n1\\nthis is new word'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 在原有文件上修改(其实就是将现在的文件递归逐行写到新文件，修改操作在写入前做修改然后再写入。)\n",
    "with open('open.file', 'r', encoding='utf-8') as old_file, open('newopen.file', 'w', encoding='utf-8') as new_file:\n",
    "    for line in old_file.readlines():\n",
    "        if '0' in line:\n",
    "            line = line.replace('0', '9999')\n",
    "        new_file.write(line)\n",
    "\n",
    "import os\n",
    "os.rename('newopen.file', 'open.file')  # 新文件覆盖旧文件\n",
    "\n",
    "with open('open.file', 'r', encoding='utf-8') as check_file:\n",
    "    check_file.read()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "navigate_num": "#000000",
    "navigate_text": "#333333",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700",
    "sidebar_border": "#EEEEEE",
    "wrapper_background": "#FFFFFF"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "12px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": false,
   "sideBar": true,
   "threshold": 4,
   "toc_cell": true,
   "toc_section_display": "block",
   "toc_window_display": false,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}