{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "toc": "true" }, "source": [ "# Table of Contents\n", "

1  Python 文件操作
1.1  打开文件
1.2  打开文件模式
1.3  open 对象
1.4  操作文件
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Python 文件操作\n", "\n", "Python操作文件时,一般要经历如下步骤\n", "- 打开文件\n", "- 操作文件" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 打开文件" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# 文件句柄 = open('文件路径', ‘模式’)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Python内置open函数可以打开一个系统中存在的文件,会创建一个文件对象,通过这个文件对象就可以通过底文件进行操作。\n", "\n", "打开文件时,需要指定文件路径和以什么模式打开文件。 打开后获得该文件的句柄,通过问句柄对该文件操作。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 打开文件模式\n", "\n", "打开文件的模式有:\n", "- r,只读模式(默认)\n", "- w,只写模式(不可读;路径文件不存在将会创建文件,如果存在会清空里面的内容;注意!)\n", "- a,追加模式(可读;文件不存在将创建,文件存在则在末尾追加内容。)\n", "\n", "“+” 增强模式:\n", "- r+,可读,可写,可追加\n", "- w+,写读(文件存在会清空里面内容,注意!)\n", "- a+,同a\n", "\n", "“U” 兼容模式,在读取文件时,可以将\\r \\n \\r\\n自动转换成 \\n (与 r 或 r+ 模式同使用):\n", "- rU\n", "- r+U\n", "\n", "“b” 处理二进制文件,与其他模式可(如图片):\n", "- rb\n", "- wb\n", "- ab" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## open 对象" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class file(object):\n", " \n", " def close(self): # real signature unknown; restored from __doc__\n", " 关闭文件\n", " \"\"\"\n", " close() -> None or (perhaps) an integer. Close the file.\n", " \n", " Sets data attribute .closed to True. A closed file cannot be used for\n", " further I/O operations. close() may be called more than once without\n", " error. Some kinds of file objects (for example, opened by popen())\n", " may return an exit status upon closing.\n", " \"\"\"\n", " \n", " def fileno(self): # real signature unknown; restored from __doc__\n", " 文件描述符 \n", " \"\"\"\n", " fileno() -> integer \"file descriptor\".\n", " \n", " This is needed for lower-level file interfaces, such os.read().\n", " \"\"\"\n", " return 0 \n", " \n", " def flush(self): # real signature unknown; restored from __doc__\n", " 刷新文件内部缓冲区\n", " \"\"\" flush() -> None. Flush the internal I/O buffer. \"\"\"\n", " pass\n", " \n", " \n", " def isatty(self): # real signature unknown; restored from __doc__\n", " 判断文件是否是同意tty设备\n", " \"\"\" isatty() -> true or false. True if the file is connected to a tty device. \"\"\"\n", " return False\n", " \n", " \n", " def next(self): # real signature unknown; restored from __doc__\n", " 获取下一行数据,不存在,则报错\n", " \"\"\" x.next() -> the next value, or raise StopIteration \"\"\"\n", " pass\n", " \n", " def read(self, size=None): # real signature unknown; restored from __doc__\n", " 读取指定字节数据\n", " \"\"\"\n", " read([size]) -> read at most size bytes, returned as a string.\n", " \n", " If the size argument is negative or omitted, read until EOF is reached.\n", " Notice that when in non-blocking mode, less data than what was requested\n", " may be returned, even if no size parameter was given.\n", " \"\"\"\n", " pass\n", " \n", " def readinto(self): # real signature unknown; restored from __doc__\n", " 读取到缓冲区,不要用,将被遗弃\n", " \"\"\" readinto() -> Undocumented. Don't use this; it may go away. \"\"\"\n", " pass\n", " \n", " def readline(self, size=None): # real signature unknown; restored from __doc__\n", " 仅读取一行数据\n", " \"\"\"\n", " readline([size]) -> next line from the file, as a string.\n", " \n", " Retain newline. A non-negative size argument limits the maximum\n", " number of bytes to return (an incomplete line may be returned then).\n", " Return an empty string at EOF.\n", " \"\"\"\n", " pass\n", " \n", " def readlines(self, size=None): # real signature unknown; restored from __doc__\n", " 读取所有数据,并根据换行保存值列表\n", " \"\"\"\n", " readlines([size]) -> list of strings, each a line from the file.\n", " \n", " Call readline() repeatedly and return a list of the lines so read.\n", " The optional size argument, if given, is an approximate bound on the\n", " total number of bytes in the lines returned.\n", " \"\"\"\n", " return []\n", " \n", " def seek(self, offset, whence=None): # real signature unknown; restored from __doc__\n", " 指定文件中指针位置\n", " \"\"\"\n", " seek(offset[, whence]) -> None. Move to new file position.\n", " \n", " Argument offset is a byte count. Optional argument whence defaults to\n", "(offset from start of file, offset should be >= 0); other values are 1\n", " (move relative to current position, positive or negative), and 2 (move\n", " relative to end of file, usually negative, although many platforms allow\n", " seeking beyond the end of a file). If the file is opened in text mode,\n", " only offsets returned by tell() are legal. Use of other offsets causes\n", " undefined behavior.\n", " Note that not all file objects are seekable.\n", " \"\"\"\n", " pass\n", " \n", " def tell(self): # real signature unknown; restored from __doc__\n", " 获取当前指针位置\n", " \"\"\" tell() -> current file position, an integer (may be a long integer). \"\"\"\n", " pass\n", " \n", " def truncate(self, size=None): # real signature unknown; restored from __doc__\n", " 截断数据,仅保留指定之前数据\n", " \"\"\"\n", " truncate([size]) -> None. Truncate the file to at most size bytes.\n", " \n", " Size defaults to the current file position, as returned by tell().\n", " \"\"\"\n", " pass\n", " \n", " def write(self, p_str): # real signature unknown; restored from __doc__\n", " 写内容\n", " \"\"\"\n", " write(str) -> None. Write string str to file.\n", " \n", " Note that due to buffering, flush() or close() may be needed before\n", " the file on disk reflects the data written.\n", " \"\"\"\n", " pass\n", " \n", " def writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__\n", " 将一个字符串列表写入文件\n", " \"\"\"\n", " writelines(sequence_of_strings) -> None. Write the strings to the file.\n", " \n", " Note that newlines are not added. The sequence can be any iterable object\n", " producing strings. This is equivalent to calling write() for each string.\n", " \"\"\"\n", " pass\n", " \n", " def xreadlines(self): # real signature unknown; restored from __doc__\n", " 可用于逐行读取文件,非全部\n", " \"\"\"\n", " xreadlines() -> returns self.\n", " \n", " For backward compatibility. File objects now include the performance\n", " optimizations previously implemented in the xreadlines module.\n", " \"\"\"\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 操作文件" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "2" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "2" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "2" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "2" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "0\n", "\n", "1\n", "\n", "2\n", "\n", "3\n", "\n", "4\n", "\n" ] } ], "source": [ "# 创建文件\n", "create_file = open('open.file', 'w')\n", "for i in range(5):\n", " content = str(i) + '\\n' # 将数字变成字符串,然后加上换行符\n", " create_file.write(content) \n", "create_file.close() # 关闭文件句柄\n", "\n", "# 读取文件内容\n", "read_file = open('open.file', 'r', encoding=\"utf-8\") # 指定编码打开文件\n", "for line in read_file: # open创建的文件对象是可迭代的\n", " print(line)\n", "read_file.close() # 关闭文件句柄,打开文件一定要关闭" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'0'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'2'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'\\n'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "16" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'0\\n1\\nthis is new word'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_read_file = open('open.file', 'r+', encoding='utf-8')\n", "new_read_file.read(1) # 指定读取字节大小。\n", "new_read_file.seek(4) # 移动指针,到第几个字节。\n", "new_read_file.tell() # 获取当前指针位置\n", "new_read_file.read(1)\n", "new_read_file.read(1)\n", "new_read_file.seek(4) # 移动指针,到第几个字节。\n", "new_read_file.tell() # 获取当前指针位置\n", "new_read_file.write('this is new word') # 移动过指针后,写操作会把指针后面的内容覆盖掉,注意!\n", "new_read_file.close()\n", "\n", "check_read_file = open('open.file', 'r+', encoding='utf-8')\n", "check_read_file.read()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "open打开文件后总要记得关闭是不是特别麻烦,你可以使用**with**, with会帮你close。**with** 还支持同时打开多个文件。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "2" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "16" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "'9999\\n1\\nthis is new word'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 在原有文件上修改(其实就是将现在的文件递归逐行写到新文件,修改操作在写入前做修改然后再写入。)\n", "with open('open.file', 'r', encoding='utf-8') as old_file, open('newopen.file', 'w', encoding='utf-8') as new_file:\n", " for line in old_file.readlines():\n", " if '0' in line:\n", " line = line.replace('0', '9999')\n", " new_file.write(line)\n", "\n", "import os\n", "os.rename('newopen.file', 'open.file') # 新文件覆盖旧文件\n", "\n", "with open('open.file', 'r', encoding='utf-8') as check_file:\n", " check_file.read()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" }, "toc": { "colors": { "hover_highlight": "#DAA520", "navigate_num": "#000000", "navigate_text": "#333333", "running_highlight": "#FF0000", "selected_highlight": "#FFD700", "sidebar_border": "#EEEEEE", "wrapper_background": "#FFFFFF" }, "moveMenuLeft": true, "nav_menu": { "height": "12px", "width": "252px" }, "navigate_menu": true, "number_sections": false, "sideBar": true, "threshold": 4, "toc_cell": true, "toc_section_display": "block", "toc_window_display": false, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }