{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ " # 聪明办法学 Python 2nd Edition\n", "## Chapter 6 字符串 Strings\n", "---\n", "聪明办法学 Python 教学团队\n", "\n", "

learn.python.the.smart.way@gmail.com

" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 字符串文字" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 四种引号\n", "\n", "引号的作用就是将文字包裹起来，告诉 Python \"这是个字符串！\"\n", "\n", "单引号 `'` 和双引号 `\"` 是最常见的两种字符串引号" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:08:54.043737Z", "start_time": "2023-09-14T09:08:54.027673Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "单引号\n", "双引号\n" ] } ], "source": [ "print('单引号')\n", "print(\"双引号\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "三个引号的情况不太常见，但是它在一些场合有特定的作用（如函数文档 doc-strings）" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:08:55.742102Z", "start_time": "2023-09-14T09:08:55.735056Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "三个单引号\n", "三个双引号\n" ] } ], "source": [ "print('''三个单引号''')\n", "print(\"\"\"三个双引号\"\"\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**我们为什么需要两种不同的引号？**" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:08:57.780088Z", "start_time": "2023-09-14T09:08:57.773238Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "聪明办法学 Python 第二版的课程简称是 'P2S'\n" ] } ], "source": [ "# 为了写出这样的句子\n", "print(\"聪明办法学 Python 第二版的课程简称是 'P2S'\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "如果我们偏要只用一种引号呢？" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:08:59.917890Z", "start_time": "2023-09-14T09:08:59.903383Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "SyntaxError", "evalue": "invalid syntax (3461323572.py, line 2)", "output_type": "error", "traceback": [ "\u001b[1;36m Cell \u001b[1;32mIn [4], line 2\u001b[1;36m\u001b[0m\n\u001b[1;33m print(\"聪明办法学 Python 第二版的课程简称是 \"P2S\"\")\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m invalid syntax\n" ] } ], "source": [ "# 这会导致语法错误，Python 无法正确判断一个字符串的终止位置\n", "print(\"聪明办法学 Python 第二版的课程简称是 \"P2S\"\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 字符串中的换行符号\n", "\n", "前面有反斜杠 `\\` 的字符，叫做**转义序列**\n", "\n", "比如 `\\n` 代表**换行**，尽管它看起来像两个字符，但是 Python 依然把它视为一个特殊的字符" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:02.141492Z", "start_time": "2023-09-14T09:09:02.124774Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data\n", "whale\n" ] } ], "source": [ "# 这两个 print() 在做同样的事情 \n", "print(\"Data\\nwhale\") # \\n 是一个单独的换行符号" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:03.257491Z", "start_time": "2023-09-14T09:09:03.249970Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data\n", "whale\n" ] } ], "source": [ "print(\"\"\"Data\n", "whale\"\"\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:04.590053Z", "start_time": "2023-09-14T09:09:04.580955Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "你可以在字符串后面使用反斜杠 `\\` 来排除后面的换行。比如这里是第二行文字，但是你会看到它会紧跟在上一行句号后面。这种做法在 CIL 里面经常使用（多个 Flag 并排保持美观），但是在编程中的应用比较少。\n" ] } ], "source": [ "print(\"\"\"你可以在字符串后面使用反斜杠 `\\` 来排除后面的换行。\\\n", "比如这里是第二行文字，但是你会看到它会紧跟在上一行句号后面。\\\n", "这种做法在 CIL 里面经常使用（多个 Flag 并排保持美观），\\\n", "但是在编程中的应用比较少。\\\n", "\"\"\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:06.504664Z", "start_time": "2023-09-14T09:09:06.497632Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "你可以在字符串后面使用反斜杠 `\\` 来排除后面的换行。\n", "比如这里是第二行文字，但是你会看到它会紧跟在上一行句号后面。\n", "这种做法在 CIL 里面经常使用（多个 Flag 并排保持美观），\n", "但是在编程中的应用比较少。\n", "\n" ] } ], "source": [ "print(\"\"\"你可以在字符串后面使用反斜杠 `\\` 来排除后面的换行。\n", "比如这里是第二行文字，但是你会看到它会紧跟在上一行句号后面。\n", "这种做法在 CIL 里面经常使用（多个 Flag 并排保持美观），\n", "但是在编程中的应用比较少。\n", "\"\"\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 其他的转义序列" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:10.235167Z", "start_time": "2023-09-14T09:09:10.221255Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "双引号：\"\n" ] } ], "source": [ "print(\"双引号：\\\"\")" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:11.334012Z", "start_time": "2023-09-14T09:09:11.319471Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "反斜线：\\\n" ] } ], "source": [ "print(\"反斜线：\\\\\")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:12.318311Z", "start_time": "2023-09-14T09:09:12.301296Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "换\n", "行\n" ] } ], "source": [ "print(\"换\\n行\")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:13.284332Z", "start_time": "2023-09-14T09:09:13.273784Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "这个是\t制\t表\t符\n", "也叫\t跳\t格\t键\n" ] } ], "source": [ "print(\"这个是\\t制\\t表\\t符\\n也叫\\t跳\\t格\\t键\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "转义序列只作为一个字符存在" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:15.670090Z", "start_time": "2023-09-14T09:09:15.660067Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s = D\\a\"t\ta\n", "\n", "s 的长度为： 7\n" ] } ], "source": [ "s = \"D\\\\a\\\"t\\ta\"\n", "print(\"s =\", s)\n", "print(\"\\ns 的长度为：\", len(s))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## repr() vs. print()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "我们现在有两个字符串" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:18.851095Z", "start_time": "2023-09-14T09:09:18.845148Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "s1 = \"Data\\tWhale\"\n", "s2 = \"Data Whale\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "它俩看起来似乎是一样的" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:20.879194Z", "start_time": "2023-09-14T09:09:20.869076Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s1: Data\tWhale\n", "s2: Data Whale\n" ] } ], "source": [ "print(\"s1:\", s1)\n", "print(\"s2:\", s2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "但是它们真的一样吗？" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:22.771850Z", "start_time": "2023-09-14T09:09:22.755795Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s1 == s2" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ " > ### 如来佛合掌道：“观音尊者，你看那两个行者，谁是真假？” " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ### “谛听，汝之神通，能分辨出谁是真身，可为我说之。”" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:25.690072Z", "start_time": "2023-09-14T09:09:25.685560Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "'Data\\tWhale'\n", "'Data Whale'\n" ] } ], "source": [ "print(repr(s1))\n", "print(repr(s2))" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:27.127485Z", "start_time": "2023-09-14T09:09:27.124484Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "hack_text = \"密码应当大于 8 个字符，小于 16 个字符，包含大写字母、小写字母、数字和特殊符号\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\"" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:28.373296Z", "start_time": "2023-09-14T09:09:28.364279Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "密码应当大于 8 个字符，小于 16 个字符，包含大写字母、小写字母、数字和特殊符号\t\t\t\t\t\t\t\t\t\t\t\t\t\n" ] } ], "source": [ "print(hack_text)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:29.463962Z", "start_time": "2023-09-14T09:09:29.451923Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "'密码应当大于 8 个字符，小于 16 个字符，包含大写字母、小写字母、数字和特殊符号\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t'\n" ] } ], "source": [ "print(repr(hack_text))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "多行字符串作为注释\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:30.718225Z", "start_time": "2023-09-14T09:09:30.706686Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Amazing!\n" ] } ], "source": [ "\"\"\"\n", "Python 本身是没有多行注释的，\n", "但是你可以用多行字符串实现同样的操作，\n", "还记得我们之前学过的“表达式“吗？\n", "它的原理就是 Python 会运行它，\n", "但是马上扔掉！（垃圾回收机制）\n", "\"\"\"\n", "print(\"Amazing!\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 一些字符串常量" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:33.967226Z", "start_time": "2023-09-14T09:09:33.949010Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\n" ] } ], "source": [ "import string\n", "print(string.ascii_letters)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:34.956185Z", "start_time": "2023-09-14T09:09:34.945054Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "abcdefghijklmnopqrstuvwxyz\n" ] } ], "source": [ "print(string.ascii_lowercase)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:35.893654Z", "start_time": "2023-09-14T09:09:35.881602Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ABCDEFGHIJKLMNOPQRSTUVWXYZ\n" ] } ], "source": [ "print(string.ascii_uppercase) " ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:36.997888Z", "start_time": "2023-09-14T09:09:36.984320Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0123456789\n" ] } ], "source": [ "print(string.digits)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:38.059200Z", "start_time": "2023-09-14T09:09:38.049236Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~\n" ] } ], "source": [ "print(string.punctuation) # < = >" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:38.994218Z", "start_time": "2023-09-14T09:09:38.977242Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n", "\r\n", "\u000b\n", "\n", "\f\n", "\n", "\n" ] } ], "source": [ "print(string.printable)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:40.023145Z", "start_time": "2023-09-14T09:09:40.007358Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " \t\n", "\r\n", "\u000b\n", "\n", "\f\n", "\n", "\n" ] } ], "source": [ "print(string.whitespace)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:43.249150Z", "start_time": "2023-09-14T09:09:43.235124Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "' \\t\\n\\r\\x0b\\x0c'\n" ] } ], "source": [ "print(repr(string.whitespace))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 一些字符串的运算\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "字符串的加减" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:45.521869Z", "start_time": "2023-09-14T09:09:45.506240Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "abcdef\n", "abcabcabc\n" ] } ], "source": [ "print(\"abc\" + \"def\")\n", "print(\"abc\" * 3)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:47.408043Z", "start_time": "2023-09-14T09:09:47.050339Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "can only concatenate str (not \"int\") to str", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[1;32mIn [31], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mabc\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m+\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m3\u001b[39;49m)\n", "\u001b[1;31mTypeError\u001b[0m: can only concatenate str (not \"int\") to str" ] } ], "source": [ "print(\"abc\" + 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`in` 运算（超级好用！）" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:49.910658Z", "start_time": "2023-09-14T09:09:49.892029Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n", "False\n", "True\n", "True\n" ] } ], "source": [ "print(\"ring\" in \"strings\") # True\n", "print(\"wow\" in \"amazing!\") # False\n", "print(\"Yes\" in \"yes!\") # False\n", "print(\"\" in \"No way!\") # True\n", "print(\"聪明\" in \"聪明办法学 Python\") # True" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### 字符串索引和切片" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "单个字符索引\n", "\n", "索引可以让我们在特定位置找到一个字符" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:52.801994Z", "start_time": "2023-09-14T09:09:52.783931Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datawhale\n", "D\n", "a\n", "t\n", "a\n" ] } ], "source": [ "s = \"Datawhale\"\n", "print(s)\n", "print(s[0])\n", "print(s[1])\n", "print(s[2])\n", "print(s[3])" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:54.714638Z", "start_time": "2023-09-14T09:09:54.710629Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(s)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:55.822042Z", "start_time": "2023-09-14T09:09:55.812017Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e\n" ] } ], "source": [ "print(s[len(s)-1])" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:56.927440Z", "start_time": "2023-09-14T09:09:56.902804Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "IndexError", "evalue": "string index out of range", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mIndexError\u001b[0m Traceback (most recent call last)", "Cell \u001b[1;32mIn [36], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43ms\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;28;43mlen\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m)\u001b[49m\u001b[43m]\u001b[49m)\n", "\u001b[1;31mIndexError\u001b[0m: string index out of range" ] } ], "source": [ "print(s[len(s)])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "负数索引" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:09:58.771396Z", "start_time": "2023-09-14T09:09:58.751745Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datawhale\n", "w\n", "h\n", "a\n", "l\n", "e\n" ] } ], "source": [ "print(s)\n", "print(s[-5])\n", "print(s[-4])\n", "print(s[-3])\n", "print(s[-2])\n", "print(s[-1])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "用切片来获取字符串的一部分" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:10:00.515352Z", "start_time": "2023-09-14T09:10:00.508269Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data\n", "whale\n" ] } ], "source": [ "print(s[0:4])\n", "print(s[4:9])" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:10:01.783542Z", "start_time": "2023-09-14T09:10:01.774947Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Da\n", "ta\n", "ha\n", "le\n" ] } ], "source": [ "print(s[0:2])\n", "print(s[2:4])\n", "print(s[5:7])\n", "print(s[7:9])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "切片的默认参数" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:05:34.573951Z", "start_time": "2023-05-18T11:05:34.565938Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data\n", "whale\n", "Datawhale\n" ] } ], "source": [ "print(s[:4])\n", "print(s[4:])\n", "print(s[:])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "切片的第三个参数 `step`" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:07:15.613361Z", "start_time": "2023-05-18T11:07:15.597348Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Daa\n", "aa\n" ] } ], "source": [ "print(s[:9:3])\n", "print(s[1:4:2])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "翻转字符串" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:09:23.334713Z", "start_time": "2023-05-18T11:09:23.310710Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "elahwataD\n" ] } ], "source": [ "# 可以，但是不优雅\n", "print(s[::-1])" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:10:11.219671Z", "start_time": "2023-05-18T11:10:11.203650Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "elahwataD\n" ] } ], "source": [ "# 也可以，但是还是不够优雅\n", "print(\"\".join(reversed(s)))" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:11:29.389867Z", "start_time": "2023-05-18T11:11:29.381840Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "elahwataD\n" ] } ], "source": [ "# 实在是太优雅辣\n", "def reverseString(s):\n", " return s[::-1]\n", "\n", "print(reverseString(s))\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 字符串的循环" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "用索引的 for 循环" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:12:32.914355Z", "start_time": "2023-05-18T11:12:32.898357Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 D\n", "1 a\n", "2 t\n", "3 a\n", "4 w\n", "5 h\n", "6 a\n", "7 l\n", "8 e\n" ] } ], "source": [ "for i in range(len(s)):\n", " print(i, s[i])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "其实也可以不用索引（超级好用的 `in`）" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:13:16.499371Z", "start_time": "2023-05-18T11:13:16.483372Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "D\n", "a\n", "t\n", "a\n", "w\n", "h\n", "a\n", "l\n", "e\n" ] } ], "source": [ "for c in s:\n", " print(c)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:11:13.403586Z", "start_time": "2023-09-14T09:11:13.397556Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "也可以使用 `enumerate()` 获得元素的序号" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 D\n", "1 a\n", "2 t\n", "3 a\n", "4 w\n", "5 h\n", "6 a\n", "7 l\n", "8 e\n" ] } ], "source": [ "for idx, c in enumerate(s):\n", " print(idx, c)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`zip(a, b)` 可以在一次循环中，分别从 `a` 和 `b` 里同时取出一个元素" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "D e\n", "a l\n", "t a\n", "a h\n", "w w\n", "h a\n", "a t\n", "l a\n", "e D\n" ] } ], "source": [ "for a, b in zip(s, reverseString(s)):\n", " print(a, b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "用 `split()` 来循环" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:14:18.418258Z", "start_time": "2023-05-18T11:14:18.402254Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "learn\n", "python\n", "the\n", "smart\n", "way\n", "2nd\n", "edition\n" ] } ], "source": [ "# class_name.split() 本身会产生一个新的叫做“列表”的东西，但是它不存储任何内容\n", "\n", "class_name = \"learn python the smart way 2nd edition\"\n", "for word in class_name.split():\n", " print(word)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "用 `splitlines()` 来循环" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:15:26.429468Z", "start_time": "2023-05-18T11:15:26.413444Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prepare To Be Smart，我们希望同学们学习这个教程后能学习到聪明的办法，从容的迈入人工智能的后续学习。\n" ] } ], "source": [ "# 跟上面一样，class_info.splitlines() 也会产生一个列表，但不存储任何内容\n", "\n", "class_info = \"\"\"\\\n", "聪明办法学 Python 第二版是 Datawhale 基于第一版教程的一次大幅更新。我们尝试在教程中融入更多计算机科学与人工智能相关的内容，制作“面向人工智能的 Python 专项教程”。\n", "\n", "我们的课程简称为 P2S，有两个含义：\n", "\n", "Learn Python The Smart Way V2，“聪明办法学 Python 第二版”的缩写。\n", "Prepare To Be Smart，我们希望同学们学习这个教程后能学习到聪明的办法，从容的迈入人工智能的后续学习。\n", "\"\"\"\n", "for line in class_info.splitlines():\n", " if (line.startswith(\"Prepare To Be Smart\")):\n", " print(line)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 例子：回文判断" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如果一个句子正着读、反着读都是一样的，那它就叫做“回文”" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:17:14.547945Z", "start_time": "2023-05-18T11:17:14.523941Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def isPalindrome1(s):\n", " return (s == reverseString(s))" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:19:30.818148Z", "start_time": "2023-05-18T11:19:30.802467Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def isPalindrome2(s):\n", " for i in range(len(s)):\n", " if (s[i] != s[len(s)-1-i]):\n", " return False\n", " return True" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:19:32.028016Z", "start_time": "2023-05-18T11:19:32.019504Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def isPalindrome3(s):\n", " for i in range(len(s)):\n", " if (s[i] != s[-1-i]):\n", " return False\n", " return True" ] }, { "cell_type": "code", "execution_count": 99, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:19:33.489527Z", "start_time": "2023-05-18T11:19:33.473410Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def isPalindrome4(s):\n", " while (len(s) > 1):\n", " if (s[0] != s[-1]):\n", " return False\n", " s = s[1:-1]\n", " return True" ] }, { "cell_type": "code", "execution_count": 100, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:19:34.763872Z", "start_time": "2023-05-18T11:19:34.747861Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True False\n", "True False\n", "True False\n", "True False\n" ] } ], "source": [ "print(isPalindrome1(\"abcba\"), isPalindrome1(\"abca\"))\n", "print(isPalindrome2(\"abcba\"), isPalindrome2(\"abca\"))\n", "print(isPalindrome3(\"abcba\"), isPalindrome3(\"abca\"))\n", "print(isPalindrome4(\"abcba\"), isPalindrome4(\"abca\"))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 一些跟字符串相关的内置函数" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`str()` 和 `len()`\n" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:13:33.733039Z", "start_time": "2023-09-14T09:13:30.147004Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "输入你的名字: Datawhale\n", "Hi, Datawhale, 你的名字有 9 个字！\n" ] } ], "source": [ "name = input(\"输入你的名字: \")\n", "print(\"Hi, \" + name + \", 你的名字有 \" + str(len(name)) + \" 个字！\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`chr()` 和 `ord()`\n" ] }, { "cell_type": "code", "execution_count": 102, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:21:11.615355Z", "start_time": "2023-05-18T11:21:11.607706Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "65\n" ] } ], "source": [ "print(ord(\"A\"))" ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:21:32.966529Z", "start_time": "2023-05-18T11:21:32.950524Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A\n" ] } ], "source": [ "print(chr(65))" ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:22:08.004859Z", "start_time": "2023-05-18T11:22:07.988839Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "B\n" ] } ], "source": [ "print(\n", " chr(\n", " ord(\"A\") + 1\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:23:17.057398Z", "start_time": "2023-05-18T11:23:17.041398Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a\n" ] } ], "source": [ "print(chr(ord(\"A\") + ord(\" \")))" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:14:09.392607Z", "start_time": "2023-09-14T09:14:09.381572Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5.0\n" ] } ], "source": [ "# 它可以正常运行，但是我们不推荐你使用这个方法\n", "s = \"(3**2 + 4**2)**0.5\"\n", "print(eval(s))" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:29:07.725053Z", "start_time": "2023-09-14T09:29:07.715056Z" }, "init_cell": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "def 电脑当场爆炸():\n", " from rich.progress import (\n", " Progress, \n", " TextColumn, \n", " BarColumn, \n", " TimeRemainingColumn)\n", " import time\n", " from rich.markdown import Markdown\n", " from rich import print as rprint\n", " from rich.panel import Panel\n", "\n", "\n", " with Progress(TextColumn(\"[progress.description]{task.description}\"),\n", " BarColumn(),\n", " TimeRemainingColumn()) as progress:\n", " epoch_tqdm = progress.add_task(description=\"爆炸倒计时！\", total=100)\n", " for ep in range(100):\n", " time.sleep(0.1)\n", " progress.advance(epoch_tqdm, advance=1)\n", "\n", " rprint(Panel.fit(\"[red]Boom! R.I.P\"))\n" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:14:22.287390Z", "start_time": "2023-09-14T09:14:11.300311Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "

爆炸倒计时！ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 0:00:01\n",
       "

\n" ], "text/plain": [ "爆炸倒计时！ \u001b[38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[38;2;249;38;114m╸\u001b[0m \u001b[36m0:00:01\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

\n" ], "text/plain": [] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

\n",
       "

\n" ], "text/plain": [ "\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

╭─────────────╮\n",
       "│ Boom! R.I.P │\n",
       "╰─────────────╯\n",
       "

\n" ], "text/plain": [ "╭─────────────╮\n", "│ \u001b[31mBoom! R.I.P\u001b[0m │\n", "╰─────────────╯\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "s = \"电脑当场爆炸()\"\n", "eval(s) # 如果这是一串让电脑爆炸的恶意代码，那会发生什么" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:14:28.579513Z", "start_time": "2023-09-14T09:14:28.565190Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['p', 2, 's']\n", "\n" ] } ], "source": [ "# 推荐使用 ast.literal_eval()\n", "\n", "import ast\n", "s_safe = \"['p', 2, 's']\"\n", "s_safe_result = ast.literal_eval(s_safe)\n", "print(s_safe_result)\n", "print(type(s_safe_result))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 一些字符串方法" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:14:30.895074Z", "start_time": "2023-09-14T09:14:30.875322Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " s isalnum isalpha isdigit islower isspace isupper\n", " ABCD True True False False False True \n", " ABcd True True False False False False \n", " abcd True True False True False False \n", " ab12 True False False True False False \n", " 1234 True False True False False False \n", " False False False False True False \n", " AB?! False False False False False True \n" ] } ], "source": [ "def p(test):\n", " print(\"True \" if test else \"False \", end=\"\")\n", "def printRow(s):\n", " print(\" \" + s + \" \", end=\"\")\n", " p(s.isalnum())\n", " p(s.isalpha())\n", " p(s.isdigit())\n", " p(s.islower())\n", " p(s.isspace())\n", " p(s.isupper())\n", " print()\n", "def printTable():\n", " print(\" s isalnum isalpha isdigit islower isspace isupper\")\n", " for s in \"ABCD,ABcd,abcd,ab12,1234, ,AB?!\".split(\",\"):\n", " printRow(s)\n", "printTable()" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "ExecuteTime": { "end_time": "2023-09-14T09:14:32.391065Z", "start_time": "2023-09-14T09:14:32.386026Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "yyds yysy xswl dddd\n", "FBI! OPEN THE DOOR!!!\n" ] } ], "source": [ "print(\"YYDS YYSY XSWL DDDD\".lower())\n", "print(\"fbi! open the door!!!\".upper())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "![](../resources/slides/chap6/ddddddddd.jpg)" ] }, { "cell_type": "code", "execution_count": 112, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:27:51.499085Z", "start_time": "2023-05-18T11:27:51.491081Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "strip() 可以将字符串首尾的空格删除\n" ] } ], "source": [ "print(\" strip() 可以将字符串首尾的空格删除 \".strip())" ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:28:32.474044Z", "start_time": "2023-05-18T11:28:32.458002Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "聪明办法学 C\n", "Hugging SD, Hugging Future\n" ] } ], "source": [ "print(\"聪明办法学 Python\".replace(\"Python\", \"C\"))\n", "print(\"Hugging LLM, Hugging Future\".replace(\"LLM\", \"SD\", 1)) # count = 1" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:28:51.729205Z", "start_time": "2023-05-18T11:28:51.713179Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "学Python, 就找 Datawhale\n" ] } ], "source": [ "s = \"聪明办法学Python, 就找 Datawhale\"\n", "t = s.replace(\"聪明办法\", \"\")\n", "print(t)" ] }, { "cell_type": "code", "execution_count": 115, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:29:06.943514Z", "start_time": "2023-05-18T11:29:06.935480Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n", "2\n" ] } ], "source": [ "print(\"This is a history test\".count(\"is\"))\n", "print(\"This IS a history test\".count(\"is\"))" ] }, { "cell_type": "code", "execution_count": 116, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:29:35.317879Z", "start_time": "2023-05-18T11:29:35.301840Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n" ] } ], "source": [ "print(\"Dogs and cats!\".startswith(\"Do\"))\n", "print(\"Dogs and cats!\".startswith(\"Don't\"))" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:29:48.861851Z", "start_time": "2023-05-18T11:29:48.845851Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n" ] } ], "source": [ "print(\"Dogs and cats!\".endswith(\"!\"))\n", "print(\"Dogs and cats!\".endswith(\"rats!\"))" ] }, { "cell_type": "code", "execution_count": 118, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:29:56.057130Z", "start_time": "2023-05-18T11:29:56.049110Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5\n", "-1\n" ] } ], "source": [ "print(\"Dogs and cats!\".find(\"and\"))\n", "print(\"Dogs and cats!\".find(\"or\"))" ] }, { "cell_type": "code", "execution_count": 119, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:30:13.730980Z", "start_time": "2023-05-18T11:30:13.706632Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5\n" ] }, { "ename": "ValueError", "evalue": "substring not found", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[1;32mc:\\Coding\\Datawhale\\Python_Tutorial\\learn-python-the-smart-way-v2\\slides\\chapter_6-Strings.ipynb Cell 112\u001b[0m line \u001b[0;36m2\n\u001b[0;32m 1\u001b[0m \u001b[39mprint\u001b[39m(\u001b[39m\"\u001b[39m\u001b[39mDogs and cats!\u001b[39m\u001b[39m\"\u001b[39m\u001b[39m.\u001b[39mindex(\u001b[39m\"\u001b[39m\u001b[39mand\u001b[39m\u001b[39m\"\u001b[39m))\n\u001b[1;32m----> 2\u001b[0m \u001b[39mprint\u001b[39m(\u001b[39m\"\u001b[39;49m\u001b[39mDogs and cats!\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39m.\u001b[39;49mindex(\u001b[39m\"\u001b[39;49m\u001b[39mor\u001b[39;49m\u001b[39m\"\u001b[39;49m))\n", "\u001b[1;31mValueError\u001b[0m: substring not found" ] } ], "source": [ "print(\"Dogs and cats!\".index(\"and\"))\n", "print(\"Dogs and cats!\".index(\"or\"))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 用 `f-string` 格式化字符串" ] }, { "cell_type": "code", "execution_count": 120, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:30:26.129186Z", "start_time": "2023-05-18T11:30:26.113175Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "你知道 42 + 99 是 141 吗？\n" ] } ], "source": [ "x = 42\n", "y = 99\n", "\n", "print(f'你知道 {x} + {y} 是 {x+y} 吗？')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 其他格式化字符串的方法" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如果要格式化字符串的话，**f-string 是个很棒的方法**，Python 还有其他方法去格式化字符串：\n", "- `%` 操作\n", "- `format()` 方法\n", "\n", "参考资料：\n", "- [Class Notes: String Formatting - CMU 15-112](http://www.cs.cmu.edu/~112-f22/notes/notes-string-formatting.html)\n", "- [Python 字符串 - 菜鸟教程](https://www.runoob.com/python3/python3-string.html#:~:text=%5Cn%0A%5Cn-,Python%20%E5%AD%97%E7%AC%A6%E4%B8%B2%E6%A0%BC%E5%BC%8F%E5%8C%96,-Python%20%E6%94%AF%E6%8C%81%E6%A0%BC%E5%BC%8F%E5%8C%96)\n", "- [Python format 格式化函数 - 菜鸟教程](https://www.runoob.com/python/att-string-format.html)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 字符串是不可变的" ] }, { "cell_type": "code", "execution_count": 121, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:30:49.649750Z", "start_time": "2023-05-18T11:30:49.618061Z" } }, "outputs": [ { "ename": "TypeError", "evalue": "'str' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32mc:\\Coding\\Datawhale\\Python_Tutorial\\learn-python-the-smart-way-v2\\slides\\chapter_6-Strings.ipynb Cell 118\u001b[0m line \u001b[0;36m2\n\u001b[0;32m 1\u001b[0m s \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mDatawhale\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m----> 2\u001b[0m s[\u001b[39m3\u001b[39m] \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39me\u001b[39m\u001b[39m\"\u001b[39m\n", "\u001b[1;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } ], "source": [ "s = \"Datawhale\"\n", "s[3] = \"e\" # Datewhale" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "你必须创建一个新的字符串" ] }, { "cell_type": "code", "execution_count": 122, "metadata": { "ExecuteTime": { "end_time": "2023-05-18T11:31:21.963982Z", "start_time": "2023-05-18T11:31:21.955966Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datewhale\n" ] } ], "source": [ "s = s[:3] + \"e\" + s[4:]\n", "print(s)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 字符串和别名" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "字符串是不可变的，所以它的别名也是不可变的" ] }, { "cell_type": "code", "execution_count": 123, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datawhale\n", "Data\n" ] } ], "source": [ "s = 'Data' # s 引用了字符串 “Data”\n", "t = s # t 只是 “Data” 的一个只读别名\n", "s += 'whale'\n", "print(s)\n", "print(t)" ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "'str' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32mc:\\Coding\\Datawhale\\Python_Tutorial\\learn-python-the-smart-way-v2\\slides\\chapter_6-Strings.ipynb Cell 124\u001b[0m line \u001b[0;36m1\n\u001b[1;32m----> 1\u001b[0m t[\u001b[39m3\u001b[39m] \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39me\u001b[39m\u001b[39m\"\u001b[39m\n", "\u001b[1;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } ], "source": [ "t[3] = \"e\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 基础文件操作" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## `Open()` 函数 \n", "Python `open()` 函数用于打开一个文件，并返回文件对象，在对文件进行处理过程都需要使用到这个函数。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`open(file, mode)` 函数主要有 `file` 和 `mode` 两个参数，其中 `file` 为需要读写文件的路径。`mode` 为读取文件时的模式，常用的模式有以下几个：\n", "\n", "- `r`：以字符串的形式读取文件。\n", "- `rb`：以二进制的形式读取文件。\n", "- `w`：写入文件。\n", "- `a`：追加写入文件。\n", "\n", "不同模式下返回的文件对象功能也会不同。" ] }, { "cell_type": "code", "execution_count": 125, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "file = open(\"chap6_demo.txt\", \"w\")\n", "dw_text = \"Datawhale\"\n", "file.write(dw_text)\n", "file.close()" ] }, { "cell_type": "code", "execution_count": 126, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "file = open('chap6_demo.txt', 'r')\n", "print(type(file))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 文件对象" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`open` 函数会返回一个文件对象。在进行文件操作前，我们首先需要了解文件对象提供了哪些常用的方法：\n", "\n", "- `close( )`: 关闭文件\n", "- 在 **r** 与 **rb** 模式下：\n", " - `read()`: 读取整个文件\n", " - `readline()`: 读取文件的一行\n", " - `readlines()`: 读取文件的所有行\n", "- 在 **w** 与 **a** 模式下：\n", " - `write()`: \n", " - `writelines()`: " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "下面我们通过实例学习这几种方法：" ] }, { "cell_type": "code", "execution_count": 127, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datawhale\n" ] } ], "source": [ "## 通过 read 方法读取整个文件\n", "content = file.read()\n", "print(content)" ] }, { "cell_type": "code", "execution_count": 128, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "## 通过 readline() 读取文件的一行\n", "content = file.readline()\n", "print(content)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**代码竟然什么也没输出，这是为什么？**" ] }, { "cell_type": "code", "execution_count": 129, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datawhale\n" ] } ], "source": [ "## 关闭之前打开的 chap6_demo.txt 文件\n", "file.close()\n", "## 重新打开\n", "file = open('chap6_demo.txt', 'r')\n", "content = file.readline()\n", "print(content)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**注意每次操作结束后，及时通过 `close( )` 方法关闭文件**" ] }, { "cell_type": "code", "execution_count": 130, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "## 以 w 模式打开文件chap6_demo.txt\n", "file = open('chap6_demo.txt', 'w')\n", "## 创建需要写入的字符串变量在字符串中 \\n 代表换行（也就是回车）\n", "content = 'Data\\nwhale\\n'\n", "## 写入到 chap6_demo.txt 文件中\n", "file.write(content)\n", "## 关闭文件对象\n", "file.close()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "w 模式会覆盖之前的文件。如果你想在文件后面追加内容，可以使用 a 模式操作。" ] }, { "cell_type": "code", "execution_count": 131, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "## 以 w 模式打开文件chap6_demo.txt\n", "file = open('chap6_demo.txt', 'w')\n", "## 创建需要追加的字符串变量\n", "content = 'Hello smart way!!!'\n", "## 写入到 chap6_demo.txt 文件中\n", "file.write(content)\n", "## 关闭文件对象\n", "file.close()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## with 语句\n", "> 我不想写 close() 啦！" ] }, { "cell_type": "code", "execution_count": 132, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The Zen of Python, by Tim Peters\n", "\n", "Beautiful is better than ugly.\n", "Explicit is better than implicit.\n", "Simple is better than complex.\n", "Complex is better than complicated.\n", "Flat is better than nested.\n", "Sparse is better than dense.\n", "Readability counts.\n", "Special cases aren't special enough to break the rules.\n", "Although practicality beats purity.\n", "Errors should never pass silently.\n", "Unless explicitly silenced.\n", "In the face of ambiguity, refuse the temptation to guess.\n", "There should be one-- and preferably only one --obvious way to do it.\n", "Although that way may not be obvious at first unless you're Dutch.\n", "Now is better than never.\n", "Although never is often better than *right* now.\n", "If the implementation is hard to explain, it's a bad idea.\n", "If the implementation is easy to explain, it may be a good idea.\n", "Namespaces are one honking great idea -- let's do more of those!\n" ] } ], "source": [ "import this" ] }, { "cell_type": "code", "execution_count": 133, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "Caesar_cipher = \"\"\"s = \\\"\\\"\\\"Gur Mra bs Clguba, ol Gvz Crgref\n", "\n", "Ornhgvshy vf orggre guna htyl.\n", "Rkcyvpvg vf orggre guna vzcyvpvg.\n", "Fvzcyr vf orggre guna pbzcyrk.\n", "Pbzcyrk vf orggre guna pbzcyvpngrq.\n", "Syng vf orggre guna arfgrq.\n", "Fcnefr vf orggre guna qrafr.\n", "Ernqnovyvgl pbhagf.\n", "Fcrpvny pnfrf nera'g fcrpvny rabhtu gb oernx gur ehyrf.\n", "Nygubhtu cenpgvpnyvgl orngf chevgl.\n", "Reebef fubhyq arire cnff fvyragyl.\n", "Hayrff rkcyvpvgyl fvyraprq.\n", "Va gur snpr bs nzovthvgl, ershfr gur grzcgngvba gb thrff.\n", "Gurer fubhyq or bar-- naq cersrenoyl bayl bar --boivbhf jnl gb qb vg.\n", "Nygubhtu gung jnl znl abg or boivbhf ng svefg hayrff lbh'er Qhgpu.\n", "Abj vf orggre guna arire.\n", "Nygubhtu arire vf bsgra orggre guna *evtug* abj.\n", "Vs gur vzcyrzragngvba vf uneq gb rkcynva, vg'f n onq vqrn.\n", "Vs gur vzcyrzragngvba vf rnfl gb rkcynva, vg znl or n tbbq vqrn.\n", "Anzrfcnprf ner bar ubaxvat terng vqrn -- yrg'f qb zber bs gubfr!\\\"\\\"\\\"\n", "\n", "d = {}\n", "for c in (65, 97):\n", " for i in range(26):\n", " d[chr(i+c)] = chr((i+13) % 26 + c)\n", "\n", "print(\"\".join([d.get(c, c) for c in s]))\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 134, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1003\n" ] } ], "source": [ "with open(\"ZenOfPy.py\", \"w\", encoding=\"utf-8\") as file:\n", " file.write(Caesar_cipher)\n", " print(len(Caesar_cipher))" ] }, { "cell_type": "code", "execution_count": 135, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The Zen of Python, by Tim Peters\n", "\n", "Beautiful is better than ugly.\n", "Explicit is better than implicit.\n", "Simple is better than complex.\n", "Complex is better than complicated.\n", "Flat is better than nested.\n", "Sparse is better than dense.\n", "Readability counts.\n", "Special cases aren't special enough to break the rules.\n", "Although practicality beats purity.\n", "Errors should never pass silently.\n", "Unless explicitly silenced.\n", "In the face of ambiguity, refuse the temptation to guess.\n", "There should be one-- and preferably only one --obvious way to do it.\n", "Although that way may not be obvious at first unless you're Dutch.\n", "Now is better than never.\n", "Although never is often better than *right* now.\n", "If the implementation is hard to explain, it's a bad idea.\n", "If the implementation is easy to explain, it may be a good idea.\n", "Namespaces are one honking great idea -- let's do more of those!\n" ] } ], "source": [ "import ZenOfPy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 总结\n", "\n", "- 单引号与双引号要适时出现，多行文本用三引号。\n", "- 字符串中可以包含转义序列。\n", "- `repr()` 能够显示出更多的信息。\n", "- 字符串本身包含许多内置方法，`in` 是一个特别好用的玩意。\n", "- 字符串是不可变的常量。\n", "- 文件操作推荐使用 `with open(\"xxx\") as yyy`，这样就不用写 `f.close()` 啦。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "首先在这里恭喜各位同学**完成了《聪明办法学 Python 第二版》基础部分的全部学习内容！** 在这 6 章的学习中希望你可以体会到本教程简洁明快的风格，这种风格与 Python 本身的代码设计风格是一致的。在 Tim Peters 编写的 “Python 之禅” 中的核心指导原则就是 “Simple is better than complex”。\n", "\n", "纵观历史，你会看到苹果创始人史蒂夫·乔布斯对 Less is More 的追求，看到无印良品“删繁就简，去其浮华”的核心设计理念，看到山下英子在《断舍离》中对生活做减法的观点，甚至看到苏东坡“竹杖芒鞋轻胜马，一蓑烟雨任平生”的人生态度。你会发现**极简主义不只存在于 Python 编程中，它本就是这个世界优雅的一条运行法则。**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "

我们进阶部分再见！

" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Thank You ;-) The End\n", "Datawhale 聪明办法学 Python 教学团队出品\n", "\n", "## 关注我们\n", "Datawhale 是一个专注 AI 领域的开源组织，以“for the learner，和学习者一起成长”为愿景，构建对学习者最有价值的开源学习社区。关注我们，一起学习成长。\n", "

" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" }, "rise": { "overlay": "

" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }