{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 数据处理03:Python数据分析库Pandas\n", "Pandas 是最强大的 Python 数据分析库,它在 NumPy 基础之上构建,功能完善、性能出色并且操作便捷。项目官网 http://pandas.pydata.org/\n", "\n", "\n", "Pandas 已包含于 Anaconda 中,导入模块时请按惯例命名为 pd:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.24.2'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "pd.__version__ # 查看版本号" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas 所提供的对象类型主要有“数据系列”(Series)和“数据网格”(DataFrame)——Series 像是一维数组而 DataFrame 像是二维数组,与数组的关键区别在于它们包含可自定义的“数据索引”(Index),类似于字典的键。DataFrame 中的列就是 Series 对象,每一列有各自的数据类型但共享相同的 Index。让我们先调用构造器创建一个 Series:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 北京\n", "1 上海\n", "2 广州\n", "3 深圳\n", "dtype: object" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series([\"北京\", \"上海\", \"广州\", \"深圳\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Series 对象的 index 属性指向所用的索引:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RangeIndex(start=0, stop=4, step=1)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Out[2].index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面创建一个新的 Series 并使用城市名拼音缩写作为自定义索引:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['bj', 'sh', 'gz', 'sz'], dtype='object')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cityname = pd.Series([\"北京\", \"上海\", \"广州\", \"深圳\"], index=[\"bj\", \"sh\", \"gz\", \"sz\"])\n", "cityname.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "需要注意的是默认的序列索引仍然有效:前者称为显式索引而后者称为隐式索引,当以整数作为显式索引时这可能会引发混淆,因此你还可以用“定位器”属性 loc 和 iloc 来明确指定索引方式:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'深圳'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cityname[\"sz\"]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'深圳'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cityname[-1]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'北京'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cityname.loc[\"bj\"]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'北京'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cityname.iloc[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面让我们再尝试创建 DataFrame,所用方式是向构造器传入一个由可索引对象组成的字典,所生成 DataFrame 的列数据和列标签就是字典的值和键,行索引是一个由所有列数据共用的 Index,列索引则是一个由所有列标签组成的 Index:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " </tr>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口\n", "bj 北京 1877.70\n", "gz 广州 1246.83\n", "sh 上海 2115.00\n", "sz 深圳 1137.89" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "citypop = {\"bj\": 1877.7, \"sh\": 2115, \"gz\": 1246.83, \"sz\": 1137.89}\n", "df = pd.DataFrame({\"名称\": cityname, \"人口\": citypop})\n", "df" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['bj', 'gz', 'sh', 'sz'], dtype='object')" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index # index属性指向行索引" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['名称', '人口'], dtype='object')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns # columns属性指向列索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "你可以使用 Series 与 DataFrame 的索引、属性或方法,以及模块的函数对数据执行各种操作,包括读取、更新和运算等等:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bj 北京\n", "gz 广州\n", "sh 上海\n", "sz 深圳\n", "Name: 名称, dtype: object" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"名称\"] # 以索引方式获取列" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bj 1877.70\n", "gz 1246.83\n", "sh 2115.00\n", "sz 1137.89\n", "Name: 人口, dtype: float64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.人口 # 以属性方式获取以标识符规则命名的列" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口\n", "sh 上海 2115.00\n", "sz 深圳 1137.89" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[2:] # 以序列索引方式获取行" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6377.42" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"人口\"].sum() # 列数据求和" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " </tr>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口\n", "sh 上海 2115.00\n", "bj 北京 1877.70\n", "gz 广州 1246.83\n", "sz 深圳 1137.89" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_values(\"人口\", ascending=False) # 按人口列降序排列" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "注意方法和函数默认会返回新对象,索引操作则会在原地修改:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " <th>区号</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " <td>010</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " <td>020</td>\n", " </tr>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " <td>021</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " <td>0755</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口 区号\n", "bj 北京 1877.70 010\n", "gz 广州 1246.83 020\n", "sh 上海 2115.00 021\n", "sz 深圳 1137.89 0755" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"区号\"] = [\"010\", \"020\", \"021\", \"0755\"] # 添加新列\n", "df" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " <th>区号</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " <td>010</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " <td>020</td>\n", " </tr>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " <td>021</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " <td>0755</td>\n", " </tr>\n", " <tr>\n", " <th>tj</th>\n", " <td>天津</td>\n", " <td>875.24</td>\n", " <td>022</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口 区号\n", "bj 北京 1877.70 010\n", "gz 广州 1246.83 020\n", "sh 上海 2115.00 021\n", "sz 深圳 1137.89 0755\n", "tj 天津 875.24 022" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"tj\"] = [\"天津\", 875.24, \"022\"] # 添加新行\n", "df" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " <th>区号</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " <td>010</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " <td>020</td>\n", " </tr>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " <td>021</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " <td>0755</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口 区号\n", "bj 北京 1877.70 010\n", "gz 广州 1246.83 020\n", "sh 上海 2115.00 021\n", "sz 深圳 1137.89 0755" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df.人口 >= 1000] # 按条件筛选" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bj True\n", "gz True\n", "sh True\n", "sz True\n", "tj False\n", "Name: 人口, dtype: bool" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.人口 >= 1000 # 筛选的原理是用布尔值系列来索引" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>名称</th>\n", " <th>人口</th>\n", " <th>区号</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>bj</th>\n", " <td>北京</td>\n", " <td>1877.70</td>\n", " <td>010</td>\n", " </tr>\n", " <tr>\n", " <th>gz</th>\n", " <td>广州</td>\n", " <td>1246.83</td>\n", " <td>020</td>\n", " </tr>\n", " <tr>\n", " <th>sh</th>\n", " <td>上海</td>\n", " <td>2115.00</td>\n", " <td>021</td>\n", " </tr>\n", " <tr>\n", " <th>sz</th>\n", " <td>深圳</td>\n", " <td>1137.89</td>\n", " <td>0755</td>\n", " </tr>\n", " <tr>\n", " <th>tj</th>\n", " <td>天津</td>\n", " <td>875.24</td>\n", " <td>022</td>\n", " </tr>\n", " <tr>\n", " <th>cq</th>\n", " <td>重庆</td>\n", " <td>851.80</td>\n", " <td>023</td>\n", " </tr>\n", " <tr>\n", " <th>nj</th>\n", " <td>南京</td>\n", " <td>617.82</td>\n", " <td>025</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 名称 人口 区号\n", "bj 北京 1877.70 010\n", "gz 广州 1246.83 020\n", "sh 上海 2115.00 021\n", "sz 深圳 1137.89 0755\n", "tj 天津 875.24 022\n", "cq 重庆 851.80 023\n", "nj 南京 617.82 025" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = pd.DataFrame([[\"重庆\", 851.8, \"023\"], [\"南京\", 617.82, \"025\"]], columns=[\"名称\", \"人口\", \"区号\"], index=[\"cq\", \"nj\"])\n", "pd.concat([df, df2]) # 拼接两个 DataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "接下来的例子是对中国历史上皇帝们的寿命进行统计分析,这次使用现成数据来生成 DataFrame。Pandas 支持读取多种类型的资源,例如以逗号作为分隔符的文本格式(CSV):" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "数据网格形状: (302, 5)\n", "各列数据类型:\n", "num int64\n", "name object\n", "age int64\n", "year object\n", "dynasty object\n", "dtype: object\n" ] } ], "source": [ "# 短网址对应的原文件\n", "# https://gitee.com/freesand/pyStudy/raw/master/data/emperor.csv\n", "# df = pd.read_csv(\"http://t.cn/EMl0NtB\")\n", "df = pd.read_csv(\"emperor.csv\")\n", "print(\"数据网格形状:\", df.shape)\n", "print(\"各列数据类型:\")\n", "print(df.dtypes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对于大尺寸 DataFrame,推荐先用 shape 和 dtypes 属性查看形状和列数据类型,也可用 head() 方法预览前 5 行内容,DataFrame 在 Jupyter Notebook 中会以表格形式输出:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>num</th>\n", " <th>name</th>\n", " <th>age</th>\n", " <th>year</th>\n", " <th>dynasty</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>秦始皇嬴政</td>\n", " <td>50</td>\n", " <td>前259年—前210年</td>\n", " <td>秦</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>2</td>\n", " <td>秦二世嬴胡亥</td>\n", " <td>24</td>\n", " <td>前230年—前207年</td>\n", " <td>秦</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>3</td>\n", " <td>汉高帝刘邦</td>\n", " <td>62</td>\n", " <td>前256年—前195年</td>\n", " <td>西汉</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>4</td>\n", " <td>汉惠帝刘盈</td>\n", " <td>23</td>\n", " <td>前210年—前188年</td>\n", " <td>西汉</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>5</td>\n", " <td>汉文帝刘恒</td>\n", " <td>46</td>\n", " <td>前202年—前157年</td>\n", " <td>西汉</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " num name age year dynasty\n", "0 1 秦始皇嬴政 50 前259年—前210年 秦\n", "1 2 秦二世嬴胡亥 24 前230年—前207年 秦\n", "2 3 汉高帝刘邦 62 前256年—前195年 西汉\n", "3 4 汉惠帝刘盈 23 前210年—前188年 西汉\n", "4 5 汉文帝刘恒 46 前202年—前157年 西汉" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对于已生成的 DataFrame,还可以进行一些调整操作(修改列标签、去除多余内容等)再开始数据分析,例如列出寿命达到 80 岁的皇帝:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>序号</th>\n", " <th>名号</th>\n", " <th>寿命</th>\n", " <th>生卒</th>\n", " <th>朝代</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>100</th>\n", " <td>100</td>\n", " <td>梁武帝萧衍</td>\n", " <td>86</td>\n", " <td>464年—549年</td>\n", " <td>南朝梁</td>\n", " </tr>\n", " <tr>\n", " <th>149</th>\n", " <td>149</td>\n", " <td>武则天武瞾</td>\n", " <td>82</td>\n", " <td>624年—705年</td>\n", " <td>唐</td>\n", " </tr>\n", " <tr>\n", " <th>208</th>\n", " <td>207</td>\n", " <td>宋高宗赵构</td>\n", " <td>81</td>\n", " <td>1107年—1187年</td>\n", " <td>南宋</td>\n", " </tr>\n", " <tr>\n", " <th>253</th>\n", " <td>252</td>\n", " <td>元世祖孛儿只斤·忽必烈</td>\n", " <td>80</td>\n", " <td>1215年—1294年</td>\n", " <td>元</td>\n", " </tr>\n", " <tr>\n", " <th>295</th>\n", " <td>296</td>\n", " <td>清高宗(乾隆)爱新觉罗·弘历</td>\n", " <td>89</td>\n", " <td>1711年—1799年</td>\n", " <td>清</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 序号 名号 寿命 生卒 朝代\n", "100 100 梁武帝萧衍 86 464年—549年 南朝梁\n", "149 149 武则天武瞾 82 624年—705年 唐\n", "208 207 宋高宗赵构 81 1107年—1187年 南宋\n", "253 252 元世祖孛儿只斤·忽必烈 80 1215年—1294年 元\n", "295 296 清高宗(乾隆)爱新觉罗·弘历 89 1711年—1799年 清" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns = [\"序号\", \"名号\", \"寿命\", \"生卒\", \"朝代\"]\n", "df[df.寿命 >= 80]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "筛选出明清两朝的皇帝:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>序号</th>\n", " <th>名号</th>\n", " <th>寿命</th>\n", " <th>生卒</th>\n", " <th>朝代</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>267</th>\n", " <td>266</td>\n", " <td>明太祖(洪武)朱元璋</td>\n", " <td>71</td>\n", " <td>1328年—1398年</td>\n", " <td>明</td>\n", " </tr>\n", " <tr>\n", " <th>268</th>\n", " <td>267</td>\n", " <td>明惠宗(建文)朱允炆</td>\n", " <td>26</td>\n", " <td>1377年—1402年</td>\n", " <td>明</td>\n", " </tr>\n", " <tr>\n", " <th>269</th>\n", " <td>268</td>\n", " <td>明成祖(永乐)朱棣</td>\n", " <td>65</td>\n", " <td>1360年—1424年</td>\n", " <td>明</td>\n", " </tr>\n", " <tr>\n", " <th>270</th>\n", " <td>269</td>\n", " <td>明仁宗(洪熙)朱高炽</td>\n", " <td>48</td>\n", " <td>1378年—1425年</td>\n", " <td>明</td>\n", " </tr>\n", " <tr>\n", " <th>271</th>\n", " <td>270</td>\n", " <td>明宣宗(宣德)朱瞻基</td>\n", " <td>38</td>\n", " <td>1398年—1435年</td>\n", " <td>明</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 序号 名号 寿命 生卒 朝代\n", "267 266 明太祖(洪武)朱元璋 71 1328年—1398年 明\n", "268 267 明惠宗(建文)朱允炆 26 1377年—1402年 明\n", "269 268 明成祖(永乐)朱棣 65 1360年—1424年 明\n", "270 269 明仁宗(洪熙)朱高炽 48 1378年—1425年 明\n", "271 270 明宣宗(宣德)朱瞻基 38 1398年—1435年 明" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mingqing = df[df.朝代.isin([\"明\", \"清\"])]\n", "mingqing.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "比较明清两朝的皇帝寿命——聚合输出分组总计数、最低值、最高值、平均值、中位数:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>min</th>\n", " <th>max</th>\n", " <th>mean</th>\n", " <th>median</th>\n", " </tr>\n", " <tr>\n", " <th>朝代</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>明</th>\n", " <td>16</td>\n", " <td>23</td>\n", " <td>71</td>\n", " <td>42.187500</td>\n", " <td>38.0</td>\n", " </tr>\n", " <tr>\n", " <th>清</th>\n", " <td>12</td>\n", " <td>19</td>\n", " <td>89</td>\n", " <td>53.333333</td>\n", " <td>59.5</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count min max mean median\n", "朝代 \n", "明 16 23 71 42.187500 38.0\n", "清 12 19 89 53.333333 59.5" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compare = mingqing.groupby(\"朝代\").寿命.agg([\"count\", \"min\", \"max\", \"mean\", \"median\"])\n", "compare" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "还可以根据全体皇帝的寿命数据绘制直方图来显示值的分布:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([10., 25., 48., 62., 56., 49., 36., 11., 5., 0.]),\n", " array([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., 100.]),\n", " <a list of 10 Patch objects>)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeAAAAFKCAYAAADFU4wdAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAFixJREFUeJzt3V1s0/fdhvE7JIueuYHMtmyS0ahVolaraMkOVlGjtlOdEYJCVatD6qaqB96mVFO1LKQgAVErbRpwsD6I7GSqhaaFSZ3QKHIqslZRTBkodKxbWTmADaES8SKII9sJJOYtxs/Bo2btSrDj2PnazvU5Gm74+9ZPkGuxwz8V6XQ6LQAAsKCWWA8AAGAxIsAAABggwAAAGCDAAAAYIMAAABggwAAAGKhayCcbG7ue1+s5nQ4lEsm8XnOx4Qzzg3OcP85w/jjD+cv3GXo8S2f9byX9FXBVVaX1hJLHGeYH5zh/nOH8cYbzt5BnWNIBBgCgVBFgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMLOgPYwDmKpWSRkYqrGdk5HJZLwBQaggwitrISIV+/utTctQW7094SU449Idda+R0Wi8BUEoIMIqeozapGueU9QwAyCveAwYAwAABBgDAAAEGAMAAAQYAwEBWAb527Zo6OzvV1tam9evX6+TJkxofH1cwGFRra6uCwaAmJiYKvRUAgLKRVYB37NihZ555Rh988IH6+/vV1NSkUCgkn8+nwcFB+Xw+hUKhQm8FAKBsZAzw5OSkPv74Y23cuFGSVF1drWXLlikSiSgQCEiSAoGAhoaGCrsUAIAykvHfAV+8eFEul0vbtm3Tv/71L61cuVI9PT2KxWLyer2SJK/Xq3g8nvHJnE6Hqqoq57/6CzyepXm93mJUzGeYSFgvyF4xn2Op4AznjzOcv4U6w4wBnp6e1unTp/XGG2+oublZv/rVr3J+uTmRyO/djDyepRobu57Xay42xX6G8Xjx34byc8V8jqWg2P8slgLOcP7yfYb3i3nGl6Dr6upUV1en5uZmSVJbW5tOnz4tt9utaDQqSYpGo3JxM1wAALKWMcAej0d1dXX67LPPJEkfffSRmpqa5Pf7FQ6HJUnhcFgtLS2FXQoAQBnJ6l7Qb7zxhjZv3qw7d+6ooaFBu3bt0t27d9XV1aUDBw6ovr5evb29hd4KAEDZyCrAjz32mA4ePPiVx/v6+vI+CACAxYA7YQEAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGAgqx/GAGB26bvS+fNSPF5hPWVWDz+cVmWl9QoAX0SAgXm6cd2hN0PH5ahNWk+5p+SEQ71bVqmpKW09BcAXEGAgDxy1SdU4p6xnACghvAcMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAIABAgwAgAECDACAAQIMAICBqmw+yO/364EHHtCSJUtUWVmpgwcPanx8XJs2bdLly5e1YsUK7dmzR7W1tYXeCwBAWcj6K+C+vj719/fr4MGDkqRQKCSfz6fBwUH5fD6FQqGCjQQAoNzk/BJ0JBJRIBCQJAUCAQ0NDeVtFAAA5S6rl6Al6cc//rEqKir00ksv6aWXXlIsFpPX65Ukeb1exePxjNdwOh2qqqrMfe09eDxL83q9xaiYzzCRsF5QHlyuGnk81isyK+Y/i6WCM5y/hTrDrAL8xz/+UcuXL1csFlMwGFRjY2NOT5ZIJHP6fbPxeJZqbOx6Xq+52BT7GcbjFdYTykI8PqmxsbT1jPsq9j+LpYAznL98n+H9Yp7VS9DLly+XJLndbq1du1anTp2S2+1WNBqVJEWjUblcrjxMBQBgccgY4GQyqcnJyZn/PTw8rEceeUR+v1/hcFiSFA6H1dLSUtilAACUkYwvQcdiMb322muSpFQqpQ0bNujZZ5/VE088oa6uLh04cED19fXq7e0t+FgAc5e+K124UPwv5fMiGhabjAFuaGjQe++995XHnU6n+vr6CjIKQP7cuO7Q/+7/VI7a/H4PRj4lJxz6w641cjqtlwALJ+vvggZQuhy1SdU4p6xnAPgCbkUJAIABvgJexFIp6ezZ4v6nPqXw3iUA5IIAL2IjIxX6+a+PF/V7g7FLLrkftF4BAPlHgBe5Yn9vMDnhsJ4AAAXBe8AAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABjIOsCpVEqBQECvvvqqJGl8fFzBYFCtra0KBoOamJgo2EgAAMpN1gHet2+fmpqaZn4dCoXk8/k0ODgon8+nUChUkIEAAJSjrAJ89epVHTlyRBs3bpx5LBKJKBAISJICgYCGhoYKsxAAgDKUVYB37typLVu2aMmS/3x4LBaT1+uVJHm9XsXj8cIsBACgDFVl+oAPP/xQLpdLjz/+uE6cODGvJ3M6HaqqqpzXNf6bx7M0r9dbTBIJ6wXAl/H3ef44w/lbqDPMGOBPPvlEhw8f1tGjR3Xr1i1NTk5q8+bNcrvdikaj8nq9ikajcrlcGZ8skUjmZfTnPJ6lGhu7ntdrLibxeIX1BOBL+Ps8P3xOnL98n+H9Yp7xJejXX39dR48e1eHDh7V792499dRTeuutt+T3+xUOhyVJ4XBYLS0teRsMAEC5y/nfAXd0dGh4eFitra0aHh5WR0dHPncBAFDWMr4E/UWrV6/W6tWrJUlOp1N9fX0FGQUAQLnjTlgAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYmNMPYwCAQkjflc6fL+6fUf3ww2lVVlqvQDkhwADM3bju0Juh43LUJq2n3FNywqHeLavU1JS2noIyQoABFAVHbVI1zinrGcCC4T1gAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMVGX6gFu3bunll1/W7du3lUqltG7dOnV2dmp8fFybNm3S5cuXtWLFCu3Zs0e1tbULsRkAgJKX8Svg6upq9fX16b333lM4HNaxY8f0z3/+U6FQSD6fT4ODg/L5fAqFQguxFwCAspAxwBUVFXrggQckSdPT05qenlZFRYUikYgCgYAkKRAIaGhoqLBLAQAoI1m9B5xKpfTCCy9ozZo1WrNmjZqbmxWLxeT1eiVJXq9X8Xi8oEMBACgnGd8DlqTKykr19/fr2rVreu2113T27NmcnszpdKiqqjKn3zsbj2dpXq+3mCQS1guA0uFy1cjjsV6RGZ8T52+hzjCrAH9u2bJlWr16tY4dOya3261oNCqv16toNCqXy5Xx9ycSyZyH3ovHs1RjY9fzes3FJB6vsJ4AlIx4fFJjY2nrGffF58T5y/cZ3i/mGV+CjsfjunbtmiTp5s2bOn78uBobG+X3+xUOhyVJ4XBYLS0teZoLAED5y/gVcDQa1datW5VKpZROp9XW1qbnnntO3/72t9XV1aUDBw6ovr5evb29C7EXAICykDHA3/rWt2a+0v0ip9Opvr6+gowCAKDccScsAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAxUWQ8oZ6mUNDJSYT1jVhcuFO82ACh3BLiARkYq9PNfn5KjNmk95Z5il1xyP2i9AgAWJwJcYI7apGqcU9Yz7ik54bCeAACLFu8BAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGCAAAMAYIAAAwBggAADAGAgY4CvXLmiV155RevXr1d7e7v6+vokSePj4woGg2ptbVUwGNTExETBxwIAUC4yBriyslJbt27V+++/r/379+udd97RuXPnFAqF5PP5NDg4KJ/Pp1AotBB7AQAoCxkD7PV6tXLlSklSTU2NGhsbNTo6qkgkokAgIEkKBAIaGhoq7FIAAMrInN4DvnTpks6cOaPm5mbFYjF5vV5J/x/peDxekIEAAJSjqmw/cGpqSp2dndq+fbtqampyejKn06Gqqsqcfu9sPJ6leb1ePiUS1gsA5IvLVSOPx3pFZsX8ObFULNQZZhXgO3fuqLOzU88//7xaW1slSW63W9FoVF6vV9FoVC6XK+N1Eonk/Nb+F49nqcbGruf1mvkUj1dYTwCQJ/H4pMbG0tYz7qvYPyeWgnyf4f1invEl6HQ6rZ6eHjU2NioYDM487vf7FQ6HJUnhcFgtLS15mAoAwOKQ8Svgf/zjH+rv79ejjz6qF154QZLU3d2tjo4OdXV16cCBA6qvr1dvb2/BxwIAUC4yBvg73/mO/v3vf9/zv33+b4IBAMDccCcsAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAxUWQ8AgGKXvitduFBhPSMjl8t6AeaCAANABjeuO/S/+z+VozZpPWVWyQmH/rBrjZxO6yXIVsYAb9u2TUeOHJHb7dahQ4ckSePj49q0aZMuX76sFStWaM+ePaqtrS34WACw4qhNqsY5ZT0DZSTje8Avvvii9u7d+6XHQqGQfD6fBgcH5fP5FAqFCjYQAIBylDHATz755Fe+uo1EIgoEApKkQCCgoaGhwqwDAKBM5fQecCwWk9frlSR5vV7F4/G8jspGKiWdPSvF48X7jRGl8E0bAAAbC/pNWE6nQ1VVlXm51tmz0ivbjhf1N0XELrnkftB6BYDFxONZaj2h5C3UGeYUYLfbrWg0Kq/Xq2g0KleW3/ueSOQvlvF4RdF/U0RywmE9AcAiMzZ23XpCSfN4lub1DO8X85xuxOH3+xUOhyVJ4XBYLS0tuS0DAGCRyhjg7u5u/eAHP9D58+f17LPP6k9/+pM6Ojo0PDys1tZWDQ8Pq6OjYyG2AgBQNjK+BL179+57Pt7X15f3MQAALBbcCxoAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAA1XWAwAA85e+K50/L8XjFdZTZvXww2lVVlqvKB4EGADKwI3rDr0ZOi5HbdJ6yj0lJxzq3bJKTU1p6ylFgwADQJlw1CZV45yynoEs8R4wAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAYIMAAABggwAAAGCDAAAAbmFeCjR49q3bp1Wrt2rUKhUL42AQBQ9nIOcCqV0i9/+Uvt3btXAwMDOnTokM6dO5fPbQAAlK2cA3zq1Ck99NBDamhoUHV1tdrb2xWJRPK5DQCAslWV628cHR1VXV3dzK+XL1+uU6dO5WVUtpITjgV9vrm6cf1/rCfcV7Hvk9iYD8W+Tyr+jcW+Tyr+jckJhy5cqLCekZHHs3DPlXOA0+n0Vx6rqLj/4Xo8S3N9untcSzrx7pq8XQ8AACm/rbqfnF+Crqur09WrV2d+PTo6Kq/Xm5dRAACUu5wD/MQTT2hkZEQXL17U7du3NTAwIL/fn89tAACUrZxfgq6qqtKbb76pn/zkJ0qlUvr+97+vRx55JJ/bAAAoWxXpe72ZCwAACoo7YQEAYIAAAwBgoGQDzG0w5+7KlSt65ZVXtH79erW3t6uvr0+SND4+rmAwqNbWVgWDQU1MTBgvLX6pVEqBQECvvvqqJM5wrq5du6bOzk61tbVp/fr1OnnyJGc4R7///e/V3t6uDRs2qLu7W7du3eIMs7Bt2zb5fD5t2LBh5rH7ndvbb7+ttWvXat26dTp27Fhet5RkgLkNZm4qKyu1detWvf/++9q/f7/eeecdnTt3TqFQSD6fT4ODg/L5fPwfmizs27dPTU1NM7/mDOdmx44deuaZZ/TBBx+ov79fTU1NnOEcjI6Oat++fXr33Xd16NAhpVIpDQwMcIZZePHFF7V3794vPTbbuZ07d04DAwMaGBjQ3r179Ytf/EKpVCpvW0oywNwGMzder1crV66UJNXU1KixsVGjo6OKRCIKBAKSpEAgoKGhIcuZRe/q1as6cuSINm7cOPMYZ5i9yclJffzxxzPnV11drWXLlnGGc5RKpXTz5k1NT0/r5s2b8nq9nGEWnnzySdXW1n7psdnOLRKJqL29XdXV1WpoaNBDDz2U1zs+lmSA73UbzNHRUcNFpefSpUs6c+aMmpubFYvFZm6i4vV6FY/HjdcVt507d2rLli1asuQ/f304w+xdvHhRLpdL27ZtUyAQUE9Pj5LJJGc4B8uXL9ePfvQjPffcc3r66adVU1Ojp59+mjPM0WznVujWlGSAc7kNJv5jampKnZ2d2r59u2pqaqznlJQPP/xQLpdLjz/+uPWUkjU9Pa3Tp0/rhz/8ocLhsL7+9a/zUukcTUxMKBKJKBKJ6NixY7px44b6+/utZ5WdQremJAPMbTBzd+fOHXV2dur5559Xa2urJMntdisajUqSotGoXC6X5cSi9sknn+jw4cPy+/3q7u7WX//6V23evJkznIO6ujrV1dWpublZktTW1qbTp09zhnNw/PhxPfjgg3K5XPra176m1tZWnTx5kjPM0WznVujWlGSAuQ1mbtLptHp6etTY2KhgMDjzuN/vVzgcliSFw2G1tLRYTSx6r7/+uo4eParDhw9r9+7deuqpp/TWW29xhnPg8XhUV1enzz77TJL00UcfqampiTOcg29+85v69NNPdePGDaXTac5wnmY7N7/fr4GBAd2+fVsXL17UyMiIVq1albfnLdk7Yf3lL3/Rzp07Z26D+dOf/tR6UtH7+9//rpdfflmPPvrozPuX3d3dWrVqlbq6unTlyhXV19ert7dX3/jGN4zXFr8TJ07od7/7nd5++20lEgnOcA7OnDmjnp4e3blzRw0NDdq1a5fu3r3LGc7Bb37zG/35z39WVVWVHnvsMe3YsUNTU1OcYQbd3d3629/+pkQiIbfbrZ/97Gf63ve+N+u5/fa3v9W7776ryspKbd++Xd/97nfztqVkAwwAQCkryZegAQAodQQYAAADBBgAAAMEGAAAAwQYAAADBBgAAAMEGAAAAwQYAAAD/wf3CEQ7NxguzgAAAABJRU5ErkJggg==\n", "text/plain": [ "<Figure size 576x396 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "plt.style.use(\"seaborn\")\n", "plt.hist(df.寿命, range=(0, 100), edgecolor=\"blue\") # 直方图,范围0至100(默认为最小值到最大值)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas 的功能非常丰富,想更深入地了解请查看官方文档 http://pandas.pydata.org/pandas-docs/stable/\n", "\n", "——编程原来是这样……" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.7", "language": "python", "name": "python3.7" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }