{ "cells": [ { "cell_type": "markdown", "id": "ed1f74e7", "metadata": {}, "source": [ "# 二维数据结构DataFrame对象" ] }, { "cell_type": "markdown", "id": "717b29b8", "metadata": {}, "source": [ "DataFrame对象是一种二维带标记数据结构,不同列的数据类型可以不同。为了方便理解,可以将DataFrame对象看成一张Excel电子表格,或者是一个由多列Series对象构成的字典。" ] }, { "cell_type": "code", "execution_count": 1, "id": "b0182569", "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "id": "79178b2b", "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "attachments": {}, "cell_type": "markdown", "id": "698481f5", "metadata": {}, "source": [ "## DataFrame对象的生成\n", "\n", "与Series类似,DataFrame对象也可以由多种类型的数据生成:\n", "- 由Series对象为值构成的字典。\n", "- 由一维数组或列表构成的字典。\n", "- 由字典构成的列表或数组。" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2e7cb22e", "metadata": {}, "source": [ "### 使用Series对象构成的字典生成" ] }, { "attachments": {}, "cell_type": "markdown", "id": "821beb00", "metadata": {}, "source": [ "DataFrame对象可以从一组由Series对象为值构成的字典中生成。字典中的值除了Series对象,也可以是另一个字典,因为字典被转换为Series对象。\n", "\n", "假设有一个包含两个Series对象的字典d:" ] }, { "cell_type": "code", "execution_count": 3, "id": "98d20646", "metadata": {}, "outputs": [], "source": [ "s1 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])" ] }, { "cell_type": "code", "execution_count": 4, "id": "315ddecb", "metadata": {}, "outputs": [], "source": [ "s2 = pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])" ] }, { "cell_type": "code", "execution_count": 5, "id": "1e13fa34", "metadata": {}, "outputs": [], "source": [ "d = {\"one\": s1, \"two\": s2}" ] }, { "attachments": {}, "cell_type": "markdown", "id": "49f79ba0", "metadata": {}, "source": [ "可以用字典d构造一个DataFrame对象:" ] }, { "cell_type": "code", "execution_count": 6, "id": "79c735da", "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(d)" ] }, { "cell_type": "code", "execution_count": 7, "id": "d65d98a8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
a1.01.0
b2.02.0
c3.03.0
dNaN4.0
\n", "
" ], "text/plain": [ " one two\n", "a 1.0 1.0\n", "b 2.0 2.0\n", "c 3.0 3.0\n", "d NaN 4.0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e1f02d1b", "metadata": {}, "source": [ "与Series相比,DataFrame对象要区分不同的行和列,因此有行标记和列标记之分。默认情况下,df的列标记是传入字典的键,可以用属性`.columns`查看:" ] }, { "cell_type": "code", "execution_count": 8, "id": "fa4a04fb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['one', 'two'], dtype='object')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ff9acb82", "metadata": {}, "source": [ "行标记是两个Series对象标记的并集,Pandas会自动将两个Series对象的标记进行对齐:" ] }, { "cell_type": "code", "execution_count": 9, "id": "34b63bf8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['a', 'b', 'c', 'd'], dtype='object')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "markdown", "id": "741871ba", "metadata": {}, "source": [ "在生成DataFrame时,也可以指定index和columns参数:" ] }, { "cell_type": "code", "execution_count": 10, "id": "227afaa7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
dNaN4.0
b2.02.0
a1.01.0
\n", "
" ], "text/plain": [ " one two\n", "d NaN 4.0\n", "b 2.0 2.0\n", "a 1.0 1.0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(d, index=[\"d\", \"b\", \"a\"])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "a50f5bc2", "metadata": {}, "source": [ "Pandas会按照给定的顺序从传入的数据中寻找对应的值,如果该值不存在,则使用缺省值`np.nan`:" ] }, { "cell_type": "code", "execution_count": 11, "id": "492f3b05", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
twothree
d4.0NaN
b2.0NaN
a1.0NaN
\n", "
" ], "text/plain": [ " two three\n", "d 4.0 NaN\n", "b 2.0 NaN\n", "a 1.0 NaN" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(d, index=['d', 'b', 'a'], columns=['two', 'three'])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2cef3d5d", "metadata": {}, "source": [ "### 使用一维数组构成的字典生成" ] }, { "attachments": {}, "cell_type": "markdown", "id": "eb0bb12a", "metadata": {}, "source": [ "DataFrame对象还可以使用由一维数组或列表构成的字典生成,这些数组和列表必须是等长的:" ] }, { "cell_type": "code", "execution_count": 12, "id": "a097d435", "metadata": {}, "outputs": [], "source": [ "d = {'one' : [1., 2., 3., 4.],\n", " 'two' : [4., 3., 2., 1.]}" ] }, { "cell_type": "code", "execution_count": 13, "id": "8d58e2f8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
01.04.0
12.03.0
23.02.0
34.01.0
\n", "
" ], "text/plain": [ " one two\n", "0 1.0 4.0\n", "1 2.0 3.0\n", "2 3.0 2.0\n", "3 4.0 1.0" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(d)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0333d58f", "metadata": {}, "source": [ "传入index参数时,该参数的长度也必须与列表长度一致:" ] }, { "cell_type": "code", "execution_count": 14, "id": "21661d1e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
a1.04.0
b2.03.0
c3.02.0
d4.01.0
\n", "
" ], "text/plain": [ " one two\n", "a 1.0 4.0\n", "b 2.0 3.0\n", "c 3.0 2.0\n", "d 4.0 1.0" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(d, index=['a', 'b', 'c', 'd'])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "9b2c149f", "metadata": {}, "source": [ "### 使用字典数组生成" ] }, { "cell_type": "markdown", "id": "3f5efc2a", "metadata": {}, "source": [ "还可以使用字典构成的数组或列表进行构建:" ] }, { "cell_type": "code", "execution_count": 15, "id": "57d4e5b2", "metadata": {}, "outputs": [], "source": [ "data = [{'a': 1, 'b': 2}, {'a': 5, 'b': 10, 'c': 20}]" ] }, { "cell_type": "code", "execution_count": 16, "id": "fd002d26", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
abc
012NaN
151020.0
\n", "
" ], "text/plain": [ " a b c\n", "0 1 2 NaN\n", "1 5 10 20.0" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(data)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e6448551", "metadata": {}, "source": [ "与Series不同的是,字典的键对应的是列标记,行标记由数组或列表的大小决定。" ] }, { "cell_type": "markdown", "id": "28b0a139", "metadata": {}, "source": [ "### 使用二维数组生成" ] }, { "cell_type": "markdown", "id": "e73d4783", "metadata": {}, "source": [ "还可以使用NumPy的二维数组生成:" ] }, { "cell_type": "code", "execution_count": 17, "id": "5e70aa1e", "metadata": {}, "outputs": [], "source": [ "a = np.array([[1,2,3], [4,5,6]])" ] }, { "cell_type": "code", "execution_count": 18, "id": "a6fa5ce0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
0123
1456
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 1 2 3\n", "1 4 5 6" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(a)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6aed9e8c", "metadata": {}, "source": [ "## DataFrame对象的使用\n", "\n", "DataFrame对象不是二维NumPy数组,在使用方法上存在很大差异:" ] }, { "cell_type": "code", "execution_count": 19, "id": "9f32f6c3", "metadata": {}, "outputs": [], "source": [ "s1 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])" ] }, { "cell_type": "code", "execution_count": 20, "id": "239d3b03", "metadata": {}, "outputs": [], "source": [ "s2 = pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])" ] }, { "cell_type": "code", "execution_count": 21, "id": "e8776dc6", "metadata": {}, "outputs": [], "source": [ "d = {\"one\": s1, \"two\": s2}" ] }, { "cell_type": "code", "execution_count": 22, "id": "cdde7d53", "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(d)" ] }, { "cell_type": "code", "execution_count": 23, "id": "5e17d0aa", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
a1.01.0
b2.02.0
c3.03.0
dNaN4.0
\n", "
" ], "text/plain": [ " one two\n", "a 1.0 1.0\n", "b 2.0 2.0\n", "c 3.0 3.0\n", "d NaN 4.0" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "23e62817", "metadata": {}, "source": [ "### 列相关的操作" ] }, { "attachments": {}, "cell_type": "markdown", "id": "3950a0b2", "metadata": {}, "source": [ "DataFrame对象可以看成是一个由Series对象构成的字典,.columns属性对应字典的键,每一列对应字典的值:" ] }, { "cell_type": "code", "execution_count": 24, "id": "a3c973ea", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 1.0\n", "b 2.0\n", "c 3.0\n", "d NaN\n", "Name: one, dtype: float64" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['one']" ] }, { "attachments": {}, "cell_type": "markdown", "id": "55ef0692", "metadata": {}, "source": [ "可以像字典一样增加新列:" ] }, { "cell_type": "code", "execution_count": 25, "id": "0f2ef75c", "metadata": {}, "outputs": [], "source": [ "df[\"three\"] = df[\"one\"] * df[\"two\"]" ] }, { "cell_type": "code", "execution_count": 26, "id": "05ac44bb", "metadata": {}, "outputs": [], "source": [ "df[\"flag\"] = df[\"one\"] > 2" ] }, { "cell_type": "code", "execution_count": 27, "id": "7ecb4c9f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwothreeflag
a1.01.01.0False
b2.02.04.0False
c3.03.09.0True
dNaN4.0NaNFalse
\n", "
" ], "text/plain": [ " one two three flag\n", "a 1.0 1.0 1.0 False\n", "b 2.0 2.0 4.0 False\n", "c 3.0 3.0 9.0 True\n", "d NaN 4.0 NaN False" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "1f62216a", "metadata": {}, "source": [ "增加新列时,如果新列的值是单一值,Pandas会按照行标记自动进行扩展:" ] }, { "cell_type": "code", "execution_count": 28, "id": "e156535d", "metadata": {}, "outputs": [], "source": [ "df[\"four\"] = 4" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e220dded", "metadata": {}, "source": [ "DataFrame对象支持用del关键字或者.pop()方法删除列:" ] }, { "cell_type": "code", "execution_count": 29, "id": "8e744653", "metadata": {}, "outputs": [], "source": [ "del df[\"two\"]" ] }, { "cell_type": "code", "execution_count": 30, "id": "4770ad42", "metadata": {}, "outputs": [], "source": [ "three = df.pop(\"three\")" ] }, { "cell_type": "code", "execution_count": 31, "id": "d20663a2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 1.0\n", "b 4.0\n", "c 9.0\n", "d NaN\n", "Name: three, dtype: float64" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "three" ] }, { "cell_type": "code", "execution_count": 32, "id": "8713ea5e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
oneflagfour
a1.0False4
b2.0False4
c3.0True4
dNaNFalse4
\n", "
" ], "text/plain": [ " one flag four\n", "a 1.0 False 4\n", "b 2.0 False 4\n", "c 3.0 True 4\n", "d NaN False 4" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "18ec10aa", "metadata": {}, "source": [ "增加一个行标记不完全相同的新列时,Pandas只会保留该列中与原有行标记相同的部分,以保证原DataFrame对象的行标记不变化:" ] }, { "cell_type": "code", "execution_count": 33, "id": "07b61507", "metadata": {}, "outputs": [], "source": [ "df[\"foo\"] = pd.Series([1,2,3], index=[\"a\", \"d\", \"e\"])" ] }, { "cell_type": "code", "execution_count": 34, "id": "e45141a6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
oneflagfourfoo
a1.0False41.0
b2.0False4NaN
c3.0True4NaN
dNaNFalse42.0
\n", "
" ], "text/plain": [ " one flag four foo\n", "a 1.0 False 4 1.0\n", "b 2.0 False 4 NaN\n", "c 3.0 True 4 NaN\n", "d NaN False 4 2.0" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f53b8dd1", "metadata": {}, "source": [ "默认情况下,新列的插入位置都在DataFrame对象的最后。可以使用.insert()方法将其插入指定的位置:" ] }, { "cell_type": "code", "execution_count": 35, "id": "edad10b3", "metadata": {}, "outputs": [], "source": [ "df.insert(1, \"bar\", df[\"one\"])" ] }, { "cell_type": "code", "execution_count": 36, "id": "5b1967c2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onebarflagfourfoo
a1.01.0False41.0
b2.02.0False4NaN
c3.03.0True4NaN
dNaNNaNFalse42.0
\n", "
" ], "text/plain": [ " one bar flag four foo\n", "a 1.0 1.0 False 4 1.0\n", "b 2.0 2.0 False 4 NaN\n", "c 3.0 3.0 True 4 NaN\n", "d NaN NaN False 4 2.0" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2441cfd5", "metadata": {}, "source": [ "### 行相关的操作\n", "\n", "DataFrame对象有两种常用的索引行的方式。可以用`.loc`属性索引行标记,返回一个Series对象:" ] }, { "cell_type": "code", "execution_count": 37, "id": "737b173b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "one 2.0\n", "bar 2.0\n", "flag False\n", "four 4\n", "foo NaN\n", "Name: b, dtype: object" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"b\"]" ] }, { "attachments": {}, "cell_type": "markdown", "id": "cb199e17", "metadata": {}, "source": [ "也可以用.iloc属性索引位置,得到第二行数据:" ] }, { "cell_type": "code", "execution_count": 38, "id": "80f5c2ae", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "one 2.0\n", "bar 2.0\n", "flag False\n", "four 4\n", "foo NaN\n", "Name: b, dtype: object" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[1]" ] }, { "attachments": {}, "cell_type": "markdown", "id": "a41ed5e1", "metadata": {}, "source": [ "### 加法与减法操作\n", "\n", "DataFrame对象支持加法和减法的操作,并且按照行列标记对齐的原则进行计算:" ] }, { "cell_type": "code", "execution_count": 39, "id": "9b6e727a", "metadata": {}, "outputs": [], "source": [ "df1 = pd.DataFrame(np.random.randn(10, 4), columns=['A', 'B', 'C', 'D'])" ] }, { "cell_type": "code", "execution_count": 40, "id": "f2b0db7d", "metadata": {}, "outputs": [], "source": [ "df2 = pd.DataFrame(np.random.randn(7, 3), columns=['A', 'B', 'C'])" ] }, { "cell_type": "code", "execution_count": 41, "id": "ed6e0274", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCD
0-1.906552-2.4284951.131278NaN
1-0.955872-1.476556-1.523796NaN
20.766210-0.1621120.190370NaN
3-2.8668380.8662811.340097NaN
4-2.0272470.972097-0.807422NaN
50.8410790.101313-1.701630NaN
60.318099-0.037061-1.878293NaN
7NaNNaNNaNNaN
8NaNNaNNaNNaN
9NaNNaNNaNNaN
\n", "
" ], "text/plain": [ " A B C D\n", "0 -1.906552 -2.428495 1.131278 NaN\n", "1 -0.955872 -1.476556 -1.523796 NaN\n", "2 0.766210 -0.162112 0.190370 NaN\n", "3 -2.866838 0.866281 1.340097 NaN\n", "4 -2.027247 0.972097 -0.807422 NaN\n", "5 0.841079 0.101313 -1.701630 NaN\n", "6 0.318099 -0.037061 -1.878293 NaN\n", "7 NaN NaN NaN NaN\n", "8 NaN NaN NaN NaN\n", "9 NaN NaN NaN NaN" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 + df2" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f0a10763", "metadata": {}, "source": [ "DataFrame对象还可以与Series对象进行加减操作。与NumPy中的广播机制类似,Pandas会先将Series对象的标记与DataFrame对象的列标记中对应的部分拿出来,然后使用广播机制将Series对象沿着行标记进行扩展:" ] }, { "cell_type": "code", "execution_count": 42, "id": "898bd8b9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCD
00.034677-1.4478890.2396730.897156
1-0.216450-0.0525220.2378490.806303
20.2605220.5908210.231546-2.164184
3-1.2645390.9471300.601591-0.753204
4-1.1131260.063686-0.379063-0.275933
50.596109-0.516650-1.1778660.075800
61.386725-0.328219-1.303265-0.790358
71.2254540.9235030.715214-0.144048
8-0.982050-0.0263151.9637320.638793
90.715773-0.767911-0.379927-1.533615
\n", "
" ], "text/plain": [ " A B C D\n", "0 0.034677 -1.447889 0.239673 0.897156\n", "1 -0.216450 -0.052522 0.237849 0.806303\n", "2 0.260522 0.590821 0.231546 -2.164184\n", "3 -1.264539 0.947130 0.601591 -0.753204\n", "4 -1.113126 0.063686 -0.379063 -0.275933\n", "5 0.596109 -0.516650 -1.177866 0.075800\n", "6 1.386725 -0.328219 -1.303265 -0.790358\n", "7 1.225454 0.923503 0.715214 -0.144048\n", "8 -0.982050 -0.026315 1.963732 0.638793\n", "9 0.715773 -0.767911 -0.379927 -1.533615" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1" ] }, { "cell_type": "code", "execution_count": 43, "id": "0726b1fc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCD
00.0000000.0000000.0000000.000000
1-0.2511271.395367-0.001824-0.090853
20.2258452.038710-0.008127-3.061340
3-1.2992162.3950190.361919-1.650360
4-1.1478021.511575-0.618736-1.173089
50.5614320.931239-1.417538-0.821356
61.3520481.119670-1.542938-1.687514
71.1907782.3713920.475542-1.041204
8-1.0167271.4215741.724059-0.258363
90.6810960.679978-0.619600-2.430771
\n", "
" ], "text/plain": [ " A B C D\n", "0 0.000000 0.000000 0.000000 0.000000\n", "1 -0.251127 1.395367 -0.001824 -0.090853\n", "2 0.225845 2.038710 -0.008127 -3.061340\n", "3 -1.299216 2.395019 0.361919 -1.650360\n", "4 -1.147802 1.511575 -0.618736 -1.173089\n", "5 0.561432 0.931239 -1.417538 -0.821356\n", "6 1.352048 1.119670 -1.542938 -1.687514\n", "7 1.190778 2.371392 0.475542 -1.041204\n", "8 -1.016727 1.421574 1.724059 -0.258363\n", "9 0.681096 0.679978 -0.619600 -2.430771" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 - df1.iloc[0]" ] }, { "cell_type": "code", "execution_count": null, "id": "0fe7dcf0", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.10" } }, "nbformat": 4, "nbformat_minor": 5 }