{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Основы программирования в Python\n", "\n", "*Алла Тамбовцева, НИУ ВШЭ*\n", "\n", "## Работа с таблицами. Основы работы с датафреймами `pandas`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В этой и последующих лекциях мы будем работать с таблицами. В социальных науках термины «база данных» и «таблица» часто используются как синонимы. Вообще, между этими терминами есть существенная разница, так как база данных – это набор таблиц, связанных друг с другом (при определённых условиях можно думать о ней как о файле Excel с разными листами). Давайте для простоты считать эти термины эквивалентными, основы работы с «настоящими» базами данных (SQL, PyMongo) мы обсуждать не будем. Кроме того, в качестве синонима слова таблица мы будем использовать слово датафрейм как кальку с термина data frame.\n", "\n", "Библиотека pandas используется для удобной и более эффективной работы с таблицами. Её функционал достаточно разнообразен, но давайте начнем с каких-то базовых функций и методов.\n", "\n", "Для начала импортируем саму библиотеку." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Здесь мы использовали такой приём: импортировали библиотеку и присвоили ей сокращённое имя, которое будет использоваться в пределах данного ipynb-файла. Чтобы не писать перед каждой библиотечной функцией длинное `pandas`. и не импортировать сразу все функции из этой библиотеки, мы сократили название до `pd`, и в дальнейшем Python будет понимать, что мы имеем в виду. Можно было бы сократить и до `p`, но тогда есть риск забыть про это и создать переменную с таким же именем, что в какой-то момент приведёт к проблемам. К тому же `pd` – распространенное сокращение." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Загрузка таблицы из файла и описание переменных\n", "А теперь давайте загрузим какую-нибудь реальную базу данных из файла. Библиотека `pandas` достаточно гибкая, она позволяет загружать данные из файлов разных форматов. Пока остановимся на самом простом – файле csv, что расшифровывается как *comma separated values*. Столбцы в таком файле по умолчанию отделяются друг от друга запятой. Например, такая таблица" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
0149
1486
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 1 4 9\n", "1 4 8 6" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame([[1, 4, 9], [4, 8, 6]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "сохраненная в формате csv без названий строк и столбцов будет выглядеть так:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "1, 4, 9\n", "4, 8, 6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Но разделитель столбцов в таблице может быть и другим, например, точкой с запятой:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "1; 4; 9\n", "4; 8; 6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В таких случаях нам потребуется дополнительно выставлять параметр `sep = \";\"`, чтобы Python понимал, как правильно отделять один столбец от другого. Посмотрим на примере двух файлов: `test1.xlsx` и `test2.csv`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
0234
1567
\n", "
" ], "text/plain": [ " A B C\n", "0 2 3 4\n", "1 5 6 7" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# загружаем test1.xlsx – все нормально\n", "d1 = pd.read_excel(\"test1.xlsx\")\n", "d1" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
022.51.8
134.20.0
244.31.6
\n", "
" ], "text/plain": [ " A B C\n", "0 2 2.5 1.8\n", "1 3 4.2 0.0\n", "2 4 4.3 1.6" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# загружаем test2.csv – тоже все хорошо\n", "d2 = pd.read_csv(\"test2.csv\")\n", "d2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Теперь поэкспериментируем: откроем файл `test2.csv` (можно в блокноте, а можно прямо в Jupyter, он открывает текстовые файлы) и изменим разделитель столбцов. Заменим запятые на точки с запятой:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "A;B;C\n", "2;2.5;1.8\n", "3;4.2;0\n", "4;4.3;1.6\n", "```" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
A;B;C
02;2.5;1.8
13;4.2;0
24;4.3;1.6
\n", "
" ], "text/plain": [ " A;B;C\n", "0 2;2.5;1.8\n", "1 3;4.2;0\n", "2 4;4.3;1.6" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# теперь при загрузке получим что-то не то\n", "pd.read_csv(\"test2.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Это из-за разделителя столбцов по умолчанию (запятая), укажем явно, что теперь это точка с запятой:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
022.51.8
134.20.0
244.31.6
\n", "
" ], "text/plain": [ " A B C\n", "0 2 2.5 1.8\n", "1 3 4.2 0.0\n", "2 4 4.3 1.6" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# все хорошо\n", "pd.read_csv(\"test2.csv\", sep = \";\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если мы при этом еще изменим десятичный разделитель в дробях, нас тоже будут ожидать странности:\n", "\n", "```\n", "A;B;C\n", "2;2,5;1,8\n", "3;4,2;0\n", "4;4,3;1,6\n", "```" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
022,51,8
134,20
244,31,6
\n", "
" ], "text/plain": [ " A B C\n", "0 2 2,5 1,8\n", "1 3 4,2 0\n", "2 4 4,3 1,6" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# визуально все так же\n", "dd = pd.read_csv(\"test2.csv\", sep = \";\")\n", "dd" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 3 entries, 0 to 2\n", "Data columns (total 3 columns):\n", "A 3 non-null int64\n", "B 3 non-null object\n", "C 3 non-null object\n", "dtypes: int64(1), object(2)\n", "memory usage: 152.0+ bytes\n" ] } ], "source": [ "dd.info() # тип object, не float" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 3 entries, 0 to 2\n", "Data columns (total 3 columns):\n", "A 3 non-null int64\n", "B 3 non-null float64\n", "C 3 non-null float64\n", "dtypes: float64(2), int64(1)\n", "memory usage: 152.0 bytes\n" ] } ], "source": [ "# изменим десятичный разделитель\n", "dd = pd.read_csv(\"test2.csv\", sep = \";\", decimal = \",\")\n", "dd.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Пока загрузим файл по ссылке: пропишем путь к нему внутри функции `read_csv()` из библиотеки `pandas`. Плюс, сделаем так, чтобы первый столбец (с индексом 0) был использован в качестве названий строк (строки будут иметь не номер от 0 до N, а названия, которые мы захотим, важно только, чтобы они все были уникальными, без повторов):" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(\"scores2.csv\", index_col = 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Иногда такой подход может быть полезен. Представьте, что все переменные в таблице, кроме *id*, измерены в количественной шкале, и мы планируем реализовать на них статистический метод, который работает исключительно с числовыми данными. Если мы просто выкинем столбец с *id*, мы потеряем информацию о наблюдении, если мы его оставим, нам придется собирать в отдельную таблицу показатели, к которым будем применять метод, так как сохраненный в исходной таблице текст будет мешать. Если же мы назовем строки в соответствии с *id*, мы убьем сразу двух зайцев: избавимся от столбца с текстом и не потеряем информацию о наблюдении (код, имя респондента, название страны и прочее)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В файле `scores2.csv` сохранены оценки студентов-политологов по ряду курсов. Оценки реальные, взяты из кумулятивного рейтинга, но имена студентов зашифрованы – вместо них задействованы номера студенческих билетов. Посмотрим на датафрейм:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ024798898108.07997.088.06101
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ017998899106.09998.088.0890
М141БПЛТЛ06910101010101098.081097.065.08101
М141БПЛТЛ0721098109898.081097.088.0990
М141БПЛТЛ020877691088.07797.086.0891
М141БПЛТЛ0267108710798.08888.087.0780
М141БПЛТЛ07379889898.08997.076.01091
М141БПЛТЛ078669561076.08696.088.0670
М141БПЛТЛ06078779885.07585.078.0791
М141БПЛТЛ04069869786.09585.085.07100
М141БПЛТЛ06599848879.085109.088.0691
М141БПЛТЛ05367759878.08687.086.0990
М141БПЛТЛ01569769794.07776.077.01070
М141БПЛТЛ02189889887.07766.086.0780
М141БПЛТЛ01877979786.06787.077.0780
М141БПЛТЛ03998988868.07696.078.0491
М141БПЛТЛ036810788694.08876.076.0781
М141БПЛТЛ04967668684.08596.085.0680
061140438810588810.0779NaN78.0781
М141БПЛТЛ04886869644.06484.067.0780
М141БПЛТЛ03469769686.07665.085.0890
М141БПЛТЛ04558878676.07786.086.0580
М141БПЛТЛ03359879797.07887.085.0780
М141БПЛТЛ08355658765.07575.075.0470
М141БПЛТЛ0081088981098.091098.055.01041
М141БПЛТЛ001677410776.08684.066.0480
М141БПЛТЛ03879649676.07484.054.0971
М141БПЛТЛ05277778666.08675.086.0571
М141БПЛТЛ01176869665.06676.086.0580
М141БПЛТЛ00477668665.05565.075.0880
М141БПЛТЛ01066769776.07586.086.0581
М141БПЛТЛ07169779684.06776.05NaN570
М141БПЛТЛ03556768554.06675.087.0670
М141БПЛТЛ03076667664.08555.085.0791
М141БПЛТЛ07055648655.06456.085.0670
М141БПЛТЛ05189868767.06665.044.0551
М141БПЛТЛ04657747585.07575.084.0570
М141БПЛТЛ04758647595.06464.074.0880
М141БПЛТЛ06355648444.05454.075.0880
М141БПЛТЛ02968879567.06585.074.0570
М141БПЛТЛ06478676684.06444.065.0470
М141БПЛТЛ07677868666.08685.074.0460
М141БПЛТЛ06277769665.06564.055.0460
М141БПЛТЛ07456747656.06686.066.0881
13023203867658484.08455.064.0560
М141БПЛТЛ02379689694.07776.044.0751
М141БПЛТЛ05478648644.06484.044.0481
М141БПЛТЛ012667410654.07574.054.0481
М141БПЛТЛ00665658555.06475.075.0680
М141БПЛТЛ05565647748.05464.065.0451
М141БПЛТЛ00767767674.05565.045.0471
М141БПЛТЛ05086668454.05564.054.0660
М141БПЛТЛ066710779584.06564.064.0560
М141БПЛТЛ04355658565.06454.05NaN460
М141БПЛТЛ0846784855NaN8444.044.0671
М141БПЛТЛ00557557474.05455.044.0481
М141БПЛТЛ04445746445.04444.06NaN551
1305103854449555.0544NaN74.0441
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8.0 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6.0 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8.0 8 10 \n", "М141БПЛТЛ072 10 9 8 10 9 8 9 8.0 8 10 \n", "М141БПЛТЛ020 8 7 7 6 9 10 8 8.0 7 7 \n", "М141БПЛТЛ026 7 10 8 7 10 7 9 8.0 8 8 \n", "М141БПЛТЛ073 7 9 8 8 9 8 9 8.0 8 9 \n", "М141БПЛТЛ078 6 6 9 5 6 10 7 6.0 8 6 \n", "М141БПЛТЛ060 7 8 7 7 9 8 8 5.0 7 5 \n", "М141БПЛТЛ040 6 9 8 6 9 7 8 6.0 9 5 \n", "М141БПЛТЛ065 9 9 8 4 8 8 7 9.0 8 5 \n", "М141БПЛТЛ053 6 7 7 5 9 8 7 8.0 8 6 \n", "М141БПЛТЛ015 6 9 7 6 9 7 9 4.0 7 7 \n", "М141БПЛТЛ021 8 9 8 8 9 8 8 7.0 7 7 \n", "М141БПЛТЛ018 7 7 9 7 9 7 8 6.0 6 7 \n", "М141БПЛТЛ039 9 8 9 8 8 8 6 8.0 7 6 \n", "М141БПЛТЛ036 8 10 7 8 8 6 9 4.0 8 8 \n", "М141БПЛТЛ049 6 7 6 6 8 6 8 4.0 8 5 \n", "06114043 8 8 10 5 8 8 8 10.0 7 7 \n", "М141БПЛТЛ048 8 6 8 6 9 6 4 4.0 6 4 \n", "М141БПЛТЛ034 6 9 7 6 9 6 8 6.0 7 6 \n", "М141БПЛТЛ045 5 8 8 7 8 6 7 6.0 7 7 \n", "М141БПЛТЛ033 5 9 8 7 9 7 9 7.0 7 8 \n", "М141БПЛТЛ083 5 5 6 5 8 7 6 5.0 7 5 \n", "М141БПЛТЛ008 10 8 8 9 8 10 9 8.0 9 10 \n", "М141БПЛТЛ001 6 7 7 4 10 7 7 6.0 8 6 \n", "М141БПЛТЛ038 7 9 6 4 9 6 7 6.0 7 4 \n", "М141БПЛТЛ052 7 7 7 7 8 6 6 6.0 8 6 \n", "М141БПЛТЛ011 7 6 8 6 9 6 6 5.0 6 6 \n", "М141БПЛТЛ004 7 7 6 6 8 6 6 5.0 5 5 \n", "М141БПЛТЛ010 6 6 7 6 9 7 7 6.0 7 5 \n", "М141БПЛТЛ071 6 9 7 7 9 6 8 4.0 6 7 \n", "М141БПЛТЛ035 5 6 7 6 8 5 5 4.0 6 6 \n", "М141БПЛТЛ030 7 6 6 6 7 6 6 4.0 8 5 \n", "М141БПЛТЛ070 5 5 6 4 8 6 5 5.0 6 4 \n", "М141БПЛТЛ051 8 9 8 6 8 7 6 7.0 6 6 \n", "М141БПЛТЛ046 5 7 7 4 7 5 8 5.0 7 5 \n", "М141БПЛТЛ047 5 8 6 4 7 5 9 5.0 6 4 \n", "М141БПЛТЛ063 5 5 6 4 8 4 4 4.0 5 4 \n", "М141БПЛТЛ029 6 8 8 7 9 5 6 7.0 6 5 \n", "М141БПЛТЛ064 7 8 6 7 6 6 8 4.0 6 4 \n", "М141БПЛТЛ076 7 7 8 6 8 6 6 6.0 8 6 \n", "М141БПЛТЛ062 7 7 7 6 9 6 6 5.0 6 5 \n", "М141БПЛТЛ074 5 6 7 4 7 6 5 6.0 6 6 \n", "130232038 6 7 6 5 8 4 8 4.0 8 4 \n", "М141БПЛТЛ023 7 9 6 8 9 6 9 4.0 7 7 \n", "М141БПЛТЛ054 7 8 6 4 8 6 4 4.0 6 4 \n", "М141БПЛТЛ012 6 6 7 4 10 6 5 4.0 7 5 \n", "М141БПЛТЛ006 6 5 6 5 8 5 5 5.0 6 4 \n", "М141БПЛТЛ055 6 5 6 4 7 7 4 8.0 5 4 \n", "М141БПЛТЛ007 6 7 7 6 7 6 7 4.0 5 5 \n", "М141БПЛТЛ050 8 6 6 6 8 4 5 4.0 5 5 \n", "М141БПЛТЛ066 7 10 7 7 9 5 8 4.0 6 5 \n", "М141БПЛТЛ043 5 5 6 5 8 5 6 5.0 6 4 \n", "М141БПЛТЛ084 6 7 8 4 8 5 5 NaN 8 4 \n", "М141БПЛТЛ005 5 7 5 5 7 4 7 4.0 5 4 \n", "М141БПЛТЛ044 4 5 7 4 6 4 4 5.0 4 4 \n", "13051038 5 4 4 4 9 5 5 5.0 5 4 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 \n", "М141БПЛТЛ072 9 7.0 8 8.0 9 9 0 \n", "М141БПЛТЛ020 9 7.0 8 6.0 8 9 1 \n", "М141БПЛТЛ026 8 8.0 8 7.0 7 8 0 \n", "М141БПЛТЛ073 9 7.0 7 6.0 10 9 1 \n", "М141БПЛТЛ078 9 6.0 8 8.0 6 7 0 \n", "М141БПЛТЛ060 8 5.0 7 8.0 7 9 1 \n", "М141БПЛТЛ040 8 5.0 8 5.0 7 10 0 \n", "М141БПЛТЛ065 10 9.0 8 8.0 6 9 1 \n", "М141БПЛТЛ053 8 7.0 8 6.0 9 9 0 \n", "М141БПЛТЛ015 7 6.0 7 7.0 10 7 0 \n", "М141БПЛТЛ021 6 6.0 8 6.0 7 8 0 \n", "М141БПЛТЛ018 8 7.0 7 7.0 7 8 0 \n", "М141БПЛТЛ039 9 6.0 7 8.0 4 9 1 \n", "М141БПЛТЛ036 7 6.0 7 6.0 7 8 1 \n", "М141БПЛТЛ049 9 6.0 8 5.0 6 8 0 \n", "06114043 9 NaN 7 8.0 7 8 1 \n", "М141БПЛТЛ048 8 4.0 6 7.0 7 8 0 \n", "М141БПЛТЛ034 6 5.0 8 5.0 8 9 0 \n", "М141БПЛТЛ045 8 6.0 8 6.0 5 8 0 \n", "М141БПЛТЛ033 8 7.0 8 5.0 7 8 0 \n", "М141БПЛТЛ083 7 5.0 7 5.0 4 7 0 \n", "М141БПЛТЛ008 9 8.0 5 5.0 10 4 1 \n", "М141БПЛТЛ001 8 4.0 6 6.0 4 8 0 \n", "М141БПЛТЛ038 8 4.0 5 4.0 9 7 1 \n", "М141БПЛТЛ052 7 5.0 8 6.0 5 7 1 \n", "М141БПЛТЛ011 7 6.0 8 6.0 5 8 0 \n", "М141БПЛТЛ004 6 5.0 7 5.0 8 8 0 \n", "М141БПЛТЛ010 8 6.0 8 6.0 5 8 1 \n", "М141БПЛТЛ071 7 6.0 5 NaN 5 7 0 \n", "М141БПЛТЛ035 7 5.0 8 7.0 6 7 0 \n", "М141БПЛТЛ030 5 5.0 8 5.0 7 9 1 \n", "М141БПЛТЛ070 5 6.0 8 5.0 6 7 0 \n", "М141БПЛТЛ051 6 5.0 4 4.0 5 5 1 \n", "М141БПЛТЛ046 7 5.0 8 4.0 5 7 0 \n", "М141БПЛТЛ047 6 4.0 7 4.0 8 8 0 \n", "М141БПЛТЛ063 5 4.0 7 5.0 8 8 0 \n", "М141БПЛТЛ029 8 5.0 7 4.0 5 7 0 \n", "М141БПЛТЛ064 4 4.0 6 5.0 4 7 0 \n", "М141БПЛТЛ076 8 5.0 7 4.0 4 6 0 \n", "М141БПЛТЛ062 6 4.0 5 5.0 4 6 0 \n", "М141БПЛТЛ074 8 6.0 6 6.0 8 8 1 \n", "130232038 5 5.0 6 4.0 5 6 0 \n", "М141БПЛТЛ023 7 6.0 4 4.0 7 5 1 \n", "М141БПЛТЛ054 8 4.0 4 4.0 4 8 1 \n", "М141БПЛТЛ012 7 4.0 5 4.0 4 8 1 \n", "М141БПЛТЛ006 7 5.0 7 5.0 6 8 0 \n", "М141БПЛТЛ055 6 4.0 6 5.0 4 5 1 \n", "М141БПЛТЛ007 6 5.0 4 5.0 4 7 1 \n", "М141БПЛТЛ050 6 4.0 5 4.0 6 6 0 \n", "М141БПЛТЛ066 6 4.0 6 4.0 5 6 0 \n", "М141БПЛТЛ043 5 4.0 5 NaN 4 6 0 \n", "М141БПЛТЛ084 4 4.0 4 4.0 6 7 1 \n", "М141БПЛТЛ005 5 5.0 4 4.0 4 8 1 \n", "М141БПЛТЛ044 4 4.0 6 NaN 5 5 1 \n", "13051038 4 NaN 7 4.0 4 4 1 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Так как в нашем случае таблица не очень большая, Python вывел её на экран полностью. Если строк или столбцов было бы слишком много, Python вывел бы несколько первых и последних, а в середине бы поставил многоточие." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Описание показателей (переменных):**\n", "\n", "* `id` – номер студенческого билета;\n", "* `catps` – оценка по курсу *Категории политической науки*;\n", "* `mstat` – оценка по курсу *Математика и статистика*;\n", "* `soc` – оценка по курсу *Социология*;\n", "* `econ` – оценка по курсу *Экономика*;\n", "* `eng` – оценка по курсу *Английский язык*;\n", "* `polth` – оценка по курсу *История политических учений*;\n", "* `mstat2` – оценка по курсу *Математика и статистика (часть 2)*;\n", "* `phist` – оценка по курсу *Политическая история*;\n", "* `law` – оценка по курсу *Право*;\n", "* `phil` – оценка по курсу *Философия*;\n", "* `polsoc` – оценка по курсу *Политическая социология*;\n", "* `ptheo` – оценка по курсу *Политическая теория*;\n", "* `preg` – оценка по курсу *Политическая регионалистика*;\n", "* `compp` – оценка по курсу *Сравнительная политика*;\n", "* `game` – оценка по курсу *Теория игр*;\n", "* `wpol` – оценка по курсу *Мировая политика и международные отношения*;\n", "* `male` – пол (1 – мужской, 0 – женский)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Получим сводную информацию по таблице:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Index: 60 entries, М141БПЛТЛ024 to 13051038\n", "Data columns (total 17 columns):\n", "catps 60 non-null int64\n", "mstat 60 non-null int64\n", "soc 60 non-null int64\n", "econ 60 non-null int64\n", "eng 60 non-null int64\n", "polth 60 non-null int64\n", "mstat2 60 non-null int64\n", "phist 59 non-null float64\n", "law 60 non-null int64\n", "phil 60 non-null int64\n", "polsoc 60 non-null int64\n", "ptheo 58 non-null float64\n", "preg 60 non-null int64\n", "compp 57 non-null float64\n", "game 60 non-null int64\n", "wpol 60 non-null int64\n", "male 60 non-null int64\n", "dtypes: float64(3), int64(14)\n", "memory usage: 8.4+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Какую информацию выдал метод `.info()`? Во-первых, он сообщил нам, что `df` является объектом `DataFrame`. Во-вторых, он вывел число строк (60 entries) и показал их индексы (0 to 59). В-третьих, он вывел число столбцов (total 18 columns). Наконец, он выдал информацию по каждому столбцу. Остановимся на этом поподробнее.\n", "\n", "В выдаче выше представлено, сколько непустых элементов содержится в каждом столбце. Непустые элементы non-null – это всё, кроме пропущенных значений, которые кодируются особым образом (`NaN` – от *Not A Number*). В нашей таблице есть столбцы, которые заполнены неполностью. \n", "\n", "Далее указан тип каждого столбца, целочисленный `int64` и строковый `object`. Что означают числа в конце? Это объем памяти, который требуется для хранения." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Сводную статистическую информацию можно получить с помощью метода `.describe()`." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
count60.00000060.00000060.00000060.00000060.00000060.00000060.00000059.00000060.00000060.00000060.00000058.00000060.00000057.00000060.00000060.00000060.000000
mean6.7000007.4666677.2166676.1166678.3500006.6000007.0333335.8305086.8666675.9666677.1833335.6034486.7000005.6315796.2500007.5666670.450000
std1.4178041.5780991.2086081.7182140.9711951.6385191.7070811.6624921.2138561.8500271.5890691.4134651.3567161.4221661.7814961.4304990.501692
min4.0000004.0000004.0000004.0000006.0000004.0000004.0000004.0000004.0000004.0000004.0000004.0000004.0000004.0000004.0000004.0000000.000000
25%6.0000006.0000006.0000005.0000008.0000006.0000006.0000004.0000006.0000004.7500006.0000004.2500006.0000004.0000005.0000007.0000000.000000
50%7.0000007.0000007.0000006.0000008.0000006.0000007.0000006.0000007.0000005.5000007.0000005.0000007.0000005.0000006.0000008.0000000.000000
75%7.2500009.0000008.0000007.0000009.0000008.0000008.0000007.0000008.0000007.0000008.0000006.0000008.0000007.0000007.2500008.2500001.000000
max10.00000010.00000010.00000010.00000010.00000010.00000010.00000010.0000009.00000010.00000010.0000009.0000008.0000008.00000010.00000010.0000001.000000
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth \\\n", "count 60.000000 60.000000 60.000000 60.000000 60.000000 60.000000 \n", "mean 6.700000 7.466667 7.216667 6.116667 8.350000 6.600000 \n", "std 1.417804 1.578099 1.208608 1.718214 0.971195 1.638519 \n", "min 4.000000 4.000000 4.000000 4.000000 6.000000 4.000000 \n", "25% 6.000000 6.000000 6.000000 5.000000 8.000000 6.000000 \n", "50% 7.000000 7.000000 7.000000 6.000000 8.000000 6.000000 \n", "75% 7.250000 9.000000 8.000000 7.000000 9.000000 8.000000 \n", "max 10.000000 10.000000 10.000000 10.000000 10.000000 10.000000 \n", "\n", " mstat2 phist law phil polsoc ptheo \\\n", "count 60.000000 59.000000 60.000000 60.000000 60.000000 58.000000 \n", "mean 7.033333 5.830508 6.866667 5.966667 7.183333 5.603448 \n", "std 1.707081 1.662492 1.213856 1.850027 1.589069 1.413465 \n", "min 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000 \n", "25% 6.000000 4.000000 6.000000 4.750000 6.000000 4.250000 \n", "50% 7.000000 6.000000 7.000000 5.500000 7.000000 5.000000 \n", "75% 8.000000 7.000000 8.000000 7.000000 8.000000 6.000000 \n", "max 10.000000 10.000000 9.000000 10.000000 10.000000 9.000000 \n", "\n", " preg compp game wpol male \n", "count 60.000000 57.000000 60.000000 60.000000 60.000000 \n", "mean 6.700000 5.631579 6.250000 7.566667 0.450000 \n", "std 1.356716 1.422166 1.781496 1.430499 0.501692 \n", "min 4.000000 4.000000 4.000000 4.000000 0.000000 \n", "25% 6.000000 4.000000 5.000000 7.000000 0.000000 \n", "50% 7.000000 5.000000 6.000000 8.000000 0.000000 \n", "75% 8.000000 7.000000 7.250000 8.250000 1.000000 \n", "max 8.000000 8.000000 10.000000 10.000000 1.000000 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В случае количественных показателей этот метод возвращает таблицу с основными описательными статистиками:\n", "\n", "* `count` – число непустых (заполненных) значений\n", "* `mean` – среднее арифметическое\n", "* `std` – стандартное отклонение (показатель разброса данных относительно среднего значения)\n", "* `min` – минимальное значение\n", "* `max` – максимальное значение\n", "* `25%` – нижний квартиль (значение, которое 25% значений не превышают)\n", "* `50%` – медиана (значение, которое 50% значений не превышают)\n", "* `75%` – верхний квартиль (значение, которое 75% значений не превышают)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Series: столбец в датафрейме" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Посмотрим на структуру таблицы более внимательно. Выберем первый столбец `catps`:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "М141БПЛТЛ024 7\n", "М141БПЛТЛ031 8\n", "М141БПЛТЛ075 9\n", "М141БПЛТЛ017 9\n", "М141БПЛТЛ069 10\n", "М141БПЛТЛ072 10\n", "М141БПЛТЛ020 8\n", "М141БПЛТЛ026 7\n", "М141БПЛТЛ073 7\n", "М141БПЛТЛ078 6\n", "М141БПЛТЛ060 7\n", "М141БПЛТЛ040 6\n", "М141БПЛТЛ065 9\n", "М141БПЛТЛ053 6\n", "М141БПЛТЛ015 6\n", "М141БПЛТЛ021 8\n", "М141БПЛТЛ018 7\n", "М141БПЛТЛ039 9\n", "М141БПЛТЛ036 8\n", "М141БПЛТЛ049 6\n", "06114043 8\n", "М141БПЛТЛ048 8\n", "М141БПЛТЛ034 6\n", "М141БПЛТЛ045 5\n", "М141БПЛТЛ033 5\n", "М141БПЛТЛ083 5\n", "М141БПЛТЛ008 10\n", "М141БПЛТЛ001 6\n", "М141БПЛТЛ038 7\n", "М141БПЛТЛ052 7\n", "М141БПЛТЛ011 7\n", "М141БПЛТЛ004 7\n", "М141БПЛТЛ010 6\n", "М141БПЛТЛ071 6\n", "М141БПЛТЛ035 5\n", "М141БПЛТЛ030 7\n", "М141БПЛТЛ070 5\n", "М141БПЛТЛ051 8\n", "М141БПЛТЛ046 5\n", "М141БПЛТЛ047 5\n", "М141БПЛТЛ063 5\n", "М141БПЛТЛ029 6\n", "М141БПЛТЛ064 7\n", "М141БПЛТЛ076 7\n", "М141БПЛТЛ062 7\n", "М141БПЛТЛ074 5\n", "130232038 6\n", "М141БПЛТЛ023 7\n", "М141БПЛТЛ054 7\n", "М141БПЛТЛ012 6\n", "М141БПЛТЛ006 6\n", "М141БПЛТЛ055 6\n", "М141БПЛТЛ007 6\n", "М141БПЛТЛ050 8\n", "М141БПЛТЛ066 7\n", "М141БПЛТЛ043 5\n", "М141БПЛТЛ084 6\n", "М141БПЛТЛ005 5\n", "М141БПЛТЛ044 4\n", "13051038 5\n", "Name: catps, dtype: int64" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['catps']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Столбец датафрейма `df` имеет особый тип *Series*. Внешне *Series* отличается от обычного списка значений, потому что, во-первых, при вызове столбца на экран выводятся не только сами элементы, но их номер (номер строки), а во-вторых, на экран выводится строка с названием столбца (`Name: id`) и его тип (`dtype: object`, текстовый). Первая особенность роднит *Series* со словарями: он представляет собой пары *ключ-значение*, то есть *номер-значение*. Вторая особенность роднит *Series* с массивами `numpy`: элементы обычно должны быть одного типа." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Операции с таблицами" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Можно вывести первые или последние строки таблицы, используя методы `.head()` и `.tail()`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ024798898108.07997.088.06101
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ017998899106.09998.088.0890
М141БПЛТЛ06910101010101098.081097.065.08101
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8.0 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6.0 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8.0 8 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ04355658565.06454.05NaN460
М141БПЛТЛ0846784855NaN8444.044.0671
М141БПЛТЛ00557557474.05455.044.0481
М141БПЛТЛ04445746445.04444.06NaN551
1305103854449555.0544NaN74.0441
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ043 5 5 6 5 8 5 6 5.0 6 4 \n", "М141БПЛТЛ084 6 7 8 4 8 5 5 NaN 8 4 \n", "М141БПЛТЛ005 5 7 5 5 7 4 7 4.0 5 4 \n", "М141БПЛТЛ044 4 5 7 4 6 4 4 5.0 4 4 \n", "13051038 5 4 4 4 9 5 5 5.0 5 4 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ043 5 4.0 5 NaN 4 6 0 \n", "М141БПЛТЛ084 4 4.0 4 4.0 6 7 1 \n", "М141БПЛТЛ005 5 5.0 4 4.0 4 8 1 \n", "М141БПЛТЛ044 4 4.0 6 NaN 5 5 1 \n", "13051038 4 NaN 7 4.0 4 4 1 " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Внимание:** это просто первые и последние строки таблицы «как есть». Никакой сортировки не происходит! \n", "\n", "По умолчанию эти методы выводят пять строк, но при желании это легко изменить. Достаточно в скобках указать желаемое число строк." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ024798898108.07997.088.06101
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ017998899106.09998.088.0890
М141БПЛТЛ06910101010101098.081097.065.08101
М141БПЛТЛ0721098109898.081097.088.0990
М141БПЛТЛ020877691088.07797.086.0891
М141БПЛТЛ0267108710798.08888.087.0780
М141БПЛТЛ07379889898.08997.076.01091
М141БПЛТЛ078669561076.08696.088.0670
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8.0 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6.0 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8.0 8 10 \n", "М141БПЛТЛ072 10 9 8 10 9 8 9 8.0 8 10 \n", "М141БПЛТЛ020 8 7 7 6 9 10 8 8.0 7 7 \n", "М141БПЛТЛ026 7 10 8 7 10 7 9 8.0 8 8 \n", "М141БПЛТЛ073 7 9 8 8 9 8 9 8.0 8 9 \n", "М141БПЛТЛ078 6 6 9 5 6 10 7 6.0 8 6 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 \n", "М141БПЛТЛ072 9 7.0 8 8.0 9 9 0 \n", "М141БПЛТЛ020 9 7.0 8 6.0 8 9 1 \n", "М141БПЛТЛ026 8 8.0 8 7.0 7 8 0 \n", "М141БПЛТЛ073 9 7.0 7 6.0 10 9 1 \n", "М141БПЛТЛ078 9 6.0 8 8.0 6 7 0 " ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head(10) # первые 10 строк" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Когда таблица большая, увидеть все столбцы разом не получится. Поэтому полезно знать, как получить список названий столбцов." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['catps', 'mstat', 'soc', 'econ', 'eng', 'polth', 'mstat2', 'phist',\n", " 'law', 'phil', 'polsoc', 'ptheo', 'preg', 'compp', 'game', 'wpol',\n", " 'male'],\n", " dtype='object')" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Обратите внимание: полученный объект не является обычным списком:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.indexes.base.Index" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(df.columns) # это Index из pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Чтобы получить список названий, достаточно сконвертировать тип с помощью привычного `list()`: " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['catps', 'mstat', 'soc', 'econ', 'eng', 'polth', 'mstat2', 'phist', 'law', 'phil', 'polsoc', 'ptheo', 'preg', 'compp', 'game', 'wpol', 'male']\n" ] } ], "source": [ "c = list(df.columns)\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Аналогичная история со строками: " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['М141БПЛТЛ024', 'М141БПЛТЛ031', 'М141БПЛТЛ075', 'М141БПЛТЛ017',\n", " 'М141БПЛТЛ069', 'М141БПЛТЛ072', 'М141БПЛТЛ020', 'М141БПЛТЛ026',\n", " 'М141БПЛТЛ073', 'М141БПЛТЛ078', 'М141БПЛТЛ060', 'М141БПЛТЛ040',\n", " 'М141БПЛТЛ065', 'М141БПЛТЛ053', 'М141БПЛТЛ015', 'М141БПЛТЛ021',\n", " 'М141БПЛТЛ018', 'М141БПЛТЛ039', 'М141БПЛТЛ036', 'М141БПЛТЛ049',\n", " '06114043', 'М141БПЛТЛ048', 'М141БПЛТЛ034', 'М141БПЛТЛ045',\n", " 'М141БПЛТЛ033', 'М141БПЛТЛ083', 'М141БПЛТЛ008', 'М141БПЛТЛ001',\n", " 'М141БПЛТЛ038', 'М141БПЛТЛ052', 'М141БПЛТЛ011', 'М141БПЛТЛ004',\n", " 'М141БПЛТЛ010', 'М141БПЛТЛ071', 'М141БПЛТЛ035', 'М141БПЛТЛ030',\n", " 'М141БПЛТЛ070', 'М141БПЛТЛ051', 'М141БПЛТЛ046', 'М141БПЛТЛ047',\n", " 'М141БПЛТЛ063', 'М141БПЛТЛ029', 'М141БПЛТЛ064', 'М141БПЛТЛ076',\n", " 'М141БПЛТЛ062', 'М141БПЛТЛ074', '130232038', 'М141БПЛТЛ023',\n", " 'М141БПЛТЛ054', 'М141БПЛТЛ012', 'М141БПЛТЛ006', 'М141БПЛТЛ055',\n", " 'М141БПЛТЛ007', 'М141БПЛТЛ050', 'М141БПЛТЛ066', 'М141БПЛТЛ043',\n", " 'М141БПЛТЛ084', 'М141БПЛТЛ005', 'М141БПЛТЛ044', '13051038'],\n", " dtype='object', name='id')" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Важно:** датафреймы являются изменяемой структурой данных (да-да, как списки). Поэтому, применяя некоторые методы к объекту типа `DataFrame` или внося какие-то изменения в ссылку на него, мы меняем исходный датафрейм, и к этому надо быть готовым. Если вы не планируете вносить изменения в исходную базу, имеет смысл сделать её копию и работать с ней. Например, вот так:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# метод copy\n", "df_new = df.copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Обратите внимание:** создать копию обычным присваиванием не получится, код вида `df_new = df` создаст новую ссылку на датафрейм, но не новый датафрейм. Поэтому при изменении `df_new` база `df` также изменится (вспомните историю о коварстве списков). \n", "\n", "Однако это касается не всех преобразований, многие методы для датафреймов уже устроены так, что они возвращают измененную копию датафрейма и не изменяют исходный датафрейм. Чтобы изменить исходный датафрейм, можно добавить аргумент `inplace = True`, он есть у многих методов." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Выбор столбцов и строк таблицы" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Выбор столбцов по названию**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Часто удобнее всего выбирать столбец по названию. Для этого достаточно указать название столбца в квадратных скобках (и обязательно в кавычках, так как название является строкой):" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "М141БПЛТЛ024 9\n", "М141БПЛТЛ031 10\n", "М141БПЛТЛ075 9\n", "М141БПЛТЛ017 9\n", "М141БПЛТЛ069 10\n", "М141БПЛТЛ072 9\n", "М141БПЛТЛ020 7\n", "М141БПЛТЛ026 10\n", "М141БПЛТЛ073 9\n", "М141БПЛТЛ078 6\n", "М141БПЛТЛ060 8\n", "М141БПЛТЛ040 9\n", "М141БПЛТЛ065 9\n", "М141БПЛТЛ053 7\n", "М141БПЛТЛ015 9\n", "М141БПЛТЛ021 9\n", "М141БПЛТЛ018 7\n", "М141БПЛТЛ039 8\n", "М141БПЛТЛ036 10\n", "М141БПЛТЛ049 7\n", "06114043 8\n", "М141БПЛТЛ048 6\n", "М141БПЛТЛ034 9\n", "М141БПЛТЛ045 8\n", "М141БПЛТЛ033 9\n", "М141БПЛТЛ083 5\n", "М141БПЛТЛ008 8\n", "М141БПЛТЛ001 7\n", "М141БПЛТЛ038 9\n", "М141БПЛТЛ052 7\n", "М141БПЛТЛ011 6\n", "М141БПЛТЛ004 7\n", "М141БПЛТЛ010 6\n", "М141БПЛТЛ071 9\n", "М141БПЛТЛ035 6\n", "М141БПЛТЛ030 6\n", "М141БПЛТЛ070 5\n", "М141БПЛТЛ051 9\n", "М141БПЛТЛ046 7\n", "М141БПЛТЛ047 8\n", "М141БПЛТЛ063 5\n", "М141БПЛТЛ029 8\n", "М141БПЛТЛ064 8\n", "М141БПЛТЛ076 7\n", "М141БПЛТЛ062 7\n", "М141БПЛТЛ074 6\n", "130232038 7\n", "М141БПЛТЛ023 9\n", "М141БПЛТЛ054 8\n", "М141БПЛТЛ012 6\n", "М141БПЛТЛ006 5\n", "М141БПЛТЛ055 5\n", "М141БПЛТЛ007 7\n", "М141БПЛТЛ050 6\n", "М141БПЛТЛ066 10\n", "М141БПЛТЛ043 5\n", "М141БПЛТЛ084 7\n", "М141БПЛТЛ005 7\n", "М141БПЛТЛ044 5\n", "13051038 4\n", "Name: mstat, dtype: int64" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['mstat']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ещё столбец можно выбрать, не используя квадратные скобки, а просто указав его название через точку: " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "М141БПЛТЛ024 9\n", "М141БПЛТЛ031 10\n", "М141БПЛТЛ075 9\n", "М141БПЛТЛ017 9\n", "М141БПЛТЛ069 10\n", "М141БПЛТЛ072 9\n", "М141БПЛТЛ020 7\n", "М141БПЛТЛ026 10\n", "М141БПЛТЛ073 9\n", "М141БПЛТЛ078 6\n", "М141БПЛТЛ060 8\n", "М141БПЛТЛ040 9\n", "М141БПЛТЛ065 9\n", "М141БПЛТЛ053 7\n", "М141БПЛТЛ015 9\n", "М141БПЛТЛ021 9\n", "М141БПЛТЛ018 7\n", "М141БПЛТЛ039 8\n", "М141БПЛТЛ036 10\n", "М141БПЛТЛ049 7\n", "06114043 8\n", "М141БПЛТЛ048 6\n", "М141БПЛТЛ034 9\n", "М141БПЛТЛ045 8\n", "М141БПЛТЛ033 9\n", "М141БПЛТЛ083 5\n", "М141БПЛТЛ008 8\n", "М141БПЛТЛ001 7\n", "М141БПЛТЛ038 9\n", "М141БПЛТЛ052 7\n", "М141БПЛТЛ011 6\n", "М141БПЛТЛ004 7\n", "М141БПЛТЛ010 6\n", "М141БПЛТЛ071 9\n", "М141БПЛТЛ035 6\n", "М141БПЛТЛ030 6\n", "М141БПЛТЛ070 5\n", "М141БПЛТЛ051 9\n", "М141БПЛТЛ046 7\n", "М141БПЛТЛ047 8\n", "М141БПЛТЛ063 5\n", "М141БПЛТЛ029 8\n", "М141БПЛТЛ064 8\n", "М141БПЛТЛ076 7\n", "М141БПЛТЛ062 7\n", "М141БПЛТЛ074 6\n", "130232038 7\n", "М141БПЛТЛ023 9\n", "М141БПЛТЛ054 8\n", "М141БПЛТЛ012 6\n", "М141БПЛТЛ006 5\n", "М141БПЛТЛ055 5\n", "М141БПЛТЛ007 7\n", "М141БПЛТЛ050 6\n", "М141БПЛТЛ066 10\n", "М141БПЛТЛ043 5\n", "М141БПЛТЛ084 7\n", "М141БПЛТЛ005 7\n", "М141БПЛТЛ044 5\n", "13051038 4\n", "Name: mstat, dtype: int64" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.mstat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Однако такой способ не универсален. В случае, если в названии столбца используются недопустимые для переменных символы (пробелы, тире, кириллические буквы), этот метод не подойдет. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если нам нужно выбрать более одного столбца, то названия столбцов указываются внутри списка – появляются двойные квадратные скобки:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
socpolsoc
id
М141БПЛТЛ02489
М141БПЛТЛ0311010
М141БПЛТЛ07599
М141БПЛТЛ01789
М141БПЛТЛ069109
М141БПЛТЛ07289
М141БПЛТЛ02079
М141БПЛТЛ02688
М141БПЛТЛ07389
М141БПЛТЛ07899
М141БПЛТЛ06078
М141БПЛТЛ04088
М141БПЛТЛ065810
М141БПЛТЛ05378
М141БПЛТЛ01577
М141БПЛТЛ02186
М141БПЛТЛ01898
М141БПЛТЛ03999
М141БПЛТЛ03677
М141БПЛТЛ04969
06114043109
М141БПЛТЛ04888
М141БПЛТЛ03476
М141БПЛТЛ04588
М141БПЛТЛ03388
М141БПЛТЛ08367
М141БПЛТЛ00889
М141БПЛТЛ00178
М141БПЛТЛ03868
М141БПЛТЛ05277
М141БПЛТЛ01187
М141БПЛТЛ00466
М141БПЛТЛ01078
М141БПЛТЛ07177
М141БПЛТЛ03577
М141БПЛТЛ03065
М141БПЛТЛ07065
М141БПЛТЛ05186
М141БПЛТЛ04677
М141БПЛТЛ04766
М141БПЛТЛ06365
М141БПЛТЛ02988
М141БПЛТЛ06464
М141БПЛТЛ07688
М141БПЛТЛ06276
М141БПЛТЛ07478
13023203865
М141БПЛТЛ02367
М141БПЛТЛ05468
М141БПЛТЛ01277
М141БПЛТЛ00667
М141БПЛТЛ05566
М141БПЛТЛ00776
М141БПЛТЛ05066
М141БПЛТЛ06676
М141БПЛТЛ04365
М141БПЛТЛ08484
М141БПЛТЛ00555
М141БПЛТЛ04474
1305103844
\n", "
" ], "text/plain": [ " soc polsoc\n", "id \n", "М141БПЛТЛ024 8 9\n", "М141БПЛТЛ031 10 10\n", "М141БПЛТЛ075 9 9\n", "М141БПЛТЛ017 8 9\n", "М141БПЛТЛ069 10 9\n", "М141БПЛТЛ072 8 9\n", "М141БПЛТЛ020 7 9\n", "М141БПЛТЛ026 8 8\n", "М141БПЛТЛ073 8 9\n", "М141БПЛТЛ078 9 9\n", "М141БПЛТЛ060 7 8\n", "М141БПЛТЛ040 8 8\n", "М141БПЛТЛ065 8 10\n", "М141БПЛТЛ053 7 8\n", "М141БПЛТЛ015 7 7\n", "М141БПЛТЛ021 8 6\n", "М141БПЛТЛ018 9 8\n", "М141БПЛТЛ039 9 9\n", "М141БПЛТЛ036 7 7\n", "М141БПЛТЛ049 6 9\n", "06114043 10 9\n", "М141БПЛТЛ048 8 8\n", "М141БПЛТЛ034 7 6\n", "М141БПЛТЛ045 8 8\n", "М141БПЛТЛ033 8 8\n", "М141БПЛТЛ083 6 7\n", "М141БПЛТЛ008 8 9\n", "М141БПЛТЛ001 7 8\n", "М141БПЛТЛ038 6 8\n", "М141БПЛТЛ052 7 7\n", "М141БПЛТЛ011 8 7\n", "М141БПЛТЛ004 6 6\n", "М141БПЛТЛ010 7 8\n", "М141БПЛТЛ071 7 7\n", "М141БПЛТЛ035 7 7\n", "М141БПЛТЛ030 6 5\n", "М141БПЛТЛ070 6 5\n", "М141БПЛТЛ051 8 6\n", "М141БПЛТЛ046 7 7\n", "М141БПЛТЛ047 6 6\n", "М141БПЛТЛ063 6 5\n", "М141БПЛТЛ029 8 8\n", "М141БПЛТЛ064 6 4\n", "М141БПЛТЛ076 8 8\n", "М141БПЛТЛ062 7 6\n", "М141БПЛТЛ074 7 8\n", "130232038 6 5\n", "М141БПЛТЛ023 6 7\n", "М141БПЛТЛ054 6 8\n", "М141БПЛТЛ012 7 7\n", "М141БПЛТЛ006 6 7\n", "М141БПЛТЛ055 6 6\n", "М141БПЛТЛ007 7 6\n", "М141БПЛТЛ050 6 6\n", "М141БПЛТЛ066 7 6\n", "М141БПЛТЛ043 6 5\n", "М141БПЛТЛ084 8 4\n", "М141БПЛТЛ005 5 5\n", "М141БПЛТЛ044 7 4\n", "13051038 4 4" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[[\"soc\", \"polsoc\"]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если нам нужно несколько столбцов подряд, начиная с одного названия и заканчивая другим, можно воспользоваться методом `.loc`: " ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
econengpolthmstat2phistlaw
id
М141БПЛТЛ024898108.07
М141БПЛТЛ031101010109.09
М141БПЛТЛ0751091098.09
М141БПЛТЛ017899106.09
М141БПЛТЛ06910101098.08
М141БПЛТЛ072109898.08
М141БПЛТЛ020691088.07
М141БПЛТЛ026710798.08
М141БПЛТЛ07389898.08
М141БПЛТЛ078561076.08
М141БПЛТЛ06079885.07
М141БПЛТЛ04069786.09
М141БПЛТЛ06548879.08
М141БПЛТЛ05359878.08
М141БПЛТЛ01569794.07
М141БПЛТЛ02189887.07
М141БПЛТЛ01879786.06
М141БПЛТЛ03988868.07
М141БПЛТЛ03688694.08
М141БПЛТЛ04968684.08
06114043588810.07
М141БПЛТЛ04869644.06
М141БПЛТЛ03469686.07
М141БПЛТЛ04578676.07
М141БПЛТЛ03379797.07
М141БПЛТЛ08358765.07
М141БПЛТЛ008981098.09
М141БПЛТЛ001410776.08
М141БПЛТЛ03849676.07
М141БПЛТЛ05278666.08
М141БПЛТЛ01169665.06
М141БПЛТЛ00468665.05
М141БПЛТЛ01069776.07
М141БПЛТЛ07179684.06
М141БПЛТЛ03568554.06
М141БПЛТЛ03067664.08
М141БПЛТЛ07048655.06
М141БПЛТЛ05168767.06
М141БПЛТЛ04647585.07
М141БПЛТЛ04747595.06
М141БПЛТЛ06348444.05
М141БПЛТЛ02979567.06
М141БПЛТЛ06476684.06
М141БПЛТЛ07668666.08
М141БПЛТЛ06269665.06
М141БПЛТЛ07447656.06
13023203858484.08
М141БПЛТЛ02389694.07
М141БПЛТЛ05448644.06
М141БПЛТЛ012410654.07
М141БПЛТЛ00658555.06
М141БПЛТЛ05547748.05
М141БПЛТЛ00767674.05
М141БПЛТЛ05068454.05
М141БПЛТЛ06679584.06
М141БПЛТЛ04358565.06
М141БПЛТЛ0844855NaN8
М141БПЛТЛ00557474.05
М141БПЛТЛ04446445.04
1305103849555.05
\n", "
" ], "text/plain": [ " econ eng polth mstat2 phist law\n", "id \n", "М141БПЛТЛ024 8 9 8 10 8.0 7\n", "М141БПЛТЛ031 10 10 10 10 9.0 9\n", "М141БПЛТЛ075 10 9 10 9 8.0 9\n", "М141БПЛТЛ017 8 9 9 10 6.0 9\n", "М141БПЛТЛ069 10 10 10 9 8.0 8\n", "М141БПЛТЛ072 10 9 8 9 8.0 8\n", "М141БПЛТЛ020 6 9 10 8 8.0 7\n", "М141БПЛТЛ026 7 10 7 9 8.0 8\n", "М141БПЛТЛ073 8 9 8 9 8.0 8\n", "М141БПЛТЛ078 5 6 10 7 6.0 8\n", "М141БПЛТЛ060 7 9 8 8 5.0 7\n", "М141БПЛТЛ040 6 9 7 8 6.0 9\n", "М141БПЛТЛ065 4 8 8 7 9.0 8\n", "М141БПЛТЛ053 5 9 8 7 8.0 8\n", "М141БПЛТЛ015 6 9 7 9 4.0 7\n", "М141БПЛТЛ021 8 9 8 8 7.0 7\n", "М141БПЛТЛ018 7 9 7 8 6.0 6\n", "М141БПЛТЛ039 8 8 8 6 8.0 7\n", "М141БПЛТЛ036 8 8 6 9 4.0 8\n", "М141БПЛТЛ049 6 8 6 8 4.0 8\n", "06114043 5 8 8 8 10.0 7\n", "М141БПЛТЛ048 6 9 6 4 4.0 6\n", "М141БПЛТЛ034 6 9 6 8 6.0 7\n", "М141БПЛТЛ045 7 8 6 7 6.0 7\n", "М141БПЛТЛ033 7 9 7 9 7.0 7\n", "М141БПЛТЛ083 5 8 7 6 5.0 7\n", "М141БПЛТЛ008 9 8 10 9 8.0 9\n", "М141БПЛТЛ001 4 10 7 7 6.0 8\n", "М141БПЛТЛ038 4 9 6 7 6.0 7\n", "М141БПЛТЛ052 7 8 6 6 6.0 8\n", "М141БПЛТЛ011 6 9 6 6 5.0 6\n", "М141БПЛТЛ004 6 8 6 6 5.0 5\n", "М141БПЛТЛ010 6 9 7 7 6.0 7\n", "М141БПЛТЛ071 7 9 6 8 4.0 6\n", "М141БПЛТЛ035 6 8 5 5 4.0 6\n", "М141БПЛТЛ030 6 7 6 6 4.0 8\n", "М141БПЛТЛ070 4 8 6 5 5.0 6\n", "М141БПЛТЛ051 6 8 7 6 7.0 6\n", "М141БПЛТЛ046 4 7 5 8 5.0 7\n", "М141БПЛТЛ047 4 7 5 9 5.0 6\n", "М141БПЛТЛ063 4 8 4 4 4.0 5\n", "М141БПЛТЛ029 7 9 5 6 7.0 6\n", "М141БПЛТЛ064 7 6 6 8 4.0 6\n", "М141БПЛТЛ076 6 8 6 6 6.0 8\n", "М141БПЛТЛ062 6 9 6 6 5.0 6\n", "М141БПЛТЛ074 4 7 6 5 6.0 6\n", "130232038 5 8 4 8 4.0 8\n", "М141БПЛТЛ023 8 9 6 9 4.0 7\n", "М141БПЛТЛ054 4 8 6 4 4.0 6\n", "М141БПЛТЛ012 4 10 6 5 4.0 7\n", "М141БПЛТЛ006 5 8 5 5 5.0 6\n", "М141БПЛТЛ055 4 7 7 4 8.0 5\n", "М141БПЛТЛ007 6 7 6 7 4.0 5\n", "М141БПЛТЛ050 6 8 4 5 4.0 5\n", "М141БПЛТЛ066 7 9 5 8 4.0 6\n", "М141БПЛТЛ043 5 8 5 6 5.0 6\n", "М141БПЛТЛ084 4 8 5 5 NaN 8\n", "М141БПЛТЛ005 5 7 4 7 4.0 5\n", "М141БПЛТЛ044 4 6 4 4 5.0 4\n", "13051038 4 9 5 5 5.0 5" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[:, 'econ' : 'law']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Откуда в квадратных скобках взялось двоеточие? Дело в том, что метод `.loc` – более универсальный, и позволяет выбирать не только столбцы, но и строки. При этом нужные строки указываются на первом месте, а столбцы – на втором. Когда мы пишем `.loc[:, 1]`, мы сообщаем Python, что нам нужны все строки (`:`) и столбцы, начиная с `Econ` и до `Law` включительно.\n", "\n", "**Внимание:** выбор столбцов по названиям через двоеточие очень напоминает срезы (*slices*) в списках. Но есть важное отличие. В случае текстовых названий, оба конца среза (левый и правый) включаются. Если бы срезы по названиям были бы устроены как срезы по числовым индексам, код выше выдавал бы столбцы с `Econ` и до `Phist`, не включая колонку `Law`, так как в обычных срезах правый конец исключается." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Выбор столбцов по номеру**\n", "\n", "Иногда может возникнуть необходимость выбрать столбец по его порядковому номеру. Например, когда названий столбцов нет как таковых или когда названия слишком длинные, а переименовывать их нежелательно. Сделать это можно с помощью метода `.iloc`:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "М141БПЛТЛ024 9\n", "М141БПЛТЛ031 10\n", "М141БПЛТЛ075 9\n", "М141БПЛТЛ017 9\n", "М141БПЛТЛ069 10\n", "М141БПЛТЛ072 9\n", "М141БПЛТЛ020 7\n", "М141БПЛТЛ026 10\n", "М141БПЛТЛ073 9\n", "М141БПЛТЛ078 6\n", "М141БПЛТЛ060 8\n", "М141БПЛТЛ040 9\n", "М141БПЛТЛ065 9\n", "М141БПЛТЛ053 7\n", "М141БПЛТЛ015 9\n", "М141БПЛТЛ021 9\n", "М141БПЛТЛ018 7\n", "М141БПЛТЛ039 8\n", "М141БПЛТЛ036 10\n", "М141БПЛТЛ049 7\n", "06114043 8\n", "М141БПЛТЛ048 6\n", "М141БПЛТЛ034 9\n", "М141БПЛТЛ045 8\n", "М141БПЛТЛ033 9\n", "М141БПЛТЛ083 5\n", "М141БПЛТЛ008 8\n", "М141БПЛТЛ001 7\n", "М141БПЛТЛ038 9\n", "М141БПЛТЛ052 7\n", "М141БПЛТЛ011 6\n", "М141БПЛТЛ004 7\n", "М141БПЛТЛ010 6\n", "М141БПЛТЛ071 9\n", "М141БПЛТЛ035 6\n", "М141БПЛТЛ030 6\n", "М141БПЛТЛ070 5\n", "М141БПЛТЛ051 9\n", "М141БПЛТЛ046 7\n", "М141БПЛТЛ047 8\n", "М141БПЛТЛ063 5\n", "М141БПЛТЛ029 8\n", "М141БПЛТЛ064 8\n", "М141БПЛТЛ076 7\n", "М141БПЛТЛ062 7\n", "М141БПЛТЛ074 6\n", "130232038 7\n", "М141БПЛТЛ023 9\n", "М141БПЛТЛ054 8\n", "М141БПЛТЛ012 6\n", "М141БПЛТЛ006 5\n", "М141БПЛТЛ055 5\n", "М141БПЛТЛ007 7\n", "М141БПЛТЛ050 6\n", "М141БПЛТЛ066 10\n", "М141БПЛТЛ043 5\n", "М141БПЛТЛ084 7\n", "М141БПЛТЛ005 7\n", "М141БПЛТЛ044 5\n", "13051038 4\n", "Name: mstat, dtype: int64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[:, 1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Синтаксис кода с `.iloc` несильно отличается от синтаксиса `.loc`. В чем разница? Разница заключается в том, что метод `.loc` работает с текстовыми названиями, а метод `.iloc` – с числовыми индексами. Отсюда и префикс `i` в названии (*i* – индекс, *loc* – location). Если мы попытаемся в `.iloc` указать названия столбцов, Python выдаст ошибку:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "cannot do slice indexing on with these indexers [mstat] of ", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0miloc\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'mstat'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'econ'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1470\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mKeyError\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mIndexError\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1471\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1472\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getitem_tuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1473\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1474\u001b[0m \u001b[0;31m# we by definition only have the 0th axis\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_getitem_tuple\u001b[0;34m(self, tup)\u001b[0m\n\u001b[1;32m 2027\u001b[0m \u001b[0;32mcontinue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2028\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2029\u001b[0;31m \u001b[0mretval\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgetattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mretval\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2030\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2031\u001b[0m \u001b[0;31m# if the dim was reduced, then pass a lower-dim the next time\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 2078\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2079\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2080\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_slice_axis\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2081\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2082\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_get_slice_axis\u001b[0;34m(self, slice_obj, axis)\u001b[0m\n\u001b[1;32m 2046\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcopy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdeep\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2047\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2048\u001b[0;31m \u001b[0mslice_obj\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_convert_slice_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mslice_obj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2049\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mslice_obj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mslice\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2050\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_slice\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mslice_obj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'iloc'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_convert_slice_indexer\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 264\u001b[0m \u001b[0;31m# if we are accessing via lowered dim, use the last dim\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 265\u001b[0m \u001b[0max\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndim\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 266\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0max\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_convert_slice_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 267\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 268\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_has_valid_setitem_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mindexer\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36m_convert_slice_indexer\u001b[0;34m(self, key, kind)\u001b[0m\n\u001b[1;32m 1688\u001b[0m \u001b[0;31m# validate iloc\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1689\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mkind\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m'iloc'\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1690\u001b[0;31m return slice(self._validate_indexer('slice', key.start, kind),\n\u001b[0m\u001b[1;32m 1691\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_validate_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'slice'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstop\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkind\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1692\u001b[0m self._validate_indexer('slice', key.step, kind))\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36m_validate_indexer\u001b[0;34m(self, form, key, kind)\u001b[0m\n\u001b[1;32m 4126\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4127\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mkind\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m'iloc'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'getitem'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 4128\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_invalid_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mform\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4129\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4130\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36m_invalid_indexer\u001b[0;34m(self, form, key)\u001b[0m\n\u001b[1;32m 1846\u001b[0m \"indexers [{key}] of {kind}\".format(\n\u001b[1;32m 1847\u001b[0m \u001b[0mform\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mform\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mklass\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1848\u001b[0;31m kind=type(key)))\n\u001b[0m\u001b[1;32m 1849\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1850\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mget_duplicates\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypeError\u001b[0m: cannot do slice indexing on with these indexers [mstat] of " ] } ], "source": [ "df.iloc[:, 'mstat': 'econ']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python пишет, что невозможно взять срез по индексам, которые имеют строковый тип (`class 'str'`), так как в квадратных скобках ожидаются числовые (целочисленные) индексы.\n", "\n", "Если нужно выбрать несколько столбцов подряд, можно воспользоваться срезами:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mstatsoc
id
М141БПЛТЛ02498
М141БПЛТЛ0311010
М141БПЛТЛ07599
М141БПЛТЛ01798
М141БПЛТЛ0691010
М141БПЛТЛ07298
М141БПЛТЛ02077
М141БПЛТЛ026108
М141БПЛТЛ07398
М141БПЛТЛ07869
М141БПЛТЛ06087
М141БПЛТЛ04098
М141БПЛТЛ06598
М141БПЛТЛ05377
М141БПЛТЛ01597
М141БПЛТЛ02198
М141БПЛТЛ01879
М141БПЛТЛ03989
М141БПЛТЛ036107
М141БПЛТЛ04976
06114043810
М141БПЛТЛ04868
М141БПЛТЛ03497
М141БПЛТЛ04588
М141БПЛТЛ03398
М141БПЛТЛ08356
М141БПЛТЛ00888
М141БПЛТЛ00177
М141БПЛТЛ03896
М141БПЛТЛ05277
М141БПЛТЛ01168
М141БПЛТЛ00476
М141БПЛТЛ01067
М141БПЛТЛ07197
М141БПЛТЛ03567
М141БПЛТЛ03066
М141БПЛТЛ07056
М141БПЛТЛ05198
М141БПЛТЛ04677
М141БПЛТЛ04786
М141БПЛТЛ06356
М141БПЛТЛ02988
М141БПЛТЛ06486
М141БПЛТЛ07678
М141БПЛТЛ06277
М141БПЛТЛ07467
13023203876
М141БПЛТЛ02396
М141БПЛТЛ05486
М141БПЛТЛ01267
М141БПЛТЛ00656
М141БПЛТЛ05556
М141БПЛТЛ00777
М141БПЛТЛ05066
М141БПЛТЛ066107
М141БПЛТЛ04356
М141БПЛТЛ08478
М141БПЛТЛ00575
М141БПЛТЛ04457
1305103844
\n", "
" ], "text/plain": [ " mstat soc\n", "id \n", "М141БПЛТЛ024 9 8\n", "М141БПЛТЛ031 10 10\n", "М141БПЛТЛ075 9 9\n", "М141БПЛТЛ017 9 8\n", "М141БПЛТЛ069 10 10\n", "М141БПЛТЛ072 9 8\n", "М141БПЛТЛ020 7 7\n", "М141БПЛТЛ026 10 8\n", "М141БПЛТЛ073 9 8\n", "М141БПЛТЛ078 6 9\n", "М141БПЛТЛ060 8 7\n", "М141БПЛТЛ040 9 8\n", "М141БПЛТЛ065 9 8\n", "М141БПЛТЛ053 7 7\n", "М141БПЛТЛ015 9 7\n", "М141БПЛТЛ021 9 8\n", "М141БПЛТЛ018 7 9\n", "М141БПЛТЛ039 8 9\n", "М141БПЛТЛ036 10 7\n", "М141БПЛТЛ049 7 6\n", "06114043 8 10\n", "М141БПЛТЛ048 6 8\n", "М141БПЛТЛ034 9 7\n", "М141БПЛТЛ045 8 8\n", "М141БПЛТЛ033 9 8\n", "М141БПЛТЛ083 5 6\n", "М141БПЛТЛ008 8 8\n", "М141БПЛТЛ001 7 7\n", "М141БПЛТЛ038 9 6\n", "М141БПЛТЛ052 7 7\n", "М141БПЛТЛ011 6 8\n", "М141БПЛТЛ004 7 6\n", "М141БПЛТЛ010 6 7\n", "М141БПЛТЛ071 9 7\n", "М141БПЛТЛ035 6 7\n", "М141БПЛТЛ030 6 6\n", "М141БПЛТЛ070 5 6\n", "М141БПЛТЛ051 9 8\n", "М141БПЛТЛ046 7 7\n", "М141БПЛТЛ047 8 6\n", "М141БПЛТЛ063 5 6\n", "М141БПЛТЛ029 8 8\n", "М141БПЛТЛ064 8 6\n", "М141БПЛТЛ076 7 8\n", "М141БПЛТЛ062 7 7\n", "М141БПЛТЛ074 6 7\n", "130232038 7 6\n", "М141БПЛТЛ023 9 6\n", "М141БПЛТЛ054 8 6\n", "М141БПЛТЛ012 6 7\n", "М141БПЛТЛ006 5 6\n", "М141БПЛТЛ055 5 6\n", "М141БПЛТЛ007 7 7\n", "М141БПЛТЛ050 6 6\n", "М141БПЛТЛ066 10 7\n", "М141БПЛТЛ043 5 6\n", "М141БПЛТЛ084 7 8\n", "М141БПЛТЛ005 7 5\n", "М141БПЛТЛ044 5 7\n", "13051038 4 4" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[:, 1:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Числовые срезы в `pandas` уже ничем не отличаются от списковых срезов: правый конец среза не включается. В нашем случае мы выбрали только столбцы с индексами 1 и 2." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Выбор строк по названию**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Выбор строки по названию происходит аналогичным образом, только здесь метод `.loc` уже обязателен." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "catps 8.0\n", "mstat 10.0\n", "soc 10.0\n", "econ 10.0\n", "eng 10.0\n", "polth 10.0\n", "mstat2 10.0\n", "phist 9.0\n", "law 9.0\n", "phil 10.0\n", "polsoc 10.0\n", "ptheo 9.0\n", "preg 8.0\n", "compp 8.0\n", "game 9.0\n", "wpol 10.0\n", "male 1.0\n", "Name: М141БПЛТЛ031, dtype: float64" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc['М141БПЛТЛ031'] # строка для студента с номером М141БПЛТЛ031" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "При этом ставить запятую и двоеточие, показывая, что нам нужна одна строка и все столбцы, уже не нужно. Если нам нужно выбрать несколько строк подряд, то `.loc` не нужен:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ024798898108.07997.088.06101
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ017998899106.09998.088.0890
М141БПЛТЛ06910101010101098.081097.065.08101
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8.0 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6.0 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8.0 8 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 " ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"М141БПЛТЛ024\":'М141БПЛТЛ069']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Как Python понимает, что мы просим вывести именно строки с такими названиями, а не столбцы? Потому что у нас стоят одинарные квадратные скобки, а не двойные, как в случае со столбцами. (Да, в `pandas` много всяких тонкостей, но чтобы хорошо в них разбираться, нужно просто попрактиковаться и привыкнуть).\n", "\n", "Обратите внимание: разницы между двойными и одинарными кавычками нет, строки можно вводить в любых кавычках, как в примере выше." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Выбор строк по номеру**\n", "\n", "В этом случае достаточно указать номер в квадратных скобках в `.iloc`:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "catps 9.0\n", "mstat 9.0\n", "soc 9.0\n", "econ 10.0\n", "eng 9.0\n", "polth 10.0\n", "mstat2 9.0\n", "phist 8.0\n", "law 9.0\n", "phil 10.0\n", "polsoc 9.0\n", "ptheo 9.0\n", "preg 8.0\n", "compp 8.0\n", "game 7.0\n", "wpol 9.0\n", "male 1.0\n", "Name: М141БПЛТЛ075, dtype: float64" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если нужно несколько строк подряд, можно воспользоваться срезами:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[1:3] # и без iloc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если нужно несколько строк не подряд, можно просто перечислить внутри списка в `.iloc`:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ0721098109898.081097.088.0990
М141БПЛТЛ06078779885.07585.078.0791
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ072 10 9 8 10 9 8 9 8.0 8 10 \n", "М141БПЛТЛ060 7 8 7 7 9 8 8 5.0 7 5 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ072 9 7.0 8 8.0 9 9 0 \n", "М141БПЛТЛ060 8 5.0 7 8.0 7 9 1 " ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[[1, 2, 5, 10]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Удаление пропущенных значений\n", "\n", "Мы уже видели, что в данном датафрейме есть строки (и столбцы) с пропущенными значениями (`NaN`). Из-за наличия этих таких значений содержащие их столбцы, даже если остальные значения являются целыми, имеют тип `float`. \n", "\n", "Удалим строки с пропущенными значениями из датафрейма совсем:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "df = df.dropna()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Однако, если посмотрим на обновленный датасет, тип `float` никуда не исчез:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ024798898108.07997.088.06101
М141БПЛТЛ03181010101010109.0910109.088.09101
М141БПЛТЛ0759991091098.091099.088.0791
М141БПЛТЛ017998899106.09998.088.0890
М141БПЛТЛ06910101010101098.081097.065.08101
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8.0 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9.0 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8.0 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6.0 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8.0 8 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 " ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Применим преобразование типов." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Преобразование типов столбцов" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Просто воспользуемся методом `.astype()`, который преобразует тип столбца в тот, который мы укажем (если это возможно, разумеется):" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " \"\"\"Entry point for launching an IPython kernel.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ0247988981087997.088.06101
М141БПЛТЛ03181010101010109910109.088.09101
М141БПЛТЛ075999109109891099.088.0791
М141БПЛТЛ0179988991069998.088.0890
М141БПЛТЛ0691010101010109881097.065.08101
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8 8 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 " ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['phist'] = df['phist'].astype(int)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Описательные статистики и базовые графики" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В самом начале мы обсуждали описание базы данных с помощью метода `.describe()`. Помимо этого метода существует много методов, которые выводят отдельные статистики." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "catps 7.0\n", "mstat 7.5\n", "soc 7.0\n", "econ 6.0\n", "eng 8.5\n", "polth 6.0\n", "mstat2 7.0\n", "phist 6.0\n", "law 7.0\n", "phil 6.0\n", "polsoc 8.0\n", "ptheo 5.0\n", "preg 7.0\n", "compp 5.0\n", "game 6.0\n", "wpol 8.0\n", "male 0.0\n", "dtype: float64" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.median() # медиана (для всех показателей)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Можно запрашивать статистики по отдельным переменным (столбцам):" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5.833333333333333" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.phist.mean() # среднее арифметическое Phist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Или по наблюдениям (строкам):" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.235294117647059" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"М141БПЛТЛ023\"].mean() # средний балл студента по всем курсам" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Давайте теперь построим какие-нибудь графики. Библиотеку pandas удобно использовать в сочетании с библиотекой для построения графиков `matplotlib`. Давайте её импортируем (эта библиотека должна была быть установлена на ваш компьютер вместе с Anaconda)." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "import matplotlib" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Построим гистограмму для оценок по теории игр." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAD8CAYAAAB6paOMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAADvJJREFUeJzt3XusZWV9xvHv44wGhmq9cLAKjgcaghpSBY/GSksrSKOioKa2mNpQax2TWkXbREdjiv80wcR6adpYR7zgDSN4rVoEsWqaWHQGaLiMBqsjjKAz1raoUAH99Y+9pg4jMnv2OXu9c877/SQne601e/b7rNkzPKx7qgpJUr/u0zqAJKkti0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUufWtA0zj8MMPr8XFxdYxJGlV2bZt2/eramF/71sVRbC4uMjWrVtbx5CkVSXJt6d5n7uGJKlzFoEkdc4ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpc6viyuLlWNz86Sbj7jjv9Cbjalyt/n6Bf8e0ctwikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdm1sRJHlXkl1Jrt1r2YOTXJbkhuH1QfMaX5I0nXluEbwHeNo+yzYDl1fVscDlw7wkqaG5FUFVfQn4wT6LzwQuGKYvAJ49r/ElSdMZ+xjBQ6vqFoDh9YiRx5ck7eOgPVicZFOSrUm27t69u3UcSVqzxi6C7yV5GMDwuuuXvbGqtlTVUlUtLSwsjBZQknozdhF8Ejh7mD4b+MTI40uS9jHP00cvBL4MHJdkZ5IXAecBpyW5AThtmJckNbR+Xh9cVc//Jb906rzGlCQduIP2YLEkaRwWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1DmLQJI6N7e7j/ZucfOnm42947zTm40tafVxi0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTONSmCJK9Mcl2Sa5NcmOSQFjkkSQ2KIMmRwMuBpao6HlgHnDV2DknSRKtdQ+uBQ5OsBzYANzfKIUndG/1RlVX1nSRvBG4EbgcurapL931fkk3AJoCNGzeOG1Iz8fGc42r1593jn/Va12LX0IOAM4GjgYcDhyV5wb7vq6otVbVUVUsLCwtjx5SkbrTYNfRU4FtVtbuq7gQ+Cjy5QQ5JEm2K4EbgSUk2JAlwKrC9QQ5JEg2KoKquAC4GrgSuGTJsGTuHJGli9IPFAFV1LnBui7ElSXfnlcWS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6txURZDk+HkHkSS1Me0WwT8m+UqSP0/ywLkmkiSNaqoiqKrfAv4IeASwNckHk5w212SSpFFMfYygqm4AXge8Gvgd4O+SfC3Jc+cVTpI0f9MeI/iNJG9m8kjJU4BnVdWjh+k3zzGfJGnOpn1C2d8D7wBeW1W371lYVTcned1ckkmSRjFtETwDuL2qfgqQ5D7AIVV1W1W9b27pJElzN+0xgs8Bh+41v2FYJkla5aYtgkOq6kd7ZobpDfOJJEka07RF8OMkJ+6ZSfJ44PZ7eb8kaZWY9hjBK4CLktw8zD8M+MP5RJIkjWmqIqiqryZ5FHAcEOBrVXXnXJNJkkYx7RYBwBOAxeH3nJCEqnrvXFJJkkYzVREkeR/w68DVwE+HxQVYBJK0yk27RbAEPKaqap5hJEnjm/asoWuBX5tnEElSG9NuERwOXJ/kK8BP9iysqjNmGXS4lfX5wPFMdjH9aVV9eZbPkiQtz7RF8PoVHvetwCVV9ftJ7ocXp0lSM9OePvrFJI8Ejq2qzyXZAKybZcAkDwBOBv5k+Ow7gDtm+SxJ0vJNexvqFwMXA28fFh0JfHzGMY8BdgPvTnJVkvOTHDbjZ0mSlmnag8UvBU4CboX/f0jNETOOuR44EXhbVZ0A/BjYvO+bkmxKsjXJ1t27d884lCRpf6Ytgp8Mu3AASLKeyUHeWewEdlbVFcP8xUyK4W6qaktVLVXV0sLCwoxDSZL2Z9oi+GKS1wKHDs8qvgj4p1kGrKrvAjclOW5YdCpw/SyfJUlavmnPGtoMvAi4BngJ8Bkmp3/O6mXAB4Yzhr4JvHAZnyVJWoZpzxr6GZNHVb5jJQatqquZXK0sSWps2nsNfYt7OCZQVceseCJJ0qgO5F5DexwCPA948MrHkSSNbaqDxVX1n3v9fKeq3gKcMudskqQRTLtraO/TO+/DZAvh/nNJJEka1bS7hv52r+m7gB3AH6x4GknS6KY9a+gp8w4iSWpj2l1Df3lvv15Vb1qZOJKksR3IWUNPAD45zD8L+BJw0zxCSZLGcyAPpjmxqn4IkOT1wEVV9WfzCiZJGse09xrayN2fGXAHsLjiaSRJo5t2i+B9wFeSfIzJFcbPAd47t1SSpNFMe9bQ3yT5Z+C3h0UvrKqr5hdLkjSWaXcNweS5wrdW1VuBnUmOnlMmSdKIpn1U5bnAq4HXDIvuC7x/XqEkSeOZdovgOcAZTB4rSVXdjLeYkKQ1YdoiuKOqiuFW1D5sXpLWjmmL4MNJ3g48MMmLgc+xQg+pkSS1Ne1ZQ28cnlV8K3Ac8NdVddlck0mSRrHfIkiyDvhsVT0V8D/+krTG7HfXUFX9FLgtya+OkEeSNLJpryz+X+CaJJcxnDkEUFUvn0sqSdJopi2CTw8/kqQ15l6LIMnGqrqxqi4YK5AkaVz7O0bw8T0TST4y5yySpAb2VwTZa/qYeQaRJLWxvyKoXzItSVoj9new+LFJbmWyZXDoMM0wX1X1gLmmkyTN3b0WQVWtGyuIJKmNA3kegSRpDWpWBEnWJbkqyadaZZAktd0iOAfY3nB8SRKNiiDJUcDpwPktxpck/dy0t5hYaW8BXsW9POUsySZgE8DGjRtHiiXpYLa4uc2dbnacd3qTcccy+hZBkmcCu6pq2729r6q2VNVSVS0tLCyMlE6S+tNi19BJwBlJdgAfAk5J8v4GOSRJNCiCqnpNVR1VVYvAWcDnq+oFY+eQJE14HYEkda7VwWIAquoLwBdaZpCk3rlFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1Lmmdx+VpNVgrT8i0y0CSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktS50YsgySOS/EuS7UmuS3LO2BkkST/X4nkEdwF/VVVXJrk/sC3JZVV1fYMsktS90bcIquqWqrpymP4hsB04cuwckqSJpscIkiwCJwBXtMwhST1rVgRJfgX4CPCKqrr1Hn59U5KtSbbu3r17/ICS1IkmRZDkvkxK4ANV9dF7ek9VbamqpapaWlhYGDegJHWkxVlDAd4JbK+qN409viTp7lpsEZwE/DFwSpKrh59nNMghSaLB6aNV9a9Axh5XknTPvLJYkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzTYogydOSfD3JN5JsbpFBkjQxehEkWQf8A/B04DHA85M8ZuwckqSJFlsETwS+UVXfrKo7gA8BZzbIIUmiTREcCdy01/zOYZkkqYH1DcbMPSyrX3hTsgnYNMz+KMnXZxzvcOD7M/7eg81U65I3jJBkeVb8O2m4zv79Ojitie8lb1j2ejxymje1KIKdwCP2mj8KuHnfN1XVFmDLcgdLsrWqlpb7OQeDtbIua2U9wHU5WK2VdRlrPVrsGvoqcGySo5PcDzgL+GSDHJIkGmwRVNVdSf4C+CywDnhXVV03dg5J0kSLXUNU1WeAz4w03LJ3Lx1E1sq6rJX1ANflYLVW1mWU9UjVLxynlSR1xFtMSFLn1nQRJFmX5Kokn2qdZTmS7EhyTZKrk2xtnWc5kjwwycVJvpZke5LfbJ1pFkmOG76PPT+3JnlF61yzSPLKJNcluTbJhUkOaZ1pVknOGdbjutX2fSR5V5JdSa7da9mDk1yW5Ibh9UHzGHtNFwFwDrC9dYgV8pSqetwaOCXurcAlVfUo4LGs0u+nqr4+fB+PAx4P3AZ8rHGsA5bkSODlwFJVHc/kBI6z2qaaTZLjgRczuXvBY4FnJjm2baoD8h7gafss2wxcXlXHApcP8ytuzRZBkqOA04HzW2fRRJIHACcD7wSoqjuq6r/bploRpwL/UVXfbh1kRuuBQ5OsBzZwD9f1rBKPBv6tqm6rqruALwLPaZxpalX1JeAH+yw+E7hgmL4AePY8xl6zRQC8BXgV8LPWQVZAAZcm2TZccb1aHQPsBt497LI7P8lhrUOtgLOAC1uHmEVVfQd4I3AjcAvwP1V1adtUM7sWODnJQ5JsAJ7B3S9eXY0eWlW3AAyvR8xjkDVZBEmeCeyqqm2ts6yQk6rqRCZ3bH1pkpNbB5rReuBE4G1VdQLwY+a0qTuW4aLIM4CLWmeZxbDP+UzgaODhwGFJXtA21WyqajvwBuAy4BLg34G7moZaJdZkEQAnAWck2cHk7qanJHl/20izq6qbh9ddTPZDP7FtopntBHZW1RXD/MVMimE1ezpwZVV9r3WQGT0V+FZV7a6qO4GPAk9unGlmVfXOqjqxqk5mspvlhtaZlul7SR4GMLzumscga7IIquo1VXVUVS0y2Wz/fFWtyv/LSXJYkvvvmQZ+j8km8KpTVd8Fbkpy3LDoVOD6hpFWwvNZpbuFBjcCT0qyIUmYfCer8gA+QJIjhteNwHNZ3d8NTG6/c/YwfTbwiXkM0uTKYh2QhwIfm/wbZT3wwaq6pG2kZXkZ8IFhl8o3gRc2zjOzYT/0acBLWmeZVVVdkeRi4Eomu1GuYnVflfuRJA8B7gReWlX/1TrQtJJcCPwucHiSncC5wHnAh5O8iElpP28uY3tlsST1bU3uGpIkTc8ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpc/8HBCSFFd5RE6UAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df[\"game\"].plot.hist() # histogram" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Что показывает этот график? Он показывает, сколько студентов получили те или иные оценки. По гистограмме видно, что больше всего по этому курсу оценок 4 и 7.\n", "\n", "Можно поменять цвет гистограммы:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAD8CAYAAAB6paOMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAADuFJREFUeJzt3XusZWV9xvHv44wGhmpROVgFx4GGoIZUwaOx0tIK0ngFNbXF1IZa65jUKtomisYU/2liE+ulaWMd8YI3jOC11iKIVdPEojNAw2U0WEUYQWesbVGhAvrrH3tNHcaR2bPP2euds9/vJznZe61Zs99nzZ7hYd1TVUiS+nWf1gEkSW1ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTOrW8dYBpHHHFEbdq0qXUMSVpTtm3b9r2qWtrfcmuiCDZt2sTWrVtbx5CkNSXJt6ZZzl1DktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUuTVxZfGKJG3GrWozrsbV6u8X+HdMq8YtAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUubkVQZJ3JdmZ5No95j0oyWVJbhheHziv8SVJ05nnFsF7gKfuNe9c4PKqOg64fJiWJDU0tyKoqi8C399r9pnABcP7C4Bnz2t8SdJ0xj5G8JCquhVgeD1y5PElSXs5aA8WJ9mcZGuSrbt27WodR5IW1thF8N0kDwUYXnf+ogWraktVLVfV8tLS0mgBJak3YxfBJ4Gzh/dnA58YeXxJ0l7mefrohcCXgOOT7EjyIuANwOlJbgBOH6YlSQ2tn9cHV9Xzf8EvnTavMSVJB+6gPVgsSRqHRSBJnbMIJKlzFoEkdc4ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzs3t7qPdS9qNXdVubElrjlsEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkda5JESR5ZZLrklyb5MIkh7TIIUlqUARJjgJeDixX1QnAOuCssXNIkiZa7RpaDxyaZD2wAbilUQ5J6t7oj6qsqm8neSNwE3AHcGlVXbr3ckk2A5sBNm7cOG5IzcbHc46r1Z93j3/WC67FrqEHAmcCxwAPAw5L8oK9l6uqLVW1XFXLS0tLY8eUpG602DX0FOCbVbWrqu4CPgo8qUEOSRJtiuAm4IlJNiQJcBqwvUEOSRINiqCqrgAuBq4ErhkybBk7hyRpYvSDxQBVdR5wXouxJUn35JXFktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSercVEWQ5IR5B5EktTHtFsE/JPlykj9NcvhcE0mSRjVVEVTVbwB/ADwc2Jrkg0lOn2sySdIopj5GUFU3AK8DXg38FvC3Sb6a5LnzCidJmr9pjxH8WpI3M3mk5KnAs6rqUcP7N88xnyRpzqZ9QtnfAe8AXltVd+yeWVW3JHndXJJJkkYxbRE8Hbijqn4CkOQ+wCFVdXtVvW9u6SRJczftMYLPAofuMb1hmCdJWuOmLYJDquqHuyeG9xvmE0mSNKZpi+BHSU7aPZHkccAd97K8JGmNmPYYwSuAi5LcMkw/FPj9+USSJI1pqiKoqq8keSRwPBDgq1V111yTSZJGMe0WAcDjgU3D7zkxCVX13rmkkiSNZqoiSPI+4FeBq4GfDLMLsAgkaY2bdotgGXh0VdU8w0iSxjftWUPXAr8yzyCSpDam3SI4Arg+yZeBH++eWVVnzDLocCvr84ETmOxi+uOq+tIsnyVJWplpi+D1qzzuW4FLqup3k9wPL06TpGamPX30C0keARxXVZ9NsgFYN8uASR4AnAL80fDZdwJ3zvJZkqSVm/Y21C8GLgbePsw6Cvj4jGMeC+wC3p3kqiTnJzlsxs+SJK3QtAeLXwqcDNwG//+QmiNnHHM9cBLwtqo6EfgRcO7eCyXZnGRrkq27du2acShJ0v5MWwQ/HnbhAJBkPZODvLPYAeyoqiuG6YuZFMM9VNWWqlququWlpaUZh5Ik7c+0RfCFJK8FDh2eVXwR8I+zDFhV3wFuTnL8MOs04PpZPkuStHLTnjV0LvAi4BrgJcCnmZz+OauXAR8Yzhj6BvDCFXyWJGkFpj1r6KdMHlX5jtUYtKquZnK1siSpsWnvNfRN9nFMoKqOXfVEkqRRHci9hnY7BHge8KDVjyNJGttUB4ur6j/3+Pl2Vb0FOHXO2SRJI5h219Cep3feh8kWwv3nkkiSNKppdw39zR7v7wZuBH5v1dNIkkY37VlDT553EElSG9PuGvrze/v1qnrT6sSRJI3tQM4aejzwyWH6WcAXgZvnEUqSNJ4DeTDNSVX1A4Akrwcuqqo/mVcwSdI4pr3X0Ebu+cyAO4FNq55GkjS6abcI3gd8OcnHmFxh/BzgvXNLJUkazbRnDf1Vkn8GfnOY9cKqump+sSRJY5l21xBMnit8W1W9FdiR5Jg5ZZIkjWjaR1WeB7waeM0w677A++cVSpI0nmm3CJ4DnMHksZJU1S14iwlJWgjTFsGdVVUMt6L2YfOStDimLYIPJ3k7cHiSFwOfZZUeUiNJamvas4beODyr+DbgeOAvq+qyuSaTJI1iv0WQZB3wmap6CuB//CVpwex311BV/QS4Pckvj5BHkjSyaa8s/l/gmiSXMZw5BFBVL59LKknSaKYtgn8afiRJC+ZeiyDJxqq6qaouGCuQJGlc+ztG8PHdb5J8ZM5ZJEkN7K8Issf7Y+cZRJLUxv6KoH7Be0nSgtjfweLHJLmNyZbBocN7humqqgfMNZ0kae7utQiqat1YQSRJbRzI8wgkSQuoWREkWZfkqiSfapVBktR2i+AcYHvD8SVJNCqCJEcDzwDObzG+JOlnpr3FxGp7C/Aq7uUpZ0k2A5sBNm7cOFIsSQe1ZP/LzEMt9tnzo28RJHkmsLOqtt3bclW1paqWq2p5aWlppHSS1J8Wu4ZOBs5IciPwIeDUJO9vkEOSRIMiqKrXVNXRVbUJOAv4XFW9YOwckqQJryOQpM61OlgMQFV9Hvh8ywyS1Du3CCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjrX9O6jkrQmLPgjMt0ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdG70Ikjw8yb8k2Z7kuiTnjJ1BkvQzLZ5HcDfwF1V1ZZL7A9uSXFZV1zfIIkndG32LoKpuraorh/c/ALYDR42dQ5I00fQYQZJNwInAFS1zSFLPmhVBkl8CPgK8oqpu28evb06yNcnWXbt2jR9QkjrRpAiS3JdJCXygqj66r2WqaktVLVfV8tLS0rgBJakjLc4aCvBOYHtVvWns8SVJ99Rii+Bk4A+BU5NcPfw8vUEOSRINTh+tqn8FMva4kqR988piSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1DmLQJI6ZxFIUucsAknqnEUgSZ2zCCSpcxaBJHXOIpCkzlkEktQ5i0CSOmcRSFLnLAJJ6pxFIEmdswgkqXMWgSR1ziKQpM5ZBJLUOYtAkjpnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTONSmCJE9N8rUkX09ybosMkqSJ0YsgyTrg74GnAY8Gnp/k0WPnkCRNtNgieALw9ar6RlXdCXwIOLNBDkkSbYrgKODmPaZ3DPMkSQ2sbzBm9jGvfm6hZDOweZj8YZKvzTjeEcD3Zvy9B5vp1iX7+iM+qKz+d9Junf37dXBajO8lWel6PGKahVoUwQ7g4XtMHw3csvdCVbUF2LLSwZJsrarllX7OwWBR1mVR1gNcl4PVoqzLWOvRYtfQV4DjkhyT5H7AWcAnG+SQJNFgi6Cq7k7yZ8BngHXAu6rqurFzSJImWuwaoqo+DXx6pOFWvHvpILIo67Io6wGuy8FqUdZllPVI1c8dp5UkdcRbTEhS5xa6CJKsS3JVkk+1zrISSW5Mck2Sq5NsbZ1nJZIcnuTiJF9Nsj3Jr7fONIskxw/fx+6f25K8onWuWSR5ZZLrklyb5MIkh7TONKsk5wzrcd1a+z6SvCvJziTX7jHvQUkuS3LD8PrAeYy90EUAnANsbx1ilTy5qh67AKfEvRW4pKoeCTyGNfr9VNXXhu/jscDjgNuBjzWOdcCSHAW8HFiuqhOYnMBxVttUs0lyAvBiJncveAzwzCTHtU11QN4DPHWveecCl1fVccDlw/SqW9giSHI08Azg/NZZNJHkAcApwDsBqurOqvrvtqlWxWnAf1TVt1oHmdF64NAk64EN7OO6njXiUcC/VdXtVXU38AXgOY0zTa2qvgh8f6/ZZwIXDO8vAJ49j7EXtgiAtwCvAn7aOsgqKODSJNuGK67XqmOBXcC7h1125yc5rHWoVXAWcGHrELOoqm8DbwRuAm4F/qeqLm2bambXAqckeXCSDcDTuefFq2vRQ6rqVoDh9ch5DLKQRZDkmcDOqtrWOssqObmqTmJyx9aXJjmldaAZrQdOAt5WVScCP2JOm7pjGS6KPAO4qHWWWQz7nM8EjgEeBhyW5AVtU82mqrYDfw1cBlwC/Dtwd9NQa8RCFgFwMnBGkhuZ3N301CTvbxtpdlV1y/C6k8l+6Ce0TTSzHcCOqrpimL6YSTGsZU8Drqyq77YOMqOnAN+sql1VdRfwUeBJjTPNrKreWVUnVdUpTHaz3NA60wp9N8lDAYbXnfMYZCGLoKpeU1VHV9UmJpvtn6uqNfl/OUkOS3L/3e+B32GyCbzmVNV3gJuTHD/MOg24vmGk1fB81uhuocFNwBOTbEgSJt/JmjyAD5DkyOF1I/Bc1vZ3A5Pb75w9vD8b+MQ8BmlyZbEOyEOAj03+jbIe+GBVXdI20oq8DPjAsEvlG8ALG+eZ2bAf+nTgJa2zzKqqrkhyMXAlk90oV7G2r8r9SJIHA3cBL62q/2odaFpJLgR+GzgiyQ7gPOANwIeTvIhJaT9vLmN7ZbEk9W0hdw1JkqZnEUhS5ywCSeqcRSBJnbMIJKlzFoEkdc4ikKTOWQSS1Ln/A972e/WEttqAAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df[\"game\"].plot.hist(color = \"red\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "А также поменять число столбцов и цвет границ столбцов:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAD8CAYAAABkbJM/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFQlJREFUeJzt3X+wX3V95/HnS4JVKC1gAmJCiHYZKmUK0lvUZcvyQyiwrFTHtjDdLuuiURa3uu3MFrs74tLZGXe2onXpwKaSBa2mikpl1whE2kqdqUCIIMHAQjHIJVkSxfJDrBh87x/fk93L5XuTz73c7z33Js/HzHe+53zO53zP+8ydySvnc36lqpAkaXde0ncBkqSFwcCQJDUxMCRJTQwMSVITA0OS1MTAkCQ1MTAkSU0MDElSEwNDktRkUd8FzKbFixfXihUr+i5DkhaMO++887tVtaSl7x4VGCtWrGD9+vV9lyFJC0aSh1v7OiQlSWpiYEiSmhgYkqQmBoYkqYmBIUlqMrLASHJ4kr9KsinJvUne27UfnGRdkge674OmWP+Crs8DSS4YVZ2SpDajPMLYAfxeVb0WeANwcZKjgUuAW6rqSOCWbv55khwMXAq8HjgBuHSqYJEkzY2RBUZVba2qDd30U8AmYClwLnBt1+1a4NeGrP6rwLqqeryqvg+sA84cVa2SpN2bk3MYSVYArwNuAw6tqq0wCBXgkCGrLAUemTA/3rVJknoy8ju9k/w08HngfVX1ZJKm1Ya01RS/vxJYCbB8+fKZlsmKZZfz8KNPzXj9heiIpQewefx3+y5D0gIx0sBIsi+DsPhUVX2ha34syWFVtTXJYcC2IauOAydPmF8G/PWwbVTVKmAVwNjY2NBQafHwo09RF43NdPUFKVf6GBVJ7UZ5lVSAq4FNVXX5hEU3ADuveroA+OKQ1W8CzkhyUHey+4yuTZLUk1GewzgR+G3g1CR3dZ+zgQ8Bpyd5ADi9myfJWJKPA1TV48AfAnd0n8u6NklST0Y2JFVVX2P4uQiA04b0Xw+8Y8L8amD1aKqTJE2Xd3pLkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKajOyNe0lWA+cA26rqmK7tM8BRXZcDgb+vquOGrLsZeAp4DthRVWOjqlOS1GZkgQFcA1wBfGJnQ1X95s7pJB8GntjF+qdU1XdHVp0kaVpG+U7vW5OsGLYsSYDfAE4d1fYlSbOrr3MYvwI8VlUPTLG8gJuT3Jlk5RzWJUmawiiHpHblfGDNLpafWFVbkhwCrEtyX1XdOqxjFygrAZYvXz77lUqSgB6OMJIsAt4KfGaqPlW1pfveBlwPnLCLvquqaqyqxpYsWTLb5UqSOn0MSb0JuK+qxoctTLJ/kgN2TgNnABvnsD5J0hAjC4wka4C/BY5KMp7kwm7ReUwajkryqiRru9lDga8luRu4HfhSVd04qjolSW1GeZXU+VO0/6shbVuAs7vph4BjR1WXJGlmvNNbktTEwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVITA0OS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJUhMDQ5LUZJSvaF2dZFuSjRPaPpjk0SR3dZ+zp1j3zCT3J3kwySWjqlGS1G6URxjXAGcOaf9IVR3XfdZOXphkH+BPgLOAo4Hzkxw9wjolSQ1GFhhVdSvw+AxWPQF4sKoeqqpngT8Hzp3V4iRJ09bHOYz3JPlmN2R10JDlS4FHJsyPd21DJVmZZH2S9du3b5/tWiVJnbkOjCuBnwOOA7YCHx7SJ0PaaqofrKpVVTVWVWNLliyZnSolSS8wp4FRVY9V1XNV9RPgTxkMP002Dhw+YX4ZsGUu6pMkTW1OAyPJYRNm3wJsHNLtDuDIJK9O8lLgPOCGuahPkjS1RaP64SRrgJOBxUnGgUuBk5Mcx2CIaTPwrq7vq4CPV9XZVbUjyXuAm4B9gNVVde+o6pQktRlZYFTV+UOar56i7xbg7Anza4EXXHIrSeqPd3pLkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKaNAVGkmOm+8NJVifZlmTjhLb/muS+JN9Mcn2SA6dYd3OSe5LclWT9dLctSZp9rUcYVyW5Pcm/meof+SGuAc6c1LYOOKaqfhH438D7d7H+KVV1XFWNNW5PkjRCTYFRVf8E+C3gcGB9kk8nOX0369wKPD6p7eaq2tHNfh1YNv2SJUl9aD6HUVUPAP8R+H3gnwIf64aX3jrDbf9r4MtTbQ64OcmdSVbu6keSrEyyPsn67du3z7AUSdLutJ7D+MUkHwE2AacC/7yqXttNf2S6G03yH4AdwKem6HJiVR0PnAVcnOSkqX6rqlZV1VhVjS1ZsmS6pUiSGrUeYVwBbACOraqLq2oDQFVtYXDU0SzJBcA5wG9VVQ3r0/0uVbUNuB44YTrbkCTNvkWN/c4GflhVzwEkeQnwsqp6pqo+2bqxJGfSDWlV1TNT9NkfeElVPdVNnwFc1roNSdJotB5hfAV4+YT5/bq2KSVZA/wtcFSS8SQXMjhSOQBY110ye1XX91VJ1narHgp8LcndwO3Al6rqxuY9kiSNROsRxsuq6umdM1X1dJL9drVCVZ0/pPnqKfpuYXAUQ1U9BBzbWJckaY60HmH8IMnxO2eS/BLww9GUJEmaj1qPMN4HXJdkSzd/GPCboylJkjQfNQVGVd2R5OeBo4AA91XVj0damSRpXmk9wgD4ZWBFt87rklBVnxhJVZKkeacpMJJ8Evg54C7gua65AANDkvYSrUcYY8DRU91oJ0na87VeJbUReOUoC5EkzW+tRxiLgW8luR340c7GqnrzSKqSJM07rYHxwVEWIUma/1ovq/1qkiOAI6vqK91d3vuMtjRJ0nzSepXUO4GVwMEMrpZaClwFnDa60qTZt2LZ5Tz86FN9lzFnjlh6AJvHf7fvMrSHaB2SupjBI8Zvg8HLlJIcMrKqpBF5+NGnqIv2nrf+5sr1fZegPUjrVVI/qqpnd84kWcTgPgxJ0l6iNTC+muQPgJd37/K+DvifoytLkjTftAbGJcB24B7gXcBapvmmPUnSwtZ6ldRPgD/tPpKkvVDrVVLfZsg5i6p6zaxXJEmal1qHpMYYPK32l4FfAT4G/NnuVkqyOsm2JBsntB2cZF2SB7rvg6ZY94KuzwNJLmisU5I0Ik2BUVXfm/B5tKo+CpzasOo1wJmT2i4BbqmqI4FbuvnnSXIwcCnwegaX8146VbBIkuZG65DU8RNmX8LgiOOA3a1XVbcmWTGp+Vzg5G76WuCvgd+f1OdXgXVV9Xi3/XUMgmdNS72SpNnXeuPehydM7wA2A78xw20eWlVbAapq6xQ3AC4FHpkwP961vUCSlQzuQmf58uUzLEmStDutV0mdMupCJsmwMoZ1rKpVwCqAsbExbyaUpBFpHZLa5cNoquryaWzzsSSHdUcXhwHbhvQZ5/8PWwEsYzB0JUnqyXSukrqIwbDQUuDdwNEMzmPs9lzGJDcAO696ugD44pA+NwFnJDmoO9l9RtcmSerJdF6gdHxVPQWQ5IPAdVX1jl2tlGQNgyOFxUnGGVz59CHgs0kuBL4D/HrXdwx4d1W9o6oeT/KHwB3dT1228wS4JKkfrYGxHHh2wvyzwIrdrVRV50+x6AWPRa+q9cA7JsyvBlY31idJGrHWwPgkcHuS6xmcfH4L8ImRVSVJmndar5L6z0m+zOAub4C3V9U3RleWJGm+aT3pDbAf8GRV/TEwnuTVI6pJkjQPNQVGkksZ3I39/q5pXxqeJSVJ2nO0HmG8BXgz8AOAqtrC9C+nlSQtYK2B8WxVFd3d1kn2H11JkqT5qDUwPpvkvwMHJnkn8BV8mZIk7VVar5L6o+5d3k8CRwEfqKp1I61MkjSv7DYwkuwD3FRVbwIMCUnaS+12SKqqngOeSfKzc1CPJGmear3T+x+Ae7oXGf1gZ2NV/c5IqpIkzTutgfGl7iNJ2kvtMjCSLK+q71TVtXNVkCRpftrdOYy/2DmR5PMjrkWSNI/tLjAmvir1NaMsRJI0v+3uHEZNMa09wE/tE5L/1HcZGqG98W98xNID2Dy+y7dKa4Z2FxjHJnmSwZHGy7tpuvmqqp+Z7gaTHAV8ZkLTaxjcCPjRCX1OZvDq1m93TV+oqsumuy3t2o+eK+qisb7LmFO5cn3fJcwp/8aaTbsMjKraZ7Y3WFX3A8fB/7sp8FHg+iFd/6aqzpnt7UuSZmY678MYhdOAv6uqh3uuQ5K0G30HxnnAmimWvTHJ3Um+nOQX5rIoSdIL9RYYSV7K4B0b1w1ZvAE4oqqOBf4bEy7vHfI7K5OsT7J++/btoylWktTrEcZZwIaqemzygqp6sqqe7qbXAvsmWTzsR6pqVVWNVdXYkiVLRluxJO3F+gyM85liOCrJK5Okmz6BQZ3fm8PaJEmTtD5LalYl2Q84HXjXhLZ3A1TVVcDbgIuS7AB+CJzXvfFPktSTXgKjqp4BXjGp7aoJ01cAV8x1XZKkqfUSGJI0Kt7dPjoGhqQ9ine3j07f92FIkhYIA0OS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVITA0OS1MTAkCQ16S0wkmxOck+Su5K84GHuGfhYkgeTfDPJ8X3UKUka6PsFSqdU1XenWHYWcGT3eT1wZfctSerBfB6SOhf4RA18HTgwyWF9FyVJe6s+A6OAm5PcmWTlkOVLgUcmzI93bZKkHvQ5JHViVW1JcgiwLsl9VXXrhOUZsk5NbujCZiXA8uXLR1OpJKm/I4yq2tJ9bwOuB06Y1GUcOHzC/DJgy5DfWVVVY1U1tmTJklGVK0l7vV4CI8n+SQ7YOQ2cAWyc1O0G4F92V0u9AXiiqrbOcamSpE5fQ1KHAtcn2VnDp6vqxiTvBqiqq4C1wNnAg8AzwNt7qlWSRE+BUVUPAccOab9qwnQBF89lXZKkqc3ny2olSfOIgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpyZwHRpLDk/xVkk1J7k3y3iF9Tk7yRJK7us8H5rpOSdLz9fGK1h3A71XVhiQHAHcmWVdV35rU72+q6pwe6pMkDTHnRxhVtbWqNnTTTwGbgKVzXYckaXp6PYeRZAXwOuC2IYvfmOTuJF9O8gtzWpgk6QX6GJICIMlPA58H3ldVT05avAE4oqqeTnI28BfAkVP8zkpgJcDy5ctHWLEk7d16OcJIsi+DsPhUVX1h8vKqerKqnu6m1wL7Jlk87LeqalVVjVXV2JIlS0ZatyTtzfq4SirA1cCmqrp8ij6v7PqR5AQGdX5v7qqUJE3Wx5DUicBvA/ckuatr+wNgOUBVXQW8DbgoyQ7gh8B5VVU91CpJ6sx5YFTV14Dsps8VwBVzU5EkqYV3ekuSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpr0EhhJzkxyf5IHk1wyZPlPJflMt/y2JCvmvkpJ0kRzHhhJ9gH+BDgLOBo4P8nRk7pdCHy/qv4R8BHgv8xtlZKkyfo4wjgBeLCqHqqqZ4E/B86d1Odc4Npu+nPAaUl2+R5wSdJo9REYS4FHJsyPd21D+1TVDuAJ4BVzUp0kaahFPWxz2JFCzaDPoGOyEljZzT6d5P4Z1rU4V/LdGa473yyGtn3JlSOu5MVp3o/p6GmfR7IvLUawv73tS6vGfZ73+zENi5MPznRfjmjt2EdgjAOHT5hfBmyZos94kkXAzwKPD/uxqloFrHqxRSVZX1VjL/Z35oM9ZV/2lP0A92U+2lP2A+ZuX/oYkroDODLJq5O8FDgPuGFSnxuAC7rptwF/WVVDjzAkSXNjzo8wqmpHkvcANwH7AKur6t4klwHrq+oG4Grgk0keZHBkcd5c1ylJer4+hqSoqrXA2kltH5gw/Q/Ar89xWS96WGse2VP2ZU/ZD3Bf5qM9ZT9gjvYljvRIklr4aBBJUhMDg8Hd50m+keR/9V3Li5Fkc5J7ktyVZH3f9bwYSQ5M8rkk9yXZlOSNfdc0E0mO6v4eOz9PJnlf33XNRJJ/l+TeJBuTrEnysr5rmqkk7+32496F9vdIsjrJtiQbJ7QdnGRdkge674NGsW0DY+C9wKa+i5glp1TVcXvA5YJ/DNxYVT8PHMsC/ftU1f3d3+M44JeAZ4Drey5r2pIsBX4HGKuqYxhcsLIgL0ZJcgzwTgZPnTgWOCfJkf1WNS3XAGdOarsEuKWqjgRu6eZn3V4fGEmWAf8M+HjftWggyc8AJzG4Wo6qeraq/r7fqmbFacDfVdXDfRcyQ4uAl3f3Ru3HC++fWiheC3y9qp7pniTxVeAtPdfUrKpu5YX3pU18nNK1wK+NYtt7fWAAHwX+PfCTvguZBQXcnOTO7g74heo1wHbgf3RDhR9Psn/fRc2C84A1fRcxE1X1KPBHwHeArcATVXVzv1XN2EbgpCSvSLIfcDbPv5l4ITq0qrYCdN+HjGIje3VgJDkH2FZVd/Zdyyw5saqOZ/Ak4IuTnNR3QTO0CDgeuLKqXgf8gBEdYs+V7ibVNwPX9V3LTHRj4ucCrwZeBeyf5F/0W9XMVNUmBk/AXgfcCNwN7Oi1qAVirw4M4ETgzUk2M3hq7qlJ/qzfkmauqrZ039sYjJOf0G9FMzYOjFfVbd385xgEyEJ2FrChqh7ru5AZehPw7araXlU/Br4A/OOea5qxqrq6qo6vqpMYDO880HdNL9JjSQ4D6L63jWIje3VgVNX7q2pZVa1gMFzwl1W1IP/XlGT/JAfsnAbOYHDoveBU1f8BHklyVNd0GvCtHkuaDeezQIejOt8B3pBkv+5VA6exQC9EAEhySPe9HHgrC/tvA89/nNIFwBdHsZFe7vTWSBwKXN+9NmQR8OmqurHfkl6Ufwt8qhvKeQh4e8/1zFg3Tn468K6+a5mpqrotyeeADQyGb77Bwr5T+vNJXgH8GLi4qr7fd0GtkqwBTgYWJxkHLgU+BHw2yYUMwn0kT8rwTm9JUpO9ekhKktTOwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVKT/wsBXe3A4YdyZgAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df[\"game\"].plot.hist(color = \"hotpink\", \n", " bins = 5, \n", " edgecolor = 'navy')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Можно пытаться строить другие графики. Например, построить ящик с усами (свеча)." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAACyFJREFUeJzt3V+MpfVdx/H3RwYKW2qL7NAYcZg2rWiFUOvRiCippRDt9p/VC4g1QBrnxmhbL3RNSrCJF9ukifZ2UhWSKiZFmpaSUEhNJTVl4ywt2cWtYutuS4swpEpLIBbarxd7SNYNy8w5zzMzy3ffr2QyM888f75z894nv3OenVQVkqSXvh/Z6QEkSeMw6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmljYzovt3r27lpeXt/OSkvSSd+DAgSeqanGj/bY16MvLy6ytrW3nJSXpJS/J0c3s55KLJDVh0CWpCYMuSU0YdElqwqBLUhMbBj3JXyd5PMmh47b9WJJ7kzw8/Xze1o4pSdrIZu7QbwF+/YRte4HPV9Xrgc9Pv5ck7aANg15V9wHfOWHzu4Bbp1/fCrx75LkkSTOa98GiV1fVowBV9WiSC062Y5IVYAVgaWlpzstJs0myLdfxb/LqVLLlL4pW1WpVTapqsri44ZOr0iiqaqaPi/7kszMfY8x1qpk36I8l+XGA6efHxxtJkjSPeYP+GeD66dfXA58eZxxJ0rw287bF24AvARcneSTJ+4B9wNVJHgaunn4vSdpBG74oWlXXneRHV408iyRpAJ8UlaQmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1MSgoCd5f5JDSR5K8oGxhpIkzW7uoCe5BPg94BeBy4C3J3n9WINJkmYz5A79Z4D7q+rpqnoO+CfgN8cZS5I0qyFBPwRcmeT8JLuAtwE/eeJOSVaSrCVZW19fH3A5SdKLmTvoVXUY+AhwL3A38CDw3Avst1pVk6qaLC4uzj2oJOnFDXpRtKr+qqreVFVXAt8BHh5nLEnSrBaGHJzkgqp6PMkS8B7g8nHGkiTNalDQgX9Icj7wLPD7VfXfI8wkSZrDoKBX1a+ONYgkaRifFJWkJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmhgU9CQfTPJQkkNJbkty9liDSZJmM3fQk/wE8IfApKouAc4Arh1rMEnSbIYuuSwA5yRZAHYB3x4+kiRpHnMHvaq+BXwU+AbwKPBkVd0z1mCSpNkszHtgkvOAdwGvAf4H+GSS91bVJ07YbwVYAVhaWhowqk5Xl334Hp585tktv87y3ru29PyvPOdMHrz5mi29hk5vcwcdeCvwn1W1DpDkDuCXgf8X9KpaBVYBJpNJDbieTlNPPvMsR/bt2ekxBtvqfzCkIWvo3wB+KcmuJAGuAg6PM5YkaVZD1tD3A7cDDwAHp+daHWkuSdKMhiy5UFU3AzePNIskaQCfFJWkJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktTE3EFPcnGSrxz38d0kHxhzOEnS5i3Me2BV/RvwRoAkZwDfAj410lySpBmNteRyFfC1qjo60vkkSTMaK+jXAreNdC5J0hwGBz3JWcA7gU+e5OcrSdaSrK2vrw+9nCTpJMa4Q/8N4IGqeuyFflhVq1U1qarJ4uLiCJeTJL2QMYJ+HS63SNKOGxT0JLuAq4E7xhlHkjSvud+2CFBVTwPnjzSLJGkAnxSVpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUxKCgJ3lVktuTfDXJ4SSXjzWYJGk2CwOP/xhwd1X9dpKzgF0jzCRJmsPcQU/yo8CVwA0AVfV94PvjjCVJmtWQJZfXAuvA3yT5cpKPJ3n5SHNJkmaUqprvwGQC3A9cUVX7k3wM+G5V3XTCfivACsDS0tLPHz16dODIOt1ceuulOz3CaA5ef3CnR9BLUJIDVTXZaL8ha+iPAI9U1f7p97cDe0/cqapWgVWAyWQy378eOq197/A+juzbs9NjDLa8966dHkHNzb3kUlX/BXwzycXTTVcB/zrKVJKkmQ19l8sfAH87fYfL14Ebh48kSZrHoKBX1VeADdd1JElbzydFJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0YdElqwqBLUhMGXZKaMOiS1IRBl6QmDLokNbEw5OAkR4DvAT8AnquqyRhDSZJmNyjoU79WVU+McB5J0gAuuUhSE0ODXsA9SQ4kWRljIEnSfIYuuVxRVd9OcgFwb5KvVtV9x+8wDf0KwNLS0sDL6XS1vPeunR5hsFeec+ZOj6DmUlXjnCj5M+CpqvroyfaZTCa1trY2yvWkMS3vvYsj+/bs9BjSC0pyYDNvOpl7ySXJy5O84vmvgWuAQ/OeT5I0zJAll1cDn0ry/Hn+rqruHmUqSdLM5g56VX0duGzEWSRJA/i2RUlqwqBLUhMGXZKaMOiS1IRBl6QmDLokNWHQJakJgy5JTRh0SWrCoEtSEwZdkpow6JLUhEGXpCYMuiQ1YdAlqQmDLklNGHRJasKgS1ITBl2SmjDoktSEQZekJgy6JDVh0CWpCYMuSU0MDnqSM5J8OclnxxhIkjSfMe7Q3w8cHuE8kqQBBgU9yYXAHuDj44wjSZrX0Dv0vwT+GPjhCLNIkgZYmPfAJG8HHq+qA0ne/CL7rQArAEtLS/NeTppJktmP+cjs16mq2Q+StsiQO/QrgHcmOQL8PfCWJJ84caeqWq2qSVVNFhcXB1xO2ryq2pYP6VQyd9Cr6k+r6sKqWgauBf6xqt472mSSpJn4PnRJamLuNfTjVdUXgC+McS5J0ny8Q5ekJgy6JDVh0CWpCYMuSU0YdElqItv5cESSdeDotl1Q2rzdwBM7PYR0EhdV1YZPZm5r0KVTVZK1qprs9BzSEC65SFITBl2SmjDo0jGrOz2ANJRr6JLUhHfoktSEQZekJgy6JDUxyn+fK51qktwE/A7wTY49MHQAeJJjfw7xLOA/gN+tqqeT3AI8A/w0cBFwI3A9cDmwv6pumJ7zGuDDwMuArwE3VtVT2/dbSS/OO3S1k2QC/Bbwc8B7gOcfGLqjqn6hqi4DDgPvO+6w84C3AB8E7gT+AvhZ4NIkb0yyG/gQ8NaqehOwBvzRdvw+0mZ5h66OfgX4dFU9A5Dkzun2S5L8OfAq4Fzgc8cdc2dVVZKDwGNVdXB67EPAMnAh8Abgn6d/gPos4Evb8LtIm2bQ1VFOsv0W4N1V9WCSG4A3H/ez/51+/uFxXz///QLwA+Deqrpu1EmlEbnkoo6+CLwjydlJzgX2TLe/Ang0yZkcW1+fxf3AFUleB5BkV5KfGm1iaQTeoaudqvqXJJ8BHuTY/+65xrEXRG8C9k+3HeRY4Dd7zvXpXf1tSV423fwh4N9HHF0axCdF1VKSc6vqqSS7gPuAlap6YKfnkraSd+jqajXJG4CzgVuNuU4H3qFLUhO+KCpJTRh0SWrCoEtSEwZdkpow6JLUhEGXpCb+D3CkwUe7e5xxAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df[\"game\"].plot.box() # boxplot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Этот график визуализирует основные описательные статистики переменной и отображает форму её распределения. Нижняя граница яшика – это нижний квартиль, верхняя – верхний квартиль, линяя внутри ящика – медиана. Усы графика могут откладываться по-разному: если в переменной встречаются нетипичные значения (выбросы), то границы усов совпадают с границами типичных значений, если нетипичных значений нет, границы усов соответствуют минимальному и максимальному значению переменной. Подробнее про ящик с усами см. [здесь](https://ru.wikipedia.org/wiki/%D0%AF%D1%89%D0%B8%D0%BA_%D1%81_%D1%83%D1%81%D0%B0%D0%BC%D0%B8)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Фильтрация строк по условиям" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Часто в исследованиях нас не интересует выбор отдельных строк по названию или номеру, мы хотим отбирать строки в таблице согласно некорому условию (условиям). Другими словами, проводить фильтрацию наблюдений. Для этого интересующее нас условие необходимо указать в квадратных скобках. Выберем из датафрейма `df` строки, которые соответствуют студентам с оценкой по экономике выше 6." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ0247988981087997.088.06101
М141БПЛТЛ03181010101010109910109.088.09101
М141БПЛТЛ075999109109891099.088.0791
М141БПЛТЛ0179988991069998.088.0890
М141БПЛТЛ0691010101010109881097.065.08101
М141БПЛТЛ072109810989881097.088.0990
М141БПЛТЛ02671087107988888.087.0780
М141БПЛТЛ073798898988997.076.01091
М141БПЛТЛ060787798857585.078.0791
М141БПЛТЛ021898898877766.086.0780
М141БПЛТЛ018779797866787.077.0780
М141БПЛТЛ039989888687696.078.0491
М141БПЛТЛ0368107886948876.076.0781
М141БПЛТЛ045588786767786.086.0580
М141БПЛТЛ033598797977887.085.0780
М141БПЛТЛ008108898109891098.055.01041
М141БПЛТЛ052777786668675.086.0571
М141БПЛТЛ029688795676585.074.0570
М141БПЛТЛ064786766846444.065.0470
М141БПЛТЛ023796896947776.044.0751
М141БПЛТЛ0667107795846564.064.0560
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ024 7 9 8 8 9 8 10 8 7 9 \n", "М141БПЛТЛ031 8 10 10 10 10 10 10 9 9 10 \n", "М141БПЛТЛ075 9 9 9 10 9 10 9 8 9 10 \n", "М141БПЛТЛ017 9 9 8 8 9 9 10 6 9 9 \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8 8 10 \n", "М141БПЛТЛ072 10 9 8 10 9 8 9 8 8 10 \n", "М141БПЛТЛ026 7 10 8 7 10 7 9 8 8 8 \n", "М141БПЛТЛ073 7 9 8 8 9 8 9 8 8 9 \n", "М141БПЛТЛ060 7 8 7 7 9 8 8 5 7 5 \n", "М141БПЛТЛ021 8 9 8 8 9 8 8 7 7 7 \n", "М141БПЛТЛ018 7 7 9 7 9 7 8 6 6 7 \n", "М141БПЛТЛ039 9 8 9 8 8 8 6 8 7 6 \n", "М141БПЛТЛ036 8 10 7 8 8 6 9 4 8 8 \n", "М141БПЛТЛ045 5 8 8 7 8 6 7 6 7 7 \n", "М141БПЛТЛ033 5 9 8 7 9 7 9 7 7 8 \n", "М141БПЛТЛ008 10 8 8 9 8 10 9 8 9 10 \n", "М141БПЛТЛ052 7 7 7 7 8 6 6 6 8 6 \n", "М141БПЛТЛ029 6 8 8 7 9 5 6 7 6 5 \n", "М141БПЛТЛ064 7 8 6 7 6 6 8 4 6 4 \n", "М141БПЛТЛ023 7 9 6 8 9 6 9 4 7 7 \n", "М141БПЛТЛ066 7 10 7 7 9 5 8 4 6 5 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ024 9 7.0 8 8.0 6 10 1 \n", "М141БПЛТЛ031 10 9.0 8 8.0 9 10 1 \n", "М141БПЛТЛ075 9 9.0 8 8.0 7 9 1 \n", "М141БПЛТЛ017 9 8.0 8 8.0 8 9 0 \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 \n", "М141БПЛТЛ072 9 7.0 8 8.0 9 9 0 \n", "М141БПЛТЛ026 8 8.0 8 7.0 7 8 0 \n", "М141БПЛТЛ073 9 7.0 7 6.0 10 9 1 \n", "М141БПЛТЛ060 8 5.0 7 8.0 7 9 1 \n", "М141БПЛТЛ021 6 6.0 8 6.0 7 8 0 \n", "М141БПЛТЛ018 8 7.0 7 7.0 7 8 0 \n", "М141БПЛТЛ039 9 6.0 7 8.0 4 9 1 \n", "М141БПЛТЛ036 7 6.0 7 6.0 7 8 1 \n", "М141БПЛТЛ045 8 6.0 8 6.0 5 8 0 \n", "М141БПЛТЛ033 8 7.0 8 5.0 7 8 0 \n", "М141БПЛТЛ008 9 8.0 5 5.0 10 4 1 \n", "М141БПЛТЛ052 7 5.0 8 6.0 5 7 1 \n", "М141БПЛТЛ029 8 5.0 7 4.0 5 7 0 \n", "М141БПЛТЛ064 4 4.0 6 5.0 4 7 0 \n", "М141БПЛТЛ023 7 6.0 4 4.0 7 5 1 \n", "М141БПЛТЛ066 6 4.0 6 4.0 5 6 0 " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df['econ'] > 6] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Почему нельзя было написать проще, то есть `df[\"Econ\"] > 6`? Давайте напишем, и посмотрим, что получится:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "М141БПЛТЛ024 True\n", "М141БПЛТЛ031 True\n", "М141БПЛТЛ075 True\n", "М141БПЛТЛ017 True\n", "М141БПЛТЛ069 True\n", "М141БПЛТЛ072 True\n", "М141БПЛТЛ020 False\n", "М141БПЛТЛ026 True\n", "М141БПЛТЛ073 True\n", "М141БПЛТЛ078 False\n", "М141БПЛТЛ060 True\n", "М141БПЛТЛ040 False\n", "М141БПЛТЛ065 False\n", "М141БПЛТЛ053 False\n", "М141БПЛТЛ015 False\n", "М141БПЛТЛ021 True\n", "М141БПЛТЛ018 True\n", "М141БПЛТЛ039 True\n", "М141БПЛТЛ036 True\n", "М141БПЛТЛ049 False\n", "М141БПЛТЛ048 False\n", "М141БПЛТЛ034 False\n", "М141БПЛТЛ045 True\n", "М141БПЛТЛ033 True\n", "М141БПЛТЛ083 False\n", "М141БПЛТЛ008 True\n", "М141БПЛТЛ001 False\n", "М141БПЛТЛ038 False\n", "М141БПЛТЛ052 True\n", "М141БПЛТЛ011 False\n", "М141БПЛТЛ004 False\n", "М141БПЛТЛ010 False\n", "М141БПЛТЛ035 False\n", "М141БПЛТЛ030 False\n", "М141БПЛТЛ070 False\n", "М141БПЛТЛ051 False\n", "М141БПЛТЛ046 False\n", "М141БПЛТЛ047 False\n", "М141БПЛТЛ063 False\n", "М141БПЛТЛ029 True\n", "М141БПЛТЛ064 True\n", "М141БПЛТЛ076 False\n", "М141БПЛТЛ062 False\n", "М141БПЛТЛ074 False\n", "130232038 False\n", "М141БПЛТЛ023 True\n", "М141БПЛТЛ054 False\n", "М141БПЛТЛ012 False\n", "М141БПЛТЛ006 False\n", "М141БПЛТЛ055 False\n", "М141БПЛТЛ007 False\n", "М141БПЛТЛ050 False\n", "М141БПЛТЛ066 True\n", "М141БПЛТЛ005 False\n", "Name: econ, dtype: bool" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"econ\"] > 6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Что мы увидели? Просто результат проверки условия, набор из `True` и `False`. Когда мы подставляем это выражение в квадратные скобки, Python выбирает из `df` те строки, где выражение принимает значение `True`.\n", "\n", "Все операторы проверки условий работают как обычно:" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ008108898109891098.055.01041
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ008 10 8 8 9 8 10 9 8 9 10 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ008 9 8.0 5 5.0 10 4 1 " ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df[\"econ\"] == 9] # двойное равенство для равенства" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Можно формулировать сложные условия. Выберем студентов с оценкой по экономике от 6 до 8 (8 не включается)." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ0208776910887797.086.0891
М141БПЛТЛ02671087107988888.087.0780
М141БПЛТЛ060787798857585.078.0791
М141БПЛТЛ040698697869585.085.07100
М141БПЛТЛ015697697947776.077.01070
М141БПЛТЛ018779797866787.077.0780
М141БПЛТЛ049676686848596.085.0680
М141БПЛТЛ048868696446484.067.0780
М141БПЛТЛ034697696867665.085.0890
М141БПЛТЛ045588786767786.086.0580
М141БПЛТЛ033598797977887.085.0780
М141БПЛТЛ052777786668675.086.0571
М141БПЛТЛ011768696656676.086.0580
М141БПЛТЛ004776686655565.075.0880
М141БПЛТЛ010667697767586.086.0581
М141БПЛТЛ035567685546675.087.0670
М141БПЛТЛ030766676648555.085.0791
М141БПЛТЛ051898687676665.044.0551
М141БПЛТЛ029688795676585.074.0570
М141БПЛТЛ064786766846444.065.0470
М141БПЛТЛ076778686668685.074.0460
М141БПЛТЛ062777696656564.055.0460
М141БПЛТЛ007677676745565.045.0471
М141БПЛТЛ050866684545564.054.0660
М141БПЛТЛ0667107795846564.064.0560
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ020 8 7 7 6 9 10 8 8 7 7 \n", "М141БПЛТЛ026 7 10 8 7 10 7 9 8 8 8 \n", "М141БПЛТЛ060 7 8 7 7 9 8 8 5 7 5 \n", "М141БПЛТЛ040 6 9 8 6 9 7 8 6 9 5 \n", "М141БПЛТЛ015 6 9 7 6 9 7 9 4 7 7 \n", "М141БПЛТЛ018 7 7 9 7 9 7 8 6 6 7 \n", "М141БПЛТЛ049 6 7 6 6 8 6 8 4 8 5 \n", "М141БПЛТЛ048 8 6 8 6 9 6 4 4 6 4 \n", "М141БПЛТЛ034 6 9 7 6 9 6 8 6 7 6 \n", "М141БПЛТЛ045 5 8 8 7 8 6 7 6 7 7 \n", "М141БПЛТЛ033 5 9 8 7 9 7 9 7 7 8 \n", "М141БПЛТЛ052 7 7 7 7 8 6 6 6 8 6 \n", "М141БПЛТЛ011 7 6 8 6 9 6 6 5 6 6 \n", "М141БПЛТЛ004 7 7 6 6 8 6 6 5 5 5 \n", "М141БПЛТЛ010 6 6 7 6 9 7 7 6 7 5 \n", "М141БПЛТЛ035 5 6 7 6 8 5 5 4 6 6 \n", "М141БПЛТЛ030 7 6 6 6 7 6 6 4 8 5 \n", "М141БПЛТЛ051 8 9 8 6 8 7 6 7 6 6 \n", "М141БПЛТЛ029 6 8 8 7 9 5 6 7 6 5 \n", "М141БПЛТЛ064 7 8 6 7 6 6 8 4 6 4 \n", "М141БПЛТЛ076 7 7 8 6 8 6 6 6 8 6 \n", "М141БПЛТЛ062 7 7 7 6 9 6 6 5 6 5 \n", "М141БПЛТЛ007 6 7 7 6 7 6 7 4 5 5 \n", "М141БПЛТЛ050 8 6 6 6 8 4 5 4 5 5 \n", "М141БПЛТЛ066 7 10 7 7 9 5 8 4 6 5 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ020 9 7.0 8 6.0 8 9 1 \n", "М141БПЛТЛ026 8 8.0 8 7.0 7 8 0 \n", "М141БПЛТЛ060 8 5.0 7 8.0 7 9 1 \n", "М141БПЛТЛ040 8 5.0 8 5.0 7 10 0 \n", "М141БПЛТЛ015 7 6.0 7 7.0 10 7 0 \n", "М141БПЛТЛ018 8 7.0 7 7.0 7 8 0 \n", "М141БПЛТЛ049 9 6.0 8 5.0 6 8 0 \n", "М141БПЛТЛ048 8 4.0 6 7.0 7 8 0 \n", "М141БПЛТЛ034 6 5.0 8 5.0 8 9 0 \n", "М141БПЛТЛ045 8 6.0 8 6.0 5 8 0 \n", "М141БПЛТЛ033 8 7.0 8 5.0 7 8 0 \n", "М141БПЛТЛ052 7 5.0 8 6.0 5 7 1 \n", "М141БПЛТЛ011 7 6.0 8 6.0 5 8 0 \n", "М141БПЛТЛ004 6 5.0 7 5.0 8 8 0 \n", "М141БПЛТЛ010 8 6.0 8 6.0 5 8 1 \n", "М141БПЛТЛ035 7 5.0 8 7.0 6 7 0 \n", "М141БПЛТЛ030 5 5.0 8 5.0 7 9 1 \n", "М141БПЛТЛ051 6 5.0 4 4.0 5 5 1 \n", "М141БПЛТЛ029 8 5.0 7 4.0 5 7 0 \n", "М141БПЛТЛ064 4 4.0 6 5.0 4 7 0 \n", "М141БПЛТЛ076 8 5.0 7 4.0 4 6 0 \n", "М141БПЛТЛ062 6 4.0 5 5.0 4 6 0 \n", "М141БПЛТЛ007 6 5.0 4 5.0 4 7 1 \n", "М141БПЛТЛ050 6 4.0 5 4.0 6 6 0 \n", "М141БПЛТЛ066 6 4.0 6 4.0 5 6 0 " ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df[\"econ\"] >= 6) & (df[\"econ\"] < 8)] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "В качестве символа для одновременного выполнения условий используется оператор `&`. И не забудьте про круглые скобки. А теперь выберем студентов с оценкой по английскому выше 9 и оценкой по праву ниже 9:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ0691010101010109881097.065.08101
М141БПЛТЛ02671087107988888.087.0780
М141БПЛТЛ0016774107768684.066.0480
М141БПЛТЛ0126674106547574.054.0481
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ069 10 10 10 10 10 10 9 8 8 10 \n", "М141БПЛТЛ026 7 10 8 7 10 7 9 8 8 8 \n", "М141БПЛТЛ001 6 7 7 4 10 7 7 6 8 6 \n", "М141БПЛТЛ012 6 6 7 4 10 6 5 4 7 5 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ069 9 7.0 6 5.0 8 10 1 \n", "М141БПЛТЛ026 8 8.0 8 7.0 7 8 0 \n", "М141БПЛТЛ001 8 4.0 6 6.0 4 8 0 \n", "М141БПЛТЛ012 7 4.0 5 4.0 4 8 1 " ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df[\"eng\"] > 9) & (df[\"law\"] < 9)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "При формулировании сложных (составных) условий обращайте особое внимание на порядок круглых скобках, потому что, если вы расставите скобки неправильно, результат получится неверный.\n", "\n", "Теперь выберем студентов с оценкой по политической истории ниже 5 или с оценкой по истории политических учений ниже 5:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
catpsmstatsoceconengpolthmstat2phistlawphilpolsocptheopregcomppgamewpolmale
id
М141БПЛТЛ015697697947776.077.01070
М141БПЛТЛ0368107886948876.076.0781
М141БПЛТЛ049676686848596.085.0680
М141БПЛТЛ048868696446484.067.0780
М141БПЛТЛ035567685546675.087.0670
М141БПЛТЛ030766676648555.085.0791
М141БПЛТЛ063556484445454.075.0880
М141БПЛТЛ064786766846444.065.0470
130232038676584848455.064.0560
М141БПЛТЛ023796896947776.044.0751
М141БПЛТЛ054786486446484.044.0481
М141БПЛТЛ0126674106547574.054.0481
М141БПЛТЛ007677676745565.045.0471
М141БПЛТЛ050866684545564.054.0660
М141БПЛТЛ0667107795846564.064.0560
М141БПЛТЛ005575574745455.044.0481
\n", "
" ], "text/plain": [ " catps mstat soc econ eng polth mstat2 phist law phil \\\n", "id \n", "М141БПЛТЛ015 6 9 7 6 9 7 9 4 7 7 \n", "М141БПЛТЛ036 8 10 7 8 8 6 9 4 8 8 \n", "М141БПЛТЛ049 6 7 6 6 8 6 8 4 8 5 \n", "М141БПЛТЛ048 8 6 8 6 9 6 4 4 6 4 \n", "М141БПЛТЛ035 5 6 7 6 8 5 5 4 6 6 \n", "М141БПЛТЛ030 7 6 6 6 7 6 6 4 8 5 \n", "М141БПЛТЛ063 5 5 6 4 8 4 4 4 5 4 \n", "М141БПЛТЛ064 7 8 6 7 6 6 8 4 6 4 \n", "130232038 6 7 6 5 8 4 8 4 8 4 \n", "М141БПЛТЛ023 7 9 6 8 9 6 9 4 7 7 \n", "М141БПЛТЛ054 7 8 6 4 8 6 4 4 6 4 \n", "М141БПЛТЛ012 6 6 7 4 10 6 5 4 7 5 \n", "М141БПЛТЛ007 6 7 7 6 7 6 7 4 5 5 \n", "М141БПЛТЛ050 8 6 6 6 8 4 5 4 5 5 \n", "М141БПЛТЛ066 7 10 7 7 9 5 8 4 6 5 \n", "М141БПЛТЛ005 5 7 5 5 7 4 7 4 5 4 \n", "\n", " polsoc ptheo preg compp game wpol male \n", "id \n", "М141БПЛТЛ015 7 6.0 7 7.0 10 7 0 \n", "М141БПЛТЛ036 7 6.0 7 6.0 7 8 1 \n", "М141БПЛТЛ049 9 6.0 8 5.0 6 8 0 \n", "М141БПЛТЛ048 8 4.0 6 7.0 7 8 0 \n", "М141БПЛТЛ035 7 5.0 8 7.0 6 7 0 \n", "М141БПЛТЛ030 5 5.0 8 5.0 7 9 1 \n", "М141БПЛТЛ063 5 4.0 7 5.0 8 8 0 \n", "М141БПЛТЛ064 4 4.0 6 5.0 4 7 0 \n", "130232038 5 5.0 6 4.0 5 6 0 \n", "М141БПЛТЛ023 7 6.0 4 4.0 7 5 1 \n", "М141БПЛТЛ054 8 4.0 4 4.0 4 8 1 \n", "М141БПЛТЛ012 7 4.0 5 4.0 4 8 1 \n", "М141БПЛТЛ007 6 5.0 4 5.0 4 7 1 \n", "М141БПЛТЛ050 6 4.0 5 4.0 6 6 0 \n", "М141БПЛТЛ066 6 4.0 6 4.0 5 6 0 \n", "М141БПЛТЛ005 5 5.0 4 4.0 4 8 1 " ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df[\"phist\"] < 5) | (df[\"polth\"] < 5)] # оператор | для условия или " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }