{ "cells": [ { "cell_type": "markdown", "id": "c3a2e1ea-f611-490a-9fcc-0bb6f29a08c7", "metadata": {}, "source": [ "# Challenge: Analyzing Student Performance\n", "\n", "**Description:** You are a data analyst at a school, and you need to analyze the performance of students in a mathematics exam. The school has provided you with a dataset containing the scores of 50 students in three subjects: Algebra, Geometry, and Calculus. \n", "\n", "**Your task is to:**\n", "- Load the dataset into a Pandas DataFrame\n", "- Calculate the average score for each subject\n", "- Identify the top 5 students with the highest overall score (average of all three subjects)\n", "- Create a new column to indicate whether each student passed or failed the exam (passing score is 80 or higher)" ] }, { "cell_type": "code", "execution_count": 1, "id": "19baa358-66a8-4756-95fa-a7e959c5f8ef", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python version 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)]\n", "Pandas version 2.2.1\n", "Numpy version 1.26.4\n" ] } ], "source": [ "# import libraries\n", "import pandas as pd\n", "import numpy as np\n", "import sys\n", "\n", "print('Python version ' + sys.version)\n", "print('Pandas version ' + pd.__version__)\n", "print('Numpy version ' + np.__version__)" ] }, { "cell_type": "markdown", "id": "27794f9d-d0c2-4fc6-bc87-c7334ec29c56", "metadata": {}, "source": [ "# The Data \n", "\n", "The dataset contains information about the performance of 50 students in a mathematics exam, covering three subjects: Algebra, Geometry, and Calculus. \n", "\n", "### Columns:\n", "- **Student_ID:** A unique identifier for each student (integer). \n", "- **Algebra:** The student's score in Algebra (integer). \n", "- **Geometry:** The student's score in Geometry (integer). \n", "- **Calculus:** The student's score in Calculus (integer). \n", "\n", "```\n", "Student_ID\tAlgebra\tGeometry\tCalculus\n", "1\t85\t90\t78\n", "2\t92\t88\t95\n", "3\t78\t75\t82\n", "4\t95\t92\t89\n", "5\t88\t85\t92\n", "6\t76\t78\t85\n", "7\t89\t91\t96\n", "8\t83\t80\t88\n", "9\t90\t95\t92\n", "10\t81\t83\t86\n", "11\t86\t89\t93\n", "12\t93\t96\t91\n", "13\t84\t82\t89\n", "14\t91\t94\t97\n", "15\t79\t81\t87\n", "16\t87\t86\t90\n", "17\t94\t97\t95\n", "18\t82\t84\t91\n", "19\t96\t98\t99\n", "20\t80\t79\t83\n", "21\t98\t99\t100\n", "22\t75\t77\t81\n", "23\t92\t93\t98\n", "24\t89\t92\t95\n", "25\t77\t76\t80\n", "26\t95\t96\t98\n", "27\t81\t82\t87\n", "28\t90\t91\t94\n", "29\t78\t80\t84\n", "30\t97\t99\t100\n", "31\t84\t86\t90\n", "32\t91\t94\t97\n", "33\t76\t78\t82\n", "34\t88\t90\t93\n", "35\t82\t84\t89\n", "36\t96\t98\t100\n", "37\t85\t87\t92\n", "38\t79\t81\t86\n", "39\t93\t95\t98\n", "40\t80\t82\t85\n", "41\t87\t89\t94\n", "42\t94\t96\t99\n", "43\t83\t85\t91\n", "44\t92\t94\t97\n", "45\t77\t79\t83\n", "46\t90\t92\t95\n", "47\t86\t88\t93\n", "48\t98\t100\t100\n", "49\t81\t83\t88\n", "50\t95\t97\t99\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "id": "9e23e180-4481-469a-be34-8482b0ec77b9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Student_IDAlgebraGeometryCalculus
01859078
12928895
23787582
34959289
45888592
56767885
67899196
78838088
89909592
910818386
1011868993
1112939691
1213848289
1314919497
1415798187
1516878690
1617949795
1718828491
1819969899
1920807983
20219899100
2122757781
2223929398
2324899295
2425777680
2526959698
2627818287
2728909194
2829788084
29309799100
3031848690
3132919497
3233767882
3334889093
3435828489
35369698100
3637858792
3738798186
3839939598
3940808285
4041878994
4142949699
4243838591
4344929497
4445777983
4546909295
4647868893
474898100100
4849818388
4950959799
\n", "
" ], "text/plain": [ " Student_ID Algebra Geometry Calculus\n", "0 1 85 90 78\n", "1 2 92 88 95\n", "2 3 78 75 82\n", "3 4 95 92 89\n", "4 5 88 85 92\n", "5 6 76 78 85\n", "6 7 89 91 96\n", "7 8 83 80 88\n", "8 9 90 95 92\n", "9 10 81 83 86\n", "10 11 86 89 93\n", "11 12 93 96 91\n", "12 13 84 82 89\n", "13 14 91 94 97\n", "14 15 79 81 87\n", "15 16 87 86 90\n", "16 17 94 97 95\n", "17 18 82 84 91\n", "18 19 96 98 99\n", "19 20 80 79 83\n", "20 21 98 99 100\n", "21 22 75 77 81\n", "22 23 92 93 98\n", "23 24 89 92 95\n", "24 25 77 76 80\n", "25 26 95 96 98\n", "26 27 81 82 87\n", "27 28 90 91 94\n", "28 29 78 80 84\n", "29 30 97 99 100\n", "30 31 84 86 90\n", "31 32 91 94 97\n", "32 33 76 78 82\n", "33 34 88 90 93\n", "34 35 82 84 89\n", "35 36 96 98 100\n", "36 37 85 87 92\n", "37 38 79 81 86\n", "38 39 93 95 98\n", "39 40 80 82 85\n", "40 41 87 89 94\n", "41 42 94 96 99\n", "42 43 83 85 91\n", "43 44 92 94 97\n", "44 45 77 79 83\n", "45 46 90 92 95\n", "46 47 86 88 93\n", "47 48 98 100 100\n", "48 49 81 83 88\n", "49 50 95 97 99" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_clipboard()\n", "df" ] }, { "cell_type": "code", "execution_count": 4, "id": "8c3c6ff1-07b8-4dbc-9c17-3730eac3d59c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 50 entries, 0 to 49\n", "Data columns (total 4 columns):\n", " # Column Non-Null Count Dtype\n", "--- ------ -------------- -----\n", " 0 Student_ID 50 non-null int64\n", " 1 Algebra 50 non-null int64\n", " 2 Geometry 50 non-null int64\n", " 3 Calculus 50 non-null int64\n", "dtypes: int64(4)\n", "memory usage: 1.7 KB\n" ] } ], "source": [ "# make sure data types look good\n", "df.info()" ] }, { "cell_type": "code", "execution_count": 5, "id": "17d27a61-48e8-4722-8fa9-d321e8d9211d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Student_IDAlgebraGeometryCalculusmedian
474898100100100.0
2930979910099.0
2021989910099.0
181996989998.0
3536969810098.0
\n", "
" ], "text/plain": [ " Student_ID Algebra Geometry Calculus median\n", "47 48 98 100 100 100.0\n", "29 30 97 99 100 99.0\n", "20 21 98 99 100 99.0\n", "18 19 96 98 99 98.0\n", "35 36 96 98 100 98.0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# calculate the average score for each subject\n", "df['median'] = df[['Algebra', 'Geometry', 'Calculus']].median(axis=1) # I usually choose the median over the average\n", "\n", "# identify the top 5 students with the highest overall score (average of all three subjects)\n", "df.sort_values(by='median', ascending=False).head()" ] }, { "cell_type": "code", "execution_count": 6, "id": "4083b038-5989-49e4-be37-6a06c5e2959c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Student_IDAlgebraGeometryCalculusmedianpassFail
0185907885.0pass
1292889592.0pass
2378758278.0fail
3495928992.0pass
4588859288.0pass
5676788578.0fail
6789919691.0pass
7883808883.0pass
8990959292.0pass
91081838683.0pass
101186899389.0pass
111293969193.0pass
121384828984.0pass
131491949794.0pass
141579818781.0pass
151687869087.0pass
161794979595.0pass
171882849184.0pass
181996989998.0pass
192080798380.0pass
2021989910099.0pass
212275778177.0fail
222392939893.0pass
232489929592.0pass
242577768077.0fail
252695969896.0pass
262781828782.0pass
272890919491.0pass
282978808480.0pass
2930979910099.0pass
303184869086.0pass
313291949794.0pass
323376788278.0fail
333488909390.0pass
343582848984.0pass
3536969810098.0pass
363785879287.0pass
373879818681.0pass
383993959895.0pass
394080828582.0pass
404187899489.0pass
414294969996.0pass
424383859185.0pass
434492949794.0pass
444577798379.0fail
454690929592.0pass
464786889388.0pass
474898100100100.0pass
484981838883.0pass
495095979997.0pass
\n", "
" ], "text/plain": [ " Student_ID Algebra Geometry Calculus median passFail\n", "0 1 85 90 78 85.0 pass\n", "1 2 92 88 95 92.0 pass\n", "2 3 78 75 82 78.0 fail\n", "3 4 95 92 89 92.0 pass\n", "4 5 88 85 92 88.0 pass\n", "5 6 76 78 85 78.0 fail\n", "6 7 89 91 96 91.0 pass\n", "7 8 83 80 88 83.0 pass\n", "8 9 90 95 92 92.0 pass\n", "9 10 81 83 86 83.0 pass\n", "10 11 86 89 93 89.0 pass\n", "11 12 93 96 91 93.0 pass\n", "12 13 84 82 89 84.0 pass\n", "13 14 91 94 97 94.0 pass\n", "14 15 79 81 87 81.0 pass\n", "15 16 87 86 90 87.0 pass\n", "16 17 94 97 95 95.0 pass\n", "17 18 82 84 91 84.0 pass\n", "18 19 96 98 99 98.0 pass\n", "19 20 80 79 83 80.0 pass\n", "20 21 98 99 100 99.0 pass\n", "21 22 75 77 81 77.0 fail\n", "22 23 92 93 98 93.0 pass\n", "23 24 89 92 95 92.0 pass\n", "24 25 77 76 80 77.0 fail\n", "25 26 95 96 98 96.0 pass\n", "26 27 81 82 87 82.0 pass\n", "27 28 90 91 94 91.0 pass\n", "28 29 78 80 84 80.0 pass\n", "29 30 97 99 100 99.0 pass\n", "30 31 84 86 90 86.0 pass\n", "31 32 91 94 97 94.0 pass\n", "32 33 76 78 82 78.0 fail\n", "33 34 88 90 93 90.0 pass\n", "34 35 82 84 89 84.0 pass\n", "35 36 96 98 100 98.0 pass\n", "36 37 85 87 92 87.0 pass\n", "37 38 79 81 86 81.0 pass\n", "38 39 93 95 98 95.0 pass\n", "39 40 80 82 85 82.0 pass\n", "40 41 87 89 94 89.0 pass\n", "41 42 94 96 99 96.0 pass\n", "42 43 83 85 91 85.0 pass\n", "43 44 92 94 97 94.0 pass\n", "44 45 77 79 83 79.0 fail\n", "45 46 90 92 95 92.0 pass\n", "46 47 86 88 93 88.0 pass\n", "47 48 98 100 100 100.0 pass\n", "48 49 81 83 88 83.0 pass\n", "49 50 95 97 99 97.0 pass" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create a new column to indicate whether each student passed or failed the exam (passing score is 80 or higher)\n", "df['passFail'] = df['median'].apply(lambda x: \"pass\" if x >= 80 else \"fail\")\n", "df" ] }, { "cell_type": "markdown", "id": "95e48196-9afd-4a88-8cb4-d54f1a026b16", "metadata": {}, "source": [ "# BONUS" ] }, { "cell_type": "code", "execution_count": 7, "id": "488a97b5-be16-4659-9065-7539c16f2b72", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AlgebraGeometryCalculus
0passpassfail
1passpasspass
2failfailpass
3passpasspass
4passpasspass
\n", "
" ], "text/plain": [ " Algebra Geometry Calculus\n", "0 pass pass fail\n", "1 pass pass pass\n", "2 fail fail pass\n", "3 pass pass pass\n", "4 pass pass pass" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def pass_fail(score):\n", " if score >= 80:\n", " return \"pass\"\n", " else:\n", " return \"fail\"\n", "\n", "# calculate pass fail on all 3 subjects\n", "label = df[['Algebra', 'Geometry', 'Calculus']].map(pass_fail)\n", "label.head()" ] }, { "cell_type": "code", "execution_count": 8, "id": "2b40f664-1d46-4eab-8917-2c4d98070937", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Student_IDAlgebra_xGeometry_xCalculus_xmedianpassFailAlgebra_yGeometry_yCalculus_y
0185907885.0passpasspassfail
1292889592.0passpasspasspass
2378758278.0failfailfailpass
3495928992.0passpasspasspass
4588859288.0passpasspasspass
5676788578.0failfailfailpass
6789919691.0passpasspasspass
7883808883.0passpasspasspass
8990959292.0passpasspasspass
91081838683.0passpasspasspass
101186899389.0passpasspasspass
111293969193.0passpasspasspass
121384828984.0passpasspasspass
131491949794.0passpasspasspass
141579818781.0passfailpasspass
151687869087.0passpasspasspass
161794979595.0passpasspasspass
171882849184.0passpasspasspass
181996989998.0passpasspasspass
192080798380.0passpassfailpass
2021989910099.0passpasspasspass
212275778177.0failfailfailpass
222392939893.0passpasspasspass
232489929592.0passpasspasspass
242577768077.0failfailfailpass
252695969896.0passpasspasspass
262781828782.0passpasspasspass
272890919491.0passpasspasspass
282978808480.0passfailpasspass
2930979910099.0passpasspasspass
303184869086.0passpasspasspass
313291949794.0passpasspasspass
323376788278.0failfailfailpass
333488909390.0passpasspasspass
343582848984.0passpasspasspass
3536969810098.0passpasspasspass
363785879287.0passpasspasspass
373879818681.0passfailpasspass
383993959895.0passpasspasspass
394080828582.0passpasspasspass
404187899489.0passpasspasspass
414294969996.0passpasspasspass
424383859185.0passpasspasspass
434492949794.0passpasspasspass
444577798379.0failfailfailpass
454690929592.0passpasspasspass
464786889388.0passpasspasspass
474898100100100.0passpasspasspass
484981838883.0passpasspasspass
495095979997.0passpasspasspass
\n", "
" ], "text/plain": [ " Student_ID Algebra_x Geometry_x Calculus_x median passFail Algebra_y \\\n", "0 1 85 90 78 85.0 pass pass \n", "1 2 92 88 95 92.0 pass pass \n", "2 3 78 75 82 78.0 fail fail \n", "3 4 95 92 89 92.0 pass pass \n", "4 5 88 85 92 88.0 pass pass \n", "5 6 76 78 85 78.0 fail fail \n", "6 7 89 91 96 91.0 pass pass \n", "7 8 83 80 88 83.0 pass pass \n", "8 9 90 95 92 92.0 pass pass \n", "9 10 81 83 86 83.0 pass pass \n", "10 11 86 89 93 89.0 pass pass \n", "11 12 93 96 91 93.0 pass pass \n", "12 13 84 82 89 84.0 pass pass \n", "13 14 91 94 97 94.0 pass pass \n", "14 15 79 81 87 81.0 pass fail \n", "15 16 87 86 90 87.0 pass pass \n", "16 17 94 97 95 95.0 pass pass \n", "17 18 82 84 91 84.0 pass pass \n", "18 19 96 98 99 98.0 pass pass \n", "19 20 80 79 83 80.0 pass pass \n", "20 21 98 99 100 99.0 pass pass \n", "21 22 75 77 81 77.0 fail fail \n", "22 23 92 93 98 93.0 pass pass \n", "23 24 89 92 95 92.0 pass pass \n", "24 25 77 76 80 77.0 fail fail \n", "25 26 95 96 98 96.0 pass pass \n", "26 27 81 82 87 82.0 pass pass \n", "27 28 90 91 94 91.0 pass pass \n", "28 29 78 80 84 80.0 pass fail \n", "29 30 97 99 100 99.0 pass pass \n", "30 31 84 86 90 86.0 pass pass \n", "31 32 91 94 97 94.0 pass pass \n", "32 33 76 78 82 78.0 fail fail \n", "33 34 88 90 93 90.0 pass pass \n", "34 35 82 84 89 84.0 pass pass \n", "35 36 96 98 100 98.0 pass pass \n", "36 37 85 87 92 87.0 pass pass \n", "37 38 79 81 86 81.0 pass fail \n", "38 39 93 95 98 95.0 pass pass \n", "39 40 80 82 85 82.0 pass pass \n", "40 41 87 89 94 89.0 pass pass \n", "41 42 94 96 99 96.0 pass pass \n", "42 43 83 85 91 85.0 pass pass \n", "43 44 92 94 97 94.0 pass pass \n", "44 45 77 79 83 79.0 fail fail \n", "45 46 90 92 95 92.0 pass pass \n", "46 47 86 88 93 88.0 pass pass \n", "47 48 98 100 100 100.0 pass pass \n", "48 49 81 83 88 83.0 pass pass \n", "49 50 95 97 99 97.0 pass pass \n", "\n", " Geometry_y Calculus_y \n", "0 pass fail \n", "1 pass pass \n", "2 fail pass \n", "3 pass pass \n", "4 pass pass \n", "5 fail pass \n", "6 pass pass \n", "7 pass pass \n", "8 pass pass \n", "9 pass pass \n", "10 pass pass \n", "11 pass pass \n", "12 pass pass \n", "13 pass pass \n", "14 pass pass \n", "15 pass pass \n", "16 pass pass \n", "17 pass pass \n", "18 pass pass \n", "19 fail pass \n", "20 pass pass \n", "21 fail pass \n", "22 pass pass \n", "23 pass pass \n", "24 fail pass \n", "25 pass pass \n", "26 pass pass \n", "27 pass pass \n", "28 pass pass \n", "29 pass pass \n", "30 pass pass \n", "31 pass pass \n", "32 fail pass \n", "33 pass pass \n", "34 pass pass \n", "35 pass pass \n", "36 pass pass \n", "37 pass pass \n", "38 pass pass \n", "39 pass pass \n", "40 pass pass \n", "41 pass pass \n", "42 pass pass \n", "43 pass pass \n", "44 fail pass \n", "45 pass pass \n", "46 pass pass \n", "47 pass pass \n", "48 pass pass \n", "49 pass pass " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# merge the two dataframes\n", "df.merge(right=label,left_index=True, right_index=True)" ] }, { "cell_type": "markdown", "id": "f33370ec-c071-451e-94b7-6e839e95fe30", "metadata": {}, "source": [ "# Summary: \n", "The tutorial demonstrated how to analyze a dataset of student performance in a mathematics exam using Pandas. It covered importing libraries, loading data, checking data types, calculating average scores, identifying top performers, and creating new columns to indicate pass/fail status. \n", "\n", "# Key Takeaways:\n", "- Loading data from clipboard into a Pandas DataFrame\n", "- Checking data types and info using `df.info()` and `df.head()`\n", "- Calculating median/average scores using `df.median()` or `df.mean()`\n", "- Creating new columns using `df['new_column']`\n", "- Sorting data using `df.sort_values()` and selecting top rows using `df.head()`\n", "- Applying conditional logic using `apply()` and lambda functions (e.g., pass/fail status)\n", "- Merging DataFrames using `df.merge()`\n", "- Using map function to apply pass/fail logic to individual subjects" ] }, { "cell_type": "markdown", "id": "503fc581-f17e-43c6-8215-05b70bf32b47", "metadata": {}, "source": [ "

This tutorial was created by HEDARO

" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 5 }