{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" }, "colab": { "name": "titanic-project-test.ipynb", "provenance": [] } }, "cells": [ { "cell_type": "markdown", "metadata": { "_cell_guid": "6a09d4fb-60c5-4f45-b844-8c788a50c543", "_uuid": "8e892e637f005dd61ec7dcb95865e52f3de2a77f", "id": "FrwBQrji42hL" }, "source": [ "# পাইথন দিয়ে টাইটানিক প্রজেক্ট\n", "> *Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we're looking for more people with skills in this language.*\n", "\n", "> -- Peter Norvig, Director of search quality at Google, Inc.\n", "\n", "\"আর\" প্রোগ্রামিং এনভায়রনমেন্টএর পাশাপাশি আমরা পুরো এক্সারসাইজটা দেখাবো পাইথনে। এর মানে এই নয় যে আমরা \"আর\"এ করা এক্সারসাইজটা দেখবো না। এই পাইথনে করা এক্সারসাইজ বুঝতে আমাদের দেখে আসতে হবে পুরো টাইটানিক প্রজেক্ট \"আর\" এনভায়রনমেন্টে। সব জায়গায় বেসিক কনসেপ্ট একই। তবে, 'আর' দিয়ে বোঝা যায় ভালো। আগেই বলেছি যারা কম্পিউটার প্রযুক্তিতে পড়েননি অথবা মেশিন লার্নিং একদম ভেতর থেকে হাতেকলমে শিখতে চান, তাদের জন্য 'আর' অন্য লেভেলের জিনিস। \n", "\n", "কম্পিউটার প্রযুক্তিতে পড়েছেন আর পাইথন জানেন বলে 'মেশিন লার্নিং' শিখে যাবেন সেটাও ভ্রান্ত ধারণা। পাইথন একটা ভালো ফ্রেমওয়ার্ক, প্রায় অনেককিছুই করা যায়। তাই বলে মেশিন লার্নিং আর পাইথন পাশাপাশি সমার্থক শব্দ সেটা বলা যাবে না। সেটা সামনে গেলে দেখতে পাবেন। \n", "\n", "আমার কথা বললে বলবো, আমি দুটোই শিখেছি কারণ - দুটো 'দুই' জায়গায় ভালো। মেশিন লার্নিং শেখার শুরুতে 'আর' ভালো, প্রোডাকশন লেভেলে পাইথন ভালো। যেখানে যেটা লাগে। ছোট দূরত্বে রিকশা ভালো, বড় দূরত্বে হয়তোবা মোটর সাইকেল ভালো। আমাদের জানতে হবে কোথায় কি লাগবে? যুগটা অপটিমাইজেশনের যুগ। দরকার মতো আরো কিছু জিনিস শিখতে হতে পারে। লজ্জা করলেই ক্ষতি। কিছুই শেখা যাবে না। এই পঞ্চাশ বছরের কাছাকাছি বয়সেও আমাকে শিখতে হচ্ছে অনেককিছু। না শিখলে - ঝরে পড়ে যাবেন যে কেউ। " ] }, { "cell_type": "markdown", "metadata": { "id": "_iuwI49I42hQ" }, "source": [ "## ১. জুপিটার নোটবুক ইনস্টলেশন \n", "\n", "আমরা যারা পাইথন নিয়ে কাজ করি তাদের একটা ডেভেলপমেন্ট ইন্টারফেস দরকার। আর স্টুডিওর মতো কিছু একটা। সেখানে আমরা একটা ওয়েববেসড ইন্টারফেস দিয়ে তৈরি জুপিটার নোটবুক ব্যবহার করবো। পাইথন, আর থেকে শুরু করে এমন কোন নামকরা প্রোগ্রামিং এনভায়রনমেন্ট নেই যেখানে জুপিটার ফ্রেমওয়ার্ক কাজ করে না। মজার কথা হচ্ছে পুরো এনভায়রনমেন্ট ভ্যারিয়েবলসহ আপনার কাজ শেয়ার করা যায় সব জায়গায়, এমনকি গিট্হাবেও। \n", "\n", "### ১.১ জুপিটার ইনস্টলেশন - অ্যানাকোন্ডা\n", "\n", "জুপিটারের উইন্ডোজ ইনস্টলেশন একেবারে পানি ভাত। মানে ১. ডাউনলোড করে নিন অ্যানাকোন্ডা, নিচের সাইট থেকে। ২. এরপর ক্লিক, ক্লিক আর ক্লিক। শুধু একটা জিনিস খেয়াল রাখবেন, ইনস্টলেশন পাথে যাতে স্পেস দিয়ে কোন ফোল্ডার না থাকে। উদাহরণ হিসেবে বলা যায় \"C:\\Users\\Test\\Anaconda3\", ঠিক আছে তো?\n", "\n", "https://www.anaconda.com/download/\n", "\n", "### ১.২ জুপিটার নোটবুক চালু \n", "\n", "উইন্ডোজের রান অথবা কমান্ড প্রম্পটে লিখুন, পুরোটা লিখতে হয়না। তার আগেই চলে আসে ডেস্কটপ অ্যাপের নাম। \n", "\n", "jupyter notebook\n", "\n", "খেয়াল রাখবেন - প্রতিটা কমান্ড লিখবেন In [সংখ্যা] সেলে এবং সেটার আউটপুট আপনি দেখতে পারবেন Out [সংখ্যা] দিয়ে। প্রতিবার কমান্ড লেখার পর 'সেল' মেন্যু অথবা 'সেল' বাটনে 'কোড' হিসেবে সিলেক্ট করে রান বাটন চাপবেন। " ] }, { "cell_type": "markdown", "metadata": { "id": "kj1yMMJn42hR" }, "source": [ "### ১.৩ কোথায় পাবো এই স্ক্রিপ্ট?\n", "\n", "https://github.com/raqueeb/mltraining/blob/master/Python/titanic-project.ipynb" ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "4af5e83d-7fd8-4a61-bf26-9583cb6d3476", "_uuid": "65d04d276a8983f62a49261f6e94a02b281dbcc9", "id": "8OFBJXcc42hR" }, "source": [ "## ২. টাইটানিক জাহাজ ডুবিতে বেচেঁ যাবার প্রেডিকশন\n", "ক্যাগল কম্পিটিশন [টাইটানিক: বিপর্যয়ে মেশিন লার্নিং](http://www.kaggle.com/c/titanic)" ] }, { "cell_type": "markdown", "metadata": { "id": "jefehoTQ42hR" }, "source": [ "- প্রবলেম স্টেটমেন্টকে সংজ্ঞায়িত করা\n", "- ডাটা কোথায় পাবো?\n", "- এক্সপ্লোরাটরি ডাটা অ্যানালাইসিস\n", "- ফীচার ইঞ্জিনিয়ারিং\n", "- মডেলিং\n", "- টেস্টিং" ] }, { "cell_type": "markdown", "metadata": { "id": "xTdZ-RBv42hR" }, "source": [ "### ২.১ প্রবলেম স্টেটমেন্ট: কি সমস্যা সমাধান করবো?" ] }, { "cell_type": "markdown", "metadata": { "id": "nBDRuv7142hS" }, "source": [ "- আমরা জানতে চাইবো কারা কারা বেঁচে বা মারা গিয়েছিলেন?\n", "- আমরা মেশিন লার্নিং টুল ব্যবহার করে দেখতে চাইবো কোন ধরনের মানুষগুলো বেঁচে যাবেন?\n", "*আমাদের এই প্রতিযোগিতা চাচ্ছে যাতে আমরা বাইনারি 'আউটকাম' প্রেডিক্ট করি। এখানে ০ মানে উনি মারা গেছেন, ১ মানে উনি বেঁচে গিয়েছিলেন।* " ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "3f529075-7f9b-40ff-a79a-f3a11a7d8cbe", "_uuid": "64ca0f815766e3e8074b0e04f53947930cb061aa", "id": "TlOGcAhA42hS" }, "source": [ "### ২.৩ ডাটা কোথায় পাব?\n", "আমরা ডাটা কালেক্ট করবো ক্যাগল সাইট থেকে\n", "\n", "- ট্রেনিং ডাটাসেট https://www.kaggle.com/c/titanic/download/train.csv\n", "- টেস্ট ডাটাসেট https://www.kaggle.com/c/titanic/download/test.csv" ] }, { "cell_type": "markdown", "metadata": { "id": "NBNHtBwR42hS" }, "source": [ "### ২.৪ পাইথন ক্র্যাশ কোর্স \n", "\n", "যারা পাইথন জানেন না তাদের জন্য 'আর' এনভায়রনমেন্ট ভালো। আর যারা জানেন না, তবে পাইথন শিখতে চান তাদের জন্য এটা একটা ছোট্ট ক্র্যাশ কোর্স। যেটুকু দরকার সেটুকু শেখাবো এখানে। তবে সেটার জন্য 'আর এনভায়রনমেন্ট' দেখে আসতে হবে আগে। পাইথনের সবচেয়ে বড় সুবিধা হচ্ছে এর হাজারো এক্সটার্নাল লাইব্রেরির সাপোর্ট। মেশিন লার্নিং এর জন্য Scikit-learn লাইব্রেরি একটা অসাধারণ জিনিস। Matlab এর মতো NumPy হচ্ছে \"অ্যারে\" কম্পিউটেশন মানে টেবিল নিয়ে কাজ করতে পারে। এই \"অ্যারে\" কিন্তু পাইথনের সাধারণ লিস্ট থেকে আলাদা। আর টেবিলের ডাটা নিয়ে কাজ করে Pandas লাইব্রেরি। আমাদের matplotlib হচ্ছে ডাটা গ্রাফ আর ভিজ্যুয়ালাইজেশন টুল। \n", "\n", "#### মডিউল\n", "\n", "১. পাইথনের সব ফিচার কিন্তু শুরুতেই লোড হয় না ডিফল্ট হিসেবে। আর সেটা আমাদের এই ল্যাঙ্গুয়েজের জন্য হোক - অথবা ডাউনলোড করা থার্ড পার্টি হোক। সেই মডিউলের ওই ফিচারটা ব্যবহার করতে হলে সেই মডিউলটাকে ইমপোর্ট করে নিতে হবে আগে। আগেই বলেছি, এখানে pandas হচ্ছে একটা পাওয়ারফুল ডাটা এনালাইসিস পাইথন লাইব্রেরি যেটা তৈরি করা হয়েছে numpy এর ওপর। numpy হচ্ছে আরেকটা পাইথন লাইব্রেরি যা আমাদেরকে ডাটার 'অ্যারে' ব্যবহার করতে সাহায্য করে। \n", "\n", "import pandas\n", "\n", "২. আবার যেই মডিউল আমরা ইমপোর্ট করি না কেন, সেটাকে ছোট করে নেয়া যায় 'এলিয়াস' মানে ছোট নাম দিয়ে। এখানে আমরা pandas ডাটাফ্রেমকে ইমপোর্ট করছি, সেটাকে নতুন করে অ্যাসাইন করছি pd নেমস্পেস দিয়ে। সংক্ষিপ্ত করতে। সংক্ষিপ্ত না করেও কাজ করা যায়। \n", "\n", "import pandas as pd\n", "\n", "৩. এখানে আপনি pandas কে ইমপোর্ট করে সেখানে তার ছোট্ট নাম দিয়ে কাজ করতে পারবেন পুরো ডকুমেন্ট ধরে। মডিউলের ইমপোর্ট এর পর আপনি তার ফাংশনগুলোকে এক্সেস করতে চাইলে মডিউলের পর (.) ডট দিয়ে মেথডগুলো ব্যবহার করবো। এরপর আমরা pandas লাইব্রেরি থেকে read_csv মেথড ব্যবহার করবো আমাদের কম্পিউটারে রাখা একটা csv ফাইল পড়তে। সেটাকে তারপর আমরা পাঠাবো ডাটাফ্রেমে। এখানে read_csv মেথডটা পুরো টেবিলটাকে পাঠাবে প্রথম সারিটাকে ডাটাফ্রেমের হেডার হিসেবে। প্রথম সারি মানে সেখানে ফিল্ডের নাম থাকে। \n", "\n", "pd.read_csv\n", "\n", "#### ২.৪.১ ট্রেনিং এবং টেস্ট ডাটা লোড করে নেব পান্ডা দিয়ে\n", "\n", "এখন থেকে আমরা সবকিছু রেফার করবো \"আর\" এর সাথে মিলিয়ে। মনে আছে ডাটাফ্রেমের কথা? আগেই বলেছি পাইথনে ডাটাফ্রেম নিয়ে কাজ করে পান্ডাজ নামের অসাধারণ লাইব্রেরি। " ] }, { "cell_type": "code", "metadata": { "_cell_guid": "e58a3f06-4c2a-4b87-90de-f8b09039fd4e", "_uuid": "46f0b12d7bf66712642e9a9b807f5ef398426b83", "collapsed": true, "scrolled": false, "id": "9oXYu_pE42hT" }, "source": [ "import pandas as pd\n", "\n", "train = pd.read_csv('https://github.com/raqueeb/mltraining/raw/master/Python/datasets/train.csv')\n", "test = pd.read_csv('https://github.com/raqueeb/mltraining/raw/master/Python/datasets/test.csv')" ], "execution_count": 1, "outputs": [] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "836a454f-17bc-41a2-be69-cd86c6f3b584", "_uuid": "1ed3ad39ead93977b8936d9c96e6f6f806a8f9b3", "id": "SFkQhATl42hT" }, "source": [ "## ৩. এক্সপ্লোরাটরি ডাটা অ্যানালাইসিস\n", "শুরুতেই ডাটাফ্রেমের মাথা দেখি, মাত্র ৫টা সারি ধরে। train.head ফাংশন মানে দেখাও ট্রেইন ডাটাফ্রেমের মাথার কিছু অংশ। এখানে NaN\tমানে ডাটা নেই। ডাটাটা মিসিং। " ] }, { "cell_type": "code", "metadata": { "_cell_guid": "749a3d70-394c-4d2c-999a-4d0567e39232", "_uuid": "b9fdb3b19d7a8f30cd0bb69ae434e04121ecba93", "scrolled": true, "id": "GvQKkhqB42hT", "outputId": "77588e7a-c4b7-489a-be48-967aefd1b805", "colab": { "base_uri": "https://localhost:8080/", "height": 357 } }, "source": [ "train.head(5)" ], "execution_count": 2, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass ... Fare Cabin Embarked\n", "0 1 0 3 ... 7.2500 NaN S\n", "1 2 1 1 ... 71.2833 C85 C\n", "2 3 1 3 ... 7.9250 NaN S\n", "3 4 1 1 ... 53.1000 C123 S\n", "4 5 0 3 ... 8.0500 NaN S\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 2 } ] }, { "cell_type": "markdown", "metadata": { "id": "YYQqeWIg42hU" }, "source": [ "### ৩.১ ডাটা ডিকশনারি \n", "\n", "- ভ্যারিয়েবল\t- মানে কি?\t- ভ্যালু কি হতে পারে\n", "\n", "- survival\t= বেঁচে গিয়েছেন/মারা গিয়েছেন\t1 = বেঁচে গিয়েছেন; 0 = মারা গিয়েছেন\n", "\n", "- pclass\t= টিকেটের ক্লাস বা শ্রেণী\t1st = প্রথম; 2nd = দ্বিতীয়; 3rd = তৃতীয়\n", "\n", "- sex\t= মহিলা না পুরুষ\n", "\n", "- Age\t= বয়স বছরে,\tএখানে অনেক ডাটা মিসিং\n", "\n", "- sibsp\t= উনার ভাইবোন অথবা স্বামী/স্ত্রীর সংখ্যা, ওই টাইটানিক জাহাজে,\tsiblings / spouses সংখ্যায়\n", "\n", "- parch\t= উনার বাবা মা অথবা বাচ্চাদের সংখ্যা,\tparent /children সংখ্যায়\n", "\n", "- ticket\t= টিকেট নাম্বার,\tকেবিন নম্বর ধরে টিকেট নম্বর\n", "\n", "- fare\t= টাইটানিক যাত্রীর ভাড়া\t\n", "\n", "- cabin\t= টাইটানিকের কেবিন নাম্বার\t\n", "\n", "- embarked\t= কোথা থেকে উঠেছেন, বিশেষ করে কোন পোর্ট থেকে\tC = Cherbourg, Q = Queenstown, S = Southampton" ] }, { "cell_type": "markdown", "metadata": { "id": "TTzMB07l42hU" }, "source": [ "আমরা জানতে চাইবো কতোটা সারি আর কলাম আছে আমাদের ডাটাসেটে। আমরা দেখতে পাচ্ছি ৮৯১ কলাম আর ১২ সারি। " ] }, { "cell_type": "code", "metadata": { "_cell_guid": "ed1e7849-d1b6-490d-b86b-9ca71dfafc7d", "_uuid": "5a641beccf0e555dfd7b9a53a17188ea6edef95b", "id": "-MRYuc6Y42hV", "outputId": "a681ea0d-16d0-450f-d8f7-cd0409bd353a", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.shape" ], "execution_count": 3, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(891, 12)" ] }, "metadata": { "tags": [] }, "execution_count": 3 } ] }, { "cell_type": "code", "metadata": { "id": "AJL6TEVa42hV", "outputId": "6ccc521a-321f-47b1-9818-3ea9e122d3d6", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "test.shape" ], "execution_count": 4, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(418, 11)" ] }, "metadata": { "tags": [] }, "execution_count": 4 } ] }, { "cell_type": "code", "metadata": { "_cell_guid": "418b8a69-f2aa-442d-8f45-fa8887190938", "_uuid": "4ee2591110660a4a16b3da7a7530f0945e121b46", "id": "dKi5xwwL42hV", "outputId": "7bf315f3-a83e-4d6c-8526-56d9bda505ec", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.info()" ], "execution_count": 5, "outputs": [ { "output_type": "stream", "text": [ "\n", "RangeIndex: 891 entries, 0 to 890\n", "Data columns (total 12 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 PassengerId 891 non-null int64 \n", " 1 Survived 891 non-null int64 \n", " 2 Pclass 891 non-null int64 \n", " 3 Name 891 non-null object \n", " 4 Sex 891 non-null object \n", " 5 Age 714 non-null float64\n", " 6 SibSp 891 non-null int64 \n", " 7 Parch 891 non-null int64 \n", " 8 Ticket 891 non-null object \n", " 9 Fare 891 non-null float64\n", " 10 Cabin 204 non-null object \n", " 11 Embarked 889 non-null object \n", "dtypes: float64(2), int64(5), object(5)\n", "memory usage: 83.7+ KB\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "uloijNr_42hV" }, "source": [ "আপনারা দেখেছেন কি? Age, Cabin, Embarked ভ্যারিয়েবলগুলোতে ডাটা মিসিং। " ] }, { "cell_type": "code", "metadata": { "id": "JiBBXo3q42hW", "outputId": "c0b70b07-64cf-432f-944e-58d189bee478", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "test.info()" ], "execution_count": 6, "outputs": [ { "output_type": "stream", "text": [ "\n", "RangeIndex: 418 entries, 0 to 417\n", "Data columns (total 11 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 PassengerId 418 non-null int64 \n", " 1 Pclass 418 non-null int64 \n", " 2 Name 418 non-null object \n", " 3 Sex 418 non-null object \n", " 4 Age 332 non-null float64\n", " 5 SibSp 418 non-null int64 \n", " 6 Parch 418 non-null int64 \n", " 7 Ticket 418 non-null object \n", " 8 Fare 417 non-null float64\n", " 9 Cabin 91 non-null object \n", " 10 Embarked 418 non-null object \n", "dtypes: float64(2), int64(4), object(5)\n", "memory usage: 36.0+ KB\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "abc3c4fc-6419-405f-927a-4214d2c73eec", "_uuid": "622d4d4b2ba8f77cc537af97fc343d4cd6de26b2", "id": "0ahYJYUW42hW" }, "source": [ "আমরা দেখতে পাচ্ছি বয়স ভ্যারিয়েবলটাতে অনেক ভ্যালু মিসিং। বেশ সমস্যার কথা। ৮৯১ সারির মধ্যে ৭১৪ সারিতে ভ্যালু আছে। কেবিনে দেখা যাচ্ছে মাত্র ২০৪টাতে ভ্যালু আছে। ভয়ঙ্কর সমস্যা। ডাটাফ্রেমে কোন কোন ভ্যারিয়েবলে কতোটা ভ্যালু মিসিং (NaN) সেটা জানতে আমরা ব্যবহার করছি isnull() মেথড + এরপর সেগুলোকে যোগ করেছি sum() মেথড দিয়ে। " ] }, { "cell_type": "code", "metadata": { "_cell_guid": "0663e2bb-dc27-4187-94b1-ff4ff78b68bc", "_uuid": "3bf74de7f2483d622e41608f6017f2945639e4df", "id": "lPwBqQWe42hW", "outputId": "5b8dbbe3-9001-4428-a95e-6d8bc3d2bca7", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.isnull().sum()" ], "execution_count": 7, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "PassengerId 0\n", "Survived 0\n", "Pclass 0\n", "Name 0\n", "Sex 0\n", "Age 177\n", "SibSp 0\n", "Parch 0\n", "Ticket 0\n", "Fare 0\n", "Cabin 687\n", "Embarked 2\n", "dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 7 } ] }, { "cell_type": "code", "metadata": { "id": "GcZjfzBJ42hW", "outputId": "30e6dd42-ea36-430a-fce6-9398f4e97e59", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "test.isnull().sum()" ], "execution_count": 8, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "PassengerId 0\n", "Pclass 0\n", "Name 0\n", "Sex 0\n", "Age 86\n", "SibSp 0\n", "Parch 0\n", "Ticket 0\n", "Fare 1\n", "Cabin 327\n", "Embarked 0\n", "dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 8 } ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "176aa52d-fde8-42e6-a3ee-db31f8b0ca49", "_uuid": "b48a9feff6004d783960aa1b32fdfde902d87e21", "id": "RNzaH11q42hX" }, "source": [ "টেস্ট ডাটাফ্রেমে একই কাহিনী। মানে ডাটা মিসিং। এখানে ১৭৭টা সারি মিসিং বয়স ভ্যারিয়েবলে। কেবিনের সাথে কানেক্ট করা যাচ্ছে না ৬৮৭টা ভ্যালু। কোথা থেকে কে উঠেছে সেটাতে নেই ২টা ভ্যালু। আগেই দেখেছেন কেন সমস্যা করে মিসিং ভ্যালু? জানতে হলে আপনাকে পড়তে হবে মেশিন লার্নিং প্রেডিকশন চ্যাপ্টারটা। " ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "c8553d48-c5e0-4947-bd13-1b38509c850c", "_uuid": "1a28e607e9ed63cefe0f35a4e4d72f2f36299323", "id": "DATgDgWX42hX" }, "source": [ "## ৪. ডাটা ভিজ্যুয়ালাইজেশন\n", "\n", "এটা ঠিক যে পাইথনও আস্তে আস্তে সুন্দর হয়ে উঠছে ডাটা ভিজ্যুয়ালাইজেশন লাইব্রেরিতে। matplotlib.pyplot লাইব্রেরি দেয় আমাদের ম্যাটল্যাবের মতো চমৎকার প্লটিং ফ্রেমওয়ার্ক। ছবিগুলো জুপিটার নোটবুকে একসাথে দেখানোর জন্য inline মোড নিয়ে আসা হয়েছে। seaborn হচ্ছে পাইথনের matplotlib ভিত্তিক স্ট্যাটিসটিকাল গ্রাফিক্যাল লাইব্রেরি। সুন্দর বটে। " ] }, { "cell_type": "code", "metadata": { "_cell_guid": "b1d8a6d2-c22d-435c-8c98-973e8f41b138", "_uuid": "26411c710f69b29939c815d5f5ab01d9177df7d0", "collapsed": true, "id": "Z8noMor042hX" }, "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "import seaborn as sns\n", "sns.set()" ], "execution_count": 9, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "FZLK41dM42hX" }, "source": [ "### ৪.১ ক্যাটেগরিক্যাল ফিচারগুলো নিয়ে বার চার্ট \n", "- Pclass\n", "- Sex\n", "- SibSp\n", "- Parch\n", "- Embarked\n", "- Cabin\n", "\n", "এখানে আমরা ছবি দিয়ে একটা সম্পর্ক খুঁজবো কারা কারা বেঁচে গিয়েছিলেন - বাকি ভ্যারিয়েবলগুলোর সাথে কানেক্ট করে। পাইথনে একটা বারচার্ট ফাংশন ডিফাইন করছি যাতে বিভিন্ন ভ্যারিয়েবলগুলোকে একেকটা প্যারামিটার ধরে ধরে পাঠাতে পারি আমাদের নতুন তৈরি করা ফাংশনে। এখানে দুটো বারচার্ট তৈরি করবে আমাদের প্যারামিটার - ['Survived','Dead'] - সবকিছুর ভ্যালুগুলোকে যোগ করবো শেষে। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "q9w6KcMA42hX" }, "source": [ "def bar_chart(feature):\n", " survived = train[train['Survived']==1][feature].value_counts()\n", " dead = train[train['Survived']==0][feature].value_counts()\n", " df = pd.DataFrame([survived,dead])\n", " df.index = ['Survived','Dead']\n", " df.plot(kind='bar',stacked=True, figsize=(10,5))" ], "execution_count": 10, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "fImr6S7_42hY" }, "source": [ "এখন ফিচারে পাঠাই একটা করে আমাদের ভ্যারিয়েবলগুলোকে। শুরুতেই 'Sex'" ] }, { "cell_type": "code", "metadata": { "id": "NNoyNgYa42hY", "outputId": "79c870f5-0268-4291-f762-96635ae5d888", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Sex')" ], "execution_count": 11, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "To0D_uBn42hY" }, "source": [ "এই চার্ট বলছে পুরুষ থেকে বেশি বেঁচেছেন মহিলারা। পরেরটা 'Pclass'" ] }, { "cell_type": "code", "metadata": { "id": "zVxATJCv42hY", "outputId": "aa3a3a93-774c-4c33-8f45-deffb0bedf82", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Pclass')" ], "execution_count": 12, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "SbYhZECf42hY" }, "source": [ "এখানে দেখা গেলো প্রথম শ্রেণীর যাত্রীরা বেঁচেছেন বেশি। এদিকে তৃতীয় শ্রেনীর যাত্রীরাও মারা গিয়েছেন অন্য যেকোন শ্রেণী থেকে। " ] }, { "cell_type": "code", "metadata": { "id": "eHtEj2sB42hY", "outputId": "ab3ad5e2-e91a-4985-f26e-8a03ad481812", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('SibSp')" ], "execution_count": 13, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "13r7KhNr42hZ" }, "source": [ "এখানের ব্যাপারটা একটু ভাবিয়েছে আমাদের। ওই ফ্যামিলিগুলোতে যারা দুই জনের বেশি ছিলেন তারা বেঁচেছিলেন বেশি। যারা শুধুমাত্র নিজেরা মানে একা ছিলেন তারা বেঁচেছেন কম। " ] }, { "cell_type": "code", "metadata": { "id": "JQKDMNyy42hZ", "outputId": "aed78953-5362-4555-dc78-b31c4fd5d7cf", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Parch')" ], "execution_count": 14, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "2owLxVWR42hZ" }, "source": [ "যারা বাবা মা অথবা বাচ্চাদের নিয়ে ছিলেন টাইটানিকে - তারা বেঁচেছেন বেশি। যারা একা ছিলেন তারা অতোটা বাঁচতে পারেননি। " ] }, { "cell_type": "code", "metadata": { "scrolled": true, "id": "7oMmBktK42hZ", "outputId": "ecda4e78-5538-43ff-eee6-ec211d15ceda", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Embarked')" ], "execution_count": 15, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "MJPzINif42hZ" }, "source": [ "এই চার্ট কি বলে?\n", "\n", "যারা Cherbourg থেকে উঠেছিলেন তারা অন্য জায়গা থেকে ওঠা মানুষদের থেকে বেঁচেছেন বেশি। Queenstown আর Southampton থেকে ওঠা মানুষগুলো বাঁচেননি বেশি। এর মানে হচ্ছে Cherbourg এলাকার মানুষ অবস্থাপন্ন। " ] }, { "cell_type": "markdown", "metadata": { "_cell_guid": "810cd964-24eb-44fb-9e7b-18bbddd4900f", "_uuid": "fd86ccdf2d1248b79c68365444e96e46a50f3f5a", "id": "DRqXwBkJ42ha" }, "source": [ "## ৫. ফিচার ইঞ্জিনিয়ারিং\n", "\n", "এটা নিয়ে একটা বিশাল বড় চ্যাপ্টার লিখেছি আগে। মেশিন লার্নিং প্রেডিকশন নিয়ে। 'ফিচার ইঞ্জিনিয়ারিং' হচ্ছে একটা ডোমেইন নলেজ নিয়ে কিছু ফিচার তৈরী করা যাতে আমাদের মেশিন লার্নিং অ্যালগরিদম চমৎকারভাবে কাজ করে। আমি অনুরোধ করবো সেই চ্যাপ্টারটা আবার দেখে নিতে। কারণ এটা একটা খুবই দরকারি জিনিস। \n", "\n", "আমরা জানি মেশিন তো আমার আপনার মতো ফিচার চেনে না। তার জন্য সেটাকে সংখ্যায় দিলে ভালো কাজ করে। সেরকম কিছু করবো আমরা এখানে। শুরুতেই আমরা ভালোভাবে দেখি আমাদের ডাটাসেটগুলো। " ] }, { "cell_type": "markdown", "metadata": { "id": "unFUR9PG42ha" }, "source": [ "### ৫.১ আচ্ছা, টাইটানিকের কিভাবে ডুবেছিল?\n", "আমরা এখানে রিসার্চ করতে গিয়ে দেখলাম টাইটানিকের পেছনের দিক থেকে ডোবা শুরু করেছিল। আর ওখানেই শুরু হয়েছিল ৩য় শ্রেণী। তারমানে Pclas কিন্তু একটা বড় ক্লাসিফায়ার। একটু পেছনে গেলে দেখবেন কিভাবে নাম এখানে বড় একটা রোল প্লে করেছিল। " ] }, { "cell_type": "markdown", "metadata": { "id": "Oo0WotxR42ha" }, "source": [ "### ৫.২ নাম \n", "প্রথমেই যোগ করে নেই টেস্ট আর ট্রেনিং ডাটাসেট। 'Title' ডাটাসেট তৈরি করি নাম থেকে। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "UM4OFkcY42ha" }, "source": [ "train_test_data = [train, test] \n", "\n", "for dataset in train_test_data:\n", " dataset['Title'] = dataset['Name'].str.extract(' ([A-Za-z]+)\\.', expand=False)" ], "execution_count": 16, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "Tj9ijgZU42ha" }, "source": [ "ট্রেইন আর টেস্ট ডাটাসেটে টাইটেলগুলোর সংখ্যা বের করি। এর আগেও ব্যাপারটা করেছিলাম \"আর\" দিয়ে। " ] }, { "cell_type": "code", "metadata": { "id": "6HNmxJ_A42ha", "outputId": "230a3736-bf22-454f-eab6-d72fa9b62676", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train['Title'].value_counts()" ], "execution_count": 17, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "Mr 517\n", "Miss 182\n", "Mrs 125\n", "Master 40\n", "Dr 7\n", "Rev 6\n", "Mlle 2\n", "Col 2\n", "Major 2\n", "Mme 1\n", "Jonkheer 1\n", "Don 1\n", "Ms 1\n", "Countess 1\n", "Lady 1\n", "Capt 1\n", "Sir 1\n", "Name: Title, dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 17 } ] }, { "cell_type": "code", "metadata": { "id": "gPxwBMG542hb", "outputId": "eeefccbe-28ad-4d9b-9c2a-6e90bd182353", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "test['Title'].value_counts()" ], "execution_count": 18, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "Mr 240\n", "Miss 78\n", "Mrs 72\n", "Master 21\n", "Rev 2\n", "Col 2\n", "Dona 1\n", "Ms 1\n", "Dr 1\n", "Name: Title, dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 18 } ] }, { "cell_type": "markdown", "metadata": { "id": "4Q1p7VTq42hb" }, "source": [ "#### টাইটেলগুলোকে ম্যাপ করি সংখ্যার সাথে \n", "\n", "আগের মতো বাকি অদরকারি টাইটেলগুলোকে ম্যাপ করে দেই ৩ এর সাথে। \n", "\n", "Mr : 0 \n", "Miss : 1 \n", "Mrs: 2 \n", "Others: 3\n" ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "-9k_pa1S42hb" }, "source": [ "title_mapping = {\"Mr\": 0, \"Miss\": 1, \"Mrs\": 2, \n", " \"Master\": 3, \"Dr\": 3, \"Rev\": 3, \"Col\": 3, \"Major\": 3, \"Mlle\": 3,\"Countess\": 3,\n", " \"Ms\": 3, \"Lady\": 3, \"Jonkheer\": 3, \"Don\": 3, \"Dona\" : 3, \"Mme\": 3,\"Capt\": 3,\"Sir\": 3 }\n", "for dataset in train_test_data:\n", " dataset['Title'] = dataset['Title'].map(title_mapping)" ], "execution_count": 19, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "npE8ya1842hb" }, "source": [ "ভালো করে দেখুন Title ভ্যারিয়েবলগুলো। নতুন ম্যাপিং হয়ে গেছে আমাদের দরকার মতো। " ] }, { "cell_type": "code", "metadata": { "id": "51y-asGH42hb", "outputId": "52698f57-c048-432b-80e6-16feed2cdcd7", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "train.head()" ], "execution_count": 20, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedTitle
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS0
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C2
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS1
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S2
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass ... Cabin Embarked Title\n", "0 1 0 3 ... NaN S 0\n", "1 2 1 1 ... C85 C 2\n", "2 3 1 3 ... NaN S 1\n", "3 4 1 1 ... C123 S 2\n", "4 5 0 3 ... NaN S 0\n", "\n", "[5 rows x 13 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 20 } ] }, { "cell_type": "code", "metadata": { "id": "AyDqHhNg42hc", "outputId": "630d3599-4fcf-4c75-f968-8b7be199fba5", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "test.head()" ], "execution_count": 21, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedTitle
08923Kelly, Mr. Jamesmale34.5003309117.8292NaNQ0
18933Wilkes, Mrs. James (Ellen Needs)female47.0103632727.0000NaNS2
28942Myles, Mr. Thomas Francismale62.0002402769.6875NaNQ0
38953Wirz, Mr. Albertmale27.0003151548.6625NaNS0
48963Hirvonen, Mrs. Alexander (Helga E Lindqvist)female22.011310129812.2875NaNS2
\n", "
" ], "text/plain": [ " PassengerId Pclass ... Embarked Title\n", "0 892 3 ... Q 0\n", "1 893 3 ... S 2\n", "2 894 2 ... Q 0\n", "3 895 3 ... S 0\n", "4 896 3 ... S 2\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 21 } ] }, { "cell_type": "code", "metadata": { "id": "FS7YPz6n42hc", "outputId": "d74b0161-a4db-400a-aa04-d8b17a44f474", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Title')" ], "execution_count": 22, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "7iEfSQ0Z42hc" }, "source": [ "টাইটেল বের করার পর নাম দরকার আছে কি? ফেলে দেই 'Name' ভ্যারিয়েবল। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "299Q6ViA42hc" }, "source": [ "# delete unnecessary feature from dataset\n", "train.drop('Name', axis=1, inplace=True)\n", "test.drop('Name', axis=1, inplace=True)" ], "execution_count": 23, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "0x164imt42hc", "outputId": "6a8eac41-7382-4802-cf77-2c5f42685527", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "train.head()" ], "execution_count": 24, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
0103male22.010A/5 211717.2500NaNS0
1211female38.010PC 1759971.2833C85C2
2313female26.000STON/O2. 31012827.9250NaNS1
3411female35.01011380353.1000C123S2
4503male35.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 male ... 7.2500 NaN S 0\n", "1 2 1 1 female ... 71.2833 C85 C 2\n", "2 3 1 3 female ... 7.9250 NaN S 1\n", "3 4 1 1 female ... 53.1000 C123 S 2\n", "4 5 0 3 male ... 8.0500 NaN S 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 24 } ] }, { "cell_type": "markdown", "metadata": { "id": "tK6hRwaR42hd" }, "source": [ "নাম কিন্তু নেই আর!" ] }, { "cell_type": "code", "metadata": { "id": "tRxQ_STT42hd", "outputId": "2e308504-1caf-4858-f6e9-83cec1e7016a", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "test.head()" ], "execution_count": 25, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
08923male34.5003309117.8292NaNQ0
18933female47.0103632727.0000NaNS2
28942male62.0002402769.6875NaNQ0
38953male27.0003151548.6625NaNS0
48963female22.011310129812.2875NaNS2
\n", "
" ], "text/plain": [ " PassengerId Pclass Sex Age ... Fare Cabin Embarked Title\n", "0 892 3 male 34.5 ... 7.8292 NaN Q 0\n", "1 893 3 female 47.0 ... 7.0000 NaN S 2\n", "2 894 2 male 62.0 ... 9.6875 NaN Q 0\n", "3 895 3 male 27.0 ... 8.6625 NaN S 0\n", "4 896 3 female 22.0 ... 12.2875 NaN S 2\n", "\n", "[5 rows x 11 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 25 } ] }, { "cell_type": "markdown", "metadata": { "id": "JjDRMGBN42hd" }, "source": [ "### ৫.৩ মহিলা পুরুষকে ম্যাপিং করি সংখ্যায় \n", "\n", "male: 0\n", "female: 1" ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "juiRAjiT42hd" }, "source": [ "sex_mapping = {\"male\": 0, \"female\": 1}\n", "for dataset in train_test_data:\n", " dataset['Sex'] = dataset['Sex'].map(sex_mapping)" ], "execution_count": 26, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "gbAbxBMR42hd", "outputId": "a58c727e-0c8f-44e6-c036-952822117f6a", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "bar_chart('Sex')" ], "execution_count": 27, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "GfgrKLXE42hd" }, "source": [ "এখানে কিন্তু মহিলা পুরুষ নেই আর! সেখানে তাদেরকে রিপ্রেজেন্ট করা হচ্ছে ০ এবং ১ দিয়ে। " ] }, { "cell_type": "markdown", "metadata": { "id": "JspZfJQf42he" }, "source": [ "### ৫.৪ বয়স " ] }, { "cell_type": "markdown", "metadata": { "id": "BQVHHlVz42he" }, "source": [ "#### ৫.৪.১ প্রচুর বয়সের ডাটা মিসিং আছে আমাদের ডাটাসেটে \n", "একটা কাজ করি বরং \n", "\n", "টাইটেলের \"গড়\" বয়স দিয়ে ভরে ফেলি আমাদের না থাকা ভ্যালুগুলোর জায়গায় - তাহলে আমাদের Random-Forest এনসেমবল ক্লাসিফায়ার ভালো ভাবে কাজ করবে। \"Mr\": 0, \"Miss\": 1, \"Mrs\": 2 এবং Others: 3 এর গড় বয়স দিয়ে দিচ্ছি এখানে। " ] }, { "cell_type": "code", "metadata": { "id": "Nrm52MBf42he", "outputId": "6a5c6a96-8a54-40cd-9712-fd09b2115223", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "source": [ "train.head()" ], "execution_count": 28, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
0103022.010A/5 211717.2500NaNS0
1211138.010PC 1759971.2833C85C2
2313126.000STON/O2. 31012827.9250NaNS1
3411135.01011380353.1000C123S2
4503035.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 7.2500 NaN S 0\n", "1 2 1 1 1 ... 71.2833 C85 C 2\n", "2 3 1 3 1 ... 7.9250 NaN S 1\n", "3 4 1 1 1 ... 53.1000 C123 S 2\n", "4 5 0 3 0 ... 8.0500 NaN S 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 28 } ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "dWPSng0842he" }, "source": [ "# fill missing age with median age for each title (Mr, Mrs, Miss, Others)\n", "train[\"Age\"].fillna(train.groupby(\"Title\")[\"Age\"].transform(\"median\"), inplace=True)\n", "test[\"Age\"].fillna(test.groupby(\"Title\")[\"Age\"].transform(\"median\"), inplace=True)" ], "execution_count": 29, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "fm94tB2642he", "outputId": "9ff5d516-f6f8-4fd0-cb4e-7b7a0c260da2", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "source": [ "train.groupby(\"Title\")[\"Age\"].transform(\"median\")\n", "train.head()" ], "execution_count": 30, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
0103022.010A/5 211717.2500NaNS0
1211138.010PC 1759971.2833C85C2
2313126.000STON/O2. 31012827.9250NaNS1
3411135.01011380353.1000C123S2
4503035.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 7.2500 NaN S 0\n", "1 2 1 1 1 ... 71.2833 C85 C 2\n", "2 3 1 3 1 ... 7.9250 NaN S 1\n", "3 4 1 1 1 ... 53.1000 C123 S 2\n", "4 5 0 3 0 ... 8.0500 NaN S 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 30 } ] }, { "cell_type": "markdown", "metadata": { "id": "oKmnDtnb42he" }, "source": [ "দেখুন কোন বয়স কিন্তু আর বাদ নেই। বয়সকে গড় টাইটেলের আউটকাম দিয়ে ভর্তি করা হয়েছে। \n", "\n", "আমরা একটা চার্ট আঁকি এখানে। মারা যাওয়া মানুষগুলোর বয়স ১৬ থেকে ৩৪ এর মধ্যে বেশি দেখা যাচ্ছে আমাদের ছবিতে। তার আগের অথবা পরের বয়সগুলোর মানুষ বেঁচে গিয়েছে বেশি। প্লটে দেখা যাচ্ছে ৩০ বছরের মানুষগুলো মারা গিয়েছে বেশি। আমরা অনেকগুলো চার্ট আঁকবো কারণ আমাদের বয়স কিন্তু একটা বড়ো ক্লাসিফায়ার। প্রথমে কোন লিমিট রাখবো না - মানে শুরু থেকে শেষ পর্যন্ত - facet.set(xlim=(0, train['Age'].max()))" ] }, { "cell_type": "code", "metadata": { "id": "BTGSdWJk42hf", "outputId": "3b95e40c-c8d0-4a76-f2e9-c12968a1e866", "colab": { "base_uri": "https://localhost:8080/", "height": 203 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Age',shade= True)\n", "facet.set(xlim=(0, train['Age'].max()))\n", "facet.add_legend()\n", " \n", "plt.show() " ], "execution_count": 31, "outputs": [ { "output_type": "display_data", "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA5YAAADMCAYAAAAI9jxyAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdeZhU9Z33/fc5p/au6q6uXqs3Ghpomh3EBQE1gkgUxZigxhgT88RMZjK3c2Um84TJPNEkmsyQO8s948SZ5E6iITGJokZChyAhKLIYEBcW2ZuGBnrfa1/OOc8f1TYQFBpoqOru7+u6+uruqlOnv+VP6tSnfptimqaJEEIIIYQQQghxkdR0FyCEEEIIIYQQYmiTYCmEEEIIIYQQ4pJIsBRCCCGEEEIIcUkkWAohhBBCCCGEuCQSLIUQQgghhBBCXBIJlkIIIYQQQgghLokl3QUMVEdHEMOQnVEyRW6ui66ucLrLEH2kPTKPtElmkfbIPNImmUfaJLNIe2SeggJPukvIaNJjKS6KxaKluwRxGmmPzCNtklmkPTKPtEnmkTbJLNIeYqiRYCmEEEIIIYQQ4pJIsBRCCCGEEEIIcUkkWAohhBBCCCGEuCQSLIUQQgghhBBCXJIhsyqsEEKkW1I3aGgJcqSxh4RuoKCgqgqqAnabRnW5l8JcV7rLFEIIIYS44iRYCiHEOdSd7GFnXTsHG7o51hLA67ZTkp+F1aJimmCYJqYJsYTOC6/VYbNoTB7tY86MMkpzHTjt8jIrhBBCiOFP3vEIIcRfMQyTtw+2seYvx+gJxaku9zK1Ko9F11bgsH34y6ZpmrT3RDna3MuLGw5xojXAomsruGVWOTarLBsvhBBCiOFLgqUQQvSJJXQ27Wzkle3Hcdo1ZlUXMrY0B1VVBvR4RVEo8Dop8Drxel0caehk8+4m1u84wcduGMOcKcVoqkxtF0IIIcTwI8FSCCGA3Uc6WLF2P75sB4uuKae0wH3J5/RlO7hzzmga20NsePsEa7c18OCt1UwYlTsIFQshhBBCZA4JlkKIEa0nFOfXfzrI4ZM9zJ9ZxpiS7EH/GyX5Wdz7kbEcPtnDUy/v4bbrKrj1mgoUZWA9oUIIIYQQmU6CpRBiRDJNk007G3lh4xEmVebymUXV2CyXbx6koiiMK0utGrtqcz31TQE+d3sNdpl7KYQQQohhQCb7CCFGnHA0wZMv7eaVN4/z8RvHcOP00ssaKk+Xk2Xjk/PHEY0neeIXO2jtjlyRvyuEEEIIcTlJsBRCjCjHmgN84+k30VSFT84fR1Ea9p20WlQWXVNBzahcnvjFDg6f6LniNQghhBBCDCYJlkKIEcE0TV575yTf++07zJ5UzPyZZVi09L0EKorCzPEFLLqmgv94cSdHGnvTVosQQgghxKUa0Luq+vp67r33Xm699Vbuvfdejh49etYxuq7zzW9+kwULFnDLLbewcuXKs445cuQI06ZNY/ny5ZdcuBBCDFQ8ofN/V+9l7bYG7ps/jpoMWpV1TEk2t15dwf9ZuZNjzYF0lyOEEEIIcVEGFCwfe+wx7r//fl555RXuv/9+Hn300bOOWb16NQ0NDaxbt47nnnuOJ598khMnTvTfr+s6jz32GAsWLBi86oUQ4jx6QnGW//ptesNxPnXLePKyHeku6SxjS3NYcFUZP3juXY63BtNdjhBCCCHEBTtvsOzo6GDv3r0sXrwYgMWLF7N37146OzvPOG7NmjUsXboUVVXx+XwsWLCAtWvX9t//k5/8hJtuuonKysrBfQZCCPEhTrQFefwXb1KSn8Xt143Casnc0f/jy718ZGYp3/vtO5xsk3AphBBCiKHlvO+ympqaKCoqQtNSKyZqmkZhYSFNTU1nHVdSUtL/u9/vp7m5GYD9+/ezefNmPvvZzw5i6UII8eF21XWw/Nm3mTO5mDmT/UNiz8gJFbncOK2E7/32Xdp7ZLVYIYQQQgwdl30fy0Qiwde//nX+7d/+rT+cXoy8PPcgViUGQ0GBJ90liNNIe5xSu/kIv1l3gAc+WsOo4uy01eH1XviKs9dPd2EoCk+9/B7/+3/Nw2GX7YYHi/wbyTzSJplH2iSzSHuIoeS871j8fj8tLS3ouo6maei6TmtrK36//6zjGhsbmTp1KnCqB7OtrY2Ghga+8IUvANDb24tpmgSDQR5//PEBF9rREcQwzAt5buIyKijw0NYmC41kCmmPFMM0WfnqYXYcaOO+m8eS47DQ3R1OSy1er+ui//akCi/Hm3v5t2e286WPTR4Sva2ZTv6NZB5pk8wjbZJZpD0yjwT9czvvUNi8vDxqamqora0FoLa2lpqaGnw+3xnHLVq0iJUrV2IYBp2dnaxfv55bb72VkpIStm3bxoYNG9iwYQOf+cxnuOeeey4oVAohxPkkkgY/XvUe79V3cv/8cXjd9nSXdNEUReGWWeW0dIb5/Zaj6S5HCCGEEOK8BrSSxTe+8Q1+9atfceutt/KrX/2Kb37zmwA8/PDD7N69G4AlS5ZQVlbGwoULueeee/jSl75EeXn55atcCCH6hKMJvv/cu/SG4iy9aSzOYTB81KKpLJk7mlffOclbB9rSXY4QQgghxDkppmkOifGlMhQ2s8jwjMwyktujszfK9597F39eFjfPKEVVM2PY6KUMhT1dU0eIl14/wlfvn0lZocw1v1gj+d9IppI2yTzSJplF2iPzyFDYc8vctfeFEOI8TrQGeWLFW1SXe5k/M3NC5WDy52Vx04xS/uPFXYSjyXSXI4QQQgjxgSRYCiGGpH1HO1n+m7eZO6WYa2qKhvUCN5MqfVQUuvnF2v0MkUEmQgghhBhhJFgKIYacN/Y08dTLe7hjdiUTK33nf8AwcNP0Uo42B9i8u+n8BwshhBBCXGESLIUQQ4ZpmtS+cZTnX63j3pvHUlE0cuY6WC0qi2eP4vkNh2nqCKW7HCGEEEKIM0iwFEIMCUnd4Jk/7mfLribuXzCO/Bxnuku64gq8TuZM8fPfL+8hkTTSXY4QQgghRD8JlkKIjBfq206kqTPMffPH4XHZ0l1S2kyryiPLaeX5Vw+luxQhhBBCiH4SLIUQGa2lM8zjv9hBtsvGXXNGY7dq6S4prRRFYeGscnbsb+Pdw+3pLkcIIYQQApBgKYTIYAcauvjOL99ixth8PpJBe1Smm9Nu4bbrRvHMmn0EwvF0lyOEEEIIIcFSCJF5TNPkz28d579e2s1Hr6tg2tj8dJeUccoL3UwYlcuKtQfSXYoQQgghhARLIURmiSV0frJ6L+t3nOD+BeOpLM5Od0kZa85kP8daAmzf15LuUoQQQggxwkmwFEJkjNauME/8YgfBcIL7F4wn12NPd0kZzWpR+ei1Ffxq3UG6g7F0lyOEEEKIEUyCpRAiI+w83M7jK3ZQMyqX266rwGqRl6eB8OdlMbUqj6fX7MM0zXSXI4QQQogRSt65CSHSKp7QefZPB3jmj/tZMmc0M8cXoCiySM+FmD2xiNauCJt3N6W7FCGEEEKMUJZ0FyCEGLmOtwb5n1V7yHHb+cyiahw2eUm6GJqWGhL7/IbDTBzlIy/Hke6ShBBCCDHCSI+lEOKKM0yTtdsb+O6v32bGuHzumD1KQuUlKsx1MXN8AU//UYbECiGEEOLKk2AphLiimjvDfPfZt9m6u4lP3TKeyaPzZOjrILmmpojO3hhb9zSnuxQhhBBCjDDSRSCEuCKSusEf3jjGn3Yc57qJRcwcV4CqSqAcTJqqcOvV5Ty34TCTRvvwumVVXSGEEEJcGdJjKYS47A4e7+bRn23jvaOdPLiwmlnVhRIqL5Min4upY/JY8coBGRIrhBBCiCtGeiyFEJdNZ2+UFzfWsae+k5tnlDK+3CvDXq+A6yYVseKVA+w40MbVEwrTXY4QQgghRgAJlkKIQReJJVnzl2O8+vZJpo3N43O31WC3aukua8SwaCqLrqngV+sOMKHCi8dlS3dJQgghhBjmZCisEGLQ6IbBa++cZNmP3+BYS4AHb61m3tQSCZVpUJKfxYSKXJ7908F0lyKEEEKIEUB6LIUQlyypG7zxXjO/33wUj8vKx+aNodjnSndZI97cKX5+sXY/7x5uZ/rY/HSXI4QQQohhTIKlEOKiJXWDrXuaWb2lHo/LxsKryykvdKe7LNHHalG55epyVqzdz/jPX4fLIS/5QgghhLg85F2GEOKCxeI6m3Y1snZ7AzlZNhZeXSGBMkONKvJQWZzNcxsO8dBtNekuRwghhBDDlARLIcSA9Ybj/HnHCTa8fYKyQjcfvXYUpflZ6S5LnMeN00p4Zu1+9h3romZUbrrLEUIIIcQwNKBgWV9fz7Jly+ju7sbr9bJ8+XIqKyvPOEbXdZ544gk2bdqEoih84QtfYOnSpQC8+OKLPPPMM6iqimEYLF26lAcffHDQn4wQ4vJo6gixbvtxtu9robrCy33zx5GX7Uh3WWKA7DaNBVeV8fM1+3ji89fKYkpCCCGEGHQDCpaPPfYY999/P0uWLGHVqlU8+uijrFix4oxjVq9eTUNDA+vWraO7u5u77rqL2bNnU1ZWxq233srdd9+NoigEg0HuuOMOrrnmGiZMmHBZnpQQ4tKZpsm+Y12s3d5AfWMvU6vyeOi2GtxOa7pLExehqjSH/Q3dvLSxjk8uGJ/ucoQQQggxzJw3WHZ0dLB3716efvppABYvXszjjz9OZ2cnPp+v/7g1a9awdOlSVFXF5/OxYMEC1q5dy+c//3nc7lNzr6LRKIlEQjZJFyJDJZIG2/e18MdtDSSSOjPHFXDLVeVYLUN0dyLTRDESKMkYih5H0WOofd8VPdF3W993I4GiJ8BIovR9YSZRDD11Hkwwjf7zoiiYigaqduq7asGwOLF4PLgSCoZmx7Q4MGwedLsbw+oGNT09hh+ZWcozf9zPNROLqCrJSUsNQgghhBiezhssm5qaKCoqQtNSb4Q0TaOwsJCmpqYzgmVTUxMlJSX9v/v9fpqbm/t///Of/8wPfvADGhoa+Kd/+ieqq6svqNC8PFkYJNMUFHjSXYI4zaW2RyAcZ82Wemo311Poc7Lw2lGMK/em90MgQ4d4CCUeRklEUBIR6P85DPFo6nsigpKIpu5PRlESUUjGUJIxSMZSQc5iw9RsoNkwNSv0fZmqBVRLKhRqFlAsoKqgaZhWC6h2UFRQFEDp+07q5/eDpmGAqYNpoBgJSIRR2prw6AmUZAL0WOo5xEOQiIDVienwYDh9mJ5CDE8hprsAw12AmZWfqu0y8AKL547mmT8e4Mmv3ITVMrKGxMprVuaRNsk80iaZRdpDDCVXbPGe+fPnM3/+fBobG/nSl77EDTfcwJgxYwb8+I6OIIZhXsYKxYUoKPDQ1hZIdxmiz6W0R1t3hLXbG/jLey2MLc3hrrmjKcx1AtDTExnMMsHQUeMBtGgPWqwn9XM8hBoL9P0cRE2EUBIR1GQExUhianYMiwPTYsfUbBiaHaMvHJqqre+2LExnLqa77/b++6yYqjUtPYRut51gMHb2HaaBkoygJiJosQBarAetdx9qbDu2WA9aPEDS6SPhKSORU07cU0rCU4JpG5xFksrzXLgdFn76u9184qaqQTnnUCCvWZlH2iTzSJtkFmmPzCNB/9zOGyz9fj8tLS3ouo6maei6TmtrK36//6zjGhsbmTp1KnB2D+b7SkpKmDJlCq+99toFBUshxOBqbA9Ru/Uou450MHVMHp9dNAGP6xJ7ygwdLdKJJdKBJdyBFm7HEm7HEulEjfeiJsIYVheG1Y1hc2FYnH2h0Ukyq5B4zqjUsFFLavioqVpP6yEcJhQV05qFbs1Cd+Wffb+RxBLpwhJuw9pdj7NxB5ZwG7o9h5hvLHHfWGK5VRiOixvKqigK868qY8UrB7iquoDR/uxLfEJCCCGEEAMIlnl5edTU1FBbW8uSJUuora2lpqbmjGGwAIsWLWLlypUsXLiQ7u5u1q9fz7PPPgtAXV0dVVWpT8Y7OzvZtm0bCxcuvAxPRwhxPseaA/x+Sz0HT/Qwc1w+n7+9BoftAgcvmCZauB1rsAlroBFrsAlLsBlLpBPdmoXhyEG3ZaPb3SSyS4jmT8CwuTGsztSwUvHhVAvJrAKSWQWnbjMNLOEOrIGTZB3finfvSgyLi2h+NdHCKcR8Yy9o+KzbaeWm6SX8tHYv33jomqE7f1YIIYQQGWNA7ya/8Y1vsGzZMp566imys7NZvnw5AA8//DCPPPIIU6ZMYcmSJezcubM/MH7pS1+ivLwcgOeee44tW7ZgsVgwTZMHHniAuXPnXqanJIT4ICfbQ7y4sY66kz1cPaGQhxfXYBvgHDs1HsTa04Ct+xi27npsvccxVRvJrHx0u5eEq5BI/gR0R25qvqIYXIraHzYjkAr2kU5sPcfIPvQHLOF2Yr5xRIumEC2YiGE7/5z0mlG5HDzRw6rN9SNqSKwQQgghLg/FNM0hMXFR5lhmFhn3n1nO1R7t3RF+t+kIu+o6uLqmkBljC87bQ6UkY9i66nB0HMDefgAt2k0yq5BkViGJrEIS7iJM6+DM+RuuPnSO5WWgJCLYeo6lgn/vceLe0YRLryFSOAk024c+LhhJsOKVA3z5nmnDfkisvGZlHmmTzCNtklmkPTKPzLE8N+laEGKYCkUTvLzpCG/saWH6uHw+f/tE7LYP76HUQm04W3biaHsPa+9Jku4iEp5SghVzSWYVyhDWDGZancTyJxDLnwB6HHtXPe5jr+Hd+zyRwimES64m7qs6qw1lSKwQQgghBosESyGGGcM02bSzkZc2HmFsWQ4P3TaBLMcHz7+zBFtwtuzE2fwOaqyXeO4YogWT6R2z8LJteSEuM81GLL+aWH41ajyEvfMg3r3Po5gmwYp5hEuvwbQ6+w8/NST2CJ+4aWwaCxdCCCHEUCbBUohh5EhjL7985QC6YfCxG8ZQ7HOddYwaD+I6+Sauk9tQE2FiuWMIlc0m4S6WXslhxrBlESmeQaRoOpZgE87W98iuW0u4eAahUfNIuv0oisItV5Xxi1cOMH1sAWPLLm61WSGEEEKMbBIshRgGwtEET6/Zx7uH25k31c+kSh/K6dt0mAb2joO4jm/F0XGQeO4YQuXXk3D7h992HuJsikLSU0LAU4IaD+Foe4/87U+R8PgJVC2E3CpumVXGj1e/x7c+dw1Ou1wahBBCCHFh5N2DEEPcniMd/OKVA5QXuPncR2vOmEepJCJknXiDrGObMC12ovkT6Jz6aUyLPY0Vi3QybFmES68h7L8KR8cBcnf/GsOejb3qFo7kufn1+oP8P7dPTHeZQgghhBhiJFgKMUSFo0l+++eD7Knv5GM3jaPAc2r1Ty3SRdaxjWSd3E48p4JA1S2pBXiEeJ+qES2YSDR/AvbOOnL2r+IBRWPVsYm8fSCPmdVF6a5QCCGEEEOIBEshhqC9Rzv5ae0+Rvs9fGbRBIoKPHR3h7EEmvDUr8fRtpdo3gS6Jn4Cwz68t5EQl0hRieWNI+Ybi63nGLfr2+l99R16lAfIHjfzzCHVQgghhBAfQoKlEEOIbhi8vKme13c28tFrKqjs23tQCbSQu/Ml7B0HiRRNpXPKAzLcVVwYRSHurSSeM4q2/TtxbfwFln1rcVx7D1rxuHRXJ4QQQogMJ8FSiCGiszfKf6/ag2HAgwuryXJa0SJdeA6vxdm2h3DhFLqmfApTs53/ZEJ8GEWhsHoaf3jLy430UPqn/0LNH4X9uvvQckvSXZ0QQgghMpTsLSDEEPDu4Xa++fSblOVn8Ykbx+DW4uTse4nCrf8bxdSJXvsZwiWzJFSKQaGqCtdOKuEPxzz0TPsUiief8O+/TXTLLzGjwXSXJ4QQQogMJMFSiAxmmCYvbazjF3/czx1zKrl2QgHuhi0Ubf4OWrSbzsn3ESq7Dk7b8F6IweBx2ZgxLp9VWxowymfiuOFzGMFOgs8vI7ZnPaahp7tEIYQQQmQQGQorRIaKxJL8ZPV7dPbGeGDheHJDR/FufQlTs9Az/g50V366SxTD3KjibNq6o6z9SwN3zh2NbcpCjFHTSex9leTeP2Of82kspbI1iRBCCCGkx1KIjNTaFeaJFTtQgE9e56N83y/x7X6WcPF0esbfKaFSXDEzxuXT2h1h5+E2ANTsQmzX3oOl6jqir/6EyPqnMMLdaa5SCCGEEOkmwVKIDLP3aCdPrHiLyZVe7iqox/+X72NYHHROvo+4rwpk+wdxBWmayuxJxby+s4nWrggAiqKg+cdjv+EhUFVCK/+V2J4/YRpGmqsVQgghRLpIsBQig7y+s5H/WfUeS6c7WNjxLFmNb9I94WOES68BVUaui/TIzrIxfVw+L2+qJ5Y4NbdSsdiwTrgR+3X3kjywmfDvvoHeeiSNlQohhBAjy6OPPsqPfvSjQT/vk08+yVe+8pULeoy8UxUiA5imyarN9Wzd2cAjY4+Sf2Q74dJriOZPlB5KkREqi7Pp6IlSu/Uod99Qdcb/lqqnANt196KffI/I2h9gGTsb+9WfQLHKXqpCCCFGph07dvC9732PQ4cOoWkaY8aM4Wtf+xpTp04d1L/zrW99a1DPdymkx1KINEvqBk+v2cfxfXv4ak4tOeETdE+8h2jBJAmVIqPMGJtPIJRg867Gs+5TFAVL2WTsNzyE0d1M6Pl/IXl8dxqqFEIIIdIrGAzyxS9+kQceeIDt27fz+uuv8/d///fYbBe2LZxpmhhDaJqJBEsh0igaT/KfK9+honkDn7auJeqfSWDsrRi2rHSXJsRZVE3l+inF7Krr4EDDBy/Yo9hc2KbfhnXSfKIbf0Zkw49l70shhBAjSn19PQCLFy9G0zQcDgdz585lwoQJZw0xPXHiBNXV1SSTSQA+/elP88Mf/pD77ruPadOm8dOf/pS77777jPM/88wzfPGLXwRg2bJl/PCHPwTgox/9KK+++mr/cclkkuuuu4733nsPgHfffZf77ruPWbNmceedd7Jt27b+Y48fP84DDzzAjBkzeOihh+jq6rrg5y3BUog0CUYS/PRX67k79FumudvpnngPsbxx6S5LiHNy2CzMneLnle0N/Yv5fBCtcAz2Gx7CNJKEnv8aibptmKZ5BSsVQggh0mP06NFomsZXv/pVNm7cSE9PzwU9ftWqVTz++OO8/fbbfPKTn6S+vp6jR4/237969WruuOOOsx53++23U1tb2//75s2byc3NZdKkSbS0tPA3f/M3/O3f/i3bt2/nq1/9Ko888gidnZ0AfOUrX2HSpEls27aNv/u7v+N3v/vdBT9vCZZCpEFXb4T1v/wZ9+kvYy2bRGDcbdJLKYaM3GwHM8cX8OLGOsLR5Icep1hs2CbejO2qJcS2v0Bk3X9ihC/s4iqEEEIMNW63m1//+tcoisLXv/51Zs+ezRe/+EXa29sH9PiPfexjjBs3DovFgsfjYf78+f2B8ejRoxw5coSbb775rMfdcccdbNiwgUgk9cHv6tWruf3224FUWL3hhhu48cYbUVWVOXPmMHnyZDZu3EhjYyO7d+/mH/7hH7DZbFx99dUfeP7zkWApxBXW1tTM8d88zlXWOoJTlhIrlLmUYuipKPJQUejhd6/XkdTP3ROp5pZgn/sgitVB6IV/JXFoq/ReCiGEGNaqqqr493//d15//XVWr15Na2sr3/nOdwb0WL/ff8bvd9xxB3/4wx8AqK2tZcGCBTidzrMeN2rUKKqqqnj11VeJRCJs2LChv2ezsbGRtWvXMmvWrP6vt956i7a2NlpbW8nOzsblcvWfq6Sk5IKfs6wKK8QV1LL7L5hbn0HzjCcxYQ4o8tmOGLqmjPGxdU8zq7cc5a65o8/5v7OiWbBOuAGteByxt14mcfgvOG54CDUr98oVLIQQQqRBVVUVd999N8899xwTJ04kGo323/dBvZjKX3U4XH/99XR2drJv3z5qa2v5l3/5lw/9W4sXL6a2thbDMBg7diyjRo0CUmF1yZIlPPHEE2c95uTJk/T29hIOh/vDZWNj41l1nI+8qxXiCjCTcdrW/Yzk1hU0FN2Is2aehEox9CkK104sIhCO86cdDQykE1L1+rHP+TSKw0Poxa9L76UQQohhp66ujp///Oc0NzcD0NTURG1tLdOmTaOmpoY333yTxsZGAoEAP/7xj897PqvVyqJFi/jud79LT08Pc+bM+dBjb7vtNrZs2cJvfvMbFi9e3H/7nXfeyauvvsqmTZvQdZ1YLMa2bdtobm6mtLSUyZMn8+STTxKPx9mxY8cZiwANlLyzFeIy07tO0v38o5yoO8zRiiXkjxqb7pKEGDSapjJnSjENrUG27mke0GMUzYK1ei72WR8n9tbLRF75PxjhD15lVgghhBhq3G43O3fuZOnSpUyfPp177rmH8ePHs2zZMubMmcNtt93GnXfeyd13381HPvKRAZ3zjjvuYOvWrSxatAiL5cMHnRYWFjJ9+nTeeecdbrvttv7b/X4/Tz31FD/+8Y+ZPXs2N954Iz/72c/6tzP5/ve/z86dO7n22mv50Y9+xF133XXBz1sxB/BRcX19PcuWLaO7uxuv18vy5cuprKw84xhd13niiSfYtGkTiqLwhS98gaVLlwLwox/9iDVr1qCqKlarlS9/+cvMmzfvggrt6AhiGPKpdqYoKPDQ1hZIdxkZL35gE5Gtv2FzeAzZY2dSWui5LH/H7bYTDMYuy7nFxRlpbRKJJdnw9glmTypi+riCAT/O1JMkD71B8sQuHNc/gKXq2gseejMQ8pqVeaRNMo+0SWaR9sg8BQWX533ccDGgOZaPPfYY999/P0uWLGHVqlU8+uijrFix4oxjVq9eTUNDA+vWraO7u5u77rqL2bNnU1ZWxtSpU/nc5z6H0+lk//79PPDAA2zevBmHw3FZnpQQ6WYmYkQ3PUP05EFeDs5k/IQq/PnudJclxGXjtFu4cVoJG945idNupbrCO6DHpeZezkMrqiL25oskjmzHMe+zqM7sy1yxEEIIIQbTeYfCdnR0sHfv3v4xuosXL2bv3r39e568b82aNSxduhRVVfH5fCxYsIC1a9cCMG/evP6Vi6qrqzFNk+5uGfYkhie98wShlx4j2N3NL7tmUl0zVkKlGBHcLhvzpvhZ9+ZxDh2/sG1F+leO1ayEV/4ridJRw0EAACAASURBVCNvXqYqhRBCCHE5nDdYNjU1UVRUhKZpAGiaRmFhIU1NTWcdd/qytH6/v3/C6ulefvllKioqKC4uvtTahcg48f0bCa/+N3p8k/jliUqunlRGcZ7sTylGjtxsB/Om+lm7vYEDDRf2AaKiWbDW3IT1qiXE/vJbIn/6EUZUhoEJIYQQQ8EV3W5k+/bt/Md//Ac///nPL/ixeXnS45NpZJz5KUYiRvsff4x+fD/RaZ/g+c1tzJ9Vjr/gyoVKt9t+xf6WGJiR2iZut52PZtlZ+5ej2O0WplcXXtgJvOMwyysJ7HqV6Iv/H/mL/oasCddecl3ympV5pE0yj7RJZpH2EEPJeYOl3++npaUFXdfRNA1d12ltbT1r406/309jYyNTp04Fzu7BfOedd/jnf/5nnnrqKcaMGXPBhcriPZlFJpSfYnQ3EVn3JIrbR8uYJfzu9ePMnlSMx2m5You3jLSFYoaCkd4mNk3hxmkl1G6pJxCMMaUq78JPMmYumreS1ld+hvbOa9jnfhrVcXFvsuQ1K/NIm2QeaZPMIu2ReSTon9t5h8Lm5eVRU1NDbW0tALW1tdTU1ODz+c44btGiRaxcuRLDMOjs7GT9+vXceuutAOzatYsvf/nL/Od//ieTJk26DE9DiPRI1G0jtOoJtPKptBTfwO+2pkJlkc+V7tKESLsct52bppeycWcj7xxsu6hzaL4y7PM+A0D4eZl7KYQQQmSqAW03UldXx7Jly+jt7SU7O5vly5czZswYHn74YR555BGmTJmCrut861vfYsuWLQA8/PDD3HvvvQB8/OMf5+TJkxQVFfWf87vf/S7V1dUDLlR6LDPLSP8UzdQTxN74NcljO7HNvJPGWBYvvX4kbaFypPeOZSJpk1OC4Tiv72pifHkOH5lRxsXuJqJ3niCx6xXU/FE45n3mglaOHemvWZlI2iTzSJtkFmmPzCM9luc2oGCZCSRYZpaR/GJnBNqIrPsvFJsD69SP0tid4MWNR7huUhHFvvQs1CMhJvNIm5wpHtfZvKeJ7Cwbd1xfidVy3gEzH8jUEyQPbiV5cs8F7Xs5kl+zMpW0SeaRNsks0h6ZR4LluV3RxXuEGOqSx94hsvFnWMdcizb6Khrbw6lQOTF9oVKIocBm07hxWgk7DrTy7J8O8ombqnA7rRd8HkWzYq25Ea14HLEdL5E4tDXVe+m+iDmcQgghxBX20OPraO+ODPp5871Onv76wgEdW19fz7Jly+ju7sbr9bJ8+XIqKysvuQYJlkIMgGnoxLa/QPLQVmwzl6D5ymhsD/Hi63VcW1MkW4oIMQCapnJtTRHvHe3il68c4O4bqijyOS/qXO/ve5ms20boxa9ju+pubJNuRlEuridUCCGEuBLauyN852/nDPp5v/bfWwZ87GOPPcb999/PkiVLWLVqFY8++igrVqy45BrkCizEeRihLsKr/x29+SD2eQ+i+cpo6uupvGZCIf58CZVCDJiiMGm0jylj8nhuwyHePtjGxU7IUFQN67jrsV/3SZL7NxJe9QR6V+Pg1iuEEEIMIx0dHezdu5fFixcDsHjxYvbu3UtnZ+cln1uCpRDnkDyxh/BLj6F6i7Fd/XEUm4vmjjAvbKxjVnUBJfmyv6oQF6OiyMP8q8p4+2AbL286QiyuX/S5VE8+ttmfRCscS/j33ya6/QXMZHwQqxVCCCGGh6amJoqKitA0DQBN0ygsLKSpqemSzy3BUogPYBo60e0vEN3wE6zTbsc67noURaG5I8zK11KhsrRAQqUQl8LjsjF/Zhmg8PM/7qOxPXTR51IUBUvlDBxzP4PRWkfo+a+RPL5r8IoVQgghxDnJHEsh/ooR6iLy5/8GPYF93oMo9tRQVwmVQgw+TVO5qrqAE61OXnitjhnj8pk9uRiLdnGfeypOD7aZd6K31hF9/Wm0gjHY5zwAspKfEEIIgd/vp6WlBV3X0TQNXddpbW3F7/df8rmlx1KI0yRP7CH84mOoOUXYrvmEhEohrpCyQje3Xl3OibYQP1+znxOtwUs6n1ZYhf2Gh8BqJ/TCv9L9l1WYenKQqhVCCCGGpry8PGpqaqitrQWgtraWmpoafD7fJZ9beiyFAEw9mVr19fBWrNNvQ8sf1X9fU/upOZUSKoW4fJwOK3Om+DnRGuTlzfWMK83hphml2G3aRZ1P0axYq+ehlU4ktP914m++guP6T2GpmDrIlQshhBADk+91XtAKrhdy3oH6xje+wbJly3jqqafIzs5m+fLlg1KDYpoXux7fldXREcQwhkSpI8Jw2rTX6Gkhsv4psNiwTVuEYnP139fUHmblxsNcXV2Y0aHS7bYTDMbSXYY4jbTJpYkndHbVddDUEWLuVD9TxuShqspFny8nx0nnwd0k9r2KmluK4/pPoeYUD2LF4kINp+vIcCFtklmkPTJPgUyrOCfpsRQjWuLgFqJv/Brr2OvRKmegKKfeuJ5oDfLSpiMZHyqFGI5sVo1ZEwrp7InwzqF2duxv5aYZpYwpyUG5iHypKApa0VjU/EqSR98m9LtvYa2+AfvMO/qHvAshhBDi4kmwFCOSGQ8T3bQCvbUO+7X3oGYXnnH/seYAqzbXc93EIorz5E2nEOniy3Fy84xSTraH+PNbJ9i2r4WPTC/Dn+86/4M/gKJZsFZdg6V0IolDWwn+9v/FNu12bJMXoFhsg1y9EEIIMXJIsBQjTvLkXqKv/V/U/NHY534aRbOecf+Rxh5qtx7j+snFFOZe3JtXIcQgUhRKC9yU5GVxpKmXl16vw5ftYPbkYkYVeS6uB9PhxjZlIcboq0ge2ETovT9hv/rjWMZej6LKunZCCCHEhZJgKUYMMxkntn0lybptWKfcilY45qxjDh7v4ZXtDcyb6icvZ+CToIUQl5+iKlSV5jC62MOxlgDrtjdgs2rMnlTM+DIvykXkQdWdh+2qu9A7TxDf9QrxnWuwzfo4lsoZKBdzQiGEEGKEkmApRgS97SiRDf+DmuXDPu+zKLazQ+N79Z28+vZJ5k3148t2pKFKIcRAqJrK6JIcRvuzOdkeYsvuJja8fYLpY/OZWpVHltN6/pP8Fc1Xhjr7kxitdcTefIH4jpewzbpbAqYQQggxQBIsxbBm6glib/+exN4NWCfejKV04gce9+a+Vrbva+HG6SXkuO1XuEohxEXpGyJbWuCmsyfKkaZetu1rYVSRhxnjCy54mGz/Aj+FVRgth4ltl4AphBBCDJQESzFs6c2HiGz8GaozG8fcz6A4z14i2jTh9Z2N7DvWxfyZZbguoqdDCJF+vhwHvhwH06ryONYSYP2O48QTOjWjfEys9JFzAUPbFUVBKx6HWjQ2FTDffIHY9pXYp92GZdzss+ZlCyGEEAN17Mm/Qe9tH/Tzatn5jPpfPz7vccuXL+eVV17h5MmTrF69mvHjxw9aDRIsxbBjxiOpuZRH3sQ68WZUf/UZ24i8zzBMXtl+nMaOIPNnll30JuxCiMxhtWqMLfMytsxLTzBGQ0uAFzfWYX/jKNXlXsaXeynwOgfUk3lGwOw4Rnz/RmLbX8A6eQG2iTejOGQbIiGEEBdG723H/8A3B/28Tb96bEDHzZ8/nwcffJBPfepTg16DBEsxrCQb3iW66ReovnLsNzz0gXMpAZJJg99vPUookuCm6aVYLRIqhRhuctx2prjtTBmTRzhhcOBYJy9urANgbGkO48q8lBe50dRzp0xFUdDyK9HyKzF620jW7yC4649Yx87GOnE+mq/0SjwdIYQQ4pLNmjXrsp1bgqUYFoyeFqJbn8XoOol18kK0gsoPPTYcTfLixjpsVo25U/xomsybEmJYUxQKfS5cNo0ZY/PpCcU52R7i1XdO0BOKU5qfxRh/DqP8HvKyHefszVSzC7BN+yhmZC7Jhp1Eav8dJacI26T5WEbPkmGyQgghRiwJlmJIMxOx1OI8+17FUnUt9ikLUdQP731s747wwsY6Koo8TK70cVEb4Akhhi5FIcdtJ8dtZ2Klj1hcp6UzzLGWANv2NWOaUFHkpqLIQ3mhm1zPBwdNxenBWj0Xy7jZGC2Hie/+E9Gtz2IdNxfrhHloudKLKYQQYmSRYCmGJNM0SdZtI/aX36L6ynDc8FkUx9mL85zuaFOA32+tZ8bYfEYVZ1+hSoUQmcxu06go9lBR7AHTJBBJ0NoV4eDxbrbsbkI3oDQ/i7JCN6X5LopyXVgsp0Y5KKqG5q9G81djBDvRj+8iUrscxZmDZfwcrGOvQ3V50/gMhRBCiCtDgqUYUkzTRD/5HrFtz2PqCazTb0fzlZ33ce8eamPTrmbmTPJTkDvw1SGFECOIouBx2fC4bFSV5gAQjiRo7Ylwsi3E7iMd9ARj+DwOSgtcFPvcFPuc5OU4UFUF1e1DrbkJy4QbMNob0E/uJf7WKrTC0VjHzsYyaoYs+COEEGLYkmAphgy95TCxbc9jBDuwjp/7oau9ni6pG/zpzeM0tASZP7MUt8t2haoVQgwHLqeVSqeVyuLU77pu0BWI0dEbZV9DJ1v2xAhHExR4nRT7XBT7XBTmusjzjcJWUIk5aT56y2ESB7cQ3fosWl4FljFXY6mcierOS++TE0IIMeI88cQTrFu3jvb2dh566CG8Xi9/+MMfBuXcimma5qCc6TLr6AhiGEOi1BGhoMBDW1vgivwtvf0YsR0vYbQdxTJuNlrZFBT1/Avu9ATj/O71IzjtFq6eUIBlGK/86nbbCQZj6S5DnEbaJLNczvZIJHS6AjE6A1F6QnG6gnGC4Tg5bhtFXheFPheFuU4KPRbsgRPoLYfQW+pQPXlo5VOxlE1BKxqLoo2sz3qv5HVEDIy0SWaR9sg8BQXnnnY1EOnex/JyGtBVrL6+nmXLltHd3Y3X62X58uVUVlaecYyu6zzxxBNs2rQJRVH4whe+wNKlSwHYvHkzP/jBDzh48CCf/vSn+epXvzroT0QML+8PeY2/+weMrpNYRs/COmnBgN941Tf2UvvGMSZUeKku98oiPUKIy8Zq1VLh0efqv03XDXqCcbqCMRrbQ+w72kVXMIpFUynIGU9h3hTKbb3kt7fgOLoCM9SJVjweS/lUtJJq1NxSFEVWrBZCiOEm3eHvchrQu/THHnuM+++/nyVLlrBq1SoeffRRVqxYccYxq1evpqGhgXXr1tHd3c1dd93F7NmzKSsro7y8nG9/+9usXbuWeDx+WZ6IGB5MQyd55E3i79ZiJmJYxlyNdfrt51zp9XSGYbJ1dzPv1rVz/aRimU8phEgLTVPx5Tjw5ThO3WiahKNJuoIxeoJx3u6x0x0qJhDykZ9lMr69l7KubeS8VYvFiKEVVWHxT0gtDpQ/CsUiQ/mFEEJkrvMGy46ODvbu3cvTTz8NwOLFi3n88cfp7OzE5/P1H7dmzRqWLl2Kqqr4fD4WLFjA2rVr+fznP8+oUaMAWL9+vQRL8YGMQDvxA6+T3P96ajXFqmtRC6vOO4fydF29MX6/9SiqArfMKsdpH1nDyoQQGU5RcDmtuJxWSgtO3WzoBr3hBN2hGO8GK+hNJIiFevAd7qDy5Dv4tQ249V70rAKsRVXYS8aiFYxO9WqOsOGzQgghMtd5r0hNTU0UFRWhaakeI03TKCwspKmp6Yxg2dTURElJSf/vfr+f5ubmy1CyGC5MPUHy2Dsk9r6G3n4UrbQG21V3oeYUXdh5TNh5uI3XdzYxqdLHuLIcGfoqhBgyVE3F67Hj9dhPu7WEeEKnJxTnvWCMQDCCGmzDvq+dooNHKbIEcBMk4chD85XhKhmNJa8cNbcMxe2TYbRCCCGuuCHzUWdenizRnmkuZgKzmUwQObqL4L43CB/agdVbhGfMNBw33I1isV7w+QKhOC+8eoju3hiL5475qzdmI4vbPXKfe6aSNsksQ7E9fLmu036rBNMkGE7Q3BulqztIorcd9UQ7WUffosi2GZ8SwEoCPAU4C8twFlZgzS/FmuvHmluM6nRf0EiQy20wFsIQg0vaJLNIe4ih5LzB0u/309LSgq7raJqGruu0trbi9/vPOq6xsZGpU6cCZ/dgXqorvSqsaZqQiGJGelNfiQhmMg7JeN/3GKaeBJS+zjEl1UumKKDZUKx2sNpRLKkv7C4UuxvFkYWiDpk8/6EuZKUyMxYieWIPySNvkjyxBzW7EK1oLNbZn0J15RADYsEEkBjw3zcNeOdwG5t3NVFVksNHppegKozYVThlBdLMI22SWYZbe/g8dnweO5AHVJNM6nSH4hwNxAkFg+i9HWgt3RTYdlJg20aOGsGpB1AUBS07HyW7EDW7CDW7ANWdh+LJR3Xnp65dV4iseJl5pE0yi7RH5pGgf27nTTh5eXnU1NRQW1vLkiVLqK2tpaam5oxhsACLFi1i5cqVLFy4kO7ubtavX8+zzz572Qq/VKZpYkZ6MHpbMXta0HtbMbubMQJtmJEezGgQFAXFnoViy0qFRM0KmqXvy3pqqFHfji0mZupnPQlGApJJTD0BehwzEcWMRyAeAYstdV6HB8WVg+Lyorq8KC4viisHNSsXJSsXxZk9JIczmdEgyeYD6I370Rv3YfS2ouZVoBVW4bjp8yj2rEs6f3NnmLXbGsCEj8woJWcI9kIIIcRgslg08nOc5Oc4gRygFNMwCUQSNAai7A3G6A7ECAWCeCMRykNJirqayLEcw0UELR7ADPeAxYaa5UNx+1Dd+aiefBR3Hqrbh+LOQ3HmDGi7JyGEECPPgPaxrKurY9myZfT29pKdnc3y5csZM2YMDz/8MI888ghTpkxB13W+9a1vsWXLFgAefvhh7r33XgB27NjBP/7jPxIMBjFNE4/Hw7e//W3mzZs34EIvpcfSNJIYXU0YHQ3o7ccw2o+id54ARUmFOJc3dbHMyk0FPYcbxea6LCvwpXpCY6ke0HgYoiHMWBAzFsKMhzGjIcxYADMSgEQ0FS5d3tMu9D4UV27qZ1dfvWlYKbCgwENray9mqBOj8zh6x4nUf9/O45jBTlRfGWpuKWpeOarXP+BVXc8lEtPZvKuR/Q3dTB3jY7Q/W+ZS9hluvTHDgbRJZpH26GOahKJJugIxugJRuoIxOntjKIpCodfBqFwVv9sgzxbHYYQxogGIBDCjgdTonXg4dV3K8vX1dOalAqg7rz+AYnMNaLit9MZkHmmTzCLtkXmkx/LcBhQsM8GFBEszHkFvrSPZtB+j6QB6+zEUZw5qdmFquE92YepnR2bP2zT1ZCp0vn9RjwYwYyGIBjGjwf7b0Kyp59fX+6k4PChOT9/Q274vmxMs9lQI7f9uBd6/+JunviVjmPEIZiIK8UgqBEcCGKFOjEA7ZrADNdpDoqcNRbOmhlS581CyC1A9BSie/EEJku9LJAx2HGjlzf2tlBe6mTI6D5tt8M4/HMib5swjbZJZpD3O4bRtUDp7U2GzozeGpioU5Topyc/Cn+ei2JeF05oalWJGejGjqakiRIMY7wfPcDcoaip4evJQPaeuC+r714i+USvypjnzSJtkFmmPzCPB8tyGRbA09QR68yGSx3ejn9iD0duC6vWjektRfSWpJdmtjg987FDXPxc0Fuzr7QxBPJwaepuIpu6LR/vmhCZAT6TmiPb9fCpYnsbSN0f0/fmhFhuKzdkXUj0oDg/ZhYUEk6nbLxfdMNl5uJ2te5opyHEyebQPT5bs4/ZB5E1z5pE2ySzSHheor2ezMxCjszdCdzBOR28Uu1Wj2OdKhU2fiyKfC/tpH/T1j8qJ9KS+wr0YkV6I9mKEezBDXanRQu58HPl+Es581Jyi1Ae+OUWpkUNDcArIcCFBJrNIe2QeCZbnNmRXkTF620g2vEuyYRd680FUTz5qfiWWCTeg5hSPmL29FEUBmzMV8DwF53/AILF6XSjd4cty7kTCYHd9B9v3teB2WJk7xY8ve3h+MCCEEBlJUchyWslyWikv7BvdY5oEwgk6AlGaO8LsO9pFZyCK22XD73NSku+mOM9FkdeJJacIPmDrqP4PQ8Pd2IgQb2sheewdzHA3ZqgrNdTWnY+aU4zq9aN5/Sje4tTvzuwr/B9BCCHEhRgy6cs0TfT2YySOvoVe/xZGuButsAq1cDTWiR+5rD1n4soIRhK8daCNnYfbKfA6uXpCEQVeaVchhMgIioIny4Yny0Zlceom0zDpCaXmaTa0BHj3UBs9oThetx1/ngt/XhbFeS4KvE40VTnjw1Cn10XMO+aMP2Em46mQGezECHWRqH8TM9SFEWgHVesLnCVovlJUbwmqrxQly5dRW6gIIcRINWSCZeT33ybZ24FWNBZLzU2ouSUyXGYYME043hpk1+F26hp7qCjKZsFVZbhdMuRVCCEynaIqeD0OvJ5To0p03aA7GKMrEOPwyR62728lGI7j8zhSYTM/i6JcFx7P2R8cKhYbSnYhZBdy+kx60zQhHsYIdGAGO0i2HMY8sh0z0I6ZiKF6i1Fzy9DyylBzy1B9Zak1ByRwCiHEFTNk5lg2b9+AYcuSi0SG8HpddF/CUNjuYJw9RzrYXd+BRVWpLPYwujhbFuW5SDJ/LPNIm2QWaY/0Suo6XYE4Xb1RuoMxOgNxgtE4PreD4jxX6ivXRb7XiUW7sOu8GY9iBNsxA+0YwVTwNHrbwDRSw2nzKlDzytF85anAKSOcPpTM6css0h6ZR+ZYntuQ6bFU3T7MRDzdZYiLZJrQ2Rvl0IluDp3ooSsQo6LIw/UTi8n12GXbECGEGMYsmkaB13nG9Aa7w8rJ5l46AzEOn+jhzf2tBEJxctw2inNTq9AW5jopzHWesUDQX1NsDjRfGfjKzrjdjIVSK5n3tpFs2EnivT9j9LaiODypLbHyK9DyKtB8FSjZhbI/pxBCXKIhEyzF0JNIGJxsD3GkqZfDJ7qJJw1K87OoLvdS6HWianIRF0KIkcpqUcn3Osk/LWzqukFPME5XMEZDa4BdR9rpCsRw2i0UeJ0U5zopyE3N2cx12znXjBjFnoVmz4L8Uf23maaBGerGDLRh9LaSaDpErLcVMx5KrSbvq0DLH9XXw1nWvzWKEEKI85NgKQZNNK7T1BGioTVIQ3OAtu4IvmwHhV4n19QU4ZOeSSGEEOegaSq+HAe+nFNzNk3DJBhN0B2I0R2M0dAWojsQIxpP4vM4KPA6UmEzx0F+jgO3y/ahlxpFUVHcPnD70PzVp/5GIoYRaMPsbUU/+R6J/a+lejftWai+8tN6N8tRsoukd1MIIT6ABEtxwUwTAuE4x5oDtHSFaWoP09wZJhRLkOdxku91MKEil7lT/GjSKymEEOISKKqCx2XD47JRXnRqflMiqdMbitMdjNPYHuJAQxfdwThJ3cCXnQqZBV4HPo+DvGwHOW4bqvrBiVOx2s8aTmuaZmoLlPd7N5v7ejejfb2beeWp+Zu+MtS8clSHzL0SQoxsEizFh0okUiv7vT8sqaMnSltPhM7eGIoC2Vk2vG47Po+dqpIcPC4ryodctIUQQojBZLVo5OU4ycs5czGeeFynJxynJxSnuSPMoePd9IYThKNJsrNs5Hrs5GU7yM224/M4yHXbPrCXU1GU8/RutpFs2o95cHOqd1OzofpKUX3laHnlqLmlqLmlKFb7lfjPIYQQaSfBcoRK6iahSIJgNEEwnCAQitPbdyHuDSUIhOPEEjoelxW304rLYSXbZaWmIpdst438XBfBkCymJIQQIrPYbBoFNudZ+yDrukEgnCAQSV3zDp/oIRhuJxBJEE/ofR+WpoJnrjvVw5mTZSPbbcNuPbV40If2bkYDmL1tGIE2EnXb+laq7UBxZqdCpq8MLbc0FT69fhSLBE4hxPAiwXIYSSYNwrEk4Wjy1PdogmAkSTASJ/T+79EkiaSB06bhtFv6v1x2jYIcJ6OKPGQ5UredY6LKlX1yQgghxCXQNBWvx47Xc3agSyZ1gpEkoWjqg9WG1iDh4wlCfddPTVXxuKxkZ6XCZk6WrW94rhWPy0aW04rFmQ3ObLSiqv7zmoaBGe5KbYUS6CDRvqV/SxTFmYPqLUHNLUXLLUHNLUkFTlkwSAgxREmwzFCmCfGETiR2KiRGYqmv9wNi6P3b40ki0SSGCQ6bhtOmYbdq2G19X1YNr9tOsc+Vut9uSX36KuFQCCGEwGLR8Hq0DwydmCbRhE44kroeh6IJmjvD1DcFCMcSfddhHbtVI8thIctpxeO04nalvmc5rWQ5SnH5K3E5LNgtGph9gTPYgRHsIHHkTcxQJ0agjf+/vfuPjbMu/AD+fn4/d9cf12tpe2XqRL6YAmZ8XYGYqUgZdoSurS5kOl0iGyNmYQOV6BRluo3EilFMNkWRkJgYTRR3Y6UuEwcJLl/JFjRLs+FMgTnWW7v1113vx/Pz8/3jubt1g9FJgadd36/kcs+vWz63z55nfffzS1INSLVNkOMtQeCMN0OuTUKqboAkc61nIpq7GCzfJ8IHCnYQCPNWKTAWHRQsD7miE4RFy0XBCo4VbQ+KLMGcHhJVBbouw1AVxEwNiRoThq7A1IJrNFVmWCQiIno3SRJMXYWpq0hc5BLhB+GzUP5lr+1hquBgdLKIgu3Bsl0U7eD/fgBBTyFdRcRUUWUmETU/iGidikiTgirJQtTPwHQz0E8dhzx4CMiNQhSnIFUlINc2Qa5NlgJnM+SaRkixBGeqJaLQMVjOgucL5AtBKJwqOMFvMgsOpkrHghDploKiC11VgqBYDoO6Al2VYWgK6msMtGjRyjldUzijKhER0TwgyVJlWMlMXNdD0fFg2cEvkW0neM8WbDiOD8v1YdsSik4VbCcCy70Cqvw/iOlAfbaIRKGIupE0qqVBxEQOETcL1bfgGHF4sQagpglqbTO0RBPMRBJa/ApIMn/cI6L3Hp80b0EIIG+5yObsYGKbgl0Z7D99vKLt+pWup2ape2k5MNbXGLiyIQZTD0KiqSmcMZWIiGiBU1UFVaqCqsjM1wIAhIDj+rBcD7btw3Y9TLk+xlwftu3B1J5TSAAAEqRJREFU9jwIx4HmZGFMZBAZHYHpv46YyKFGyqNKKmIKEUxKNcjKceS0BGyzDm6kAaKqAUa0GtGIhqihIjlegGM5iBoqomZw7GJLtBARXWhBBkvPF8jmHWSmLEzmbExO2cGyGrkgQOYKDjRVrkxgEzGCLjAxU0NDbSTYN4KwyK6nRERE9J6RJGiaAk1TgEsNoyWOEDjjOPDzGajFSdQXJ9FoT0KxT0HPZWAOZyEgYUqqxqRUg0GpFmfcKgw7VRi2IzhtR6FoOiKGipgZhM3yONKqSHnWeBVVFxyPRTToqgyJPyMRLSiXbbB0XR/jWQvjUxYmshbGMqXtnIVcwUXUUBAzdcRMBREzeBA2JaKl39Kp7IZKRERE85skQdV1QG8A0FA57AMoAigKAcmzoFgZJKwMFokc3OwEZPskFCsDxcrA06Kw9TgKeh3yahxZpRYZvwbjU1U4ORlD0RUolrr1Fi23NJ9EMJa0HETLIbQ8i255cqOqiF5Z1qwqosHUFYZRonlsXgdLIYBM3sbYZBFjWQujkwWczViYyBZRsLzKQ6z8wLoqWYOqqIaYoUJmcCQiIqKFTJIgVBOuagKxRuhVBqamrHPnhQ/ZzkGxs9BK4bPBPgPFykKxMpCdHDyjGp6ZgFtTDy9SDzdaDy+SQFGrQ06KIG/7KFpeMKmR5WJyysbwWL4ymdH02e89X5ybVXdaEK2N6cEsu1H93PFSy6jKn+eI5ox5EywnpiykRyYxOlnEmckCRieLGM9a0DUleOCUHkRXt9SgKlqPmKFxTCMRERHROyXJ8I1q+EY1UN3y5vO+B9nOBkHTzkDNn4E+8SpkK4uElYHsFuGZcbiRBLxIAm60AV59Am4kCJ++XnXekCLH9c8Lm+Xt4fE8Tgx7KNjlWfXPnTc0GTGz1AJqBqGzJnouiMZMDVWRc62mDKNE7515Eyz7/u8EhGOjOqqhNqrjg1dUoTqmQ1O5phMRERHR+05W4Jtx+GYczlud9xwodql1085Cyw7BGD1eavGchOQ7cM06eJG6IGxG61ETScA1E/ASCfh63dvOZSGEqLR8lmfhL4fRiSkrODetZbT80lQFUTMYNxozy2uNThsjap6bvChqqogawdIwUUNlKCV6G/MmWN76v4tgF4thF4OIiIiILoWiwSu1Vr4VybMhW1kodhaylYGWeQPG2VcqXW3hu/DMOLxIXdDdttTN1jPr4Ebq4Bs1lUkWL7bG6IWEELAdH4XS2qLBuNBgjGi+4GAsU4TleLAcH5btBdt2aek4x4MsSZXVACK6UprgUamUI2IolUkfTV0J3o1guTlTC/YNPdhnSKXLzbwJlkRERER0+RCKDi8atFS+lfOD57QWT3sKcrmrrVFTCp+JoMutGYdn1pXe4xDa+VPpSpIEo7Sm+H9dXiHgeqIUPIM1SIN3v7KfyTkYnSzC9nw4rg/b8eG4wTW2e/61lbJowau81nmkFErjtRHA94OgWjpfDrRmOdDqQZg1dS4NQ+FjsCQiIiKiOWem4AnfhWxPBUHTzkIpTkLLDpWOBWEUkgTPqK0EzXOvWvhGLTyjFr4eA6SZWw8lSYKmStBUGVURbXbfTQh4voDteLCnB1C3HEg9KIqMyVywuoHt+nDdUjgtnbccH5bjwnKCfV2VYejnWlIjhnquO2+p22+5a2/wriFS6RKscngZzRqDJRERERHNP7JaGeP5lkrLqQRBMwfZmYJSnAjCp5MrzXg7Bcmz4OtV8PQaeEYNfKOm1BJaA1+vgW9Uw9Or4OtVEIrxrqxhLkkSVEWCqsiIXuSaeDyKiYn8Jf15QohpgbPcldcthU8PRcvD5JQNy/WmdfH1KuNQi5YLSZIQNZRgfOm0MajTJz6qrGcaKY1PLYVWRWa3XrrEYPnaa69hy5YtmJiYQDweR29vLxYvXnzeNZ7nYceOHXjxxRchSRLuvfde3HXXXTOeIyIiIiJ615WWU/FUE1604eLX+V4QNJ08ZDsH2S1AKY4FAdTNQ3YKkJ08JCcPCQKeFoOvx+BrMfh6VfBuVMPXosG2FiltR+GrkaA77iW0iM7uq0qVLrXV7+DzQgg4nl8Jm8HLrWwPj+dRHPFg2UFgDQJpMPbUsl3oWtBKGjVVxAy1FELPTYpUbjWNGtp5kyFFDRW6JnP90svEJQXLrVu3Ys2aNeju7saePXvw8MMP4ze/+c151+zduxf/+c9/sH//fkxMTKCnpwef+MQnsGjRorc9R0REREQUGlmBX2qpnJHnQHYLQdh0C5BcC7JbhJIbgebZkNwiZM+C5FrBtmtB8iwIRYNQTfiqCaGY8DUz2FdMCNUIjqsGhGJAKDp8RYeSr4ZeFBCyFnxe1gBZrWwLWQ0C67vUgqqrCnRVQfXFmlAvQohg3Gk5hFql0Fnen8zZ08al+pXAWv6M5/mVLrzmBZMfRQz1vHGkRmkSpPLYVF2TS+/BvqbK0FUZqiIh+Fvxg4XvhQAg3rwdfAGI6fulY6WNC77tO4ntC8eMwXJ0dBRHjx7FU089BQDo7OzE9u3bMTY2hkTi3Bxc/f39uOuuuyDLMhKJBJYvX459+/bhnnvuedtzl8ovTMLLX1p3AHrvFV0NnvWWk4tTCFgfcw/rZG5hfcw9rJO5h3VyaTwAkHRA04FLGGopCQHJdyELB7LvQPZtyL5bejlQ3ALgZCALD7LvQiq9y8JDRATXzDY6CkgQkgxAgpCmbUMCyq8LA2pp/8JohdInzgti5StF+RMCUum9/HcARQAKIBml88IvXXMRdun1X/JKr/fEfY+/V3/yZWHGYJlOp9HU1ARFCQb0KoqCxsZGpNPp84JlOp1GS8u5xXOTySROnz4947lLtbTjjv/qeiIiIiIiInp/cKQtERERERERzcqMwTKZTGJ4eBieFzQqe56HkZERJJPJN103NDRU2U+n02hubp7xHBEREREREc1vMwbL+vp6tLa2oq+vDwDQ19eH1tbW87rBAsCKFSvwhz/8Ab7vY2xsDM899xw6OjpmPEdERERERETzmyTEm0bevsng4CC2bNmCTCaDmpoa9Pb24qqrrsKGDRuwefNmfOxjH4Pnedi2bRsOHjwIANiwYQNWr14NAG97joiIiIiIiOa3SwqWRERERERERBfDyXuIiIiIiIhoVhgsiYiIiIiIaFYYLImIiIiIiGhWGCyJiIiIiIhoVuZ0sHzttdewevVqdHR0YPXq1Xj99dfDLtKC09vbi/b2dnz0ox/F8ePHK8dZN+EYHx/Hhg0b0NHRgZUrV+K+++7D2NgYAOCf//wnurq60NHRgXXr1mF0dDTk0i4cGzduRFdXF3p6erBmzRocO3YMAO+TsO3cufO8ZxfvkfC0t7djxYoV6O7uRnd3N1588UUArJOwWJaFrVu34rOf/SxWrlyJ733vewD4zArLG2+8Ubk3uru70d7ejptuugkA6yQszz//PHp6etDd3Y2uri7s378fAOtjRmIOW7t2rUilUkIIIVKplFi7dm3IJVp4Dh06JIaGhsStt94q/vWvf1WOs27CMT4+Lv7+979X9n/4wx+Kb3/728LzPLF8+XJx6NAhIYQQu3btElu2bAmrmAtOJpOpbP/lL38RPT09QgjeJ2EaGBgQ69evrzy7eI+E68L/Q4QQrJMQbd++XTzyyCPC930hhBBnzpwRQvCZNVfs2LFD/OAHPxBCsE7C4Pu+aGtrqzyzjh07Jm644QbheR7rYwZztsVydHQUR48eRWdnJwCgs7MTR48erbTO0Pujra0NyWTyvGOsm/DE43HcfPPNlf0bbrgBQ0NDGBgYgGEYaGtrAwB84QtfwL59+8Iq5oJTXV1d2Z6amoIkSbxPQmTbNrZt24bvf//7lWO8R+Ye1kk4crkcUqkU7r//fkiSBABoaGjgM2uOsG0be/fuxapVq1gnIZJlGdlsFgCQzWbR2NiI8fFx1scM1LALcDHpdBpNTU1QFAUAoCgKGhsbkU6nkUgkQi7dwsa6mRt838fvfvc7tLe3I51Oo6WlpXIukUjA931MTEwgHo+HWMqF46GHHsLBgwchhMCvf/1r3ich+tnPfoauri4sWrSocoz3SPgefPBBCCGwdOlSfP3rX2edhOTkyZOIx+PYuXMnXnrpJcRiMdx///0wTZPPrDngwIEDaGpqwnXXXYeBgQHWSQgkScJjjz2GjRs3IhqNIpfL4Ve/+hX/X78Ec7bFkoje3vbt2xGNRvHlL3857KIQgEceeQQvvPACvva1r+FHP/pR2MVZsP7xj39gYGAAa9asCbsoNM1vf/tbPPPMM3j66achhMC2bdvCLtKC5XkeTp48iWuvvRZ/+tOf8OCDD2LTpk3I5/NhF40APP3001i1alXYxVjQXNfFL3/5S/z85z/H888/j1/84hd44IEHeI9cgjkbLJPJJIaHh+F5HoDgQTgyMvKmbpn0/mPdhK+3txcnTpzAY489BlmWkUwmMTQ0VDk/NjYGWZb5W/8Q9PT04KWXXkJzczPvkxAcOnQIg4ODuO2229De3o7Tp09j/fr1OHHiBO+REJX/3eu6jjVr1uDll1/mcyskyWQSqqpWuvMtWbIEdXV1ME2Tz6yQDQ8P49ChQ1i5ciUA/rwVlmPHjmFkZARLly4FACxduhSRSASGYbA+ZjBng2V9fT1aW1vR19cHAOjr60NrayubmucA1k24fvKTn2BgYAC7du2CrusAgOuvvx7FYhGHDx8GAPz+97/HihUrwizmgpHL5ZBOpyv7Bw4cQG1tLe+TkNx7773429/+hgMHDuDAgQNobm7Gk08+iXvuuYf3SEjy+XxlrJIQAv39/WhtbeVzKySJRAI333wzDh48CCCY5XJ0dBSLFy/mMytku3fvxi233IK6ujoA/HkrLM3NzTh9+jReffVVAMDg4CBGR0fxoQ99iPUxA0kIIcIuxMUMDg5iy5YtyGQyqKmpQW9vL6666qqwi7Wg7NixA/v378fZs2dRV1eHeDyOZ599lnUTkn//+9/o7OzE4sWLYZomAGDRokXYtWsXXn75ZWzduhWWZeHKK6/Eo48+ioaGhpBLfPk7e/YsNm7ciEKhAFmWUVtbi29961u47rrreJ/MAe3t7Xj88cdxzTXX8B4JycmTJ7Fp0yZ4ngff9/GRj3wE3/3ud9HY2Mg6CcnJkyfxne98BxMTE1BVFQ888ABuueUWPrNC1tHRgYceegif/vSnK8dYJ+F45pln8MQTT1QmuNq8eTOWL1/O+pjBnA6WRERERERENPfN2a6wREREREREND8wWBIREREREdGsMFgSERERERHRrDBYEhERERER0awwWBIREREREdGsMFgSERERERHRrDBYEhHRvLd27VrceOONsG077KIQEREtSAyWREQ0r73xxhs4fPgwJEnCX//617CLQ0REtCAxWBIR0byWSqWwZMkSfO5zn0MqlaocHx8fx1e/+lV8/OMfx6pVq/DTn/4UX/ziFyvnBwcHcffdd+Omm25CR0cH+vv7wyg+ERHRZUENuwBERESzsWfPHnzlK1/BkiVLsHr1apw9exYNDQ3Ytm0bIpEIDh48iFOnTmH9+vVoaWkBAOTzeaxbtw6bN2/GE088gePHj+Puu+/GNddcg6uvvjrkb0RERDT/sMWSiIjmrcOHD2NoaAh33HEHrr/+enzgAx9AX18fPM/D/v37sWnTJkQiEVx99dXo6empfO6FF17AlVdeiVWrVkFVVVx77bXo6OjAvn37Qvw2RERE8xdbLImIaN5KpVJYtmwZEokEAKCzsxO7d+/GnXfeCdd1kUwmK9dO3z516hSOHDmCtra2yjHP89DV1fX+FZ6IiOgywmBJRETzUrFYxJ///Gf4vo9ly5YBAGzbRiaTwejoKFRVxenTp/HhD38YAJBOpyufTSaTuPHGG/HUU0+FUnYiIqLLDbvCEhHRvPTcc89BURQ8++yzSKVSSKVS6O/vR1tbG1KpFG6//Xbs3LkThUIBg4OD2LNnT+Wzn/nMZ/D6668jlUrBcRw4joMjR45gcHAwxG9EREQ0fzFYEhHRvLR79258/vOfR0tLC6644orK60tf+hL27t2Lhx9+GNlsFsuWLcM3v/lN3HnnndB1HQBQVVWFJ598Ev39/fjUpz6FT37yk/jxj3/MdTCJiIjeIUkIIcIuBBER0Xvt0UcfxdmzZ9Hb2xt2UYiIiC47bLEkIqLL0uDgIF555RUIIXDkyBH88Y9/xO233x52sYiIiC5LnLyHiIguS7lcDt/4xjcwMjKC+vp6rFu3DrfddlvYxSIiIrossSssERERERERzQq7whIREREREdGsMFgSERERERHRrDBYEhERERER0awwWBIREREREdGsMFgSERERERHRrDBYEhERERER0az8P82P5gNn7IYCAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "8nSAHBOM42hf" }, "source": [ "আমাদের প্লটের বয়সসীমা কমিয়ে নিয়ে আসি ০ থেকে ২০ এর মধ্যে plt.xlim(0, 20) দিয়ে। এরপর ২০ থেকে ৩০, ৪০, ৫০, ৬০, ৭০ এবং ৮০ পর্যন্ত। আপনার মেশিন, আপনার মনের মাধুরী মিশিয়ে তৈরি করুন একেকটা প্লট। নিচের দিকে দেখলেই বুঝবেন ০ থেকে ৮০ বছর পর্যন্ত বয়স দেয়া আছে নিচের এক্সিসে। \n", "\n", "#### ছবির মানে কী? \n", "\n", "আপনারা ভালো করে লক্ষ্য করলে দেখবেন \"০\" মানে মারা গিয়েছেন ৩০ বছর বয়সের মানুষ বেশি। আবার বেঁচেছেন ২০ থেকে ৩৪ বছর বয়সের মানুষ। নিচের চারটা ছবি দেখলে তাই মনে হয়। ডাটার ঘনত্ব ৩০ থেকে ৩৪ বয়সের দিকে।" ] }, { "cell_type": "code", "metadata": { "id": "f3If3mvi42hf", "outputId": "1bb4ef0c-ac00-45bf-8a1c-0c97c30bc1d3", "colab": { "base_uri": "https://localhost:8080/", "height": 220 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Age',shade= True)\n", "facet.set(xlim=(0, train['Age'].max()))\n", "facet.add_legend()\n", "plt.xlim(0, 20)" ], "execution_count": 32, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(0.0, 20.0)" ] }, "metadata": { "tags": [] }, "execution_count": 32 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "code", "metadata": { "id": "7U3n4_ld42hg", "outputId": "e69f8969-0692-4ba7-ce86-5863429aef72", "colab": { "base_uri": "https://localhost:8080/", "height": 220 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Age',shade= True)\n", "facet.set(xlim=(0, train['Age'].max()))\n", "facet.add_legend()\n", "plt.xlim(20, 30)" ], "execution_count": 33, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(20.0, 30.0)" ] }, "metadata": { "tags": [] }, "execution_count": 33 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "code", "metadata": { "id": "3KH1QFLF42hg", "outputId": "5986f2c5-d69b-4c53-b1d0-7117a907be9d", "colab": { "base_uri": "https://localhost:8080/", "height": 220 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Age',shade= True)\n", "facet.set(xlim=(0, train['Age'].max()))\n", "facet.add_legend()\n", "plt.xlim(30, 40)" ], "execution_count": 34, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(30.0, 40.0)" ] }, "metadata": { "tags": [] }, "execution_count": 34 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "code", "metadata": { "id": "YCX6_BcQ42hg", "outputId": "b4b2004f-1079-4cd5-8935-c58c83edeadc", "colab": { "base_uri": "https://localhost:8080/", "height": 220 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Age',shade= True)\n", "facet.set(xlim=(0, train['Age'].max()))\n", "facet.add_legend()\n", "plt.xlim(40, 60)" ], "execution_count": 35, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(40.0, 60.0)" ] }, "metadata": { "tags": [] }, "execution_count": 35 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "bAIfF6Ml42hg" }, "source": [ "এখন দেখি ডাটা কি কথা বলে? Cabin এবং Embarked এখনো কিছু ফাঁকা! উপরের ছবি দেখে আমরা বেশ কিছু ধারণা পেয়েছি - বয়স ০, থেকে ২০, ৩০ না করে দেখা গেল তাদের ভ্যারিয়েশন কিছুটা ভিন্ন। সেজন্য আমরা নিয়ে এসেছি আলাদা আলাদা বাক্স পদ্ধতি। অনেকে এটাকে বলেন 'বিনিং' মানে ভিন্ন ভিন্ন 'বিন' মানে ব্যাগ বা বাক্সে ফেলা। আমাদের ক্লাসিফায়ার এটাকে অনেক পছন্দ করবে। " ] }, { "cell_type": "code", "metadata": { "id": "ky3ZnY1q42hh", "outputId": "651f08ad-5086-4c14-de49-0d2cfc9bb497", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.info()" ], "execution_count": 36, "outputs": [ { "output_type": "stream", "text": [ "\n", "RangeIndex: 891 entries, 0 to 890\n", "Data columns (total 12 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 PassengerId 891 non-null int64 \n", " 1 Survived 891 non-null int64 \n", " 2 Pclass 891 non-null int64 \n", " 3 Sex 891 non-null int64 \n", " 4 Age 891 non-null float64\n", " 5 SibSp 891 non-null int64 \n", " 6 Parch 891 non-null int64 \n", " 7 Ticket 891 non-null object \n", " 8 Fare 891 non-null float64\n", " 9 Cabin 204 non-null object \n", " 10 Embarked 889 non-null object \n", " 11 Title 891 non-null int64 \n", "dtypes: float64(2), int64(7), object(3)\n", "memory usage: 83.7+ KB\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "qBiUfsEe42hh", "outputId": "286bab3c-640d-4eb9-b602-698be9a7bd16", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "test.info()" ], "execution_count": 37, "outputs": [ { "output_type": "stream", "text": [ "\n", "RangeIndex: 418 entries, 0 to 417\n", "Data columns (total 11 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 PassengerId 418 non-null int64 \n", " 1 Pclass 418 non-null int64 \n", " 2 Sex 418 non-null int64 \n", " 3 Age 418 non-null float64\n", " 4 SibSp 418 non-null int64 \n", " 5 Parch 418 non-null int64 \n", " 6 Ticket 418 non-null object \n", " 7 Fare 417 non-null float64\n", " 8 Cabin 91 non-null object \n", " 9 Embarked 418 non-null object \n", " 10 Title 418 non-null int64 \n", "dtypes: float64(2), int64(6), object(3)\n", "memory usage: 36.0+ KB\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "lDWiziju42hh" }, "source": [ "#### ৫.৪.২ বয়সকে ভিন্ন ভিন্ন ব্যাগে ফেলা \n", "বয়সের মতো কন্টিনিউয়াস ভ্যারিয়েবলকে পাল্টে ফেলছি ক্যাটেগরিক্যাল ভ্যারিয়েবলে। বয়সটা দেখুন ভালো করে। ১৬ এবং ১৬ এর নিচে, ২৬ এবং ৩৬ এর নিচে, ৩৬, .... ৬২ এবং তার ওপরে। \n", "\n", "এখানে আমাদের ফিচার ভেক্টর ম্যাপ হতে পারে এধরনের ৫টা ভাগে: \n", "child: 0 \n", "young: 1 \n", "adult: 2 \n", "mid-age: 3 \n", "senior: 4" ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "BLKDyKXX42hi" }, "source": [ "for dataset in train_test_data:\n", "\n", " dataset.loc[dataset['Age'] <= 16, 'Age'] = 0\n", " dataset.loc[(dataset['Age'] > 16) & (dataset['Age'] <= 26), 'Age'] = 1\n", " dataset.loc[(dataset['Age'] > 26) & (dataset['Age'] <= 36), 'Age'] = 2\n", " dataset.loc[(dataset['Age'] > 36) & (dataset['Age'] <= 62), 'Age'] = 3\n", " dataset.loc[ dataset['Age'] > 62, 'Age'] = 4" ], "execution_count": 38, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "4XtXKAxs42hl" }, "source": [ "বয়স কিন্তু চলে এসেছে ০ থেকে ৪ নিউম্যারিক ভ্যালুর মধ্যে। ক্যাটেগরিক্যাল ভাল্যু। " ] }, { "cell_type": "code", "metadata": { "id": "-DvlRCd142hl", "outputId": "3f61d57f-dae2-4652-c449-568dadcd75cf", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "source": [ "train.head()" ], "execution_count": 39, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
010301.010A/5 211717.2500NaNS0
121113.010PC 1759971.2833C85C2
231311.000STON/O2. 31012827.9250NaNS1
341112.01011380353.1000C123S2
450302.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 7.2500 NaN S 0\n", "1 2 1 1 1 ... 71.2833 C85 C 2\n", "2 3 1 3 1 ... 7.9250 NaN S 1\n", "3 4 1 1 1 ... 53.1000 C123 S 2\n", "4 5 0 3 0 ... 8.0500 NaN S 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 39 } ] }, { "cell_type": "code", "metadata": { "id": "-wghNZEd42hl", "outputId": "c4065465-451f-4d04-a182-c3d381caf5c5", "colab": { "base_uri": "https://localhost:8080/", "height": 361 } }, "source": [ "bar_chart('Age')" ], "execution_count": 40, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "H5G3S5CK42hl" }, "source": [ "### ৫.৫ এমবার্কড, কে কোন ঘাট থেকে উঠেছে" ] }, { "cell_type": "markdown", "metadata": { "id": "O1QdURvg42hm" }, "source": [ "#### ৫.৫.১ চলুন ভর্তি করি মিসিং ভ্যালুগুলো \n", "তার আগে দেখে নেই সবচেয়ে বেশি মানুষ উঠেছে কোথা থেকে? ১ম, ২য় এবং ৩য় শ্রেণী ধরে। " ] }, { "cell_type": "code", "metadata": { "id": "g2f5EsO342hm", "outputId": "fe54ef2f-2553-4a52-ad61-2b93f7cb27e6", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "Pclass1 = train[train['Pclass']==1]['Embarked'].value_counts()\n", "Pclass2 = train[train['Pclass']==2]['Embarked'].value_counts()\n", "Pclass3 = train[train['Pclass']==3]['Embarked'].value_counts()\n", "df = pd.DataFrame([Pclass1, Pclass2, Pclass3])\n", "df.index = ['1st class','2nd class', '3rd class']\n", "df.plot(kind='bar',stacked=True, figsize=(10,5))" ], "execution_count": 41, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 41 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "lHzjc0iq42hm" }, "source": [ "১ম, ২য় এবং ৩য় শ্রেণী ধরে প্রতিটা শ্রেণীতেই সবচেয়ে বেশি মানুষ এসেছে সাউথহ্যাম্পটন থেকে। ৫০% এর বেশি। এখানে কোন শহরের কতো মানুষ ১ম শ্রেণী কিনেছে - সেটা থেকে বোঝা যাবে কোন শহরের মানুষ গরীব। \n", "\n", "কি করবো আমরা? সাউথহ্যাম্পটন মানে 'S' দিয়ে ভর্তি করে দেবো। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "ZA90D1Rr42hm" }, "source": [ "for dataset in train_test_data:\n", " dataset['Embarked'] = dataset['Embarked'].fillna('S')" ], "execution_count": 42, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "AdXo8L4B42hm", "outputId": "202049d3-6a40-47f8-eb8e-46423b17fba6", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "train.head()" ], "execution_count": 43, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
010301.010A/5 211717.2500NaNS0
121113.010PC 1759971.2833C85C2
231311.000STON/O2. 31012827.9250NaNS1
341112.01011380353.1000C123S2
450302.0003734508.0500NaNS0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 7.2500 NaN S 0\n", "1 2 1 1 1 ... 71.2833 C85 C 2\n", "2 3 1 3 1 ... 7.9250 NaN S 1\n", "3 4 1 1 1 ... 53.1000 C123 S 2\n", "4 5 0 3 0 ... 8.0500 NaN S 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 43 } ] }, { "cell_type": "markdown", "metadata": { "id": "h67edM1142hm" }, "source": [ "প্রতিটা শহরকে একটা সংখ্যা দিয়ে পাল্টে দেই আমাদের হিসেবের সুবিধার কথা চিন্তা করে। \"S\": 0, \"C\": 1, \"Q\": 2" ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "7UNzucbq42hn" }, "source": [ "embarked_mapping = {\"S\": 0, \"C\": 1, \"Q\": 2}\n", "for dataset in train_test_data:\n", " dataset['Embarked'] = dataset['Embarked'].map(embarked_mapping)" ], "execution_count": 44, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "IcD1vsD042hn" }, "source": [ "### ৫.৬ ভাড়া\n", "\n", "মিসিং অংশগুলো ভর্তি করে দেই প্রতিটা শ্রেনীর গড় ভাড়ার ভ্যালু দিয়ে। " ] }, { "cell_type": "code", "metadata": { "id": "WqcVqfKL42hn", "outputId": "78f82c57-eb6c-458a-afa1-a7cb2be5b784", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "source": [ "# fill missing Fare with median fare for each Pclass\n", "train[\"Fare\"].fillna(train.groupby(\"Pclass\")[\"Fare\"].transform(\"median\"), inplace=True)\n", "test[\"Fare\"].fillna(test.groupby(\"Pclass\")[\"Fare\"].transform(\"median\"), inplace=True)\n", "train.head(5)" ], "execution_count": 45, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
010301.010A/5 211717.2500NaN00
121113.010PC 1759971.2833C8512
231311.000STON/O2. 31012827.9250NaN01
341112.01011380353.1000C12302
450302.0003734508.0500NaN00
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 7.2500 NaN 0 0\n", "1 2 1 1 1 ... 71.2833 C85 1 2\n", "2 3 1 3 1 ... 7.9250 NaN 0 1\n", "3 4 1 1 1 ... 53.1000 C123 0 2\n", "4 5 0 3 0 ... 8.0500 NaN 0 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 45 } ] }, { "cell_type": "markdown", "metadata": { "id": "H8qiMtJG42hn" }, "source": [ "চলুন কিছু প্লট দেখে আসি - কোন ধরনের ভাড়ার মানুষ বেশি মারা গিয়েছেন। দেখা যাচ্ছে সস্তা টিকেটধারী মানুষদের ভাগ্য ওরকম সুপ্রসন্ন ছিলো না। এর পাশাপাশি \"আর\" এনভায়রনমেন্ট দেখবেন প্রতিবার। না দেখলে অনেককিছু না বোঝা থেকে যাবে। নিচের প্লট থেকে দেখা যাচ্ছে ০ থেকে ১০০ ঘরের মধ্যে বেশিরভাগ মানুষ মারা গেছেন। মানে যারা টিকেট কেঁটেছিলেন ০ থেকে ১০০ ডলারের মধ্যে - তাদের ভাগ্য খারাপ। " ] }, { "cell_type": "code", "metadata": { "id": "AkE8NpEn42hn", "outputId": "8d998514-9231-44be-b093-bb0643cf1e8d", "colab": { "base_uri": "https://localhost:8080/", "height": 205 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Fare',shade= True)\n", "facet.set(xlim=(0, train['Fare'].max()))\n", "facet.add_legend()\n", " \n", "plt.show() " ], "execution_count": 46, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "code", "metadata": { "id": "mwlQOXtk42hn", "outputId": "74615bfc-80b7-4194-a685-8e6e7b35267f", "colab": { "base_uri": "https://localhost:8080/", "height": 222 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Fare',shade= True)\n", "facet.set(xlim=(0, train['Fare'].max()))\n", "facet.add_legend()\n", "plt.xlim(0, 20)" ], "execution_count": 47, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(0.0, 20.0)" ] }, "metadata": { "tags": [] }, "execution_count": 47 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "hZ5eLN-442hn" }, "source": [ "এই ছবিতে (০-২০) ব্যাপারটা আরো পরিষ্কারভাবে ধরা পড়েছে। বোঝা যাচ্ছে ৮ ডলার আর তার আশেপাশের টাকা দিয়ে কেনা টিকেটের মালিকদের ভাগ্য সুপ্রসন্ন ছিলো না। " ] }, { "cell_type": "code", "metadata": { "id": "XzwVHm5J42ho", "outputId": "1567926b-86ee-4387-ab73-38fcf395b161", "colab": { "base_uri": "https://localhost:8080/", "height": 222 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'Fare',shade= True)\n", "facet.set(xlim=(0, train['Fare'].max()))\n", "facet.add_legend()\n", "plt.xlim(0, 30)" ], "execution_count": 48, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(0.0, 30.0)" ] }, "metadata": { "tags": [] }, "execution_count": 48 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "ZSSNF4l942ho" }, "source": [ "এখানেও ভাগ করি টিকেটের দাম দিয়ে বিভিন্ন 'বিন' মানে ব্যাগে। সেটাকে ম্যাপিং করি সংখ্যায়। আগের মতো। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "zWYDBFrM42ho" }, "source": [ "for dataset in train_test_data:\n", " dataset.loc[ dataset['Fare'] <= 17, 'Fare'] = 0\n", " dataset.loc[(dataset['Fare'] > 17) & (dataset['Fare'] <= 30), 'Fare'] = 1\n", " dataset.loc[(dataset['Fare'] > 30) & (dataset['Fare'] <= 100), 'Fare'] = 2\n", " dataset.loc[ dataset['Fare'] > 100, 'Fare'] = 3" ], "execution_count": 49, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "nB_ZQTpz42ho", "outputId": "0c8b056d-45bb-434b-e91c-d53ec770f9da", "colab": { "base_uri": "https://localhost:8080/", "height": 204 } }, "source": [ "train.head()" ], "execution_count": 50, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitle
010301.010A/5 211710.0NaN00
121113.010PC 175992.0C8512
231311.000STON/O2. 31012820.0NaN01
341112.0101138032.0C12302
450302.0003734500.0NaN00
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Fare Cabin Embarked Title\n", "0 1 0 3 0 ... 0.0 NaN 0 0\n", "1 2 1 1 1 ... 2.0 C85 1 2\n", "2 3 1 3 1 ... 0.0 NaN 0 1\n", "3 4 1 1 1 ... 2.0 C123 0 2\n", "4 5 0 3 0 ... 0.0 NaN 0 0\n", "\n", "[5 rows x 12 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 50 } ] }, { "cell_type": "markdown", "metadata": { "id": "iNTUSal142ho" }, "source": [ "### ৫.৭ কেবিন\n", "\n", "আমার এখানে কেবিন নম্বরটার দরকার নেই। দরকার হবে শুধু প্রথম অক্ষরটা। যেমন C23 এর C, কোন জায়গায় কেবিনটা। " ] }, { "cell_type": "code", "metadata": { "id": "Qd-dO5t242ho", "outputId": "33d04228-8906-4ccd-a7c0-5e472198586f", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.Cabin.value_counts()" ], "execution_count": 51, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "C23 C25 C27 4\n", "B96 B98 4\n", "G6 4\n", "E101 3\n", "F33 3\n", " ..\n", "A36 1\n", "B41 1\n", "D30 1\n", "C99 1\n", "D46 1\n", "Name: Cabin, Length: 147, dtype: int64" ] }, "metadata": { "tags": [] }, "execution_count": 51 } ] }, { "cell_type": "markdown", "metadata": { "id": "aG6DukEI42hp" }, "source": [ "শুধুমাত্র দরকার প্রথম অক্ষর। str[:1], কারণ এখানে পাওয়া যাবে কোন ক্লাসের কেবিন সেটা। তাহলে একটা ধারণা পাওয়া যাবে জাহাজের কোন এলাকায় ছিলেন একজন যাত্রী। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "8guLOPEH42hp" }, "source": [ "for dataset in train_test_data:\n", " dataset['Cabin'] = dataset['Cabin'].str[:1]" ], "execution_count": 52, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "iKk8qBOy42hp", "outputId": "32c77ebf-610c-4e55-ad55-1e3c369ba63d", "colab": { "base_uri": "https://localhost:8080/", "height": 381 } }, "source": [ "Pclass1 = train[train['Pclass']==1]['Cabin'].value_counts()\n", "Pclass2 = train[train['Pclass']==2]['Cabin'].value_counts()\n", "Pclass3 = train[train['Pclass']==3]['Cabin'].value_counts()\n", "df = pd.DataFrame([Pclass1, Pclass2, Pclass3])\n", "df.index = ['1st class','2nd class', '3rd class']\n", "df.plot(kind='bar',stacked=True, figsize=(10,5))" ], "execution_count": 53, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 53 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "SW-60Hcn42hp" }, "source": [ "একটা জিনিস লক্ষ্য করেছেন? ১ম শ্রেণীতে A, B, এবং C আছে। কিন্তু বাকি ক্লাসে A,B, এবং C কিন্তু নেই। তাহলে একটা ম্যাপিং করি সমান স্কেলিং দিয়ে। একই দূরত্বে। 0.4 দিয়ে প্রতিটার দূরত্ব। কেবিন ধরে। সেটার ভাড়াগুলো ভর্তি করি শ্রেণীর গড় ভাড়া দিয়ে। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "QdWpwnUA42hp" }, "source": [ "cabin_mapping = {\"A\": 0, \"B\": 0.4, \"C\": 0.8, \"D\": 1.2, \"E\": 1.6, \"F\": 2, \"G\": 2.4, \"T\": 2.8}\n", "for dataset in train_test_data:\n", " dataset['Cabin'] = dataset['Cabin'].map(cabin_mapping)" ], "execution_count": 54, "outputs": [] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "VM3JzdjF42hp" }, "source": [ "# fill missing Fare with median fare for each Pclass\n", "train[\"Cabin\"].fillna(train.groupby(\"Pclass\")[\"Cabin\"].transform(\"median\"), inplace=True)\n", "test[\"Cabin\"].fillna(test.groupby(\"Pclass\")[\"Cabin\"].transform(\"median\"), inplace=True)" ], "execution_count": 55, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "BeneT-sV42hp" }, "source": [ "### ৫.৮ পরিবারের সদস্যসংখ্যা \n", "এটা নিয়ে বিশাল একটা বড় লেখা আছে 'আর' এনভায়রনমেন্টএ। সেটা দেখুন - কনসেপ্ট একই। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "0y4DEGyV42hq" }, "source": [ "train[\"FamilySize\"] = train[\"SibSp\"] + train[\"Parch\"] + 1\n", "test[\"FamilySize\"] = test[\"SibSp\"] + test[\"Parch\"] + 1" ], "execution_count": 56, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "lEGHVNm942hq", "outputId": "912da721-30c5-4a2d-94c7-1797b8a43674", "colab": { "base_uri": "https://localhost:8080/", "height": 220 } }, "source": [ "facet = sns.FacetGrid(train, hue=\"Survived\",aspect=4)\n", "facet.map(sns.kdeplot,'FamilySize',shade= True)\n", "facet.set(xlim=(0, train['FamilySize'].max()))\n", "facet.add_legend()\n", "plt.xlim(0)" ], "execution_count": 57, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(0.0, 11.0)" ] }, "metadata": { "tags": [] }, "execution_count": 57 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "nBT-y0Cm42hq" }, "source": [ "আবারো পরিবারের ম্যাপিং। দেখুন 'আর' এনভায়রনমেন্ট। *ওপরের ছবি বলছে যারা একা ভ্রমণ করছিলেন তারা মারা গিয়েছেন বেশি। এখানে \"০\" মানে হচ্ছে উনি একা ছিলেন এই টাইটানিক জাহাজে।*" ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "mkerCyG542hq" }, "source": [ "family_mapping = {1: 0, 2: 0.4, 3: 0.8, 4: 1.2, 5: 1.6, 6: 2, 7: 2.4, 8: 2.8, 9: 3.2, 10: 3.6, 11: 4}\n", "for dataset in train_test_data:\n", " dataset['FamilySize'] = dataset['FamilySize'].map(family_mapping)" ], "execution_count": 58, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "JYjzLBPB42hq", "outputId": "a4194e3a-5bf5-4d75-962e-0c1f53dcaea0", "colab": { "base_uri": "https://localhost:8080/", "height": 221 } }, "source": [ "train.head()" ], "execution_count": 59, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitleFamilySize
010301.010A/5 211710.02.0000.4
121113.010PC 175992.00.8120.4
231311.000STON/O2. 31012820.02.0010.0
341112.0101138032.00.8020.4
450302.0003734500.02.0000.0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Cabin Embarked Title FamilySize\n", "0 1 0 3 0 ... 2.0 0 0 0.4\n", "1 2 1 1 1 ... 0.8 1 2 0.4\n", "2 3 1 3 1 ... 2.0 0 1 0.0\n", "3 4 1 1 1 ... 0.8 0 2 0.4\n", "4 5 0 3 0 ... 2.0 0 0 0.0\n", "\n", "[5 rows x 13 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 59 } ] }, { "cell_type": "code", "metadata": { "id": "_5ZPg7f-42hq", "outputId": "cfd7971b-234a-497e-a5bf-626c4c4ed7e0", "colab": { "base_uri": "https://localhost:8080/", "height": 221 } }, "source": [ "train.head()" ], "execution_count": 60, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassSexAgeSibSpParchTicketFareCabinEmbarkedTitleFamilySize
010301.010A/5 211710.02.0000.4
121113.010PC 175992.00.8120.4
231311.000STON/O2. 31012820.02.0010.0
341112.0101138032.00.8020.4
450302.0003734500.02.0000.0
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Sex ... Cabin Embarked Title FamilySize\n", "0 1 0 3 0 ... 2.0 0 0 0.4\n", "1 2 1 1 1 ... 0.8 1 2 0.4\n", "2 3 1 3 1 ... 2.0 0 1 0.0\n", "3 4 1 1 1 ... 0.8 0 2 0.4\n", "4 5 0 3 0 ... 2.0 0 0 0.0\n", "\n", "[5 rows x 13 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 60 } ] }, { "cell_type": "markdown", "metadata": { "id": "WASoOC_142hr" }, "source": [ "অদরকারি ভ্যারিয়েবলগুলো ফেলে দিন। কারণ 'Ticket', 'SibSp', 'Parch' থেকে ফিচার ইঞ্জিনিয়ারিং করে বের করে নিয়েছি নতুন ফিচার। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "hccviqXg42hr" }, "source": [ "features_drop = ['Ticket', 'SibSp', 'Parch']\n", "train = train.drop(features_drop, axis=1)\n", "test = test.drop(features_drop, axis=1)\n", "train = train.drop(['PassengerId'], axis=1)" ], "execution_count": 61, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "N9GLlaDC42hr", "outputId": "e050c29d-4b05-4edb-c236-52ac65d2cce5", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train_data = train.drop('Survived', axis=1)\n", "target = train['Survived']\n", "\n", "train_data.shape, target.shape" ], "execution_count": 62, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "((891, 8), (891,))" ] }, "metadata": { "tags": [] }, "execution_count": 62 } ] }, { "cell_type": "markdown", "metadata": { "id": "nByV81Eq42hr" }, "source": [ "### দেখুন সব ফিচার সংখ্যায় \n", "এটা করার কারণ হচ্ছে আমাদের সামনে মডেল তৈরির সময়ে সবগুলো ভ্যারিয়েবলকে নতুন করে বলতে হবে না, যেটা করেছিলাম 'আর' এনভায়রনমেন্টে। " ] }, { "cell_type": "code", "metadata": { "id": "eyXtjLrp42hr", "outputId": "c66e7d94-b51f-4776-93e7-a1cf6184580c", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "train_data.head()" ], "execution_count": 63, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PclassSexAgeFareCabinEmbarkedTitleFamilySize
0301.00.02.0000.4
1113.02.00.8120.4
2311.00.02.0010.0
3112.02.00.8020.4
4302.00.02.0000.0
\n", "
" ], "text/plain": [ " Pclass Sex Age Fare Cabin Embarked Title FamilySize\n", "0 3 0 1.0 0.0 2.0 0 0 0.4\n", "1 1 1 3.0 2.0 0.8 1 2 0.4\n", "2 3 1 1.0 0.0 2.0 0 1 0.0\n", "3 1 1 2.0 2.0 0.8 0 2 0.4\n", "4 3 0 2.0 0.0 2.0 0 0 0.0" ] }, "metadata": { "tags": [] }, "execution_count": 63 } ] }, { "cell_type": "markdown", "metadata": { "id": "RfwZijbr42hr" }, "source": [ "## ৬. মেশিন লার্নিং মডেলিং " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "zLf383Ee42hs" }, "source": [ "# Importing Classifier Modules\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.ensemble import RandomForestClassifier\n", "\n", "import numpy as np" ], "execution_count": 64, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "xfMCVAAP42hs" }, "source": [ "সবকিছু ঠিক আছে! কোন ডাটা মিসিং নেই। " ] }, { "cell_type": "code", "metadata": { "id": "dCBCb_AX42hs", "outputId": "574f537b-ccbb-41ff-e468-098f0aad0535", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "train.info()" ], "execution_count": 65, "outputs": [ { "output_type": "stream", "text": [ "\n", "RangeIndex: 891 entries, 0 to 890\n", "Data columns (total 9 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Survived 891 non-null int64 \n", " 1 Pclass 891 non-null int64 \n", " 2 Sex 891 non-null int64 \n", " 3 Age 891 non-null float64\n", " 4 Fare 891 non-null float64\n", " 5 Cabin 891 non-null float64\n", " 6 Embarked 891 non-null int64 \n", " 7 Title 891 non-null int64 \n", " 8 FamilySize 891 non-null float64\n", "dtypes: float64(4), int64(5)\n", "memory usage: 62.8 KB\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "AQBmrviA42hs" }, "source": [ "### ৬.১ ক্রস ভ্যালিডেশন (কে-ফোল্ড = ১০ ভাগ)\n", "আমরা চলে এসেছি প্রায় শেষের দিকে। শেষ করার আগে একটা জিনিস সবসময় চাইবো - বিশেষ করে নিজের মডেলের 'স্ট্যাবিলিটি' যাতে ভালো থাকে। এখন আমরা কাজ করছি ট্রেনিং ডাটা দিয়ে, কিন্তু যদি অন্য নতুন ডাটা (যেটা মডেল দেখেনি) দিয়ে মডেলটা খারাপ করে? মানে যে ডাটা সে দেখেনি - ট্রেনিং সেশনে। আর সেকারণে আমরা ডাটাকে দশভাগে ভাগ করে একেক সময় একেক ভাগকে দেখাবো না (মানে, লুকিয়ে রাখবো) মডেলকে। নিজের ডাটার মধ্যে চেক করা, এটা একটা মজার জিনিস। নিজের ডাটাকে ঘুরিয়ে ফিরিয়ে মডেলের ভেতরের 'অ্যাক্যুরেসি' দেখার জন্য এটা একটা চমৎকার জিনিস। চলুন, আগে বের করি cross_val_score, এটা টেস্ট 'ফোল্ডে'র স্কোরটা বের করে আনে। cross_val_score কিন্তু ট্রেনিং এবং টেস্ট দুটোতেই প্রতিটা 'ফোল্ড' ব্যবহার করে।\n", "\n", "'n_splits=10' মানে এখানে ডাটাসেটকে ১০ ভাগে ভাগ করা হয়েছে। " ] }, { "cell_type": "markdown", "metadata": { "id": "HlH6NAzp42hs" }, "source": [ "ক্রস ভ্যালিডেশন: ছবি দেখলে কেমন হয়?\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "id": "jWWl-7Na42hs" }, "source": [ "ছবি: ক্রস ভ্যালিডেশন, কিভাবে নিজের ডাটা দিয়ে 'অ্যাক্যুরেসি' জানা যায় " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "0WKjcX9v42hs" }, "source": [ "from sklearn.model_selection import KFold\n", "from sklearn.model_selection import cross_val_score\n", "k_fold = KFold(n_splits=10, shuffle=True, random_state=0)" ], "execution_count": 66, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "b1yookFy42hs" }, "source": [ "### ৬.২ ডিসিশন ট্রি\n", "আগের 'আর' এর এক্সারসাইজ দেখি। সেখানে 'ডিসিশন ট্রি' নিয়ে অনেক কথা হয়েছে। এখানে ক্লাসিফায়ারের 'clf' এর 'অ্যাক্যুরেসি' বের করার চেষ্টা করেছি আমরা। " ] }, { "cell_type": "code", "metadata": { "id": "e2aLqD2c42ht", "outputId": "9c8ecb5b-69e0-4077-e344-ba1ed6004fd2", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "clf = DecisionTreeClassifier()\n", "scoring = 'accuracy'\n", "score = cross_val_score(clf, train_data, target, cv=k_fold, n_jobs=1, scoring=scoring)\n", "print(score)" ], "execution_count": 67, "outputs": [ { "output_type": "stream", "text": [ "[0.76666667 0.80898876 0.7752809 0.76404494 0.88764045 0.76404494\n", " 0.82022472 0.82022472 0.74157303 0.78651685]\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "amg5TOXq42ht", "outputId": "aeca9364-dba9-4bb0-860f-af46ba5253e5", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "# decision tree Score\n", "round(np.mean(score)*100, 2)" ], "execution_count": 68, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "79.35" ] }, "metadata": { "tags": [] }, "execution_count": 68 } ] }, { "cell_type": "markdown", "metadata": { "id": "CgqhEFUk42ht" }, "source": [ "### ৬.৩ র‌্যান্ডম ফরেস্ট" ] }, { "cell_type": "code", "metadata": { "id": "2ORb1sLa42ht", "outputId": "c30d8adc-666f-48b2-b098-3d221be751b3", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "clf = RandomForestClassifier(n_estimators=13)\n", "scoring = 'accuracy'\n", "score = cross_val_score(clf, train_data, target, cv=k_fold, n_jobs=1, scoring=scoring)\n", "print(score)" ], "execution_count": 69, "outputs": [ { "output_type": "stream", "text": [ "[0.81111111 0.85393258 0.83146067 0.83146067 0.85393258 0.80898876\n", " 0.83146067 0.80898876 0.74157303 0.79775281]\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "X9BNm7Z742ht", "outputId": "32487c68-984f-467e-ef25-a74be1f8a31a", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "# Random Forest Score\n", "round(np.mean(score)*100, 2)" ], "execution_count": 70, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "81.71" ] }, "metadata": { "tags": [] }, "execution_count": 70 } ] }, { "cell_type": "markdown", "metadata": { "id": "KZEJdDef42hu" }, "source": [ "## ৭. ক্যাগলে আপলোড \n", "\n", "প্রচুর ক্যাগলে আপলোড করেছেন আগের 'আর' এনভায়রনমেন্টে। এবার দেখবেন কী? এখানে আমরা একটা 'submission.csv' তৈরি করবো ক্যাগলে আপলোড করার জন্য। " ] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "hBsGnbPU42hu" }, "source": [ "clf = RandomForestClassifier(n_estimators=13)\n", "clf.fit(train_data, target)\n", "\n", "test_data = test.drop(\"PassengerId\", axis=1).copy()\n", "prediction = clf.predict(test_data)" ], "execution_count": 71, "outputs": [] }, { "cell_type": "code", "metadata": { "collapsed": true, "id": "rKrhA3BI42hu" }, "source": [ "submission = pd.DataFrame({\n", " \"PassengerId\": test[\"PassengerId\"],\n", " \"Survived\": prediction\n", " })\n", "\n", "submission.to_csv('submission.csv', index=False)" ], "execution_count": 72, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "TA5Eqdi_42hu" }, "source": [ "### সাবমিশন ফাইল তৈরি করে ভেতরে দেখা " ] }, { "cell_type": "code", "metadata": { "id": "HH6C5VTC42hu", "outputId": "a3ffe6e1-9947-4394-d87d-41cc068cc39a", "colab": { "base_uri": "https://localhost:8080/", "height": 0 } }, "source": [ "submission = pd.read_csv('submission.csv')\n", "submission.head()" ], "execution_count": 73, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvived
08920
18930
28940
38950
48961
\n", "
" ], "text/plain": [ " PassengerId Survived\n", "0 892 0\n", "1 893 0\n", "2 894 0\n", "3 895 0\n", "4 896 1" ] }, "metadata": { "tags": [] }, "execution_count": 73 } ] }, { "cell_type": "markdown", "metadata": { "id": "rtyJWDiG42hu" }, "source": [ "## কৃতজ্ঞতা এবং অন্যান্য ব্যবহৃত নোটবুক \n", "\n", "এই নোটবুক তৈরি করা হয়েছে এই নোটবুকগুলোর ইনপুট নিয়ে, মিনসুকের ধারণাটা রেখেছি ইচ্ছে করে:\n", "\n", "- [Mukesh ChapagainTitanic Solution: A Beginner's Guide](https://www.kaggle.com/chapagain/titanic-solution-a-beginner-s-guide?scriptVersionId=1473689)\n", "- [How to score 0.8134 in Titanic Kaggle Challenge](http://ahmedbesbes.com/how-to-score-08134-in-titanic-kaggle-challenge.html)\n", "- [Titanic: factors to survive](https://olegleyz.github.io/titanic_factors.html)\n", "- [Titanic Survivors Dataset and Data Wrangling](http://www.codeastar.com/data-wrangling/)\n", "- [Minsuk-Heo cross validation, submit file generation](https://github.com/minsuk-heo/kaggle-titanic)\n", "- [Demonstrates basic data munging, analysis, and visualization techniques. supervised machine learning techniques](http://agconti.github.io/kaggle-titanic)" ] } ] }