{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [ "s1", "content", "l1" ] }, "source": [ "# Welcome to the Machine Learning Track!\n", "\n", "This track is intended to introduce an exhaustive range of concepts that are applied in the day-to-day work of a Data Analyst. It will cover all concepts and follows a learning-by-doing methodology for skill-building, by providing many exercises and milestone labs for practicing application of the concepts previously learned. The objective of this track is to develop data analysis skills to collect, manipulate and present data for easy consumption by business users.\n", "\n", "## Data Life Cycle and the role of a Data Analyst\n", "\n", "Data/Information are key pieces of any communication. By understanding and analyzing data, we gain insights into business that are not explicitly communicated by various stakeholders. For Eg. When a customer purchases a product, he only communicates his intention to purchase the product and any other information requested. However, by mining data, we can find out specific details regarding the transaction like date of purchase, purchase amount, product details, total number of orders of the same product purchased on the same day by other customers, etc., and these details are not explicitly available or communicated.\n", "\n", "Data Life Cycle: The lifecycle of the data starts from a source, which generates data. Commercial web applications, Enterprise Systems, Sensors, Smartphones, Servers, People and nearly every entity that is capable of communication can be a source of data. Collection, cleansing and loading of data is widely referred to as ETL of data. Once the data is ready, it is given a structure based on the nature of the data and association each data item might have with another. This is done using the Entity-Relationship model, which would be discussed at length in future lessons. The data is then queried - selectively extracted and manipulated for consumption by business users. Analysis of data using statistical techniques and algorithms falls under data sciences and analysis of data using non-statistical techniques is usually the job of a data analyst.\n", "\n", "\n", "\n", "Role of a Data Analyst: Though the job responsibilities and duties change from organization to organization, the key functions of a Data Analyst are not limited to Data Acquisition, Data Cleansing and Transformation, Data Modelling, Data Manipulation and Analysis and Data Visualization.\n", "\n", "\n", "\n", "\n", "## Exploratory Data Analysis (EDA)\n", "\n", "Among the various types of analysis performed, Exploratory Data Analysis (EDA) is one of the most common and important analysis performed by a data analyst or data scientist. EDA is nothing but modeling and visualization of data in order to understand the nature and distribution of a data set. Displaying a section of the data set, displaying the data types of each attribute, plotting graphs to find distribution (histogram, pie charts etc.), graphs to find distribution of frequency and probability (bar plot, violin plot etc.), graphs to find outliers (box-plots etc.) are some of the common techniques performed in EDA of a data set.\n", "\n", "## What can I expect out of this course?\n", "\n", "This course is expected to provide a glimpse of real-time functions of a Data Analyst. The course content imparts the knowledge of a concept and immediately provides an opportunity to apply the concepts learned. The exhaustive content introduces (and allows intensive practice of) all concepts, knowledge of which is essential to succeed as a Data Analyst. However, continuous practice is recommended for gaining expertise in application of concepts to various situations that may be encountered in a professional environment." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "ce", "l1" ] }, "outputs": [], "source": [ "#Submit and Continue." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "l1", "hint" ] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "l1", "ans" ] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "hid", "l1" ] }, "outputs": [], "source": [ "ref_tmp_var = False\n", "\n", "ref_tmp_var = True\n", "\n", "assert ref_tmp_var" ] } ], "metadata": { "executed_sections": [], "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" }, "rf_version": 1 }, "nbformat": 4, "nbformat_minor": 2 }