{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "################################################################\n", "## ImportingData #2.1\n", "## Atul Singh\n", "## www.datagenx.net\n", "################################################################" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importing Data #2\n", "### #2.1 Importing data from internet" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# import \n", "import numpy as np\n", "import pandas as pd\n", "\n", "from urllib.request import urlretrieve" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " fixed acidity volatile acidity citric acid residual sugar chlorides \\\n", "0 7.4 0.70 0.00 1.9 0.076 \n", "1 7.8 0.88 0.00 2.6 0.098 \n", "2 7.8 0.76 0.04 2.3 0.092 \n", "3 11.2 0.28 0.56 1.9 0.075 \n", "4 7.4 0.70 0.00 1.9 0.076 \n", "\n", " free sulfur dioxide total sulfur dioxide density pH sulphates \\\n", "0 11.0 34.0 0.9978 3.51 0.56 \n", "1 25.0 67.0 0.9968 3.20 0.68 \n", "2 15.0 54.0 0.9970 3.26 0.65 \n", "3 17.0 60.0 0.9980 3.16 0.58 \n", "4 11.0 34.0 0.9978 3.51 0.56 \n", "\n", " alcohol quality \n", "0 9.4 5 \n", "1 9.8 5 \n", "2 9.8 5 \n", "3 9.8 6 \n", "4 9.4 5 \n" ] } ], "source": [ "# Assign url of file: url\n", "#file = \"https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv\"\n", "\n", "# Save file locally\n", "#urlretrieve(file, 'winequality-red.csv')\n", "\n", "# Read file into a DataFrame and print its head\n", "df = pd.read_csv('winequality-red.csv', sep=';')\n", "# we can directly read the file as well\n", "# df = pd.read_csv(file, sep=';')\n", "print(df.head())" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Reading excel from internet" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dict_keys(['1700', '1900'])\n", " country 1700\n", "0 Afghanistan 34.565000\n", "1 Akrotiri and Dhekelia 34.616667\n", "2 Albania 41.312000\n", "3 Algeria 36.720000\n", "4 American Samoa -14.307000\n" ] } ], "source": [ "# Assign url of file: url\n", "url = 'http://s3.amazonaws.com/assets.datacamp.com/course/importing_data_into_r/latitude.xls'\n", "\n", "# Read in all sheets of Excel file: xl\n", "xl = pd.read_excel(url, sheetname=None)\n", "\n", "# Print the sheetnames to the shell\n", "print(xl.keys())\n", "\n", "# Print the head of the first sheet (using its name, NOT its index)\n", "print(pd.read_excel(url, sheetname='1700').head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Performing HTTP request with urllib" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# Import packages\n", "from urllib.request import Request, urlopen\n", "\n", "# Specify the url\n", "url = \"http://www.datacamp.com/teach/documentation\"\n", "\n", "# This packages the request: request\n", "request = Request(url)\n", "\n", "# Sends the request and catches the response: response\n", "response = urlopen(request)\n", "\n", "# Print the datatype of response\n", "print(type(response))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Extract the response: html\n", "html = response.read()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "b'\\n\\n\\t\\n\\n\\t\\t\\n\\t\\t\\n\\t\\t\\n\\n\\n\\t\\t\\n\\n\\n\\t\\t\\n\\t\\t\\n\\t\\t\\n\\n\\t\\t\\n\\n\\t\\t\\n\\t\\t\\n\\t\\t\\n\\t\\tDataCamp - Documentation for Teaching\\n\\t\\n\\n\\t\\n\\t\\t\\n\\t\\t
\\n\\t\\t\\t
\\n\\t\\t\\t
\\n\\t\\t\\t\\n\\t\\t\\t\\n
\\n
\\n \\n\\n
\\n
\\n
\\n
\\n\\n
\\n
\\n \\n
\\n
\\n\\n
\\n
\\n
\\n

Welcome

\\n\\n

Everybody can teach on the DataCamp platform. This documentation aims to be a comprehensive guide to creating your own course on DataCamp. We\\'ll go over the structure of a DataCamp course, how to upload and update your content and information on coding up all elements of the different types of DataCamp exercises.

\\n\\n

What is a DataCamp course?

\\n\\n

An example of a DataCamp course is Introduction to R. The course consists of 6 chapters, each of which in turn consists of different exercises. This course is represented by a set of text files, that you can find on GitHub: The course details such as its title and its description, are contained in course.yml. Each chapter is represented by a markdown file: general chapter information, as well as all required elements of the chapter\\'s exercises are coded up in a readable format.

\\n\\n

A DataCamp course

\\n\\n

The course.yml and chapter markdown files together represent an entire DataCamp course. We allow you to \"upload\" this information to DataCamp\\'s databases, so that people around the world can take your course. It does this through GitHub, the Git repository hosting service.

\\n\\n

A GitHub repository containing the course files can be linked to DataCamp. Every time the GitHub repository is updated, DataCamp notices, and updates the corresponding course. There is a course specific overview where you can follow up with your changes to your course files, see which commits caused a particular build, whether these builds passed or failed and why.

\\n\\n

As such, DataCamp bridges the gap between the textual representation of a DataCamp course in the form of files and the actual course on DataCamp. By relying on GitHub version control, reproducibility and collaboration are baked into the process.

\\n\\n\\n\\n\\n
\\n
\\n

Getting Started

\\n\\n
\\n \\n
\\n\\n

1. Create account and link to GitHub

\\n\\n\\n\\n

Visit www.datacamp.com/teach. If you are not yet logged in to DataCamp, click on Get Started. DataCamp will ask you to log in and connect your GitHub account to DataCamp. In order for things to work properly, you have to grant DataCamp several permissions, such as access to your public and private repositories on GitHub.

\\n\\n

2. Create or link repository

\\n\\n

Starting from scratch? Use the \"Create a DataCamp course\" dialog to create a GitHub repository for you with template course files. Next, this repository is automatically linked to DataCamp and built for you. In the dialog, simply mention under which GitHub account or organization you want to create a repository and choose an appropriate title for the repository (this repository title doesn\\'t have to bear the name of the actual DataCamp course title). Also make sure to specify whether you want to create an R or a Python course.

\\n\\n

Already have a GitHub repository with DataCamp course files? Click the \"link existing GitHub repository\" link in the \"Create a DataCamp course\" dialog, and hit the \"Link repository\" button for the repository you want to link to DataCamp.

\\n\\n

3. Start developing your course

\\n\\n

That\\'s it! A GitHub repository has been created for you, and it has been linked to DataCamp. You can check out your course dashboard with all your courses. You can click on a course for an overview about the changes you made. If something went wrong, you get meaningful error messages so that you can make things right again. Every time you make a change to the GitHub repository, the corresponding DataCamp course will be updated accordingly.

\\n\\n

To make changes to the source files in your repository, you can head over to your repository on GitHub, click a file and make and commit the changes there. Alternatively, you can clone the repository to your local system. After making the changes, you can commit and push your changes to the repository. In both cases, DataCamp is notified about changes to the repository and you can check your updated course on DataCamp.

\\n
\\n
\\n

Dashboard

\\n\\n

If you\\'re logged in on DataCamp with an account that is successfully linked to a GitHub account and you visit www.datacamp.com/teach, you get to see the Teach Dashboard. Here, you can create a new DataCamp course, link existing GitHub repositories with course files that represent a DataCamp course and browse through the GitHub repositories that you\\'ve linked to DataCamp.

\\n\\n

Create a DataCamp course dialog

\\n\\n

Starting from scratch? Use the \"Create a DataCamp course\" dialog to create a GitHub repository for you with template course files. Next, this repository is automatically linked to DataCamp and built for you. In the dialog, simply mention under which GitHub account or organization you want to create a repository and choose an appropriate title for the repository (this repository title doesn\\'t have to bear the name of the actual DataCamp course title). Also make sure to specify whether you want to create an R or a Python course.

\\n\\n

Already have a GitHub repository with DataCamp course files? Click the \"link existing GitHub repository\" link in the \"Create a DataCamp course\" dialog, and hit the \"Link repository\" button for the repository you want to link to DataCamp.

\\n\\n

From this point onwards, changes to the repository in question will cause the corresponding DataCamp course to update.

\\n\\n

GitHub repositories linked to DataCamp

\\n\\n

The dashboard also gives an overview of all GitHub repositories that are linked to DataCamp. Details about the last build, such as the time, the commit messages and the build results with possibly additional information are shown. If you click the repository title or the \"More Details\" link, you are redirected to a detailed overview of the repository. For more information about this view, visit the Repository Overview article.

\\n\\n\\n
\\n
\\n

Repository Overview

\\n\\n

You can access the repository overview by clicking the name of the repository on the Dashboard. This overview contains all the information to inspect and manage your repository. The overview pane features a tab for every branch in your GitHub repository. Every branch has its own overview. This is because each branch on GitHub corresponds to a different course on DataCamp. The master branch always corresponds to the \\'production\\' version of your course. By making a separate course for each branch, it becomes easy to work on big updates and to collaborate on a course: you simply create a new branch and make changes on the branch. You can add a new branch via the repository overview by clicking on the + sign at the top of your repository window. Once the new branch is created, you can start making updates to that branch. That way, you can share a \\'development version\\' of your course with a test audience, before merging those changes into the master branch, corresponding to the \\'production\\' version of the course.

\\n\\n

Each \\'branch overview\\' shows you the title, author, course image and author badge of the course. If you click the course title, you are redirected to the course corresponding to the selected branch on DataCamp. Next to this general information, the branch overview contains tabs with the most recent build, the build history and the settings. All of this information - title, images, builds, settings - is branch-specific.

\\n\\n

Build Information

\\n\\n

Whenever you make a change to a particular branch on your GitHub repository, DataCamp notices and starts a build attempt. It fetches all the course files from the branch, parses them, performs validity checks on them, and uploads them to DataCamp\\'s databases, so that they are available on the learning platform.

\\n\\n

Each such build attempts is listed in the Build History tab. The collapsed view shows the state of the attempt (in progress, warning, passed or failed), the time and last contributor to the changes. You can click the \"Show Details\" link to display all commit messages (you can click the commit hashes to inspect them on GitHub) and the build logs that are produced by building the course. If something went wrong during the build, it is here that you can figure out what went wrong and make appropriate changes. Some of the build logs will be clickable and will guide you towards the place where you should correct your markdown files.

\\n\\n

The \"Most Recent Build\" tab is simply an already expanded version of the last build attempt you\\'ve made, with the exact same information.

\\n\\n

Settings

\\n\\n

Under the \"Settings\" tab, you can adapt the settings of a particular repository\\'s branch.

\\n\\n

On every build attempt, all exercises in your chapters can be tested for correctness: it checks whether running the exercise\\'s solution happens without errors and whether the Submission Correctness Test passes, presenting the success message to the student. When you push changes to a branch, by default only the chapters that are updated are tested. This setting can be updated to test all chapters whenever a change is made, or to test no chapters whatsoever. It is suggested to not test chapters when your course is still in early development, and you don\\'t expect the exercises to all work. Later, when your course becomes ready for production, you can switch back to testing the chapters that are updated. Note: Your changes will be updated in the corresponding DataCamp course, even though some exercises might fail.

\\n\\n

The \\'force\\' setting decides whether or not to directly remove exercises that are no longer in your chapter. If force is disabled, and you remove an exercise from a chapter file, the exercise you\\'ve removed will still appear on DataCamp. If you enable force, exercises that are no longer in your chapter file, will also be removed from DataCamp\\'s databases.

\\n\\n

Finally, you can also decide to unlink your repository. If you do this, DataCamp will no longer listen to changes that are made on the repository, and no more builds will be triggered whenever you make updates. After unlinking, you can always link your course again through the Dashboard with the courses you\\'ve made (link existing repository).

\\n\\n
\\n
\\n

Teach Editor

\\n\\nCourses can be edited using the Teach Editor, an editor specifically built to create\\nDataCamp courses. This editor can be reached by clicking the Edit this Course\\nbutton on the course overview Repository Overview page.\\n\\n

Features Overview

\\n\\nFeatures of the editor include:\\n\\n
    \\n
  • Edit: edit your chapter files in a markdown editor. The editor\\n includes syntax highlighting, autocomplete, undo/redo, ...
  • \\n
  • Preview: preview your course. This allows you to take a\\n look at how your changes will look on DataCamp, before actually pushing the content\\n live.
  • \\n
  • Save: save your progress directly to GitHub. Saving in the editor\\n triggers a commit in the GitHub repository.
  • \\n
  • Add datasets: add datasets by clicking on the Datasets\\n button and using the Dataset Manager to upload your file or choose to copy the link to a\\n dataset that has already been uploaded to our servers.
  • \\n
  • And many more...
  • \\n
\\n\\n

GitHub synchronization

\\n\\nWith the Teach Editor, you edit markdown files that are also on GitHub. While using the editor,\\na version of the file is kept on the Teach Editor server. This means that when you accidentally\\nclose your browser window, your changes will not be lost. They\\'re saved to our Teach Editor\\nserver! However they will be thrown away eventually (after a while). So if you\\'re done working,\\nit\\'s always best to save your progress to GitHub.

\\n\\nHolding a temporary version (or cache) of the markdown files on our server has as a side effect that these version can start diverging from the version of the files that are on GitHub. As an example, suppose you\\'re editing the first chapter file in the Editor, and while you\\'re doing that someone else commits a change on GitHub. The 2 versions will start diverging, and when you save the progress you made in the Teach Editor, you might overwrite the changes made via the GitHub commit.\\n\\nThe editor has some counter measures against several bad situations like the one above.\\n\\n
    \\n
  • \\n Situation: there\\'s a commit on GitHub while,\\n
      \\n
    • No one is using the editor for this course
    • \\n
    • Last time editor was used for this course, everything got saved
    • \\n
    \\n Steps: \\n
      \\n
    1. No steps have to be taken, the cache is invalidated by the GitHub push and next time a user uses the editor for this course, the new content will be pulled from GitHub.
    2. \\n
    \\n
  • \\n
  • \\n Situation: there\\'s a commit on GitHub while,\\n
      \\n
    • Someone is using the editor for this course
    • \\n
    • All changes just got saved
    • \\n
    \\n This can happen when someone uses the editor, saves and leaves the editor browser open.
    \\n Steps: \\n
      \\n
    1. While the user is working in the editor, the following popup appears (this can take up to 20 seconds)\\n \"Editor\\n
    2. \\n
    3. User chooses:\\n
        \\n
      • Sync with newest version will pull the latest version from GitHub, this action is advised as the green notification means you have no cached changes that will be lost.
      • \\n
      • Keep this version will let the user keep working with this version, no content will be pulled from GitHub. Next time you save, you could overwrite changes that were made on GitHub. If this is the case, a conflict branch will be created. The previous changes on GitHub will be saved n the conflict branch.
      • \\n
      \\n
    4. \\n
    \\n
  • \\n
  • \\n Situation: there\\'s a commit on GitHub while,\\n
      \\n
    • No one is using the editor for course
    • \\n
    • Not all changes are saved in the editor
    • \\n
    \\n Steps: \\n
      \\n
    1. The cached changes can not just be invalidated like in the first situation, because this would mean losing content
    2. \\n
    3. The next time when someone opens up the course, the following popup appears\\n \"Editor\\n
    4. \\n
    5. User chooses:\\n
        \\n
      • Sync with newest version will pull the latest version from GitHub, this will cause the user to lose the changes that are currently only made on the temporary files on the Teach Editor server.
      • \\n
      • Keep this version will let the user keep working with this version, no content will be pulled from GitHub. Next time you save, you could overwrite changes that were made on GitHub. If this is the case, a conflict branch will be created. The previous changes on GitHub will be saved n the conflict branch.
      • \\n
      \\n
    6. \\n
    \\n
  • \\n
  • \\n When the last situation happens if someone is using the editor, the red notification won\\'t be shown automatically without a refresh (in contrast to the green notification). It will only show after a refresh.\\n
  • \\n
\\n\\n
\\n
\\n

Course Structure

\\n

A DataCamp course, such as Introduction to R consists of chapters; each chapter in its turn consists of exercises.\\n\\n

If you used DataCamp to create a repository for you, this repository will contain three files:

\\n\\n
    \\n\\t
  • README.md: A readme file with a list of resources and some more explanation to get you started.
  • \\n\\t
  • course.yml: A YAML-formatted file that contains general information about your course.
  • \\n\\t
  • chapter1.Rmd: A chapter file with some exercise templates.
  • \\n
\\n\\n

Course file: course.yml

\\n

All general information concerning the course is contained in the YAML-formatted course.yml file.

\\n
    \\n\\t
  • title: Title of the course.
  • \\n\\t
  • \\n\\t\\tinstructors: A YAML-formatted list of instructors of this course. This should be a list of email addresses of users on DataCamp. The biography and picture of these instructors can be edited through their profile settings page on DataCamp.\\n\\n\\t\\tFor example:\\n
    instructors:\\n  - email@datacamp.com
    \\n\\t
  • \\n\\t
  • \\n\\t\\tcollaborators: Same principle as instructors. Collaborators are people who assisted in creating the course.\\n\\n\\t\\tFor example:\\n
    collaborators:\\n  - email@datacamp.com
    \\n\\t
  • \\n\\t
  • university: Organization, university or company the author is linked to.
  • \\n\\t
  • description: Description of the course.
  • \\n\\t
  • programming_language: Programming language of the course (r or python).
  • \\n\\t
  • difficulty_level: 1, 2 or 3, depending on the difficulty of the course. 1 corresponds to beginner, 2 to intermediate and 3 to advanced.\\n\\t
  • time_needed: time_needed: The time needed to finish the course. Use small caps. \"4 hours\", for example.\\n
\\n\\n

You can update the course.yml at all times: simply commit your changes to the GitHub repository and the DataCamp course will be updated accordingly

\\n\\n

Chapter files: chapterX.(R)md

\\n

A course can contain multiple chapters. These chapters should be named chapter1.Rmd, chapter2.Rmd, chapter3.Rmd etc for R courses, and chapter1.md, chapter2.md and chapter3.md for Python courses. The number in the file name determines the order in which the chapters appear in the course.

\\n\\n

The chapter file consists of two parts: the chapter-specific information, and the exercises.

\\n\\n
Chapter-specific information
\\n

Similar to the course.yml file, chapter files start with a YAML header containing information about the chapter:

\\n\\n
    \\n\\t
  • title: Title of the chapter.
  • \\n\\t
  • description: Description of the chapter.
  • \\n\\t
  • free_preview: If it\\'s a paying course, whether this chapter is a free preview (typically the first chapter of a premium course is a free preview).
  • \\n\\t
  • \\n\\t\\tattachments: Generic way to include attachments. Typically used to include slides for the chapter, through slides_link.\\n\\n\\t\\tFor example:\\n
    attachments:\\n  slides_link: http://link.to.slides/
    \\n\\t
  • \\n
\\n\\n
Exercises
\\nAfter the YAML header inside the chapterX.(R)md files, you can add different exercises, one after the other. There are three types of exercises:\\n
    \\n\\t
  • VideoExercise: A tutorial video, explaining concepts and code that will be used in the exercises that follow. VideoExercises are typically only part of premium courses that require a subscription.\\n\\t
  • NormalExercise: An interactive exercise, where the student is expected to submit code based on the assignment and instructions provided. The student\\xe2\\x80\\x99s submission is compared to the ideal solution with DataCamp\\xe2\\x80\\x99s autograder and appropriate feedback is generated.\\n\\t
  • MultipleChoiceExercise: Provides the student with a set of possible answers. Based on the exercise information and the instruction, the student is expected to select one (and only one) answer. During a multiple choice exercise, the user can still experiment in the R console.
  • \\n
\\n\\n

For more details on how these exercises can be coded, check the Code Exercises article.

\\n\\n
Making changes
\\n\\n

You can update the different chapterX.(R)md files in your repositories at all times: simply commit your changes to the GitHub repository and the corresponding DataCamp course will be updated accordingly. Want to remove a chapter? Simply remove the chapter file from the repository. Want to change the order of the chapters? Simply rename the chapters to adapt the numbering, and the chapters will be reordered in the corresponding DataCamp course.

\\n\\n\\n
\\n
\\n

Code Exercises

\\n\\n

If you\\'ve created a course repository through www.datacamp.com/teach, your chapter file will already contain some DataCamp exercises to start from. There\\'s three types of exercises: a NormalExercise, a MultipleChoiceExercise and a VideoExercise. These different types of exercises have some elements in common; other elements are type-specific.

\\n\\n
Submission Correctness Tests
\\n\\n

Both NormalExercises and MultipleChoiceExercises require you to write so-called Submission Correctness Tests, to assess whether the student gave the correct answer Writing good SCTs is a challenge; that\\'s why programming language-specific packages have been written to make this process as easy as possible. For R, there\\'s the testwhat package (GitHub wiki). For Python, there\\'s the pythonwhat package (GitHub wiki).\\n\\n

Making Changes
\\n\\n

Changes you make are automatically reflected in the corresponding chapter on DataCamp once you commit and push the changes to the GitHub repository. Because DataCamp needs a way to know which exercises in the chapter files correspond to which exercises in DataCamp\\'s databases, so-called exercise keys are used. Whenever you add a new exercise to a chapter, DataCamp will add a key to the chapter file on your behalf. From then on, this key acts as a unique identifier of that exercise. As soon as there is a key, you can safely change the title and type of the exercise, without risking to lose any progress that students might have made.

\\n\\n

Exercise Examples

\\n\\n
NormalExercise
\\n\\n

The NormalExercise, which is the traditional interactive exercise, is a chapter component where the student is expected to write code. Additional information regarding the exercise or the data set that is used is stated in the assignment part; the actual tasks the student has to solve through code are outlined in the instructions. Behind the scene, the workspace is prepared for the student\\xe2\\x80\\x99s actions using the pre-exercise code. The student starts to code in the editor that is initialized with the sample code. When the student is not able to solve the exercise, he or she can refer to the hint or, ultimately, can peek at the solution. Every time the student hits the \\xe2\\x80\\x98Submit Code\\xe2\\x80\\x99 button, his or her code is checked for correctness using the submission correctness testing code. This SCT generates a feedback message that depends on the input of the student.

\\n\\n

The example code below pre-sets a variable y with the value 3, before the exercise even starts. The sample code acts as a \\xe2\\x80\\x98fill-in form\\xe2\\x80\\x99 where students can write code. Notice from the exercise header that you can specify two skills, 1 (R) and 3 (Data Handling) in this case. To learn more about skills and xp, check out the Gamification article.

\\n\\n
\\n--- type:NormalExercise lang:r xp:100 skills:1,3\\n## Interactive Exercise Title\\n\\nThis basic exercise will challenge you to assign a variable in R.\\n\\n*** =instructions\\n- Assign `5` to the variable `x` in the editor on the right.\\n\\n*** =hint\\nUse `<-` for assignment.\\n\\n*** =pre_exercise_code\\n```{r}\\ny <- 3\\n```\\n\\n*** =sample_code\\n```{r}\\n# Assign 5 to the variable x\\n```\\n\\n*** =solution\\n```{r}\\n# Assign 5 to the variable x\\nx <- 5\\n```\\n\\n*** =sct\\n```{r}\\ntest_error()\\ntest_object(\"x\",\\n            undefined_msg = \"Make sure to define `x`!\",\\n            incorrect_msg = \"Have you correctly assigned 5 to `x`!\")\\nsuccess_msg(\"Good job! Head over to the next exercise\")\\n```\\n
\\n\\n

If you need entire data sets in your exercise, it doesn\\'t make sense to include them as text in the pre-exercise code. Moreover, if you\\'re using the same dataset across different exercises, this would imply extreme code duplication. What you can do instead, is saving your dataset as a flat file and store it in the `datasets` folder to make it available online. Read through the Upload Assets article to learn more.

\\n\\n
Multiple Choice Exercise
\\n\\n

Multiple choice exercises offer the student a number of options from which he/she has to choose. The assignment introduces the exercise and asks a question. The instructions consist of a unnumbered list of options, which are compiled to radio buttons in the DataCamp environment. As in interactive exercises, the pre exercise code and the submission correctness tests have to be specified. Unlike interactive exercises, there is no sample nor solution code for this type of exercises.

\\n\\n
\\n--- type:MultipleChoiceExercise lang:r xp:50 skills:3\\n## Multiple Choice Exercise Title\\n\\nThis is the assignment.\\nHow much is `2 + 2`?\\n\\n*** =instructions\\n- 2\\n- 3\\n- 4\\n\\n*** =hint\\nHere\\'s a hint: think about your time in primary school.\\n\\n*** =pre_exercise_code\\n```{r}\\n# no pec\\n```\\n\\n*** =sct\\n```{r}\\nmsg1 = \"Try again! Think about your time in school.\"\\nmsg2 = \"Summing two even numbers always leads to another even number. Try again.\"\\nmsg3 = \"Well done. Proceed to the next exercise\"\\ntest_mc(correct = 3, feedback_msgs = c(msg1,msg2,msg3))\\n```\\n
\\n\\n
VideoExercise
\\n\\n

Video exercises feature an instructional video. The finished video shows the instructor explaining concepts with slides that are rendered in the background. The video files are uploaded to Vimeo and the embed link is added to the video exercise. Inside a chapter file (e.g. chapter1.md) a video lecture is added as follows:

\\n\\n
\\n--- type:VideoExercise lang:r aspect_ratio:62.5 xp:50 skills:1\\n## Video Exercise Title\\n\\n*** =video_link\\n//player.vimeo.com/video/108225030\\n
\\n\\n

This code will generate a video exercise that is titled \\xe2\\x80\\x9cVideo Exercise Title\\xe2\\x80\\x9d. It refers to the Vimeo video with id 108225030 and has an aspect ratio of 62.5 percent, which corresponds to a 1728x1080 video; the default ration is 16:9. Notice the exercise identifier (--- type:VideoExercise). This video, if watched, leads to 50 experience points for the R skill (id = 1). To learn more about skills and xp, check out the Gamification article.

\\n\\n\\n

Python or R?

\\n\\n

On DataCamp, you can teach both R and Python. The different parts of the exercises are all the same, with the only difference that you include Python code in the code blocks instead of R code.

\\n\\n

To code a python exercise, you need to do three things:

\\n\\n
    \\n
  • In the course.yml file, add programming_language: python to the YAML.
  • \\n
  • In the exercise header (--- type:...) set lang:python instead of lang:r.
  • \\n
  • The language specification inside curly brackets for all code blocks ({r}) should be changed to {python}.
  • \\n
\\n\\n

As a comparison of R and Python exercise, check the two short snippets below. Typically, the skills will be different too (Python has skill id 2).

\\n\\n

Example of R exercise:

\\n\\n
\\n--- type:NormalExercise lang:r xp:100 skills:1\\n\\n... _assignment and instructions omitted_\\n\\n*** =pre_exercise_code\\n```{r}\\n# this it the pre_exercise_code\\nlibrary(dplyr)\\n```\\n\\n... _rest of exercise omitted_\\n
\\n\\n

Example of Python exercise:

\\n\\n
\\n--- type:PythonExercise lang:python xp:100 skills:2\\n\\n... _assignment and instructions omitted_\\n\\n*** =pre_exercise_code\\n```{python}\\n# this it the pre_exercise_code\\nimport numpy as np\\n```\\n\\n... _rest of exercise omitted_\\n
\\n\\n\\n

Summary of exercise elements

\\n\\n
\\n\\t\\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t
PropertyVideoNormalMultiple
type: exercise type.
key: exercise key (don\\'t change or set this manually)
lang: Programming language of the exercise.
title: Title of the exercise
assignment: Text that clarifies or introduces the exercises.
instructions: The actual question or task for the student. In the case of a multiple choice exercise, the instructions list the options from which the student can choose.
hint: Hint that can help the student to correctly solve the exercise.
pre_exercise_code: Code that is run in the background when the exercise is loaded. It typically loads the necessary libraries and defines variables to allow the student to start solving the exercise right away.
sample_code: Initial code that is available to the user at the onset of the exercise, to better guide the student to a solution or to limit typing efforts.
solution: Correct solution to an interactive exercise. How the exercise should have been solved ideally. However, other approaches that also lead to a correct result should be selected.
sct: code that is executed to test the correctness of the student\\xe2\\x80\\x99s submission.
video_link: link to the video lecture on Vimeo.
aspect_ratio: The aspect ratio of a video lecture, as a percentage, calculated as height / width * 100.
\\n
\\n\\n\\n
\\n
\\n

Upload Assets

\\n\\n

A DataCamp course has a course badge, and your exercises will typically use datasets. The DataCamp allows you to add these assets to your course, also through GitHub.

\\n\\n

Images

\\n\\n

When you create a course with the \"Create a DataCamp Course\" dialog, DataCamp will create the course files and an img folder with a template course shield and author image. The course shield is named shield_image.png, the author image is called author_image.png. You can replace these files with your own images, just make sure to use the same names and files under 1Mb. As soon as you push the changes to GitHub, the images of your course will be updated.

\\n\\n

Datasets

\\n\\n

To use a dataset in your exercises, create a datasets folder on the root level of the repository. Every file in the `datasets` folder with a typical dataset extension, will be uploaded whenever you push the changes to GitHub. The build logs that you can find in the repository overview will inform abou the upload of these datasets to DataCamp\\'s Amazon S3 servers, and provide a link. You can then use this link in the pre_exercise_code chunk of your exercises to initialize the workspace for the student.\\n\\n

Example for loading in an RData file in an R exercise:

\\n\\n
\\n*** =pre_exercise_code\\n```{r}\\nload(url(\"http://assets.datacamp.com/production/course_123/my_file.RData\"))\\n```\\n
\\n\\n

Example pre exercise code for loading in a CSV file in an R exercise:

\\n\\n
\\n*** =pre_exercise_code\\n```{r}\\nmovies <- read.csv(\"http://assets.datacamp.com/production/course_123/movies.csv, stringsAsFactors = FALSE)\\n```\\n
\\n\\n

Example pre exercise code for loading in a CSV file in a Python exercise (with Pandas):

\\n\\n
\\n*** =pre_exercise_code\\n```{python}\\nimport pandas as pd\\ndf = pd.read_csv(\\'http://assets.datacamp.com/production/course_234/gapminder.csv\\', index_col = 0)\\n```\\n
\\n\\n\\n

To limit the resource usage of the exercises, you can upload datasets up to 10Mb in size. In addition, to guard against misuse, only files with extensions csv, xlsx, RData, Rds, Rda, hdf5, dta, p, sas7bdat, mat, sqlite3 and db3 are supported. If there are other file extensions that you want to upload, please contact support.

\\n\\n

If you want to make available self-defined functions in your R exercise, don\\'t use an RData file to store these functions. Because of the way our submission checking system works (the student and solution code are executed in separate environments), storing functions that have been assigned in the global R environment, to later use them in an RData file causes issues. Instead, write the functions in a script that you store in your datasets folder and use eval(parse(\"link_to_script_url\")) in your pre exercise code to make the functions available.

\\n\\n

Slides and Videos

\\n\\n

Premium DataCamp courses feature videos and corresponding slides. To upload slides, you can work in the same way as for datasets. This time though, make sure you upload PDF files to a folder slides in the repository\\'s root directory. Again, the build logs will generate a link that you can use to specify the slides_link in the chapter header, as detailed in the Course Structure article of this documentation.

\\n\\n

Because videos are typically very big files, they should not be added to your GitHub repository. Contact support if there are videos that you want to add to your course, and we can help you to add them to your course.

\\n\\n
\\n
\\n

Gamification

\\n\\n

To enable DataCamp\\xe2\\x80\\x99s gamification features, you have to specify one or more skills and a number of experience points for each exercise. A MultipleChoiceExercise or VideoExercise typically gets 50 xp points, a NormalExercise normally gets 100 experience points. The skill that you attribute to your exercise depends on its contents. Have a look at the header of this interactive R exercise:\\n\\n

\\n--- type:NormalExercise lang:r xp:100 skills:1,4\\n## Scatter plot\\n\\n... _rest of exercise omitted_\\n\\n
\\n\\n

This interactive exercise earns the student 100 xp for the R skill (1) and the visualization skill (4) after successfully completing the exercise. The student can check the earned XP on his or her personal profile.

\\n\\n

If the student asks for the hint, 30 percent of the exercise\\'s XP will be deducted; if the student asks for the solution, all XP will be deducted. The student can come back after 24 hours to try again and earn full XP again. As soon as you finish an exercise, DataCamp\\'s databases will consider it completed, irrespective of the xp that you earn. It\\'s hence perfectly possible to finish an entire course and earn a certificate, but to earn no experience points.

\\n\\n

At the moment, the following skills, with their corresponding skills ids are available on DataCamp:

\\n\\n
\\n\\t\\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t \\n\\t
SkillSkill id
R Programming1
Python Programming2
Data Handling3
Data Manipulation4
Reporting5
Machine Learning6
Statistical Modeling7
Importing & Cleaning Data8
\\n
\\n
\\n
\\n

General Style Guide

\\n\\n

Writing Style

\\n\\n
    \\n
  • Use American English for public courses.
  • \\n
  • Correct punctuation should be present in: Assignment, Hint, Instructions, Submission Correctness Tests.
  • \\n
  • Use the \\xe2\\x80\\x9cyou\\xe2\\x80\\x9d-form to give chapters a personal touch. Limit the we-ing.
  • \\n
  • Do a double check on spelling mistakes.
  • \\n
\\n\\n

Assignment, Instructions, Hint

\\n\\n
    \\n
  • Code is placed in backticks, to give code formatting: my_var, my_fun())
  • \\n
  • Package names are written in code style, using backticks.
  • \\n
  • Functions and methods are written inside backticks, and with trailing (): my_fun().
  • \\n
  • Blocks of code are formatted using three backticks:
  • \\n
\\n\\n
# Good\\n```\\nprint(\"This is a complex sum:\")\\nprint(3 + 5)\\n```\\n\\n# Bad\\n`\\nprint(\"This is a complex sum:\")\\nprint(3 + 5)\\n`\\n
\\n\\n
    \\n
  • Emphasize works with italics. Use bold to introduce new concepts.
  • \\n
  • Place mathematical expressions between $ signs. It will be compiled as LaTeX output.
  • \\n
\\n\\n
# Good\\n$ a_1 = b_1 + c_1 $\\n\\n# Bad\\na_1 = b_1 + c_1\\n
\\n\\n

Commenting guidelines

\\n\\n
    \\n
  • Keep comments short and sweet, ideally 60 characters or less. They don\\'t need to repeat an entire instruction and shouldn\\'t introduce any new information. Think of them as placeholders in the editor, which generally map 1-1 to instructions.
  • \\n
  • To refer to variables or functions inside comments, don\\xe2\\x80\\x99t use backticks or quotes:
  • \\n
\\n\\n
# Good way to refer to a variable x or the function my_fun()\\n\\n# Bad way to refer to a variable `x` or the function \\'my_fun()\\'\\n
\\n\\n

Submission Correctness Test

\\n\\n
    \\n
  • You can use markdown syntax inside the R strings of the SCT feedback, for function names, variable names, package names etc. testwhat converts the strings to HTML so that they are correctly rendered.
  • \\n
\\n\\n
# Good\\ntest_object(\"x\",incorrect_msg = \"Have you specified the variable `x` correctly?\")\\nsuccess_msg(\"Well done. `x` is a variable.\")\\nsuccess_msg(\"Well done. `mean()` is a function.\")\\n\\n# Bad\\ntest_object(\"x\",incorrect_msg = \"Have you specified the variable x correctly?\")\\nsuccess_msg(\"Well done. \\'x\\' is a variable.\")\\nsuccess_msg(\"Well done. \\\\\"mean()\\\\\" is a function.\")\\n
\\n\\n
\\n
\\n

R Style Guide

\\n

The R style guide is largely based on the style guide put together by Hadley Wickham. It shows small differences to better match educational needs, and introduces some new elements that are specifically relevant for content in the DataCamp format.

\\n\\n\\n\\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n\\n \\n \\n \\n \\n \\n\\n \\n \\n \\n\\n \\n \\n \\n \\n \\n\\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n
\\n
Assignment
\\n
Use <- for assignment\\n
x <- 3
\\n
\\n
x = 3
\\n
\\n
String Quotes
\\n
Pick either single or double quotes and stick with it\\n
x <- c(\"a\", \"b\")
\\n
\\n
x <- c(\"a\", \\'b\\')
\\n
Choose alteration over escaping\\n
print(\\'She said: \"nice\"\\')
\\n
\\n
print(\"She said: \\\\\"nice\\\\\"\")
\\n
\\n
Whitespace
\\n
\\n

Always surround infix operators with a single space on either side

\\n
Assignment\\n
y <- 2
\\n
\\n
y<-2
\\n
Comparisons\\n
y <= 2
\\n
\\n
y<=2
\\n
Boolean\\n
(x > 2) || (y < 5)
\\n
\\n
(x > 2)||(y < 5)
\\n
Arithmetic\\n
y <- x + 1
\\n
\\n
y <- x+1
\\n
\\n

Special cases

\\n
Always place a space after a comma, never before\\n
mean(c(1, 2, 3))
\\n
\\n
mean(c(1,2,3))
\\n
:, :: and ::: do not need spaces\\n
base::mean(1:3)
\\n
\\n
base :: mean(1 : 3)
\\n
Place a space before left parentheses, expect in a function call\\n
if (debug) do(x)
\\n
\\n
if(debug) do (x)
\\n
No spaces around code in parenthese or square brackets (unless there\\'s a comma)\\n
mtcars[5, ]
\\n
\\n
mtcars[5,]
\\n
\\n
Curly braces
\\n
Curly braces don\\'t mix in with code.\\n
if (grade > 10) {\\n  message(\\'passed\\')\\n}
\\n
\\n
if (grade > 10)\\n{ message(\\'passed\\') }
\\n
It\\'s ok to leave very short statements on the same line\\n
if (y < 0) message(\"negative\")
\\n
\\n
\\n
Indentation
\\n
Use 2 spaces inside curly brackets\\n
if ( var == 3 ) {\\n  print(\"nice\")\\n}
\\n
\\n
if ( var == 3 ) {\\nprint(\"nice\")\\n}
\\n
Keep aligned with opening delimiter\\n
foo <- fun(one, two,\\n           three, four)
\\n
\\n
foo <- fun(one, two,\\n    three, four)
\\n
\\n
Naming Styles and Conventions
\\n
Variable and function names are lowercase\\n
datacamp
\\n
\\n
DataCamp
\\n
Use an underscore to separate words within a name\\n
is_valid
\\n
\\n
isValid\\nis.valid
\\n
Variable names are nouns, function names are verbs\\n
my_val <- 5\\ncheck_validity(my_val)
\\n
\\n
check_validity <- 5\\nmy_val(check_validity)
\\n
\\n
Function calls
\\n
The first argument shall not be called by its name, typically other arguments should be named\\n
points(point,\\n       cex = .5,\\n       col = \"red\")
\\n
\\n
points(x = point,\\n       cex = .5,\\n       col = \"red\")
\\n
\\n
Comments
\\n
Use a single hashtag\\n
# This is a good comment
\\n
\\n
## Avoid multiple hashtags
\\n
Use one space after hashtag\\n
# This is a good comment
\\n
\\n
#Avoid this
\\n
Start every sentence with a capital letter\\n
# This is nice
\\n
\\n
# this is nasty
\\n
Single sentences don\\'t require period. Multiple sentences do.\\n
# Sentence\\ndo_something()\\n\\n# Sentence1. Sentence2.\\ndo_stuff()
\\n
\\n
# Sentence.\\ndo_something()\\n\\n# Sentence1. Sentence2\\ndo_stuff()
\\n
\\n\\n\\n
\\n
\\n

Python Style Guide

\\n

The Python style guide you see below is a concise and slightly altered version of the PEP 8, the widely accepted style guide for Python code. Some guidelines are added to improve the educational value and clarity of the code on DataCamp. If you can\\'t find an answer to your styling questions in this document, please refer to the PEP8 webpage.\\n\\n\\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n
\\n
Indentation
\\n
\\n

Spaces are preferred over tabs and max line length is 79 characters\\n

Use 4 spaces\\n
if var == 3 :\\n    print(\"nice\")
\\n
\\n
if var == 3 :\\n  print(\"nice\")
\\n
Keep aligned with opening delimiter\\n
foo = func(one, two,\\n           three, four)
\\n
\\n
foo = func(one, two,\\n    three, four)
\\n
Add more indentation to distinguish\\n
if (condition1\\n        and condition2):\\n    do_something()
\\n
\\n
if (condition1\\n    and condition2):\\n    do_something()
\\n
Close under the first character of the starting line\\n
my_list = [\\n    1, 2, 3,\\n    4, 5, 6\\n]
\\n
\\n
my_list = [\\n    1, 2, 3,\\n    4, 5, 6\\n    ]
\\n
\\n
Imports
\\n
\\n

Put imports on top of the file

\\n
Imports should be on separate lines\\n
import os\\nimport sys
\\n
\\n
import os, sys
\\n
Avoid wildcard imports\\n \\n
from pythonwhat import *
\\n
It\\'s best to use import only for packages and modules\\n
import math\\nprint(math.pi)
\\n
\\n
from math import pi\\nprint(pi)
\\n
\\n
String Quotes
\\n
Pick either single or double quotes and stick with it\\n
x = [\"a\", \"b\"]
\\n
\\n
x = [\"a\", \\'b\\']
\\n
Choose alteration over escaping\\n
print(\\'She said: \"nice\"\\')
\\n
\\n
print(\"She said: \\\\\"nice\\\\\"\")
\\n
\\n
Whitespace
\\n
\\n

Always surround binary operators with a single space on either side

\\n
Assignment\\n
y = 2
\\n
\\n
y=2
\\n
Comparisons\\n
y <= 2
\\n
\\n
y<=2
\\n
Boolean\\n
(x > 2) and (y < 5)
\\n
\\n
(x > 2)and(y < 5)
\\n
Arithmetic\\n
y = x + 1
\\n
\\n
y = x+1
\\n
\\n

Special cases

\\n
No space before comma, semicolon or colon\\n
def func(one, two):\\n    print(one + two);
\\n
\\n
def func(one , two) :\\n      print(one + two) ;
No space around equal sign for default parameter\\n
def func(one=1, two=3):\\n    print(one + two)
\\n
\\n
def func(one = 1, two = 3):\\n      print(one + two)
\\n
Naming Styles and Conventions
\\n
Variable and function names are lowercase\\n
datacamp
\\n
\\n
DataCamp
\\n
Use an underscore to separate words within a name\\n
is_valid\\ndef check_validity():\\n    return(True)
\\n
\\n
isValid\\ndef checkValidity():\\n    return(True)
\\n
Variable names are nouns, function names are verbs\\n
my_val = 5\\ncheck_validity(my_val)
\\n
\\n
check_validity = 5\\nmy_val(check_validity)
\\n
\\n
Comments
\\n
Use a single hashtag\\n
# This is a good comment
\\n
\\n
## Avoid multiple hashtags
\\n
Use one space after hashtag\\n
# This is a good comment
\\n
\\n
#Avoid this
\\n
Start every sentence with a capital letter\\n
# This is nice
\\n
\\n
# this is nasty
\\n
Single sentences don\\'t require period. Multiple sentences do.\\n
# Sentence\\ndo_something()\\n\\n# Sentence1. Sentence2.\\ndo_stuff()
\\n
\\n
# Sentence.\\ndo_something()\\n\\n# Sentence1. Sentence2\\ndo_stuff()
\\n
\\n\\n\\n

\\n
\\n

Frequently Asked Questions

\\n\\n

This FAQ section will grow over time, as questions come in.

\\n\\n\\n
DataCamp commits to my repository on my behalf, why?
\\n\\n

Before you can create a course, you have to log in to DataCamp, and link your account to GitHub. From that moment on, DataCamp is able to make changes to the repositories you\\'ve linked on your behalf. For example, when you create a new course with the \"Create a DataCamp Course\" dialog, we will create a repository for you, and pre-fill it with course and chapter files with several commits. That way, the build process for your course is automatically triggered and a DataCamp course results.

\\n\\n

Another point where DataCamp does commits on your behalf, is for setting \\'exercise keys\\'. These keys are meant to make the link between your exercises in the GitHub repository and the exercises on DataCamp\\'s databases.

\\n\\n
Is GitHub necessary to teach on DataCamp? Can we use our own tools?
\\n\\n

We are currently depending heavily on GitHub, but we\\'re exploring other possibilities for private and proprietary content. More specifically, we\\'re looking at GitLab to host our own premium courses. During development, we may also explore ways to allowing other GitLab instances to connect to DataCamp.

\\n\\n
How are courses hosted? Is it possible to have courses only available to a select group of people?
\\n\\n

Courses are always hosted on datacamp.com. There are different states for a course: development and published. Development courses are only available to those who have the link to the course. Published courses are available through a human-readable slug (www.datacamp.com/courses/my-awesome-course), but not findable by Google. Currently, we do not have a system in place to grant access to a particular number of people; it\\'s just the people with the link that have access.

\\n\\n
I added a dataset to my datasets folder on git, why isn\\'t it uploading?
\\n\\n

Datasets or slides have a maximum size of 10 Megabytes. Bigger datasets won\\'t be uploaded and thus will not generate a link. The build logs will tell you your dataset is too big. We need this security built in because we don\\'t want anyone to upload massive datasets onto our servers.

\\n\\n
I need a package but it\\'s not installed on your servers, what can I do?
\\n\\n

At the moment a lot of packages that can be used for Data Science are installed on our servers. There\\'s no easy way to install packages yourself, however if you come accross a package you think should really be installed, you can mail content-engineering@datacamp.com. Please note that there is no guarantee that we will install the package.

\\n\\n\\n
\\n
\\n

Related Projects

\\n\\nNext to making teaching on DataCamp as easy as possible. There are many other projects that we are working on. Some of these are vital to making DataCamp courses work properly, other are side projects to support the Data Science community.\\n\\n
    \\n\\t
  • RDocumentation: Website that groups documentation for all R packages on CRAN, GitHub and Bioconductor. The interface makes it easy to browse package, see package dependencies and see popularity of packages over time.
  • \\n\\t
  • testwhat: R package that contains a wide range of functions to write Submission Correctness Tests (SCTs) for interactive R exercises on DataCamp. Make sure to check out the wiki pages for more details about when, how and why to use the different functions. More details about the role of SCTs in a DataCamp exercise can be found at the Code Exercises article.
  • \\n\\t
  • pythonwhat: Python package that contains a wide range of functions to write Submission Correctness Tests (SCTs) for interactive Python exercises on DataCamp. Make sure to check out the wiki pages for more details about when, how and why to use the different functions. More details about the role of SCTs in a DataCamp exercise can be found at the Code Exercises article.
  • \\n\\t
  • DataCamp Light: JavaScript library to embed interactive R and Python playgrounds in your own HTML pages. Similar to how JSFiddle works, but for R and Python.
  • \\n\\t
  • Tutorial: R package that acts as a wrapper around the famous knitr R package that enables you to easily add DataCamp Light powered code chunks in your reports, blog posts and vignettes.
  • \\n
\\n\\n

These are all open-source projects projects; contact us if you want to contribute!

\\n\\n
\\n
\\n
\\n\\n
\\n
\\n\\n\\t\\t\\t
\\n
\\n \\n \\n
\\n \\n \\n \\n \\n
\\n
\\n
\\n\\n\\t\\t
\\n\\t\\t\\n\\t\\n\\n'\n" ] } ], "source": [ "# printing the response\n", "print(html)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Be polite and close the response!\n", "response.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using Requests pkg to extract the data" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "Guido's Personal Home Page\n", "\n", "\n", "\n", "\n", "

\n", "\n", "Guido van Rossum - Personal Home Page

\n", "\n", "

\"Gawky and proud of it.\"\n", "\n", "\n", "

Who\n", "I Am

\n", "\n", "

Read\n", "my \"King's\n", "Day Speech\" for some inspiration.\n", "\n", "

I am the author of the Python\n", "programming language. See also my resume\n", "and my publications list, a brief bio, assorted writings, presentations and interviews (all about Python), some\n", "pictures of me,\n", "my new blog, and\n", "my old\n", "blog on Artima.com. I am\n", "@gvanrossum on Twitter. I\n", "also have\n", "a G+\n", "profile.\n", "\n", "

In January 2013 I joined\n", "Dropbox. I work on various Dropbox\n", "products and have 50% for my Python work, no strings attached.\n", "Previously, I have worked for Google, Elemental Security, Zope\n", "Corporation, BeOpen.com, CNRI, CWI, and SARA. (See\n", "my resume.) I created Python while at CWI.\n", "\n", "

How to Reach Me

\n", "\n", "

You can send email for me to guido (at) python.org.\n", "I read everything sent there, but if you ask\n", "me a question about using Python, it's likely that I won't have time\n", "to answer it, and will instead refer you to\n", "help (at) python.org,\n", "comp.lang.python or\n", "StackOverflow. If you need to\n", "talk to me on the phone or send me something by snail mail, send me an\n", "email and I'll gladly email you instructions on how to reach me.\n", "\n", "

My Name

\n", "\n", "

My name often poses difficulties for Americans.\n", "\n", "

Pronunciation: in Dutch, the \"G\" in Guido is a hard G,\n", "pronounced roughly like the \"ch\" in Scottish \"loch\". (Listen to the\n", "sound clip.) However, if you're\n", "American, you may also pronounce it as the Italian \"Guido\". I'm not\n", "too worried about the associations with mob assassins that some people\n", "have. :-)\n", "\n", "

Spelling: my last name is two words, and I'd like keep it\n", "that way, the spelling on some of my credit cards notwithstanding.\n", "Dutch spelling rules dictate that when used in combination with my\n", "first name, \"van\" is not capitalized: \"Guido van Rossum\". But when my\n", "last name is used alone to refer to me, it is capitalized, for\n", "example: \"As usual, Van Rossum was right.\"\n", "\n", "

Alphabetization: in America, I show up in the alphabet under\n", "\"V\". But in Europe, I show up under \"R\". And some of my friends put\n", "me under \"G\" in their address book...\n", "\n", "\n", "

More Hyperlinks

\n", "\n", "\n", "\n", "

The Audio File Formats FAQ

\n", "\n", "

I was the original creator and maintainer of the Audio File Formats\n", "FAQ. It is now maintained by Chris Bagwell\n", "at http://www.cnpbagwell.com/audio-faq. And here is a link to\n", "SOX, to which I contributed\n", "some early code.\n", "\n", "\n", "\n", "


\n", "\n", "\"On the Internet, nobody knows you're\n", "a dog.\"\n", "\n", "
\n", "\n", "\n", "\n" ] } ], "source": [ "import requests\n", "\n", "# Specify the url: url\n", "url = \"https://www.python.org/~guido/\"\n", "\n", "# Packages the request, send the request and catch the response: r\n", "r = requests.get(url)\n", "\n", "# Extract the response: text\n", "text = r.text\n", "\n", "# Print the html\n", "print(text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### WebScrapping with Python Pkg BeautifulSoup" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " \n", " \n", " Guido's Personal Home Page\n", " \n", " \n", " \n", "

\n", " \n", " \n", " \n", " Guido van Rossum - Personal Home Page\n", "

\n", "

\n", " \n", " \n", " \"Gawky and proud of it.\"\n", " \n", " \n", "

\n", "

\n", " \n", " Who\n", "I Am\n", " \n", "

\n", "

\n", " Read\n", "my\n", " \n", " \"King's\n", "Day Speech\"\n", " \n", " for some inspiration.\n", "

\n", "

\n", " I am the author of the\n", " \n", " Python\n", " \n", " programming language. See also my\n", " \n", " resume\n", " \n", " and my\n", " \n", " publications list\n", " \n", " , a\n", " \n", " brief bio\n", " \n", " , assorted\n", " \n", " writings\n", " \n", " ,\n", " \n", " presentations\n", " \n", " and\n", " \n", " interviews\n", " \n", " (all about Python), some\n", " \n", " pictures of me\n", " \n", " ,\n", " \n", " my new blog\n", " \n", " , and\n", "my\n", " \n", " old\n", "blog\n", " \n", " on Artima.com. I am\n", " \n", " @gvanrossum\n", " \n", " on Twitter. I\n", "also have\n", "a\n", " \n", " G+\n", "profile\n", " \n", " .\n", "

\n", "

\n", " In January 2013 I joined\n", " \n", " Dropbox\n", " \n", " . I work on various Dropbox\n", "products and have 50% for my Python work, no strings attached.\n", "Previously, I have worked for Google, Elemental Security, Zope\n", "Corporation, BeOpen.com, CNRI, CWI, and SARA. (See\n", "my\n", " \n", " resume\n", " \n", " .) I created Python while at CWI.\n", "

\n", "

\n", " How to Reach Me\n", "

\n", "

\n", " You can send email for me to guido (at) python.org.\n", "I read everything sent there, but if you ask\n", "me a question about using Python, it's likely that I won't have time\n", "to answer it, and will instead refer you to\n", "help (at) python.org,\n", " \n", " comp.lang.python\n", " \n", " or\n", " \n", " StackOverflow\n", " \n", " . If you need to\n", "talk to me on the phone or send me something by snail mail, send me an\n", "email and I'll gladly email you instructions on how to reach me.\n", "

\n", "

\n", " My Name\n", "

\n", "

\n", " My name often poses difficulties for Americans.\n", "

\n", "

\n", " \n", " Pronunciation:\n", " \n", " in Dutch, the \"G\" in Guido is a hard G,\n", "pronounced roughly like the \"ch\" in Scottish \"loch\". (Listen to the\n", " \n", " sound clip\n", " \n", " .) However, if you're\n", "American, you may also pronounce it as the Italian \"Guido\". I'm not\n", "too worried about the associations with mob assassins that some people\n", "have. :-)\n", "

\n", "

\n", " \n", " Spelling:\n", " \n", " my last name is two words, and I'd like keep it\n", "that way, the spelling on some of my credit cards notwithstanding.\n", "Dutch spelling rules dictate that when used in combination with my\n", "first name, \"van\" is not capitalized: \"Guido van Rossum\". But when my\n", "last name is used alone to refer to me, it is capitalized, for\n", "example: \"As usual, Van Rossum was right.\"\n", "

\n", "

\n", " \n", " Alphabetization:\n", " \n", " in America, I show up in the alphabet under\n", "\"V\". But in Europe, I show up under \"R\". And some of my friends put\n", "me under \"G\" in their address book...\n", "

\n", "

\n", " More Hyperlinks\n", "

\n", " \n", "

\n", " The Audio File Formats FAQ\n", "

\n", "

\n", " I was the original creator and maintainer of the Audio File Formats\n", "FAQ. It is now maintained by Chris Bagwell\n", "at\n", " \n", " http://www.cnpbagwell.com/audio-faq\n", " \n", " . And here is a link to\n", " \n", " SOX\n", " \n", " , to which I contributed\n", "some early code.\n", "

\n", "
\n", " \n", " \"On the Internet, nobody knows you're\n", "a dog.\"\n", " \n", "
\n", " \n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "C:\\tools\\Anaconda3\\lib\\site-packages\\bs4\\__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system (\"lxml\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n", "\n", "The code that caused this warning is on line 184 of the file C:\\tools\\Anaconda3\\lib\\runpy.py. To get rid of this warning, change code that looks like this:\n", "\n", " BeautifulSoup([your markup])\n", "\n", "to this:\n", "\n", " BeautifulSoup([your markup], \"lxml\")\n", "\n", " markup_type=markup_type))\n" ] } ], "source": [ "# Import packages\n", "import requests\n", "from bs4 import BeautifulSoup\n", "\n", "# Specify url: url\n", "url = 'https://www.python.org/~guido/'\n", "\n", "# Package the request, send the request and catch the response: r\n", "r = requests.get(url)\n", "\n", "# Extracts the response as html: html_doc\n", "html_doc = r.text\n", "\n", "# Create a BeautifulSoup object from the HTML: soup\n", "soup = BeautifulSoup(html_doc)\n", "\n", "# Prettify the BeautifulSoup object: pretty_soup\n", "pretty_soup = soup.prettify()\n", "\n", "# Print the response\n", "print(pretty_soup)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Guido's Personal Home Page\n" ] } ], "source": [ "# Get the title of Guido's webpage: guido_title\n", "guido_title = soup.title\n", "\n", "# Print the title of Guido's webpage to the shell\n", "print(guido_title)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "Guido's Personal Home Page\n", "\n", "\n", "\n", "\n", "Guido van Rossum - Personal Home Page\n", "\"Gawky and proud of it.\"\n", "Who\n", "I Am\n", "Read\n", "my \"King's\n", "Day Speech\" for some inspiration.\n", "\n", "I am the author of the Python\n", "programming language. See also my resume\n", "and my publications list, a brief bio, assorted writings, presentations and interviews (all about Python), some\n", "pictures of me,\n", "my new blog, and\n", "my old\n", "blog on Artima.com. I am\n", "@gvanrossum on Twitter. I\n", "also have\n", "a G+\n", "profile.\n", "\n", "In January 2013 I joined\n", "Dropbox. I work on various Dropbox\n", "products and have 50% for my Python work, no strings attached.\n", "Previously, I have worked for Google, Elemental Security, Zope\n", "Corporation, BeOpen.com, CNRI, CWI, and SARA. (See\n", "my resume.) I created Python while at CWI.\n", "\n", "How to Reach Me\n", "You can send email for me to guido (at) python.org.\n", "I read everything sent there, but if you ask\n", "me a question about using Python, it's likely that I won't have time\n", "to answer it, and will instead refer you to\n", "help (at) python.org,\n", "comp.lang.python or\n", "StackOverflow. If you need to\n", "talk to me on the phone or send me something by snail mail, send me an\n", "email and I'll gladly email you instructions on how to reach me.\n", "\n", "My Name\n", "My name often poses difficulties for Americans.\n", "\n", "Pronunciation: in Dutch, the \"G\" in Guido is a hard G,\n", "pronounced roughly like the \"ch\" in Scottish \"loch\". (Listen to the\n", "sound clip.) However, if you're\n", "American, you may also pronounce it as the Italian \"Guido\". I'm not\n", "too worried about the associations with mob assassins that some people\n", "have. :-)\n", "\n", "Spelling: my last name is two words, and I'd like keep it\n", "that way, the spelling on some of my credit cards notwithstanding.\n", "Dutch spelling rules dictate that when used in combination with my\n", "first name, \"van\" is not capitalized: \"Guido van Rossum\". But when my\n", "last name is used alone to refer to me, it is capitalized, for\n", "example: \"As usual, Van Rossum was right.\"\n", "\n", "Alphabetization: in America, I show up in the alphabet under\n", "\"V\". But in Europe, I show up under \"R\". And some of my friends put\n", "me under \"G\" in their address book...\n", "\n", "\n", "More Hyperlinks\n", "\n", "Here's a collection of essays relating to Python\n", "that I've written, including the foreword I wrote for Mark Lutz' book\n", "\"Programming Python\".\n", "I own the official \n", "Python license.\n", "\n", "The Audio File Formats FAQ\n", "I was the original creator and maintainer of the Audio File Formats\n", "FAQ. It is now maintained by Chris Bagwell\n", "at http://www.cnpbagwell.com/audio-faq. And here is a link to\n", "SOX, to which I contributed\n", "some early code.\n", "\n", "\n", "\n", "\n", "\"On the Internet, nobody knows you're\n", "a dog.\"\n", "\n", "\n", "\n" ] } ], "source": [ "# Get Guido's text: guido_text\n", "guido_text = soup.get_text()\n", "\n", "# Print Guido's text to the shell\n", "print(guido_text)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "pics.html\n", "http://www.washingtonpost.com/wp-srv/business/longterm/microsoft/stories/1998/raymond120398.htm\n", "http://metalab.unc.edu/Dave/Dr-Fun/df200004/df20000406.jpg\n", "http://neopythonic.blogspot.com/2016/04/kings-day-speech.html\n", "http://www.python.org\n", "Resume.html\n", "Publications.html\n", "bio.html\n", "http://legacy.python.org/doc/essays/\n", "http://legacy.python.org/doc/essays/ppt/\n", "interviews.html\n", "pics.html\n", "http://neopythonic.blogspot.com\n", "http://www.artima.com/weblogs/index.jsp?blogger=12088\n", "https://twitter.com/gvanrossum\n", "https://plus.google.com/u/0/115212051037621986145/posts\n", "http://www.dropbox.com\n", "Resume.html\n", "http://groups.google.com/groups?q=comp.lang.python\n", "http://stackoverflow.com\n", "guido.au\n", "http://legacy.python.org/doc/essays/\n", "images/license.jpg\n", "http://www.cnpbagwell.com/audio-faq\n", "http://sox.sourceforge.net/\n", "images/internetdog.gif\n" ] } ], "source": [ "# Find all 'a' tags (which define hyperlinks): a_tags\n", "a_tags = soup.find_all('a')\n", "\n", "# Print the URLs to the shell\n", "for link in a_tags:\n", " print(link.get('href'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "############################################################\n", "## Atul Singh | www.datagenx.net | lnked.in/atulsingh\n", "############################################################" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }