{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Code Sample for Youtube-Data-API Software Package\n", "## Megan Brown\n", "[Slides](http://bit.ly/yt-slides) | [YouTube-Data-API Package](http://bit.ly/youtube-data-api) | [NB-Viewer](http://bit.ly/demo-notebook)\n", "\n", "To import packages for this demo:\n", "`pip install -r requirements.txt`\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#These are the packages we will use in the demonstration\n", "import os\n", "import json\n", "import pandas as pd\n", "import datetime\n", "from youtube_api import YoutubeDataApi\n", "\n", "key = os.environ.get('YT_KEY')\n", "yt = YoutubeDataApi(key)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#Formats json items to print for readability\n", "def dump(doc):\n", " def default_handler(o):\n", " if isinstance(o, datetime.datetime):\n", " return o.isoformat()\n", " \n", " print(json.dumps(doc, sort_keys=True, indent=4, default=default_handler))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A workflow sample\n", "\n", "You may start with a channel name like 'LastWeekTonight' or 'TheNewYorkTimes'. Any data collected about channels must be collected using the channel ID, not the channel name." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Channel IDs\n", "The channel ID can be pulled by running `yt.get_channel_id_from_user(CHANNEL_ID)`" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'UC3XTzVzaHQEd30rQbuvCtTQ'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get the channel ID for the TV show Last Week Tonight\n", "channel_id = yt.get_channel_id_from_user('LastWeekTonight')\n", "channel_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From this chnanel ID, we can get a wide variety of data including (but not limited to):\n", "* creation date\n", "* country\n", "* description\n", "* keywords\n", "* playlist ID (for uploads)\n", "* playlist ID (for likes)\n", "* number of subscriptions\n", "* channel name/title\n", "* channel topic IDs\n", "* number of videos\n", "* total number of views" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"account_creation_date\": \"2014-03-18T17:41:39\",\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"collection_date\": \"2019-04-12T07:12:21.012418\",\n", " \"country\": null,\n", " \"description\": \"Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.\",\n", " \"keywords\": null,\n", " \"playlist_id_likes\": \"LL3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"playlist_id_uploads\": \"UU3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"subscription_count\": \"6926188\",\n", " \"title\": \"LastWeekTonight\",\n", " \"topic_ids\": \"https://en.wikipedia.org/wiki/Entertainment|https://en.wikipedia.org/wiki/Humour|https://en.wikipedia.org/wiki/Television_program\",\n", " \"video_count\": \"267\",\n", " \"view_count\": \"1936918852\"\n", "}\n" ] } ], "source": [ "# get channel metadata for the channel ID we pulled for Last Week Tonight\n", "channel_meta = yt.get_channel_metadata(channel_id)\n", "dump(channel_meta)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to channel metadata, we can pull relational data for channels like the people they subscribe to or feature" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " {\n", " \"collection_date\": \"2019-04-12T07:12:22.720954\",\n", " \"subscription_channel_id\": \"UCWPQB43yGKEum3eW0P9N_nQ\",\n", " \"subscription_kind\": \"youtube#channel\",\n", " \"subscription_publish_date\": \"2014-03-20T19:05:54\",\n", " \"subscription_title\": \"HBOBoxing\"\n", " },\n", " {\n", " \"collection_date\": \"2019-04-12T07:12:22.721991\",\n", " \"subscription_channel_id\": \"UCy6kyFxaMqGtpE3pQTflK8A\",\n", " \"subscription_kind\": \"youtube#channel\",\n", " \"subscription_publish_date\": \"2014-12-11T18:55:41\",\n", " \"subscription_title\": \"Real Time with Bill Maher\"\n", " }\n", "]\n" ] } ], "source": [ "# Get the channels that Last Week Tonight subscribes to\n", "subscriptions = yt.get_subscriptions(channel_id)\n", "dump(subscriptions[:2])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting uploads by a user\n", "YouTube is consructed such that the video uploads by a user are stored in a playlist based on the user's channel ID. We can use this detail to generate the Upload Playlist ID for a given user and collect all videos posted by them." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'UU3XTzVzaHQEd30rQbuvCtTQ'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Install some utility functions from the YouTube package\n", "from youtube_api import youtube_api_utils as utils\n", "\n", "# get the playlist ID for Last Week Tonight's uploads\n", "playlist_id = utils.get_upload_playlist_id(channel_id)\n", "playlist_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using this, we can now pull the uploads for this `playlist_id`\n", "The function `yt.get_videos_from_playlist_id(PLAYLIST_ID)` returns a list of videos from the playlist ID, in this case, the uploads. This returns a list of videos, their channels, and the publishing date." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
publish_datevideo_idchannel_idcollection_date
02019-04-08 06:30:00jCC8fPQOaxUUC3XTzVzaHQEd30rQbuvCtTQ2019-04-12 07:12:28.690187
12019-04-01 06:30:01m8UQ4O7UiDsUC3XTzVzaHQEd30rQbuvCtTQ2019-04-12 07:12:28.691222
22019-03-18 06:30:01Yq7Eh6JTKIgUC3XTzVzaHQEd30rQbuvCtTQ2019-04-12 07:12:28.691222
32019-03-11 06:30:00FO0iG_P0P6MUC3XTzVzaHQEd30rQbuvCtTQ2019-04-12 07:12:28.691222
42019-03-04 07:30:01_h1ooyyFkF0UC3XTzVzaHQEd30rQbuvCtTQ2019-04-12 07:12:28.691222
\n", "
" ], "text/plain": [ " publish_date video_id channel_id \\\n", "0 2019-04-08 06:30:00 jCC8fPQOaxU UC3XTzVzaHQEd30rQbuvCtTQ \n", "1 2019-04-01 06:30:01 m8UQ4O7UiDs UC3XTzVzaHQEd30rQbuvCtTQ \n", "2 2019-03-18 06:30:01 Yq7Eh6JTKIg UC3XTzVzaHQEd30rQbuvCtTQ \n", "3 2019-03-11 06:30:00 FO0iG_P0P6M UC3XTzVzaHQEd30rQbuvCtTQ \n", "4 2019-03-04 07:30:01 _h1ooyyFkF0 UC3XTzVzaHQEd30rQbuvCtTQ \n", "\n", " collection_date \n", "0 2019-04-12 07:12:28.690187 \n", "1 2019-04-12 07:12:28.691222 \n", "2 2019-04-12 07:12:28.691222 \n", "3 2019-04-12 07:12:28.691222 \n", "4 2019-04-12 07:12:28.691222 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get the videos posted by Last Week Tonight and display the last 5\n", "videos = yt.get_videos_from_playlist_id(playlist_id)\n", "df = pd.DataFrame(videos[:5])\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Getting video metadata\n", "From the list of videos from the uploads playlist, we can pull more detailed information about videos using the video IDs using the function `yt.get_video_metadata(VIDEO_ID)`. This function can handle a single video ID or a list of video IDs.\n", "\n", "For the video IDs passed, the package gets:\n", "* video title\n", "* video thumbnail\n", "* channel ID\n", "* channel title\n", "* video category\n", "* number of comments\n", "* video description\n", "* number of likes, dislikes, and views\n", "* publish date\n", "* video tags" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " {\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"channel_title\": \"LastWeekTonight\",\n", " \"collection_date\": \"2019-04-12T07:12:36.841405\",\n", " \"video_category\": \"24\",\n", " \"video_comment_count\": \"12526\",\n", " \"video_description\": \"Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so.\\n\\nConnect with Last Week Tonight online... \\n\\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/lastweektonight \\n\\nFind Last Week Tonight on Facebook like your mom would: www.facebook.com/lastweektonight \\n\\nFollow us on Twitter for news about jokes and jokes about news: www.twitter.com/lastweektonight \\n\\nVisit our official site for all that other stuff at once: www.hbo.com/lastweektonight\",\n", " \"video_dislike_count\": \"3104\",\n", " \"video_id\": \"jCC8fPQOaxU\",\n", " \"video_like_count\": \"105534\",\n", " \"video_publish_date\": \"2019-04-08T06:30:00\",\n", " \"video_tags\": \"\",\n", " \"video_thumbnail\": \"https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg\",\n", " \"video_title\": \"Mobile Homes: Last Week Tonight with John Oliver (HBO)\",\n", " \"video_view_count\": \"4927325\"\n", " }\n", "]\n" ] } ], "source": [ "# Get the video metadata for Last Week Tonight's videos\n", "video_meta = yt.get_video_metadata(df.video_id.tolist())\n", "dump(video_meta[:1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### We can also pull search results!\n", "This is the functional equivalent of going onto YouTube and typing a phrase into the search bar and seeing the results" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " {\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"channel_title\": \"LastWeekTonight\",\n", " \"collection_date\": \"2019-04-12T07:12:39.248427\",\n", " \"video_category\": null,\n", " \"video_description\": \"Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so. Connect with Last Week Tonight ...\",\n", " \"video_id\": \"jCC8fPQOaxU\",\n", " \"video_publish_date\": \"2019-04-08T06:30:00\",\n", " \"video_thumbnail\": \"https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg\",\n", " \"video_title\": \"Mobile Homes: Last Week Tonight with John Oliver (HBO)\"\n", " },\n", " {\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"channel_title\": \"LastWeekTonight\",\n", " \"collection_date\": \"2019-04-12T07:12:39.249427\",\n", " \"video_category\": null,\n", " \"video_description\": \"John Oliver discusses how the WWE takes care of its wrestlers \\u2014 and how it doesn't. Connect with Last Week Tonight online... Subscribe to the Last Week ...\",\n", " \"video_id\": \"m8UQ4O7UiDs\",\n", " \"video_publish_date\": \"2019-04-01T06:30:01\",\n", " \"video_thumbnail\": \"https://i.ytimg.com/vi/m8UQ4O7UiDs/hqdefault.jpg\",\n", " \"video_title\": \"WWE: Last Week Tonight with John Oliver (HBO)\"\n", " }\n", "]\n" ] } ], "source": [ "# pull search results for 'John Oliver'\n", "searches = yt.search('john oliver', max_results=5)\n", "dump(searches[:2])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Lastly, we can pull recommended videos!\n", "These are the videos listed to the side or below YouTube videos while they play. They are YouTube's best guess as to what you may like based on the video you are currently watching. You can get the recommended " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " {\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"channel_title\": \"LastWeekTonight\",\n", " \"collection_date\": \"2019-04-12T07:12:41.364671\",\n", " \"video_category\": null,\n", " \"video_description\": \"Mobile homes may seem like an affordable housing option, but large investment companies are making them less and less so.\\n\\nConnect with Last Week Tonight online... \\n\\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/lastweektonight \\n\\nFind Last Week Tonight on Facebook like your mom would: www.facebook.com/lastweektonight \\n\\nFollow us on Twitter for news about jokes and jokes about news: www.twitter.com/lastweektonight \\n\\nVisit our official site for all that other stuff at once: www.hbo.com/lastweektonight\",\n", " \"video_id\": \"jCC8fPQOaxU\",\n", " \"video_publish_date\": \"2019-04-08T00:45:08\",\n", " \"video_thumbnail\": \"https://i.ytimg.com/vi/jCC8fPQOaxU/hqdefault.jpg\",\n", " \"video_title\": \"Mobile Homes: Last Week Tonight with John Oliver (HBO)\"\n", " },\n", " {\n", " \"channel_id\": \"UC3XTzVzaHQEd30rQbuvCtTQ\",\n", " \"channel_title\": \"LastWeekTonight\",\n", " \"collection_date\": \"2019-04-12T07:12:41.364671\",\n", " \"video_category\": null,\n", " \"video_description\": \"Ivanka Trump and Jared Kushner hold an incredible amount of political power. That's troubling considering their incredibly small amount of political experience.\\n\\nConnect with Last Week Tonight online...\\nSubscribe to the Last Week Tonight YouTube channel for more almost news as it almost happens: www.youtube.com/user/LastWeekTonight\\n\\nFind Last Week Tonight on Facebook like your mom would:\\nhttp://Facebook.com/LastWeekTonight\\n\\nFollow us on Twitter for news about jokes and jokes about news:\\nhttp://Twitter.com/LastWeekTonight\\n\\nVisit our official site for all that other stuff at once:\\nhttp://www.hbo.com/lastweektonight\",\n", " \"video_id\": \"wD8AwgO0AQI\",\n", " \"video_publish_date\": \"2017-04-24T02:20:55\",\n", " \"video_thumbnail\": \"https://i.ytimg.com/vi/wD8AwgO0AQI/hqdefault.jpg\",\n", " \"video_title\": \"Ivanka & Jared: Last Week Tonight with John Oliver (HBO)\"\n", " }\n", "]\n" ] } ], "source": [ "# get recommended videos for Last Week Tonight's video on the WWE\n", "recommendations = yt.get_recommended_videos('m8UQ4O7UiDs')\n", "dump(recommendations[:2])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }