{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "***\n", "***\n", "# Graphlab 安装与使用\n", "***\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Problem\n", "\n", "只有低版本的anaconda才可以安装,强行安装还会破坏掉anaconda的jupyter notebook中的kernel,排除使用anaconda运行graphlab的方案。\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Register for Academic Use of GraphLab Create\n", "\n", "https://turi.com/download/academic.html" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 查邮件\n", "\n", "https://turi.com/download/install-graphlab-create.html?email=wangchengjun%40nju.edu.cn&key=4972-65DF-8E02-816C-AB15-021C-EC1B-0367&utm_medium=email&utm_source=transactional&utm_campaign=beta_registration_confirmation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Renew Academic License for GraphLab Create\n", "\n", "https://turi.com/download/renew.html" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### License Renewal Confirmation\n", "Your academic license for GraphLab Create has been renewed. Please restart GraphLab Create while connected to the internet.\n", "\n", "\n", "Email: wangchengjun@nju.edu.cn\n", "\n", "Expiration Date: 03-13-2019" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Python 2.7.x\n", "\n", "GraphLab Create installation requires a Python 2.7.x environment and pip version >= 7 and Anaconda2 v4.0.0 (64-bit). IPython Notebook is recommended." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# To install a different version of Python without overwriting the current version\n", "\n", "https://conda.io/docs/user-guide/tasks/manage-python.html\n", "\n", "Creating a new environment and install the second Python version into it. To create the new environment for Python 2.7, in your Terminal window or an Anaconda Prompt, run:\n", "> conda create -n py27 python=2.7 anaconda\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "## Activate the new environment ** 切换到新环境**\n", "\n", "- linux/Mac下需要使用: `source activate py27`\n", "- windows需要使用: `activate py27`\n", "\n", "**退出环境: `source deactivate py27`\n", "也可以使用** `activate root`切回root环境\n", "\n", "3. [Verify that the new environment is your current environment.](https://conda.io/docs/user-guide/tasks/manage-environments.html#determine-current-env)\n", "4. To verify that the current environment uses the new Python version, in your Terminal window or an Anaconda Prompt, run: `python --version`" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:47:23.334875Z", "start_time": "2019-03-04T03:47:23.211768Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python 2.7.14 :: Anaconda, Inc.\r\n" ] } ], "source": [ "! python --version" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Install your licensed copy of GraphLab Create\n", "pip install --upgrade --no-cache-dir https://get.graphlab.com/GraphLab-Create/2.1/your registered email address here/your product key here/GraphLab-Create-License.tar.gz" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "# Open a terminal and input:\n", "\n", "pip install --upgrade --no-cache-dir https://get.graphlab.com/GraphLab-Create/2.1/wangchengjun@nju.edu.cn/4972-65DF-8E02-816C-AB15-021C-EC1B-0367/GraphLab-Create-License.tar.gz\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Error \n", "Could not find a version that satisfies the requirement graphlab-create>=2.1 (from GraphLab-Create-License==2.1) (from versions: )\n", "No matching distribution found for graphlab-create>=2.1 (from GraphLab-Create-License==2.1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "# 使用方法\n", "\n", "https://turi.com/learn/userguide/\n", "\n", "GraphLab Create is a Python package that allows programmers to perform end-to-end large-scale data analysis and data product development.\n", "\n", "- Data ingestion and cleaning with SFrames. SFrame is an efficient disk-based tabular data structure that is not limited by RAM. This lets you scale your analysis and data processing to handle terabytes of data, even on your laptop.\n", "\n", "- Data exploration and visualization with GraphLab Canvas. GraphLab Canvas is a browser-based interactive GUI that allows you to explore tabular data, summary plots and statistics.\n", "\n", "- Network analysis with SGraph. SGraph is a disk-based graph data structure that stores vertices and edges in SFrames.\n", "\n", "- Predictive model development with machine learning toolkits. GraphLab Create includes several toolkits for quick prototyping with fast, scalable algorithms.\n", "\n", "- Production automation with data pipelines. Data pipelines allow you to assemble reusable code tasks into jobs and automatically run them on common execution environments (e.g. Amazon Web Services, Hadoop)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:50:55.741270Z", "start_time": "2019-03-04T03:50:47.024232Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This non-commercial license of GraphLab Create for academic use is assigned to wangchengjun@nju.edu.cn and will expire on March 14, 2019.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1551671450.log\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "SGraph({'num_edges': 1, 'num_vertices': 3})\n" ] } ], "source": [ "from graphlab import SGraph, Vertex, Edge\n", "g = SGraph()\n", "verts = [Vertex(0, attr={'breed': 'labrador'}),\n", " Vertex(1, attr={'breed': 'labrador'}),\n", " Vertex(2, attr={'breed': 'vizsla'})]\n", "g = g.add_vertices(verts)\n", "g = g.add_edges(Edge(1, 2))\n", "print(g)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:51:30.249064Z", "start_time": "2019-03-04T03:51:30.184980Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "SGraph({'num_edges': 1, 'num_vertices': 3})\n" ] } ], "source": [ "from graphlab import SGraph, Vertex, Edge\n", "g = SGraph()\n", "verts = [Vertex(0, attr={'breed': 'labrador'}),\n", " Vertex(1, attr={'breed': 'labrador'}),\n", " Vertex(2, attr={'breed': 'vizsla'})]\n", "g = g.add_vertices(verts)\n", "g = g.add_edges(Edge(1, 2))\n", "print g" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:51:39.507312Z", "start_time": "2019-03-04T03:51:39.304216Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Canvas is accessible via web browser at the URL: http://localhost:62302/index.html\n", "Opening Canvas in default web browser.\n" ] } ], "source": [ "g.show()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:51:59.037393Z", "start_time": "2019-03-04T03:51:58.929048Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
Finished parsing file /Users/datalab/github/bigdata/data/bond_edges.csv
" ], "text/plain": [ "Finished parsing file /Users/datalab/github/bigdata/data/bond_edges.csv" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Parsing completed. Parsed 20 lines in 0.022952 secs.
" ], "text/plain": [ "Parsing completed. Parsed 20 lines in 0.022952 secs." ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "------------------------------------------------------\n", "Inferred types from first 100 line(s) of file as \n", "column_type_hints=[str,str,str]\n", "If parsing fails due to incorrect types, you can correct\n", "the inferred type list above and pass it to read_csv in\n", "the column_type_hints argument\n", "------------------------------------------------------\n" ] }, { "data": { "text/html": [ "
Finished parsing file /Users/datalab/github/bigdata/data/bond_edges.csv
" ], "text/plain": [ "Finished parsing file /Users/datalab/github/bigdata/data/bond_edges.csv" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Parsing completed. Parsed 20 lines in 0.009524 secs.
" ], "text/plain": [ "Parsing completed. Parsed 20 lines in 0.009524 secs." ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SGraph({'num_edges': 20, 'num_vertices': 10})\n" ] } ], "source": [ "from graphlab import SFrame,SGraph\n", "edge_data = SFrame.read_csv('../data/bond_edges.csv')\n", " #'https://static.turi.com/datasets/bond/bond_edges.csv')\n", "\n", "g = SGraph()\n", "g = g.add_edges(edge_data, src_field='src', dst_field='dst')\n", "print(g)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2017-05-13T09:40:42.615740", "start_time": "2017-05-13T09:40:41.014215" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
Downloading https://static.turi.com/datasets/bond/bond_vertices.csv to /var/tmp/graphlab-chengjun/1639/be40440c-c4eb-45f1-ae71-501011acc59f.csv
" ], "text/plain": [ "Downloading https://static.turi.com/datasets/bond/bond_vertices.csv to /var/tmp/graphlab-chengjun/1639/be40440c-c4eb-45f1-ae71-501011acc59f.csv" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Finished parsing file https://static.turi.com/datasets/bond/bond_vertices.csv
" ], "text/plain": [ "Finished parsing file https://static.turi.com/datasets/bond/bond_vertices.csv" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Parsing completed. Parsed 10 lines in 0.010129 secs.
" ], "text/plain": [ "Parsing completed. Parsed 10 lines in 0.010129 secs." ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "------------------------------------------------------\n", "Inferred types from first line of file as \n", "column_type_hints=[str,str,int,int]\n", "If parsing fails due to incorrect types, you can correct\n", "the inferred type list above and pass it to read_csv in\n", "the column_type_hints argument\n", "------------------------------------------------------\n" ] }, { "data": { "text/html": [ "
Finished parsing file https://static.turi.com/datasets/bond/bond_vertices.csv
" ], "text/plain": [ "Finished parsing file https://static.turi.com/datasets/bond/bond_vertices.csv" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Parsing completed. Parsed 10 lines in 0.0082 secs.
" ], "text/plain": [ "Parsing completed. Parsed 10 lines in 0.0082 secs." ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "vertex_data = SFrame.read_csv('https://static.turi.com/datasets/bond/bond_vertices.csv')\n", "\n", "g = SGraph(vertices=vertex_data, edges=edge_data, vid_field='name',\n", " src_field='src', dst_field='dst')" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2019-03-04T03:52:11.073515Z", "start_time": "2019-03-04T03:52:11.069355Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Canvas is updated and available in a tab in the default browser.\n" ] } ], "source": [ "g.show(vlabel='id', highlight=['James Bond', 'Moneypenny'], \\\n", " arrows=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 阅读材料\n", "- https://turi.com/learn/userguide/\n" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python [conda env:anaconda]", "language": "python", "name": "conda-env-anaconda-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.4" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 0, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": false, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "48px", "left": "1351px", "top": "42.6667px", "width": "168px" }, "toc_section_display": false, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 1 }