{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Get Notebook from github.com and other source.\n", "by [openthings@163.com](http://my.oschina.net/u/2306127/blog?catalog=3420733), 2016-04. \n", "\n", "### 通用的Notebook更新维护的工具。\n", "* 原始URL列表保存在文本文件git_list.txt中。\n", "* git_list.txt转为git_list.md,在GitBook中使用。\n", "* git_list.txt转为git_list.ipynb,在Jupyter中使用。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from pprint import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### URL地址列表读入字符串变量中。\n", "#### 注意,为了避免太长,只显示了前面指定个数的字符。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# Get Notebook from github.com and other source:\n", "\n", "# Pandas tutorial for new user.\n", "https://bitbucket.org/hrojas/learn-pandas.git\n", "\n", "# echo Pandas Cookbook.\n", "https://github.com/jvns/pandas-cookbook.git\n", "\n", "# Some Files for Finance Analysis.\n", "clone https://github.com/wy36101299/ipynb-file.git\n", "\n", "# Practical dat\n", "\n", "......\n" ] } ], "source": [ "url_str = open(\"git_list.txt\").read()\n", "print(url_str[0:300] + \"\\n\\n......\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 分解字符串到名称和url。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total: 26\n", "[{'ugit': 'https://bitbucket.org/hrojas/learn-pandas.git',\n", " 'uname': 'Pandas tutorial for new user.'},\n", " {'ugit': 'https://github.com/jvns/pandas-cookbook.git',\n", " 'uname': 'echo Pandas Cookbook.'},\n", " {'ugit': 'clone https://github.com/wy36101299/ipynb-file.git',\n", " 'uname': 'Some Files for Finance Analysis.'}]\n" ] } ], "source": [ "url_line = url_str.split(\"#\")\n", "\n", "url_list = []\n", "for url in url_line:\n", " url2 = url.strip().split(\"\\n\")\n", " if len(url2)>1:\n", " uname = url2[0]\n", " ugit = url2[1]\n", " url_dict = {\"uname\":uname,\"ugit\":ugit}\n", " url_list.append(url_dict)\n", " \n", "print(\"Total:\",len(url_list))\n", "pprint(url_list[0:3])\n", " #print(uname,\"\\n\",ugit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 保存到Markdown文件。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writed url list to file: url_list.md\n" ] } ], "source": [ "flist = open(\"git_list.md\",\"w+\") \n", "flist.write(\n", "\"\"\"\n", "## IPython Notebook Tutorial and Skills open source...\n", "##### by [openthings@163.com](http://my.oschina.net/u/2306127/blog?catalog=3420733), 2016-04. \n", "\"\"\"\n", ") \n", "for d in url_list:\n", " flist.write(\"##### \" + d[\"uname\"] + \"\\n\")\n", " flist.write(\"[\" + d[\"ugit\"] + \"]\" + \"(\" + d[\"ugit\"] + \")\\n\")\n", "flist.close()\n", "print(\"Writed url list to file: url_list.md\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 抓取git库中文件到本地。如果已经存在,则git pull,否则git clone.\n", "** 使用了IPython的!魔法操作符来执行shell操作。**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " 1 :\t Pandas tutorial for new user. \n", "==>>\t https://bitbucket.org/hrojas/learn-pandas.git\n", "\t Existed, git pull: learn-pandas\n", "Already up-to-date.\n", "\n", " 2 :\t echo Pandas Cookbook. \n", "==>>\t https://github.com/jvns/pandas-cookbook.git\n", "\t Existed, git pull: pandas-cookbook\n", "Already up-to-date.\n", "\n", " 3 :\t Some Files for Finance Analysis. \n", "==>>\t clone https://github.com/wy36101299/ipynb-file.git\n", "\t Existed, git pull: ipynb-file\n", "Already up-to-date.\n", "\n", " 4 :\t Practical data analysis with Python \n", "==>>\t https://leanpub.com/analyticshandbook\n", "Git clone ......\n", "正克隆到 'analyticshandbook'...\n", "fatal: repository 'https://leanpub.com/analyticshandbook/' not found\n", "\n", " 5 :\t Mining-the-Social-Web-2nd-Edition \n", "==>>\t https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition.git\n", "\t Existed, git pull: Mining-the-Social-Web-2nd-Edition\n", "Already up-to-date.\n", "\n", " 6 :\t Biolab \n", "==>>\t https://github.com/biolab/ipynb.git\n", "\t Existed, git pull: ipynb\n", "Already up-to-date.\n", "\n", " 7 :\t Build a flask server for Jupyter. \n", "==>>\t https://github.com/yhilpisch/ipynb-docker.git\n", "\t Existed, git pull: ipynb-docker\n", "Already up-to-date.\n", "\n", " 8 :\t IPython notebooks used in Georgia Tech's CSE 6040: Computing for Data Analysis \n", "==>>\t https://github.com/rvuduc/cse6040-ipynbs.git\n", "\t Existed, git pull: cse6040-ipynbs\n", "Already up-to-date.\n", "\n", " 9 :\t Jupyter Notebook Tools for Sphinx \n", "==>>\t https://github.com/spatialaudio/nbsphinx.git\n", "\t Existed, git pull: nbsphinx\n", "Already up-to-date.\n", "\n", " 10 :\t A collection of Notebooks for using IPython effectively \n", "==>>\t https://github.com/odewahn/ipynb-examples.git\n", "\t Existed, git pull: ipynb-examples\n", "Already up-to-date.\n", "\n", " 11 :\t aka \"Bayesian Methods for Hackers\" \n", "==>>\t https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers.git\n", "\t Existed, git pull: Probabilistic-Programming-and-Bayesian-Methods-for-Hackers\n", "Already up-to-date.\n", "\n", " 12 :\t /bokeh-tutorial-ipynb \n", "==>>\t https://github.com/chdoig/bokeh-tutorial-ipynb.git\n", "Git clone ......\n", "正克隆到 'bokeh-tutorial-ipynb'...\n", "remote: Counting objects: 253, done.\u001b[K\n", "remote: Compressing objects: 100% (90/90), done.\u001b[K\n", "remote: Total 253 (delta 161), reused 253 (delta 161), pack-reused 0\u001b[K\n", "接收对象中: 100% (253/253), 49.82 MiB | 599.00 KiB/s, 完成.\n", "处理 delta 中: 100% (161/161), 完成.\n", "检查连接... 完成。\n", "\n", " 13 :\t collection all kinds of ipynb \n", "==>>\t https://github.com/OpenBookProjects/ipynb.git\n", "\t Existed, git pull: ipynb\n", "Already up-to-date.\n", "\n", " 14 :\t Just a shared git repo for Social Graphs & Interaction \n", "==>>\t https://github.com/timmevandermeer/ipynb.git\n", "\t Existed, git pull: ipynb\n", "Already up-to-date.\n", "\n", " 15 :\t Neural Networks Training Jupyter Notebooks \n", "==>>\t https://github.com/tmeits/ipynb.git\n", "\t Existed, git pull: ipynb\n", "Already up-to-date.\n", "\n", " 16 :\t tensorflow-ipynb \n", "==>>\t https://github.com/fujun-liu/tensorflow-ipynb.git\n", "\t Existed, git pull: tensorflow-ipynb\n", "Already up-to-date.\n", "\n", " 17 :\t https://github.com/charlesjhlee/ml_ipynb.git \n", "==>>\t https://github.com/charlesjhlee/ml_ipynb.git\n", "\t Existed, git pull: ml_ipynb\n", "Already up-to-date.\n", "\n", " 18 :\t Ipython Notebook Visuals \n", "==>>\t https://github.com/bangadennis/ipynb_visuals.git\n", "\t Existed, git pull: ipynb_visuals\n", "Already up-to-date.\n", "\n", " 19 :\t Canvas Widget for IPython Notebook https://github.com/Who8MyLunch/ipynb_widget_canvas \n", "==>>\t https://github.com/Who8MyLunch/ipynb_widget_canvas.git\n", "\t Existed, git pull: ipynb_widget_canvas\n", "Already up-to-date.\n", "\n", " 20 :\t Copy of Udacity DL course ipynb files, and maybe some other stuff \n", "==>>\t https://github.com/damienstanton/tensorflownotes.git\n", "\t Existed, git pull: tensorflownotes\n", "Already up-to-date.\n", "\n", " 21 :\t This is an ipynb in which we use simple logistic regression from Spark MLlib to train a sarcasm detector on comments from reddit. \n", "==>>\t https://github.com/FranekJemiolo/SarcasmDetector.git\n", "\t Existed, git pull: SarcasmDetector\n", "Already up-to-date.\n", "\n", " 22 :\t IPython project \n", "==>>\t https://github.com/ipython/ipython.git\n", "Git clone ......\n", "正克隆到 'ipython'...\n", "remote: Counting objects: 154141, done.\u001b[K\n", "remote: Compressing objects: 100% (37/37), done.\u001b[K\n", "error: RPC failed; curl 56 GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n", "fatal: The remote end hung up unexpectedly\n", "fatal: 过早的文件结束符(EOF)\n", "fatal: index-pack failed\n", "\n", " 23 :\t Topik project \n", "==>>\t https://github.com/ContinuumIO/topik.git\n", "\t Existed, git pull: topik\n", "Already up-to-date.\n", "\n", " 24 :\t scientific-python-lectures \n", "==>>\t https://github.com/ContinuumIO/scientific-python-lectures.git\n", "\t Existed, git pull: scientific-python-lectures\n", "Already up-to-date.\n", "\n", " 25 :\t Continuum work from XDATA January 2016 Hackathon with U. S. Census Bureau \n", "==>>\t https://github.com/ContinuumIO/xdata-2016-census.git\n", "\t Existed, git pull: xdata-2016-census\n", "Already up-to-date.\n", "\n", " 26 :\t Analysis on Each Image \n", "==>>\t https://github.com/ContinuumIO/image-analyzer.git\n", "\t Existed, git pull: image-analyzer\n", "Already up-to-date.\n", "Finished.\n" ] } ], "source": [ "import os\n", "import os.path\n", "\n", "index = 0\n", "for d in url_list:\n", " index += 1\n", " print(\"\\n\",index,\":\\t\",d[\"uname\"],\"\\n==>>\\t\",d[\"ugit\"])\n", "\n", " git_path = os.path.split(d[\"ugit\"])\n", " git_name = git_path[1].split(\".\")[0]\n", " #print(git_name)\n", " \n", " if os.path.exists(git_name):\n", " print(\"\\t Existed, git pull:\",git_name,\" ...\")\n", " ! cd $git_name && git pull\n", " else:\n", " print(\"Git clone ......\")\n", " ucmd = \"git clone \" + d[\"ugit\"]\n", " #print(ucmd)\n", " ! $ucmd\n", "print(\"Finished.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }