{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Get Notebook from github.com and other source.\n",
"by [openthings@163.com](http://my.oschina.net/u/2306127/blog?catalog=3420733), 2016-04. \n",
"\n",
"### 通用的Notebook更新维护的工具。\n",
"* 原始URL列表保存在文本文件git_list.txt中。\n",
"* git_list.txt转为git_list.md,在GitBook中使用。\n",
"* git_list.txt转为git_list.ipynb,在Jupyter中使用。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from pprint import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### URL地址列表读入字符串变量中。\n",
"#### 注意,为了避免太长,只显示了前面指定个数的字符。"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# Get Notebook from github.com and other source:\n",
"\n",
"# Pandas tutorial for new user.\n",
"https://bitbucket.org/hrojas/learn-pandas.git\n",
"\n",
"# echo Pandas Cookbook.\n",
"https://github.com/jvns/pandas-cookbook.git\n",
"\n",
"# Some Files for Finance Analysis.\n",
"clone https://github.com/wy36101299/ipynb-file.git\n",
"\n",
"# Practical dat\n",
"\n",
"......\n"
]
}
],
"source": [
"url_str = open(\"git_list.txt\").read()\n",
"print(url_str[0:300] + \"\\n\\n......\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 分解字符串到名称和url。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total: 26\n",
"[{'ugit': 'https://bitbucket.org/hrojas/learn-pandas.git',\n",
" 'uname': 'Pandas tutorial for new user.'},\n",
" {'ugit': 'https://github.com/jvns/pandas-cookbook.git',\n",
" 'uname': 'echo Pandas Cookbook.'},\n",
" {'ugit': 'clone https://github.com/wy36101299/ipynb-file.git',\n",
" 'uname': 'Some Files for Finance Analysis.'}]\n"
]
}
],
"source": [
"url_line = url_str.split(\"#\")\n",
"\n",
"url_list = []\n",
"for url in url_line:\n",
" url2 = url.strip().split(\"\\n\")\n",
" if len(url2)>1:\n",
" uname = url2[0]\n",
" ugit = url2[1]\n",
" url_dict = {\"uname\":uname,\"ugit\":ugit}\n",
" url_list.append(url_dict)\n",
" \n",
"print(\"Total:\",len(url_list))\n",
"pprint(url_list[0:3])\n",
" #print(uname,\"\\n\",ugit)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 保存到Markdown文件。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writed url list to file: url_list.md\n"
]
}
],
"source": [
"flist = open(\"git_list.md\",\"w+\") \n",
"flist.write(\n",
"\"\"\"\n",
"## IPython Notebook Tutorial and Skills open source...\n",
"##### by [openthings@163.com](http://my.oschina.net/u/2306127/blog?catalog=3420733), 2016-04. \n",
"\"\"\"\n",
") \n",
"for d in url_list:\n",
" flist.write(\"##### \" + d[\"uname\"] + \"\\n\")\n",
" flist.write(\"[\" + d[\"ugit\"] + \"]\" + \"(\" + d[\"ugit\"] + \")\\n\")\n",
"flist.close()\n",
"print(\"Writed url list to file: url_list.md\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 抓取git库中文件到本地。如果已经存在,则git pull,否则git clone.\n",
"** 使用了IPython的!魔法操作符来执行shell操作。**"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" 1 :\t Pandas tutorial for new user. \n",
"==>>\t https://bitbucket.org/hrojas/learn-pandas.git\n",
"\t Existed, git pull: learn-pandas\n",
"Already up-to-date.\n",
"\n",
" 2 :\t echo Pandas Cookbook. \n",
"==>>\t https://github.com/jvns/pandas-cookbook.git\n",
"\t Existed, git pull: pandas-cookbook\n",
"Already up-to-date.\n",
"\n",
" 3 :\t Some Files for Finance Analysis. \n",
"==>>\t clone https://github.com/wy36101299/ipynb-file.git\n",
"\t Existed, git pull: ipynb-file\n",
"Already up-to-date.\n",
"\n",
" 4 :\t Practical data analysis with Python \n",
"==>>\t https://leanpub.com/analyticshandbook\n",
"Git clone ......\n",
"正克隆到 'analyticshandbook'...\n",
"fatal: repository 'https://leanpub.com/analyticshandbook/' not found\n",
"\n",
" 5 :\t Mining-the-Social-Web-2nd-Edition \n",
"==>>\t https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition.git\n",
"\t Existed, git pull: Mining-the-Social-Web-2nd-Edition\n",
"Already up-to-date.\n",
"\n",
" 6 :\t Biolab \n",
"==>>\t https://github.com/biolab/ipynb.git\n",
"\t Existed, git pull: ipynb\n",
"Already up-to-date.\n",
"\n",
" 7 :\t Build a flask server for Jupyter. \n",
"==>>\t https://github.com/yhilpisch/ipynb-docker.git\n",
"\t Existed, git pull: ipynb-docker\n",
"Already up-to-date.\n",
"\n",
" 8 :\t IPython notebooks used in Georgia Tech's CSE 6040: Computing for Data Analysis \n",
"==>>\t https://github.com/rvuduc/cse6040-ipynbs.git\n",
"\t Existed, git pull: cse6040-ipynbs\n",
"Already up-to-date.\n",
"\n",
" 9 :\t Jupyter Notebook Tools for Sphinx \n",
"==>>\t https://github.com/spatialaudio/nbsphinx.git\n",
"\t Existed, git pull: nbsphinx\n",
"Already up-to-date.\n",
"\n",
" 10 :\t A collection of Notebooks for using IPython effectively \n",
"==>>\t https://github.com/odewahn/ipynb-examples.git\n",
"\t Existed, git pull: ipynb-examples\n",
"Already up-to-date.\n",
"\n",
" 11 :\t aka \"Bayesian Methods for Hackers\" \n",
"==>>\t https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers.git\n",
"\t Existed, git pull: Probabilistic-Programming-and-Bayesian-Methods-for-Hackers\n",
"Already up-to-date.\n",
"\n",
" 12 :\t /bokeh-tutorial-ipynb \n",
"==>>\t https://github.com/chdoig/bokeh-tutorial-ipynb.git\n",
"Git clone ......\n",
"正克隆到 'bokeh-tutorial-ipynb'...\n",
"remote: Counting objects: 253, done.\u001b[K\n",
"remote: Compressing objects: 100% (90/90), done.\u001b[K\n",
"remote: Total 253 (delta 161), reused 253 (delta 161), pack-reused 0\u001b[K\n",
"接收对象中: 100% (253/253), 49.82 MiB | 599.00 KiB/s, 完成.\n",
"处理 delta 中: 100% (161/161), 完成.\n",
"检查连接... 完成。\n",
"\n",
" 13 :\t collection all kinds of ipynb \n",
"==>>\t https://github.com/OpenBookProjects/ipynb.git\n",
"\t Existed, git pull: ipynb\n",
"Already up-to-date.\n",
"\n",
" 14 :\t Just a shared git repo for Social Graphs & Interaction \n",
"==>>\t https://github.com/timmevandermeer/ipynb.git\n",
"\t Existed, git pull: ipynb\n",
"Already up-to-date.\n",
"\n",
" 15 :\t Neural Networks Training Jupyter Notebooks \n",
"==>>\t https://github.com/tmeits/ipynb.git\n",
"\t Existed, git pull: ipynb\n",
"Already up-to-date.\n",
"\n",
" 16 :\t tensorflow-ipynb \n",
"==>>\t https://github.com/fujun-liu/tensorflow-ipynb.git\n",
"\t Existed, git pull: tensorflow-ipynb\n",
"Already up-to-date.\n",
"\n",
" 17 :\t https://github.com/charlesjhlee/ml_ipynb.git \n",
"==>>\t https://github.com/charlesjhlee/ml_ipynb.git\n",
"\t Existed, git pull: ml_ipynb\n",
"Already up-to-date.\n",
"\n",
" 18 :\t Ipython Notebook Visuals \n",
"==>>\t https://github.com/bangadennis/ipynb_visuals.git\n",
"\t Existed, git pull: ipynb_visuals\n",
"Already up-to-date.\n",
"\n",
" 19 :\t Canvas Widget for IPython Notebook https://github.com/Who8MyLunch/ipynb_widget_canvas \n",
"==>>\t https://github.com/Who8MyLunch/ipynb_widget_canvas.git\n",
"\t Existed, git pull: ipynb_widget_canvas\n",
"Already up-to-date.\n",
"\n",
" 20 :\t Copy of Udacity DL course ipynb files, and maybe some other stuff \n",
"==>>\t https://github.com/damienstanton/tensorflownotes.git\n",
"\t Existed, git pull: tensorflownotes\n",
"Already up-to-date.\n",
"\n",
" 21 :\t This is an ipynb in which we use simple logistic regression from Spark MLlib to train a sarcasm detector on comments from reddit. \n",
"==>>\t https://github.com/FranekJemiolo/SarcasmDetector.git\n",
"\t Existed, git pull: SarcasmDetector\n",
"Already up-to-date.\n",
"\n",
" 22 :\t IPython project \n",
"==>>\t https://github.com/ipython/ipython.git\n",
"Git clone ......\n",
"正克隆到 'ipython'...\n",
"remote: Counting objects: 154141, done.\u001b[K\n",
"remote: Compressing objects: 100% (37/37), done.\u001b[K\n",
"error: RPC failed; curl 56 GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n",
"fatal: The remote end hung up unexpectedly\n",
"fatal: 过早的文件结束符(EOF)\n",
"fatal: index-pack failed\n",
"\n",
" 23 :\t Topik project \n",
"==>>\t https://github.com/ContinuumIO/topik.git\n",
"\t Existed, git pull: topik\n",
"Already up-to-date.\n",
"\n",
" 24 :\t scientific-python-lectures \n",
"==>>\t https://github.com/ContinuumIO/scientific-python-lectures.git\n",
"\t Existed, git pull: scientific-python-lectures\n",
"Already up-to-date.\n",
"\n",
" 25 :\t Continuum work from XDATA January 2016 Hackathon with U. S. Census Bureau \n",
"==>>\t https://github.com/ContinuumIO/xdata-2016-census.git\n",
"\t Existed, git pull: xdata-2016-census\n",
"Already up-to-date.\n",
"\n",
" 26 :\t Analysis on Each Image \n",
"==>>\t https://github.com/ContinuumIO/image-analyzer.git\n",
"\t Existed, git pull: image-analyzer\n",
"Already up-to-date.\n",
"Finished.\n"
]
}
],
"source": [
"import os\n",
"import os.path\n",
"\n",
"index = 0\n",
"for d in url_list:\n",
" index += 1\n",
" print(\"\\n\",index,\":\\t\",d[\"uname\"],\"\\n==>>\\t\",d[\"ugit\"])\n",
"\n",
" git_path = os.path.split(d[\"ugit\"])\n",
" git_name = git_path[1].split(\".\")[0]\n",
" #print(git_name)\n",
" \n",
" if os.path.exists(git_name):\n",
" print(\"\\t Existed, git pull:\",git_name,\" ...\")\n",
" ! cd $git_name && git pull\n",
" else:\n",
" print(\"Git clone ......\")\n",
" ucmd = \"git clone \" + d[\"ugit\"]\n",
" #print(ucmd)\n",
" ! $ucmd\n",
"print(\"Finished.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}