{ "metadata": { "name": "", "signature": "sha256:6325d03b4911b2fe2381e18c976fc024715dd50631e9257256b40b228669291e" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 1: \u9244\u9053\u8def\u7dda\u56f3\u30c7\u30fc\u30bf\u3092\u30b0\u30e9\u30d5\u3068\u3057\u3066\u89e3\u91c8\u3057\u3066\u53ef\u8996\u5316\u3059\u308b \n", "\n", "by [Keiichiro Ono](http://keiono.github.io/)\n", "\n", "----\n", "\n", "\u3053\u306e\u30ce\u30fc\u30c8\u30d6\u30c3\u30af\u306e\u57fa\u672c\u30c7\u30fc\u30bf\u306f\u3001\u5168\u3066[\u99c5\u30c7\u30fc\u30bf.jp](http://www.ekidata.jp/)\u306e\u7121\u6599\u7248\u30c7\u30fc\u30bf\u3092\u5229\u7528\u3055\u305b\u3066\u3044\u305f\u3060\u304d\u307e\u3057\u305f\u3002\n", "\n", "\n", "\n", " \n", "\n", "## \u306f\u3058\u3081\u306b\n", "\u7121\u511f\u3067\u5165\u624b\u3067\u304d\u308b\u9244\u9053\u95a2\u4fc2\u306e\u30c7\u30fc\u30bf\u3092\u30de\u30c3\u30d4\u30f3\u30b0\u3059\u308b\u305f\u3081\u306eCytoscape\u30bb\u30c3\u30b7\u30e7\u30f3\u30d5\u30a1\u30a4\u30eb\u3092\u4f5c\u6210\u3059\u308b\u305f\u3081\u306b\u884c\u3063\u305f\u3001\u30c7\u30fc\u30bf\u306e\u52a0\u5de5\u904e\u7a0b\u3067\u3059\u3002\uff08\u52b9\u7387\u6027\u306f\u7121\u8996\u3057\u3066\u3042\u308a\u307e\u3059\u306e\u3067\u3054\u4e86\u627f\u4e0b\u3055\u3044\u3002\uff09\n", "\n", "\u8a73\u7d30\u306f[\u3053\u3061\u3089\u306e\u8a18\u4e8b](http://qiita.com/keiono/items/29286f49b15a5b13c987)\u3092\u3054\u89a7\u304f\u3060\u3055\u3044\u3002\n", "\n", "\n", "### \u30b4\u30fc\u30eb\n", "Cytoscape\u4e0a\u5229\u7528\u3067\u304d\u308b\u30b0\u30e9\u30d5\u3068\u3057\u3066\u306e\u767d\u5730\u56f3\u3092\u4f5c\u6210\u3059\u308b\u3002\n", "\n", "### \u5fc5\u8981\u306a\u77e5\u8b58\n", "\u3053\u306e\u30b5\u30f3\u30d7\u30eb\u3092\u7406\u89e3\u3059\u308b\u306e\u306b\u5fc5\u8981\u306a\u6570\u5b66\u7684\u77e5\u8b58\u306f\u30bc\u30ed\u3067\u3059\u3002\u57fa\u790e\u7684\u306aPython\u3068Pandas\u306e\u77e5\u8b58\u3060\u3051\u3067\u3059\u3002\n", "\n", "### \u30a2\u30c3\u30d7\u30c7\u30fc\u30c8\u60c5\u5831\n", "* 8/15/2014: \u73fe\u5728\u3053\u306e\u30ce\u30fc\u30c8\u306f\u30a2\u30c3\u30d7\u30c7\u30fc\u30c8\u4e2d\u3067\u3059\u3002\u968f\u6642\u4fee\u6b63\u3092\u52a0\u3048\u3066\u3044\u304d\u307e\u3059\u3002\n", "\n", "----\n", "## \u5b9f\u969b\u306e\u30ef\u30fc\u30af\u30d5\u30ed\u30fc\n", "\n", "### 0. \u5fc5\u8981\u306a\u30e9\u30a4\u30d6\u30e9\u30ea\u306e\u8aad\u307f\u8fbc\u307f" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. \u30c7\u30fc\u30bf\u306e\u8aad\u307f\u8fbc\u307f\n", "\u307e\u305a\u3001\u5168\u3066\u306e\u30c7\u30fc\u30bf\u3092Pandas\u306eDataFrame\u306b\u8aad\u307f\u8fbc\u307f\u307e\u3059\u3002\u3053\u308c\u306f\uff08\u73fe\u5728\u306e\u57fa\u6e96\u3067\u306f\uff09\u3068\u3066\u3082\u5c0f\u3055\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306a\u306e\u3067\u3001\u666e\u901a\u306e\u30e9\u30c3\u30d7\u30c8\u30c3\u30d7\u3067\u3082\u5168\u304f\u554f\u984c\u3042\u308a\u307e\u305b\u3093\u3002\u3059\u3079\u3066\u306e\u4f5c\u696d\u3092\u30e1\u30e2\u30ea\u4e0a\u3067\u884c\u3044\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u99c5\u306b\u95a2\u3059\u308b\u30c7\u30fc\u30bf\n", "station_df = pd.read_csv('station20140303free.csv')\n", "\n", "# \u8def\u7dda\u306b\u95a2\u3059\u308b\u30c7\u30fc\u30bf\n", "connection_df = pd.read_csv('line20140303free.csv')\n", "\n", "# \u99c5\u3068\u99c5\u306e\u63a5\u7d9a\u30c7\u30fc\u30bf\u3001\u3064\u307e\u308a\u30b0\u30e9\u30d5\n", "graph_df = pd.read_csv('join20140303.csv')\n", "\n", "# \u9244\u9053\u4f1a\u793e\u306e\u30c7\u30fc\u30bf\n", "company_df = pd.read_csv('company20130120.csv')\n", "\n", "# \u770c\u540d\u3068\u770cID\u306e\u30c6\u30fc\u30d6\u30eb\n", "pref_df = pd.read_csv('pref.csv')\n", "\n", "# \u8def\u7dda\u540d\u306e\u30ea\u30b9\u30c8\uff08\u5f8c\u307b\u3069\u4f7f\u7528\uff09\n", "line_names = connection_df[['line_cd', 'line_name']]\n", "line_names.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
line_cdline_name
0 1001 \u4e2d\u592e\u65b0\u5e79\u7dda
1 1002 \u6771\u6d77\u9053\u65b0\u5e79\u7dda
2 1003 \u5c71\u967d\u65b0\u5e79\u7dda
3 1004 \u6771\u5317\u65b0\u5e79\u7dda
4 1005 \u4e0a\u8d8a\u65b0\u5e79\u7dda
\n", "

5 rows \u00d7 2 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ " line_cd line_name\n", "0 1001 \u4e2d\u592e\u65b0\u5e79\u7dda\n", "1 1002 \u6771\u6d77\u9053\u65b0\u5e79\u7dda\n", "2 1003 \u5c71\u967d\u65b0\u5e79\u7dda\n", "3 1004 \u6771\u5317\u65b0\u5e79\u7dda\n", "4 1005 \u4e0a\u8d8a\u65b0\u5e79\u7dda\n", "\n", "[5 rows x 2 columns]" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. \u30c7\u30fc\u30bf\u306e\u6383\u9664\n", "\u305d\u306e\u307e\u307e\u3067\u3082\u4f7f\u3048\u308b\u306e\u3067\u3059\u304c\u3001\u898b\u6804\u3048\u306e\u826f\u3044\u7d50\u679c\u3092\u5f97\u308b\u305f\u3081\u306b\u3001\u5fc5\u8981\u306e\u306a\u3044\u3082\u306e\u3092\u524a\u9664\u3084\u3001\u65b0\u305f\u306a\u30c7\u30fc\u30bf\u306e\u751f\u6210\u3092\u884c\u3044\u307e\u3059\u3002\n", "\n", "#### \u5fc5\u8981\u306e\u306a\u3044\u30ab\u30e9\u30e0\u306e\u524a\u9664\n", "\u6709\u6599\u7248\u306e\u307f\u306b\u63d0\u4f9b\u3055\u308c\u3066\u3044\u308b\u30ab\u30e9\u30e0\u306f\u524a\u9664\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "station_df.drop(['station_name_k', 'station_name_r', 'e_sort'], axis=1, inplace=True)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \u7def\u5ea6\u3068\u7d4c\u5ea6\u304b\u3089Cytoscape\u306e\u5ea7\u6a19\u7cfb\u306b\u5909\u63db\u3059\u308b\n", "\u5ea7\u6a19\u7cfb\u306e\u5909\u63db\u3068\u8a00\u3063\u3066\u3082\u3001\u7279\u306b\u8907\u96d1\u306a\u8a08\u7b97\u306f\u5fc5\u8981\u3042\u308a\u307e\u305b\u3093\u3002y\u8ef8\u306e\u4e0b\u304c\u6b63\u306e\u65b9\u5411\u306b\u306a\u3063\u3066\u3044\u308b\u306e\u3067\u3001\u305d\u308c\u3092\u4fee\u6b63\u3057\u3001\u3042\u3068\u306f\u6b63\u3057\u304f\u8868\u793a\u3055\u308c\u308b\u3088\u3046\u306b\u30b9\u30b1\u30fc\u30ea\u30f3\u30b0\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "SCALE_FACTOR = 10000\n", "\n", "station_df['x'] = station_df['lon'] * SCALE_FACTOR\n", "station_df['y'] = station_df['lat'] * (-SCALE_FACTOR)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \u770c\u540d\u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3068\u99c5\u60c5\u5831\u306e\u7d71\u5408\n", "\u30b7\u30f3\u30d7\u30eb\u306a\u30de\u30fc\u30b8\u3092\u884c\u3044\u3001\u770cID\u3092\u770c\u540d\u306b\u5909\u63db\u3057\u307e\u3059" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u4eba\u9593\u306b\u3068\u3063\u3066\u8aad\u307f\u3084\u3059\u3044\u3088\u3046\u306b\u3001\u770c\u306eID\u3092\u770c\u540d\u306b\u5909\u63db\u3059\u308b\n", "merged_stations = pd.merge(pref_df, station_df, on = 'pref_cd')\n", "\n", "# \u8def\u7dda\u540d\u3082\u4eba\u304c\u8aad\u3081\u308b\u3082\u306e\u306b\u3059\u308b\n", "merged_stations = pd.merge(merged_stations, line_names, on='line_cd')\n", "\n", "# \u5fc5\u8981\u306e\u7121\u304f\u306a\u3063\u305f\u30ab\u30e9\u30e0\u3092\u9664\u53bb\n", "merged_stations.drop(['pref_cd', 'line_cd'], axis=1, inplace=True)\n", "\n", "merged_stations.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pref_namestation_cdstation_g_cdstation_namepostaddlonlatopen_ymdclose_ymde_statusxyline_name
0 \u5317\u6d77\u9053 1110101 1110101 \u51fd\u9928 040-0063 \u5317\u6d77\u9053\u51fd\u9928\u5e02\u82e5\u677e\u753a\uff11\uff12-\uff11\uff13 140.726413 41.773709 1902-12-10 NaN 0 1407264.13-417737.09 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8)
1 \u5317\u6d77\u9053 1110102 1110102 \u4e94\u7a1c\u90ed 041-0813 \u51fd\u9928\u5e02\u4e80\u7530\u672c\u753a 140.733539 41.803557 NaN NaN 0 1407335.39-418035.57 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8)
2 \u5317\u6d77\u9053 1110103 1110103 \u6854\u6897 041-1210 \u5317\u6d77\u9053\u51fd\u9928\u5e02\u6854\u6897\uff13\u4e01\u76ee\uff14\uff11-\uff13\uff16 140.722952 41.846457 1902-12-10 NaN 0 1407229.52-418464.57 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8)
3 \u5317\u6d77\u9053 1110104 1110104 \u5927\u4e2d\u5c71 041-1121 \u4e80\u7530\u90e1\u4e03\u98ef\u753a\u5927\u5b57\u5927\u4e2d\u5c71 140.713580 41.864641 NaN NaN 0 1407135.80-418646.41 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8)
4 \u5317\u6d77\u9053 1110105 1110105 \u4e03\u98ef 041-1111 \u4e80\u7530\u90e1\u4e03\u98ef\u753a\u5b57\u672c\u753a 140.688556 41.886971 NaN NaN 0 1406885.56-418869.71 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8)
\n", "

5 rows \u00d7 14 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ " pref_name station_cd station_g_cd station_name post \\\n", "0 \u5317\u6d77\u9053 1110101 1110101 \u51fd\u9928 040-0063 \n", "1 \u5317\u6d77\u9053 1110102 1110102 \u4e94\u7a1c\u90ed 041-0813 \n", "2 \u5317\u6d77\u9053 1110103 1110103 \u6854\u6897 041-1210 \n", "3 \u5317\u6d77\u9053 1110104 1110104 \u5927\u4e2d\u5c71 041-1121 \n", "4 \u5317\u6d77\u9053 1110105 1110105 \u4e03\u98ef 041-1111 \n", "\n", " add lon lat open_ymd close_ymd e_status \\\n", "0 \u5317\u6d77\u9053\u51fd\u9928\u5e02\u82e5\u677e\u753a\uff11\uff12-\uff11\uff13 140.726413 41.773709 1902-12-10 NaN 0 \n", "1 \u51fd\u9928\u5e02\u4e80\u7530\u672c\u753a 140.733539 41.803557 NaN NaN 0 \n", "2 \u5317\u6d77\u9053\u51fd\u9928\u5e02\u6854\u6897\uff13\u4e01\u76ee\uff14\uff11-\uff13\uff16 140.722952 41.846457 1902-12-10 NaN 0 \n", "3 \u4e80\u7530\u90e1\u4e03\u98ef\u753a\u5927\u5b57\u5927\u4e2d\u5c71 140.713580 41.864641 NaN NaN 0 \n", "4 \u4e80\u7530\u90e1\u4e03\u98ef\u753a\u5b57\u672c\u753a 140.688556 41.886971 NaN NaN 0 \n", "\n", " x y line_name \n", "0 1407264.13 -417737.09 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8) \n", "1 1407335.39 -418035.57 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8) \n", "2 1407229.52 -418464.57 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8) \n", "3 1407135.80 -418646.41 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8) \n", "4 1406885.56 -418869.71 JR\u51fd\u9928\u672c\u7dda(\u51fd\u9928\uff5e\u9577\u4e07\u90e8) \n", "\n", "[5 rows x 14 columns]" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \u8def\u7dda\u30c7\u30fc\u30bf\u306e\u7c21\u5358\u306a\u52a0\u5de5\n", "\u8def\u7dda\u60c5\u5831\u306e\u30c6\u30fc\u30d6\u30eb\u306b\u3082\u7c21\u5358\u306a\u52a0\u5de5\u3092\u65bd\u3057\u307e\u3059\u3002\n", "\n", "* \u5fc5\u8981\u306e\u306a\u3044\u30ab\u30e9\u30e0\u306e\u9664\u53bb" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u5fc5\u8981\u306a\u3044\u30ab\u30e9\u30e0\u306e\u524a\u9664\n", "connection_df.drop(['line_color_c', 'line_color_t', 'line_type', 'e_sort'], axis=1, inplace=True)\n", "connection_df.head(5)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
line_cdcompany_cdline_nameline_name_kline_name_hlonlatzoome_status
0 1001 3 \u4e2d\u592e\u65b0\u5e79\u7dda \u30c1\u30e5\u30a6\u30aa\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e2d\u592e\u65b0\u5e79\u7dda 137.493896 35.411438 8 1
1 1002 3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda \u30c8\u30a6\u30ab\u30a4\u30c9\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda 137.721489 35.144122 7 0
2 1003 4 \u5c71\u967d\u65b0\u5e79\u7dda \u30b5\u30f3\u30e8\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u5c71\u967d\u65b0\u5e79\u7dda 133.147896 34.419338 7 0
3 1004 2 \u6771\u5317\u65b0\u5e79\u7dda \u30c8\u30a6\u30db\u30af\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u5317\u65b0\u5e79\u7dda 140.763192 38.274267 7 0
4 1005 2 \u4e0a\u8d8a\u65b0\u5e79\u7dda \u30b8\u30e7\u30a6\u30a8\u30c4\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e0a\u8d8a\u65b0\u5e79\u7dda 139.121488 36.798565 8 0
\n", "

5 rows \u00d7 9 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ " line_cd company_cd line_name line_name_k line_name_h lon \\\n", "0 1001 3 \u4e2d\u592e\u65b0\u5e79\u7dda \u30c1\u30e5\u30a6\u30aa\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e2d\u592e\u65b0\u5e79\u7dda 137.493896 \n", "1 1002 3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda \u30c8\u30a6\u30ab\u30a4\u30c9\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda 137.721489 \n", "2 1003 4 \u5c71\u967d\u65b0\u5e79\u7dda \u30b5\u30f3\u30e8\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u5c71\u967d\u65b0\u5e79\u7dda 133.147896 \n", "3 1004 2 \u6771\u5317\u65b0\u5e79\u7dda \u30c8\u30a6\u30db\u30af\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u5317\u65b0\u5e79\u7dda 140.763192 \n", "4 1005 2 \u4e0a\u8d8a\u65b0\u5e79\u7dda \u30b8\u30e7\u30a6\u30a8\u30c4\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e0a\u8d8a\u65b0\u5e79\u7dda 139.121488 \n", "\n", " lat zoom e_status \n", "0 35.411438 8 1 \n", "1 35.144122 7 0 \n", "2 34.419338 7 0 \n", "3 38.274267 7 0 \n", "4 36.798565 8 0 \n", "\n", "[5 rows x 9 columns]" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u4e8b\u696d\u8005\u60c5\u5831\u3068\u8def\u7dda\u30c7\u30fc\u30bf\u306e\u30de\u30fc\u30b8\n", "\u3053\u306e\u30de\u30fc\u30b8\u306b\u3088\u308a\u30c6\u30fc\u30d6\u30eb\u304c\u5197\u9577\u306b\u306a\u308a\u307e\u3059\u304c\u3001Cytoscape\u4e0a\u3067\u4fbf\u5229\u306a\u306e\u3067\u5b9f\u884c\u3057\u3066\u65b0\u3057\u3044\u30c6\u30fc\u30d6\u30eb\u306b\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "connection_final_df= pd.merge(connection_df, company_df, on='company_cd')\n", "connection_final_df.drop(['e_sort'], axis=1, inplace=True)\n", "connection_final_df.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
line_cdcompany_cdline_nameline_name_kline_name_hlonlatzoome_status_xrr_cdcompany_namecompany_name_kcompany_name_hcompany_name_rcompany_urlcompany_typee_status_y
0 1001 3 \u4e2d\u592e\u65b0\u5e79\u7dda \u30c1\u30e5\u30a6\u30aa\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e2d\u592e\u65b0\u5e79\u7dda 137.493896 35.411438 8 1 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0
1 1002 3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda \u30c8\u30a6\u30ab\u30a4\u30c9\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda 137.721489 35.144122 7 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0
2 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0
3 11411 3 JR\u4e2d\u592e\u672c\u7dda(\u540d\u53e4\u5c4b\uff5e\u5869\u5c3b) \u30c1\u30e5\u30a6\u30aa\u30a6\u30db\u30f3\u30bb\u30f3 JR\u4e2d\u592e\u672c\u7dda(\u540d\u53e4\u5c4b\uff5e\u5869\u5c3b) 137.468492 35.662471 9 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0
4 11413 3 JR\u98ef\u7530\u7dda(\u8c4a\u6a4b\uff5e\u5929\u7adc\u5ce1) \u30a4\u30a4\u30c0\u30bb\u30f3 JR\u98ef\u7530\u7dda(\u8c4a\u6a4b\uff5e\u5929\u7adc\u5ce1) 137.668949 35.125648 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0
\n", "

5 rows \u00d7 17 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ " line_cd company_cd line_name line_name_k line_name_h \\\n", "0 1001 3 \u4e2d\u592e\u65b0\u5e79\u7dda \u30c1\u30e5\u30a6\u30aa\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u4e2d\u592e\u65b0\u5e79\u7dda \n", "1 1002 3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda \u30c8\u30a6\u30ab\u30a4\u30c9\u30a6\u30b7\u30f3\u30ab\u30f3\u30bb\u30f3 \u6771\u6d77\u9053\u65b0\u5e79\u7dda \n", "2 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda \n", "3 11411 3 JR\u4e2d\u592e\u672c\u7dda(\u540d\u53e4\u5c4b\uff5e\u5869\u5c3b) \u30c1\u30e5\u30a6\u30aa\u30a6\u30db\u30f3\u30bb\u30f3 JR\u4e2d\u592e\u672c\u7dda(\u540d\u53e4\u5c4b\uff5e\u5869\u5c3b) \n", "4 11413 3 JR\u98ef\u7530\u7dda(\u8c4a\u6a4b\uff5e\u5929\u7adc\u5ce1) \u30a4\u30a4\u30c0\u30bb\u30f3 JR\u98ef\u7530\u7dda(\u8c4a\u6a4b\uff5e\u5929\u7adc\u5ce1) \n", "\n", " lon lat zoom e_status_x rr_cd company_name company_name_k \\\n", "0 137.493896 35.411438 8 1 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "1 137.721489 35.144122 7 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "2 138.532397 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "3 137.468492 35.662471 9 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "4 137.668949 35.125648 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "\n", " company_name_h company_name_r company_url company_type \\\n", "0 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "1 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "2 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "3 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "\n", " e_status_y \n", "0 0 \n", "1 0 \n", "2 0 \n", "3 0 \n", "4 0 \n", "\n", "[5 rows x 17 columns]" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \u99c5\u306e\u63a5\u7d9a\u30c7\u30fc\u30bf\u3092\u8def\u7dda\u60c5\u5831\u3068\u7d71\u5408\u3059\u308b\n", "\u3053\u306e\u8def\u7dda\u63a5\u7d9a\u30c7\u30fc\u30bf\u306b\u306f\u3001\u99c5\u306b\u95a2\u3059\u308b\u60c5\u5831\u306e\u306a\u3044\u3082\u306e\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002\u305d\u308c\u3089\u3092\u3072\u3068\u5de5\u592b\u3057\u3066\u9664\u53bb\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u30b0\u30e9\u30d5\u3068\u8def\u7dda\u30c7\u30fc\u30bf\u3092\u30de\u30fc\u30b8\n", "graph_new_df = pd.merge(connection_final_df, graph_df, on='line_cd')\n", "\n", "# \u5168\u3066\u306e\u30e6\u30cb\u30fc\u30af\u306a\u99c5ID\u3092\u99c5\u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u304b\u3089\u53d6\u308a\u51fa\u3059\n", "all_stations = merged_stations['station_cd'].unique()\n", "\n", "# \u30b0\u30e9\u30d5\u30c7\u30fc\u30bf\u306b\u542b\u307e\u308c\u308b\u30e6\u30cb\u30fc\u30af\u306a\u99c5ID\u3092\u62bd\u51fa\n", "st1 = graph_new_df['station_cd1']\n", "st2 = graph_new_df['station_cd2']\n", "stations_in_graph = pd.concat([st1,st2]).unique()\n", "\n", "# \u30c7\u30fc\u30bf\u306e\u306a\u3044\u99c5\u3092\u30c1\u30a7\u30c3\u30af\u3059\u308b\u95a2\u6570\n", "def has_loc(station_id1, station_id2, all_stations):\n", " if station_id1 in all_stations and station_id2 in all_stations:\n", " return True\n", " else:\n", " return False\n", "\n", "# \u305d\u306e\u95a2\u6570\u3092\u6e21\u3057\u3066\u30e9\u30e0\u30c0\u5f0f\u3068\u3057\u3066\u9069\u7528\n", "graph_new_df['has_station_data'] = graph_new_df.apply(lambda row: has_loc(row['station_cd1'], row['station_cd2'], all_stations), axis=1)\n", "\n", "# \u30c7\u30fc\u30bf\u306e\u306a\u3044\u30a8\u30c3\u30b8\u3092\u9664\u53bb\u3059\u308b\n", "graph_final_df = graph_new_df[graph_new_df['has_station_data'] == True]\n", "graph_final_df.head(3)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
line_cdcompany_cdline_nameline_name_kline_name_hlonlatzoome_status_xrr_cdcompany_namecompany_name_kcompany_name_hcompany_name_rcompany_urlcompany_typee_status_ystation_cd1station_cd2has_station_data
16 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0 1140201 1140202 True
17 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0 1140202 1140203 True
18 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 0 1140203 1140204 True
\n", "

3 rows \u00d7 20 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ " line_cd company_cd line_name line_name_k line_name_h lon \\\n", "16 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 \n", "17 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 \n", "18 11402 3 JR\u8eab\u5ef6\u7dda \u30df\u30ce\u30d6\u30bb\u30f3 JR\u8eab\u5ef6\u7dda 138.532397 \n", "\n", " lat zoom e_status_x rr_cd company_name company_name_k \\\n", "16 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "17 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "18 35.392163 10 0 11 JR\u6771\u6d77 \u30b8\u30a7\u30a4\u30a2\u30fc\u30eb\u30c8\u30a6\u30ab\u30a4 \n", "\n", " company_name_h company_name_r company_url company_type \\\n", "16 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "17 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "18 \u6771\u6d77\u65c5\u5ba2\u9244\u9053\u682a\u5f0f\u4f1a\u793e JR\u6771\u6d77 http://jr-central.co.jp/ 1 \n", "\n", " e_status_y station_cd1 station_cd2 has_station_data \n", "16 0 1140201 1140202 True \n", "17 0 1140202 1140203 True \n", "18 0 1140203 1140204 True \n", "\n", "[3 rows x 20 columns]" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.\u30c7\u30fc\u30bf\u306e\u66f8\u304d\u51fa\u3057\n", "\u4eca\u56de\u306f\u57fa\u790e\u3092\u5b66\u3076\u305f\u3081\u306b\u3001\u30d5\u30a1\u30a4\u30eb\u30d9\u30fc\u30b9\u3067\u30c7\u30fc\u30bf\u3092\u3084\u308a\u3068\u308a\u3057\u307e\u3059\u3002\u4ee5\u4e0b\u306eCSV\u30d5\u30a1\u30a4\u30eb\u306fCytoscape\u306b\u5bb9\u6613\u306b\u8aad\u307f\u8fbc\u3081\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "graph_final_df.to_csv('graph_disconnected.csv')\n", "merged_stations.to_csv('stations.csv')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Cytoscape\u3067\u306e\u8aad\u307f\u8fbc\u307f\u3068\u53ef\u8996\u5316\n", "\u30c6\u30ad\u30b9\u30c8\u30c6\u30fc\u30d6\u30eb\u306b\u306a\u308c\u3070\u3001Cytoscape\u306b\u8aad\u307f\u8fbc\u3080\u3053\u3068\u306f\u7c21\u5358\u3067\u3059\u3002\u8a73\u7d30\u306f\u8a18\u4e8b\u306e\u65b9\u306b\u3002\n", "\n", "#### \u65e5\u672c\u306e\u9244\u9053\u30b7\u30b9\u30c6\u30e0\uff08\u5730\u7406\u60c5\u5831\u306b\u3088\u308b\u30ce\u30fc\u30c9\u306e\u914d\u7f6e\uff09\n", "\n", "![](http://cl.ly/Wy4M/japan_railways.png)\n", "\n", "#### \u6771\u4eac\u30a8\u30ea\u30a2\u306e\u30ba\u30fc\u30e0\n", "\n", "![](http://cl.ly/X0Dr/tokyo.png)\n", "\n", "\n", "* [Cytoscape 3.1.1\u3067\u4f5c\u6210\u3057\u305f\u30bb\u30c3\u30b7\u30e7\u30f3\u30d5\u30a1\u30a4\u30eb](http://cl.ly/X0D7/japan_railways_final.cys)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "# Part 2: \u30b0\u30e9\u30d5\u306e\u52a0\u5de5\u3068\u516c\u5171\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u306e\u30de\u30fc\u30b8\n", "\n", "## 1. \u30af\u30ea\u30fc\u30af\u306e\u4f5c\u6210\n", "\u30b0\u30e9\u30d5\u5185\u306e\u30b0\u30eb\u30fc\u30d7\u3092\u30af\u30ea\u30fc\u30af\u5316\u3057\u3001\u5168\u56fd\u306e\u8def\u7dda\u3092\u63a5\u7d9a\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u30b0\u30eb\u30fc\u30d7\u3092\u62bd\u51fa\u3000\uff08\u975e\u52b9\u7387\u306a\u306e\u3067\u5f53\u7136\u306a\u304c\u3089\u9045\u3044\u3051\u3069\u30b7\u30f3\u30d7\u30eb\u306a\u306e\u3067\uff09\n", "groups = map(lambda x: merged_stations[merged_stations['station_g_cd'] == x], merged_stations['station_g_cd'].unique())\n", "\n", "# \u62bd\u51fa\u306e\u305f\u3081\u306e\u30e6\u30fc\u30c6\u30a3\u30ea\u30c6\u30a3\u95a2\u6570\n", "def add_cl(df, edges):\n", " group_members = df['station_cd']\n", " processed = set([])\n", " map(lambda station_id: add_edge(station_id, group_members, edges, processed), group_members)\n", "\n", "def add_edge(current_station, stations, edges, processed):\n", " for station in stations:\n", " if station != current_station and station not in processed:\n", " edges.append([station, 0, current_station])\n", " processed.add(current_station)\n", "\n", "# \u65b0\u3057\u3044\u30a8\u30c3\u30b8\u3092\u5165\u308c\u308b\u5165\u308c\u7269\n", "group_edges = []\n", "map(lambda df: add_cl(df, group_edges) if len(df) != 1 else None, groups)\n", "\n", "# \u6700\u7d42\u7684\u306b\u306f\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\n", "cliques_df = pd.DataFrame(group_edges, columns=['station_cd1', 'line_cd', 'station_cd2'])\n", "\n", "# \u30af\u30ea\u30fc\u30af\u306e\u307f\u3092SIF\u5f62\u5f0f\u306e\u30c6\u30fc\u30d6\u30eb\u3068\u3057\u3066\u66f8\u304d\u51fa\u3057\u3066\u307f\u307e\u3059\n", "cliques_df.to_csv('cliques.sif', sep=' ', index=False)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u3061\u3083\u3093\u3068\u3057\u305f\u7d50\u679c\u306b\u306a\u3063\u3066\u3044\u308b\u304b\u53ef\u8996\u5316\u3057\u3066\u78ba\u8a8d\u3059\u308b\n", "\u3053\u3053\u3067\u66f8\u304d\u3060\u3057\u305f\u30d5\u30a1\u30a4\u30eb\u306fCytoscape\u3067\u6a19\u6e96\u7684\u306b\u4f7f\u308f\u308c\u3066\u3044\u308b\u30d5\u30a9\u30fc\u30de\u30c3\u30c8\u306a\u306e\u3067\u3001\u305d\u306e\u307e\u307e\u8aad\u307f\u8fbc\u3081\u307e\u3059\u3002\u53ef\u8996\u5316\u3059\u308b\u3068\u4ee5\u4e0b\u306e\u69d8\u306a\u611f\u3058\u306b\u3002\n", "\n", "![](http://cl.ly/X3m2/cliques.png)\n", "\n", "\n", "\u5927\u4e08\u592b\u306a\u3088\u3046\u3067\u3059\u306d\u3002\n", "\n", "\n", "### \u65e2\u5b58\u306e\u30b0\u30e9\u30d5\u306b\u30de\u30fc\u30b8\u3059\u308b\n", "\u3067\u306f\u3053\u308c\u3092\u4eca\u3042\u308b\u9244\u9053\u8def\u7dda\u56f3\u3068\u30de\u30fc\u30b8\u3057\u307e\u3057\u3087\u3046\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "merged_graph = pd.concat([graph_final_df, cliques_df])\n", "merged_graph.to_csv('graph_connected.csv')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u30b3\u30cd\u30af\u30b7\u30e7\u30f3\u306e\u78ba\u8a8d\n", "\u63a5\u7d9a\u306f\u3046\u307e\u304f\u884c\u3063\u305f\u3067\u3057\u3087\u3046\u304b\uff1f\u518d\u3073Cytoscape\u306b\u8aad\u307f\u8fbc\u307e\u305b\u3066\u8a66\u3057\u307e\u3059\u3002\n", "\n", "#### \u5143\u3005\u306e\u8def\u7dda\u56f3\u3000\uff08\u975e\u63a5\u7d9a\u72b6\u614b\uff09\n", "\n", "![](http://cl.ly/X3dp/disconnected.png)\n", "\n", "\n", "\u5730\u7406\u60c5\u5831\u306e\u30de\u30c3\u30d4\u30f3\u30b0\u3092\u9664\u53bb\u3057\u3066\u30ec\u30a4\u30a2\u30a6\u30c8\u3059\u308b\u3068\u3001\u5f53\u7136\u3001\u5206\u65ad\u3055\u308c\u305f\u5404\u8def\u7dda\u304c\u898b\u3048\u307e\u3059\u3002\u30ba\u30fc\u30e0\u30a4\u30f3\u3057\u3066\u307f\u307e\u3059:\n", "\n", "\n", "![](http://cl.ly/X3hR/disconnected_zoom.png)\n", "\n", "\n", "\n", "#### \u65b0\u3057\u3044\u30c7\u30fc\u30bf (\u30b0\u30eb\u30fc\u30d7\u60c5\u5831\u3092\u4f7f\u3063\u3066\u63a5\u7d9a\u6e08\u307f\uff09\n", "\n", "##### \u65e5\u672c\u5168\u56fd\u306e\u9244\u9053\u30b7\u30b9\u30c6\u30e0\u30b0\u30e9\u30d5\n", "\n", "\n", "![](http://cl.ly/X3gA/connected_view.png)\n", "\n", "\n", "##### \u518d\u3073\u6771\u4eac\u30e1\u30c8\u30ed\u306e\u30b5\u30d6\u30b0\u30e9\u30d5\n", "\n", "\n", "![](http://cl.ly/X3kp/metro2.png)\n", "\n", "\n", "\u4eca\u56de\u306f\u4f4d\u7f6e\u60c5\u5831\u3092\u5730\u7406\u7684\u306a\u4e8b\u5b9f\u306b\u95a2\u4fc2\u306a\u304f\u30b0\u30e9\u30d5\u3068\u3057\u3066\u30ec\u30a4\u30a2\u30a6\u30c8\u3067\u304d\u3066\u3044\u307e\u3059\u3002\u5b9f\u969b\u306e\u8def\u7dda\u3068\u3001\u99c5\u306e\u30b0\u30eb\u30fc\u30d7\u3092\u7e4b\u3050\u30a8\u30c3\u30b8\u306f\u5b9f\u7dda\u3068\u7834\u7dda\u3067\u63cf\u304d\u5206\u3051\u3066\u3042\u308a\u307e\u3059\u3002" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## \u516c\u5171\u30c7\u30fc\u30bf\u306e\u6383\u9664\u3068\u30de\u30c3\u30d4\u30f3\u30b0\n", "\n", "\u4e57\u964d\u5ba2\u30c7\u30fc\u30bf\u306f\u3069\u3053\u304b\u306b\u306a\u3044\u304b\u3068\u63a2\u3057\u3066\u3044\u305f\u3068\u3053\u308d\u3001\u305d\u308c\u3089\u3057\u304d\u3082\u306e\u3092\u767a\u898b\u3057\u305f\u306e\u3067\u4f7f\u3063\u3066\u307f\u307e\u3059\u3002\n", "\n", "\u30bd\u30fc\u30b9\u306f\u3053\u3061\u3089\u3067\u3059: http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-S12.html\n", "\n", "\n", "\u8a73\u3057\u3044\u30b9\u30da\u30c3\u30af\u306f\u3053\u3061\u3089\n", "\n", "* http://nlftp.mlit.go.jp/ksj/gml/product_spec/KS-PS-S12-v2_0.pdf\n", "\n", "### \u6ce8\u610f\uff01\n", "\u3053\u306e\u30c7\u30fc\u30bf\u3001\u5b9f\u306f\u58ca\u308c\u305f\u72b6\u614b\u3067\u914d\u5e03\u3055\u308c\u3066\u3044\u307e\u3059\u3002\u3053\u3093\u306a\u611f\u3058\u3067\u7c21\u5358\u306b\u4fee\u5fa9\u3067\u304d\u307e\u3059\u304c\u3002\n", "\n", "\n", "```\n", "cat S12-13.xml | sed -e \"s/ksj:ailroad/ksj:railroad/\" > fixed.xml\n", "```\n", "\n", "## XML\u30d5\u30a1\u30a4\u30eb\u3092\u51e6\u7406\u3059\u308b\n", "\u5fc5\u8981\u306a\u90e8\u5206\u3092\u62fe\u3044\u3060\u3057\u3066DataFrame\u306b\u3057\u307e\u3059\u3002\n", "\n", "(\u66ab\u5b9a\u7248\u3067\u3059\u3002\u5f8c\u307b\u3069\u6383\u9664\u3057\u307e\u3059\uff09\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import xml.etree.ElementTree as ET\n", "\n", "tree = ET.parse('fixed.xml')\n", "root = tree.getroot()\n", "PREFIX = '{http://nlftp.mlit.go.jp/ksj/schemas/ksj-app}'\n", "\n", "passenger_array = []\n", "entries = root.findall('./' + PREFIX + 'TheNumberofTheStationPassengersGettingonandoff')\n", "\n", "# \u30d5\u30a3\u30fc\u30eb\u30c9\u540d\u306e\u62bd\u51fa\n", "column_names = []\n", "for col in entries[0]:\n", " column_names.append(col.tag.split('}')[1])\n", "\n", "# \u30c7\u30fc\u30bf\u3092\u62bd\u51fa\u3059\u308b\n", "for data in entries:\n", " row=[]\n", " for entry in data:\n", " if(type(entry.text) is unicode):\n", " row.append((entry.text).encode('utf-8'))\n", " else:\n", " row.append(entry.text)\n", " passenger_array.append(row)\n", "\n", "# \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3078\n", "passenger_df = pd.DataFrame(passenger_array, columns=column_names)\n", "\n", "# \u30b9\u30da\u30c3\u30af\u30b7\u30fc\u30c8\u304b\u3089\u4eba\u529b\u3067\u4eba\u306e\u8aad\u3081\u308b\u30c6\u30fc\u30d6\u30eb\u306b\u3059\u308b\uff08\u30c0\u30e1\u30c0\u30e1\u306a\u3084\u308a\u65b9\u3067\u3059\u306d...\uff09\n", "railroad_division = [\n", " ['\u666e\u901a\u9244\u9053 JR', '11'], \n", " ['\u666e\u901a\u9244\u9053', '12'], \n", " ['\u92fc\u7d22\u9244\u9053', '13'], \n", " ['\u61f8\u5782\u5f0f\u9244\u9053', '14'],\n", " ['\u8de8\u5ea7\u5f0f\u9244\u9053', '15'], \n", " ['\u6848\u5185\u8ecc\u6761\u5f0f\u9244\u9053','16'], \n", " ['\u7121\u8ecc\u6761\u9244\u9053', '17'], \n", " ['\u8ecc\u9053', '21'], \n", " ['\u61f8\u5782\u5f0f\u30e2\u30ce\u30ec\u30fc\u30eb', '22'], \n", " ['\u8de8\u5ea7\u5f0f\u30e2\u30ce\u30ec\u30fc\u30eb', '23'], \n", " ['\u6848\u5185\u8ecc\u6761\u5f0f', '24'], \n", " ['\u6d6e\u4e0a\u5f0f', '25']\n", "]\n", "\n", "railroad_company_classification = [\n", " ['JR\u65b0\u5e79\u7dda', '1'], \n", " ['JR\u5728\u6765\u7dda', '2'], \n", " ['\u516c\u55b6\u9244\u9053', '3'], \n", " ['\u6c11\u55b6\u9244\u9053', '4'], \n", " ['\u7b2c\u4e09\u30bb\u30af\u30bf\u30fc', '5'] \n", "]\n", "\n", "rd_df = pd.DataFrame(railroad_division, columns=['rail_type', 'railroadDivision'])\n", "company_type_df = pd.DataFrame(railroad_company_classification, columns=['company_type', 'railroadCompanyClassification'])\n", "\n", "passenger_df.to_csv('passenger_original.csv')\n", "passenger_df.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationstationNameadministrationCompanyrouteNamerailroadDivisionrailroadCompanyClassificationduplicate2011dataEorN2011remarks2011passengers2011duplicate2012dataEorN2012remarks2012passengers2012
0 None \u4e8c\u6708\u7530 \u4e5d\u5dde\u65c5\u5ba2\u9244\u9053 \u6307\u5bbf\u6795\u5d0e\u7dda 11 2 1 3 None 0 1 3 None 0
1 None \u53e4\u5cf6 \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb\u7dda 23 5 1 1 None 3907 1 1 None 3980
2 None \u304a\u53f0\u5834\u6d77\u6d5c\u516c\u5712 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 1 1 None 14612 1 1 None 16130
3 None \u8239\u306e\u79d1\u5b66\u9928 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 1 1 None 3767 1 1 None 3235
4 None \u30c6\u30ec\u30b3\u30e0\u30bb\u30f3\u30bf\u30fc \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 1 1 None 12112 1 1 None 12775
\n", "

5 rows \u00d7 14 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ " station stationName administrationCompany routeName railroadDivision \\\n", "0 None \u4e8c\u6708\u7530 \u4e5d\u5dde\u65c5\u5ba2\u9244\u9053 \u6307\u5bbf\u6795\u5d0e\u7dda 11 \n", "1 None \u53e4\u5cf6 \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb\u7dda 23 \n", "2 None \u304a\u53f0\u5834\u6d77\u6d5c\u516c\u5712 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "3 None \u8239\u306e\u79d1\u5b66\u9928 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "4 None \u30c6\u30ec\u30b3\u30e0\u30bb\u30f3\u30bf\u30fc \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "\n", " railroadCompanyClassification duplicate2011 dataEorN2011 remarks2011 \\\n", "0 2 1 3 None \n", "1 5 1 1 None \n", "2 5 1 1 None \n", "3 5 1 1 None \n", "4 5 1 1 None \n", "\n", " passengers2011 duplicate2012 dataEorN2012 remarks2012 passengers2012 \n", "0 0 1 3 None 0 \n", "1 3907 1 1 None 3980 \n", "2 14612 1 1 None 16130 \n", "3 3767 1 1 None 3235 \n", "4 12112 1 1 None 12775 \n", "\n", "[5 rows x 14 columns]" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u4e57\u964d\u5ba2\u6570\u306a\u3069\u304c\u3061\u3083\u3093\u3068\u30c6\u30fc\u30d6\u30eb\u5316\u3055\u308c\u3066\u3044\u308b\u306e\u304c\u308f\u304b\u308a\u307e\u3059\u3002\u3061\u306a\u307f\u306b\u3053\u306e\u4e57\u964d\u5ba2\u6570\u306f\u3001\u4e00\u65e5\u3042\u305f\u308a\u306e\u5e73\u5747\u5024\u3060\u305d\u3046\u3067\u3059\u3002\n", "\n", "## \u30c7\u30fc\u30bf\u5185\u5bb9\u306e\u691c\u8a0e\n", "\u3055\u3066\u3001\u3053\u308c\u3067\u3068\u308a\u3042\u3048\u305a\u6a5f\u68b0\u306b\u8aad\u307f\u8fbc\u307e\u305b\u3084\u3059\u3044\u5f62\u5f0f\u306b\u306f\u306a\u308a\u307e\u3057\u305f\u304c\u3001\u3055\u3089\u306a\u308b\u5229\u7528\u3092\u3059\u308b\u305f\u3081\u306b\u3001\u3069\u306e\u3088\u3046\u306b\u4f7f\u3046\u304b\u3092\u691c\u8a0e\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002\u30c7\u30fc\u30bf\u306e\u69cb\u9020\u3092\u898b\u3066\u307f\u307e\u3057\u3087\u3046\u3002\n", "\n", "\n", "![](http://cl.ly/X4LZ/passengers_schema.png)\n", "\n", "\uff08[\u56fd\u571f\u4ea4\u901a\u7701\u306e\u30b5\u30a4\u30c8\u3088\u308a](http://nlftp.mlit.go.jp/ksj/gml/product_spec/KS-PS-S12-v2_0.pdf)\uff09\n", "\n", "\u3053\u3053\u3067\u6ce8\u76ee\u3059\u3079\u304d\u306f__\u91cd\u8907\u30b3\u30fc\u30c9__\u3068\u8a00\u3046\u9805\u76ee\u3067\u3059\u3002\u3069\u3046\u3084\u3089\u3001\u5fc5\u305a\u3057\u3082\u99c5\u3054\u3068\u306e\u30c7\u30fc\u30bf\u304c\u3068\u308c\u3066\u3044\u308b\u308f\u3051\u3067\u306f\u306a\u304f\u3001\u30c7\u30fc\u30bf\u306b\u3088\u3063\u3066\u306f\u4ed6\u306e\u99c5\u306e\u7d71\u8a08\u306b\u5408\u7b97\u3055\u308c\u3066\u3044\u305f\u308a\u3001\u6b20\u640d\u5024\u3082\u5b58\u5728\u3059\u308b\u3088\u3046\u3067\u3059\u3002\u66f4\u306b\u8abf\u3079\u308b\u3068\u3001\u4ed6\u306e\u99c5\u306b\u5408\u7b97\u3055\u308c\u3066\u3044\u308b\u5834\u5408\u3001\u5408\u7b97\u5148\u306e\u99c5\u306f\u5099\u8003\u6b04\u306bID\u3067\u6307\u5b9a\u3057\u3066\u3042\u308b\u308f\u3051\u3067\u306f\u306a\u304f\u3001\u4eba\u9593\u7528\u306e\u6587\u7ae0\u3067\u8a18\u8ff0\u3057\u3066\u3042\u308a\u307e\u3059\u3002\u5177\u4f53\u7684\u306b\u306f\u3001\u3053\u3093\u306a\u611f\u3058\u3067\u3059:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "remarks = pd.Series(passenger_df['remarks2011'].unique())\n", "remarks[:10]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "0 None\n", "1 \u6771\u65e5\u672c\u65c5\u5ba2\u9244\u9053\u3001\u82b1\u8f2a\u7dda\u3092\u542b\u3080\n", "2 \u9577\u91ce\u96fb\u9244\u3001\u5c4b\u4ee3\u7dda\u3092\u542b\u3080\n", "3 \u6771\u65e5\u672c\u65c5\u5ba2\u9244\u9053\u3001\u5c0f\u6d77\u7dda\u3092\u542b\u3080\n", "4 \u6771\u6b66\u9244\u9053\u3001\u6850\u751f\u7dda\u3092\u542b\u3080\n", "5 \u5929\u7adc\u6d5c\u540d\u6e56\u9244\u9053\u3001\u5929\u7adc\u6d5c\u540d\u6e56\u7dda\u3092\u542b\u3080\n", "6 \u6771\u4eac\u90fd\u30011\u53f7\u7dda\u6d45\u8349\u7dda\u3092\u542b\u3080\n", "7 \u829d\u5c71\u9244\u9053\u3001\u829d\u5c71\u9244\u9053\u7dda\u3092\u542b\u3080\n", "8 \u65b0\u4eac\u6210\u96fb\u9244\u3001\u65b0\u4eac\u6210\u7dda\u3092\u542b\u3080\n", "9 \u5317\u7dcf\u9244\u9053\u3001\u5317\u7dcf\u7dda\u3092\u542b\u3080\n", "dtype: object" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u3053\u308c\u3060\u3068\u3001\u6a5f\u68b0\u7684\u306b\u5408\u7b97\u5185\u5bb9\u3092\u628a\u63e1\u3059\u308b\u306e\u306f\u3068\u3066\u3082\u9762\u5012\u3067\u3059\u3002\u304b\u3068\u8a00\u3063\u3066\u3001\u73fe\u5728\u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u5168\u90e8\u3092\u5358\u7d14\u306b\u99c5\u540d\u3067\u30de\u30c3\u30d4\u30f3\u30b0\u3057\u3066\u3082\u3001\u5408\u7b97\u306e\u65b9\u6cd5\u306a\u3069\u306b\u3088\u3063\u3066\u53ef\u8996\u5316\u306e\u7d50\u679c\u304c\u5de6\u53f3\u3055\u308c\u3066\u3057\u307e\u3044\u307e\u3059\u3002\n", "\n", "#### \u629c\u7c8b\u7248\u306e\u30c7\u30fc\u30bf\u3092\u4f5c\u308b\n", "\u3053\u306e\u3088\u3046\u306b\u5358\u7d14\u306a\u30de\u30c3\u30d4\u30f3\u30b0\u3067\u306f\u3046\u307e\u304f\u3044\u304d\u305d\u3046\u306b\u306a\u3044\u306e\u3067\u3001\u629c\u7c8b\u7248\u3092\u4f5c\u308a\u3001\u305d\u308c\u3067\u3069\u306e\u3088\u3046\u306a\u611f\u3058\u306b\u306a\u308b\u304b\u78ba\u304b\u3081\u308b\u3001\u3068\u8a00\u3046\u65b9\u91dd\u3067\u884c\u304d\u307e\u3059\u3002\u3064\u307e\u308a\u3001\u7c21\u5358\u306b\u6b63\u78ba\u306a\u30c7\u30fc\u30bf\u304c\u624b\u306b\u5165\u308b\u3082\u306e\u3060\u3051\u3092\u5229\u7528\u3057\u307e\u3059\u3002\u4f5c\u696d\u3068\u3057\u3066\u306f\u3001__dataEorN\u30ab\u30e9\u30e0\u306e\u307f\u3092\u5229\u7528\u3057\u30c7\u30fc\u30bf\u304c\u5b58\u5728\u3059\u308b\u3082\u306e\u3092\u53d6\u308a\u51fa\u3059__\u3068\u8a00\u3046\u5358\u7d14\u306a\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3067\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "passenger_df['is_complete'] = passenger_df.apply(\n", " lambda row: True if row['dataEorN2011'] == '1' and row['dataEorN2012'] == '1' and row['duplicate2011'] == '1' and row['duplicate2012'] == '1' and row['remarks2011'] == None and row['remarks2012'] == None else False, axis=1)\n", "passenger_filtered = passenger_df[passenger_df['is_complete']]\n", "\n", "# \u8981\u3089\u306a\u304f\u306a\u3063\u305f\u60c5\u5831\u3092\u9664\u53bb\n", "passenger_filtered.drop(['is_complete', 'station', 'duplicate2011', 'duplicate2012', 'remarks2011', 'remarks2012', 'dataEorN2011', 'dataEorN2012'], axis=1, inplace=True)\n", "passenger_filtered.head()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "-c:6: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame\n" ] }, { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationNameadministrationCompanyrouteNamerailroadDivisionrailroadCompanyClassificationpassengers2011passengers2012
1 \u53e4\u5cf6 \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb\u7dda 23 5 3907 3980
2 \u304a\u53f0\u5834\u6d77\u6d5c\u516c\u5712 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 14612 16130
3 \u8239\u306e\u79d1\u5b66\u9928 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 3767 3235
4 \u30c6\u30ec\u30b3\u30e0\u30bb\u30f3\u30bf\u30fc \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 12112 12775
5 \u6c50\u7559 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 5 6841 7617
\n", "

5 rows \u00d7 7 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ " stationName administrationCompany routeName railroadDivision \\\n", "1 \u53e4\u5cf6 \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb \u6c96\u7e04\u90fd\u5e02\u30e2\u30ce\u30ec\u30fc\u30eb\u7dda 23 \n", "2 \u304a\u53f0\u5834\u6d77\u6d5c\u516c\u5712 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "3 \u8239\u306e\u79d1\u5b66\u9928 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "4 \u30c6\u30ec\u30b3\u30e0\u30bb\u30f3\u30bf\u30fc \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "5 \u6c50\u7559 \u3086\u308a\u304b\u3082\u3081 \u6771\u4eac\u81e8\u6d77\u65b0\u4ea4\u901a\u81e8\u6d77\u7dda 24 \n", "\n", " railroadCompanyClassification passengers2011 passengers2012 \n", "1 5 3907 3980 \n", "2 5 14612 16130 \n", "3 5 3767 3235 \n", "4 5 12112 12775 \n", "5 5 6841 7617 \n", "\n", "[5 rows x 7 columns]" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u3055\u3089\u306a\u308b\u7d30\u304b\u306a\u8abf\u6574\n", "\n", "#### \u5404\u7a2e\u30b3\u30fc\u30c9\u3092\u4eba\u9593\u306e\u8aad\u3081\u308b\u6587\u5b57\u5217\u306b\u5909\u63db\n", "\u8def\u7dda\u306e\u7a2e\u5225\u306a\u3069\u304c\u6570\u5024\u3068\u3057\u3066\u4fdd\u5b58\u3057\u3066\u3042\u308b\u306e\u3067\u3001\u4eba\u9593\u304c\u8aad\u3081\u308b\u3082\u306e\u306b\u5909\u63db\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "temp_df = pd.merge(passenger_filtered, rd_df, on='railroadDivision')\n", "passenger_final_df = pd.merge(temp_df, company_type_df, on='railroadCompanyClassification')\n", "\n", "# \u5fc5\u8981\u306a\u304f\u306a\u3063\u305f\u30ab\u30e9\u30e0\u3092\u6d88\u53bb\n", "passenger_final_df.drop(['railroadDivision', 'railroadCompanyClassification'], axis=1, inplace=True)\n", "\n", "passenger_final_df.to_csv('passenger_final.csv')\n", "passenger_final_df[4000:4010]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationNameadministrationCompanyrouteNamepassengers2011passengers2012rail_typecompany_type
4000 \u5357\u6804 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 3435 3422 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4001 \u8001\u6d25 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 747 729 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4002 \u611b\u77e5\u5927\u5b66\u524d \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 6391 4281 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4003 \u4e09\u6cb3\u7530\u539f \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2921 3003 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4004 \u5927\u6e05\u6c34 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2940 3004 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4005 \u9ad8\u5e2b \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2607 2687 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4006 \u82a6\u539f \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 580 621 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4007 \u65b0\u8c4a\u6a4b \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 18298 16482 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4008 \u8c4a\u5cf6 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 471 453 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
4009 \u56db\u5341\u4e07 \u5317\u9678\u9244\u9053 \u77f3\u5ddd\u7dda 222 225 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053
\n", "

10 rows \u00d7 7 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ " stationName administrationCompany routeName passengers2011 \\\n", "4000 \u5357\u6804 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 3435 \n", "4001 \u8001\u6d25 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 747 \n", "4002 \u611b\u77e5\u5927\u5b66\u524d \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 6391 \n", "4003 \u4e09\u6cb3\u7530\u539f \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2921 \n", "4004 \u5927\u6e05\u6c34 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2940 \n", "4005 \u9ad8\u5e2b \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 2607 \n", "4006 \u82a6\u539f \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 580 \n", "4007 \u65b0\u8c4a\u6a4b \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 18298 \n", "4008 \u8c4a\u5cf6 \u8c4a\u6a4b\u9244\u9053 \u6e25\u7f8e\u7dda 471 \n", "4009 \u56db\u5341\u4e07 \u5317\u9678\u9244\u9053 \u77f3\u5ddd\u7dda 222 \n", "\n", " passengers2012 rail_type company_type \n", "4000 3422 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4001 729 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4002 4281 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4003 3003 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4004 3004 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4005 2687 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4006 621 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4007 16482 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4008 453 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "4009 225 \u666e\u901a\u9244\u9053 \u6c11\u55b6\u9244\u9053 \n", "\n", "[10 rows x 7 columns]" ] } ], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u7d50\u679c\u306e\u691c\u8a0e\n", "\u304b\u306a\u308a\u4eba\u9593\u306b\u8aad\u3081\u308b\u3088\u3046\u306b\u306a\u3063\u3066\u304d\u307e\u3057\u305f\u3002\u3067\u3082\u5fae\u5999\u306b\u6c17\u6301\u3061\u60aa\u3044\u5024\u304c\u5165\u3063\u3066\u3044\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "passenger_final_df[passenger_final_df['stationName'] == '\u5357\u963f\u8607\u767d\u5ddd\u6c34\u6e90']" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationNameadministrationCompanyrouteNamepassengers2011passengers2012rail_typecompany_type
942 \u5357\u963f\u8607\u767d\u5ddd\u6c34\u6e90 \u5357\u963f\u8607\u9244\u9053 \u9ad8\u68ee\u7dda 1 24 \u666e\u901a\u9244\u9053 \u7b2c\u4e09\u30bb\u30af\u30bf\u30fc
\n", "

1 rows \u00d7 7 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ " stationName administrationCompany routeName passengers2011 passengers2012 \\\n", "942 \u5357\u963f\u8607\u767d\u5ddd\u6c34\u6e90 \u5357\u963f\u8607\u9244\u9053 \u9ad8\u68ee\u7dda 1 24 \n", "\n", " rail_type company_type \n", "942 \u666e\u901a\u9244\u9053 \u7b2c\u4e09\u30bb\u30af\u30bf\u30fc \n", "\n", "[1 rows x 7 columns]" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u4e00\u65e5\u306e\u4e57\u964d\u5ba2\u304c\u4e00\u4eba\u3063\u3066\u3042\u308a\u3048\u308b\u306e\u3067\u3057\u3087\u3046\u304b\uff1f[\u99c5\u306e\u60c5\u5831](http://ja.wikipedia.org/wiki/%E5%8D%97%E9%98%BF%E8%98%87%E7%99%BD%E5%B7%9D%E6%B0%B4%E6%BA%90%E9%A7%85)\u3092\u898b\u3066\u307f\u307e\u3059\n", "\n", "![](http://upload.wikimedia.org/wikipedia/commons/e/e4/Minamiaso_Shirakawasuigen_Station01.jpg)\n", "\n", "> \u5357\u963f\u8607\u767d\u5ddd\u6c34\u6e90\u99c5\n", "> * 2011\u5e74\uff08\u5e73\u621023\u5e74\uff096\u6708 \u963f\u8607\u767d\u5ddd - \u898b\u6674\u53f0\u9593\u306b\u767d\u5ddd\u6c34\u6e90\u306e\u6700\u5bc4\u3068\u306a\u308b\u65b0\u99c5\u8a2d\u7f6e\u3092\u767a\u8868\u3002\n", "> * 2012\u5e74\uff08\u5e73\u621024\u5e74\uff093\u670817\u65e5 - \u958b\u696d\u3002\n", "> * 2012\u5e74\uff08\u5e73\u621024\u5e74\uff097\u670822\u65e5 - \u99c5\u820e\u5b8c\u6210[3]\u3002\n", "> * \u5099\u8003:\t\u7121\u4eba\u99c5\n", "\n", "\n", "\u958b\u696d\u524d\u306e\u30c7\u30fc\u30bf\u304c\u3042\u308b\u306e\u306f\u8b0e\u3067\u3059\u304c\u3001\u6570\u5b57\u7684\u306b\u306f\u3042\u308a\u3048\u305d\u3046\u3067\u3059(\u8a66\u9a13\u8d70\u884c\u6642\u306e\u7d71\u8a08\uff1f\uff09\u3002\u4eca\u56de\u306f\u305f\u307e\u305f\u307e\u3053\u3046\u3044\u3063\u305f\u5024\u3092\u767a\u898b\u3057\u307e\u3057\u305f\u304c\u3001\u5b9f\u969b\u306b\u306f\u76ee\u8996\u3067\u3053\u3046\u3044\u3063\u305f\u3053\u3068\u3092\u3059\u308b\u306e\u306f\u56f0\u96e3\u3067\u3059\u3002\u3060\u304b\u3089\u3053\u305d\u53ef\u8996\u5316\u3068\u3044\u3046\u624b\u6cd5\u304c\u751f\u304d\u3066\u304f\u308b\u306e\u3067\u3059\u304c\u3002\n", "\n", "\n", "#### \u306a\u305c\u308f\u3056\u308f\u3056\u4eba\u9593\u306e\u8aad\u3081\u308b\u975e\u52b9\u7387\u7684\u306a\u30c7\u30fc\u30bf\u306b\u5909\u63db\u3059\u308b\u306e\u304b\uff1f\n", "\u3053\u308c\u306f\u6a5f\u68b0\u306b\u3068\u3063\u3066\u306f\u30e1\u30ea\u30c3\u30c8\u304c\u306a\u3044\u306e\u3067\u3059\u304c\u3001\u53ef\u8996\u5316\u3092\u8003\u3048\u305f\u6642\u3001\u6570\u5b57\u306a\u3069\u306b\u30a8\u30f3\u30b3\u30fc\u30c9\u3055\u308c\u305f\u60c5\u5831\u306f\u76f4\u63a5\u30e9\u30d9\u30eb\u7b49\u3068\u3057\u3066\u4f7f\u7528\u3059\u308b\u306e\u306f\u4f7f\u3044\u52dd\u624b\u304c\u60aa\u3044\u306e\u3067\u3001\u30c7\u30fc\u30bf\u306e\u30b5\u30a4\u30ba\u304c\u5c0f\u3055\u3044\u5834\u5408\u306f\u3001\u52b9\u7387\u6027\u3088\u308a\u3082\u5b9f\u7528\u6027\u3092\u91cd\u8996\u3057\u3066\u3001\u3053\u306e\u3088\u3046\u306a\u5909\u63db\u3092\u4e88\u3081\u884c\u3063\u3066\u304a\u304f\u3053\u3068\u3067\u53ef\u8996\u5316\u306e\u6bb5\u968e\u3067\u306e\u8907\u96d1\u306a\u64cd\u4f5c\u3092\u907f\u3051\u308b\u4e8b\u304c\u51fa\u6765\u307e\u3059\u3002\n", "\n", "\n", "## \u5b9f\u969b\u306b\u30de\u30c3\u30d4\u30f3\u30b0\n", "\u6b32\u3057\u304b\u3063\u305f\u306e\u306f\u3001\u99c5\u540d\u3068\u4e57\u964d\u5ba2\u306e\u30b7\u30f3\u30d7\u30eb\u306a\u30c6\u30fc\u30d6\u30eb\u3067\u3057\u305f\u304c\u3001\u304b\u306a\u308a\u306e\u56de\u308a\u9053\u306b\u306a\u3063\u3066\u3057\u307e\u3044\u307e\u3057\u305f\u3002\u3053\u3053\u304b\u3089\u306f\u5b9f\u969b\u306b[Cytoscape](http://www.cytoscape.org/)\u3067\u53ef\u8996\u5316\u3057\u3066\u307f\u307e\u3057\u3087\u3046\u3002\n", "\n", "\n", "### \u6771\u4eac\u30e1\u30c8\u30ed\u3092\u984c\u6750\u3068\u3057\u305f\u30de\u30c3\u30d4\u30f3\u30b0\n", "\u3053\u3053\u3067[\u6771\u4eac\u30e1\u30c8\u30ed](http://www.tokyometro.jp/index.html)\u306e\u4e57\u964d\u5ba2\u3092\u8def\u7dda\u30b0\u30e9\u30d5\u4e0a\u3067\u8996\u899a\u7684\u306b\u628a\u63e1\u3057\u3066\u307f\u3088\u3046\u3068\u601d\u3044\u307e\u3059\u3002\u3068\u3053\u308d\u304c\u3001\u73fe\u6642\u70b9\u3067\u8def\u7dda\u540d\u306e\u30de\u30c3\u30d7\u306f\u884c\u3048\u307e\u305b\u3093\u3001\u306a\u305c\u306a\u3089\u3001\u4e00\u65b9\u306f\u300c__\u6771\u4eac\u30e1\u30c8\u30ed__\u300d\u3001\u3082\u3046\u4e00\u65b9\u306f\u300c__\u6771\u4eac\u5730\u4e0b\u9244__\u300d\u3068\u7570\u306a\u308b\u540d\u79f0\u3092\u4f7f\u3063\u3066\u3044\u308b\u304b\u3089\u3067\u3059\u3002\u3068\u3044\u3046\u8a33\u3067\u5207\u308a\u51fa\u3057\u3001\u5909\u63db\u3001\u30de\u30fc\u30b8\u3067\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "tokyo_metro = passenger_final_df[passenger_final_df['administrationCompany'] == '\u6771\u4eac\u5730\u4e0b\u9244']\n", "def create_line_name(company, line):\n", " prefix = company.replace('\u5730\u4e0b\u9244', '\u30e1\u30c8\u30ed')\n", " suffix = line.split('\u7dda')[1]\n", " return prefix + suffix + '\u7dda'\n", " \n", "tokyo_metro['line_name'] = tokyo_metro.apply(lambda row: create_line_name(row['administrationCompany'], row['routeName']), axis=1)\n", "tokyo_metro['station_name'] = tokyo_metro['stationName']\n", "\n", "merged = pd.merge(merged_stations, tokyo_metro, on=['station_name', 'line_name'])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### \u30c7\u30fc\u30bf\u30bf\u30a4\u30d7\u306e\u5909\u66f4\n", "\n", "\u6570\u5b57\u3068\u3057\u3066\u6271\u308f\u306a\u3051\u308c\u3070\u3044\u3051\u306a\u3044\u3082\u306e\u3092\u6587\u5b57\u5217\u3068\u3057\u3066\u8aad\u307f\u8fbc\u3093\u3067\u3044\u308b\u306e\u3092\u5fd8\u308c\u3066\u3044\u307e\u3057\u305f\u3002\u5909\u63db\u3057\u307e\u3057\u3087\u3046\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "merged['passengers2012'] = merged['passengers2012'].astype(int)\n", "merged['passengers2011'] = merged['passengers2011'].astype(int)\n", "\n", "# \u4e57\u964d\u5ba2\u30c8\u30c3\u30d710\u306e\u99c5\u3092\u8868\u793a\u3057\u3066\u3001\u59a5\u5f53\u6027\u3092\u691c\u8a0e\u3059\u308b\n", "sorted_df = merged.sort_index(by='passengers2012', ascending=False)\n", "sorted_df[['station_name', 'line_name', 'passengers2012']][:10]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
station_nameline_namepassengers2012
57 \u6c60\u888b \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 483952
64 \u5927\u624b\u753a \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 277336
27 \u897f\u8239\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 274785
90 \u9280\u5ea7 \u6771\u4eac\u30e1\u30c8\u30ed\u65e5\u6bd4\u8c37\u7dda 245548
56 \u6e0b\u8c37 \u6771\u4eac\u30e1\u30c8\u30ed\u9280\u5ea7\u7dda 226644
50 \u65b0\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u9280\u5ea7\u7dda 223335
69 \u65b0\u5bbf \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 220154
82 \u4e0a\u91ce \u6771\u4eac\u30e1\u30c8\u30ed\u65e5\u6bd4\u8c37\u7dda 212509
29 \u9ad8\u7530\u99ac\u5834 \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 186629
32 \u98ef\u7530\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 169830
\n", "

10 rows \u00d7 3 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ " station_name line_name passengers2012\n", "57 \u6c60\u888b \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 483952\n", "64 \u5927\u624b\u753a \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 277336\n", "27 \u897f\u8239\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 274785\n", "90 \u9280\u5ea7 \u6771\u4eac\u30e1\u30c8\u30ed\u65e5\u6bd4\u8c37\u7dda 245548\n", "56 \u6e0b\u8c37 \u6771\u4eac\u30e1\u30c8\u30ed\u9280\u5ea7\u7dda 226644\n", "50 \u65b0\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u9280\u5ea7\u7dda 223335\n", "69 \u65b0\u5bbf \u6771\u4eac\u30e1\u30c8\u30ed\u4e38\u30ce\u5185\u7dda 220154\n", "82 \u4e0a\u91ce \u6771\u4eac\u30e1\u30c8\u30ed\u65e5\u6bd4\u8c37\u7dda 212509\n", "29 \u9ad8\u7530\u99ac\u5834 \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 186629\n", "32 \u98ef\u7530\u6a4b \u6771\u4eac\u30e1\u30c8\u30ed\u6771\u897f\u7dda 169830\n", "\n", "[10 rows x 3 columns]" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u3060\u3044\u305f\u3044\u60f3\u50cf\u3068\u4e00\u81f4\u3057\u307e\u3059\u306d\u3002\u7121\u8ad6\u3001\u3053\u308c\u306f\u99c5\u3054\u3068\u306e\u30c7\u30fc\u30bf\u306e\u307f\u3092\u62bd\u51fa\u3057\u305f\u3082\u306e\u3067\u3059\u304b\u3089\u3001\u5de8\u5927\u99c5\u3068\u3057\u3066\u8907\u6570\u306e\u8def\u7dda\u3092\u5408\u7b97\u3057\u305f\u5834\u5408\u306f\u3001\u5f53\u7136\u65b0\u5bbf\u3068\u304b\u304c\u30c8\u30c3\u30d7\u306b\u6765\u308b\u3068\u601d\u3044\u307e\u3059\u304c\u3001\u5730\u4e0b\u9244\u306e\u3072\u3068\u99c5\u5f53\u305f\u308a\u3060\u3068\u3053\u3093\u306a\u611f\u3058\u3067\u9593\u9055\u3044\u306a\u3055\u305d\u3046\u3067\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# \u7d50\u679c\u306e\u66f8\u304d\u51fa\u3057\n", "merged.to_csv('metro_data_table.csv')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0: Web\u30da\u30fc\u30b8\u306e\u52a0\u5de5\u306b\u3088\u308b\u8def\u7dda\u3054\u3068\u306e\u8272\u30c7\u30fc\u30bf\u62bd\u51fa\n", "\u305b\u3063\u304b\u304f\u306a\u306e\u3067\u3001\u3053\u306e\u30c7\u30fc\u30bf\u306b\u3082\u3046\u4e00\u4ed5\u4e8b\u3057\u3066\u307f\u307e\u3057\u3087\u3046\u3002\u8def\u7dda\u3054\u3068\u306e\u8272\u306f\u30e9\u30a4\u30f3\u30ab\u30e9\u30fc\u3068\u547c\u3070\u308c\u3066\u3001\u99c5\u306b\u63b2\u793a\u3057\u3066\u3042\u308b\u8def\u7dda\u56f3\u306a\u3069\u3067\u304a\u99b4\u67d3\u307f\u3067\u3059\u306d\u3002\u6b8b\u5ff5\u306a\u304c\u3089\u6a5f\u68b0\u304c\u8aad\u307f\u3084\u3059\u3044\u3088\u3046\u306a\u30c7\u30fc\u30bf\u306f\u898b\u3064\u304b\u3089\u306a\u304b\u3063\u305f\u306e\u3067\u3059\u304c\u3001Wikipedia\u306b\u306f\u3061\u3083\u3093\u3068\u3042\u308a\u307e\u3057\u305f\u3002\n", "\n", "* [\u65e5\u672c\u306e\u9244\u9053\u30e9\u30a4\u30f3\u30ab\u30e9\u30fc\u4e00\u89a7](http://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%AC%E3%81%AE%E9%89%84%E9%81%93%E3%83%A9%E3%82%A4%E3%83%B3%E3%82%AB%E3%83%A9%E3%83%BC%E4%B8%80%E8%A6%A7)\n", "\n", "\u3053\u308c\u3092Cytoscape\u3067\u4f7f\u3048\u308b\u5f62\u306b\u52a0\u5de5\u3057\u307e\u3057\u3087\u3046\u3002\u3053\u306e\u4f5c\u696d\u306f_\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0_\u3068\u547c\u3070\u308c\u3001\u3067\u304d\u308c\u3070\u907f\u3051\u305f\u3044\u30a4\u30e4\u306a\u4f5c\u696d\u306a\u306e\u3067\u3059\u304c\u3001[\u6a5f\u68b0\u304c\u7406\u89e3\u3067\u304d\u308bWeb](http://ja.wikipedia.org/wiki/%E3%82%BB%E3%83%9E%E3%83%B3%E3%83%86%E3%82%A3%E3%83%83%E3%82%AF%E3%83%BB%E3%82%A6%E3%82%A7%E3%83%96)\u304c\u3082\u3063\u3068\u5145\u5b9f\u3059\u308b\u307e\u3067\u3001\u4ed5\u65b9\u304c\u306a\u3044\u30aa\u30fc\u30d0\u30fc\u30d8\u30c3\u30c9\u3060\u3068\u601d\u3063\u3066\u8ae6\u3081\u307e\u3057\u3087\u3046\u3002\n", "\n", "\u4ee5\u4e0b\u3001\u6b63\u78ba\u3055\u306b\u3082\u6b20\u3051\u308b\u9069\u5f53\u306a\u51e6\u7406\u3067\u3059\u304c\u3001\u3068\u308a\u3042\u3048\u305a\u5927\u65b9\u306e\u60c5\u5831\u306f\u62fe\u3048\u307e\u3059\u3002\u8def\u7dda\u540d\u304b\u3089RGB\u306e\u6587\u5b57\u5217\u3078\u5909\u63db\u3057\u307e\u3059\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from lxml.html import parse\n", "from urllib2 import urlopen\n", "\n", "page = parse(urlopen('http://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%AC%E3%81%AE%E9%89%84%E9%81%93%E3%83%A9%E3%82%A4%E3%83%B3%E3%82%AB%E3%83%A9%E3%83%BC%E4%B8%80%E8%A6%A7'))\n", "doc = page.getroot()\n", "table_rows = doc.findall('.//tr')\n", "\n", "color_list = []\n", "for row in table_rows:\n", " line_name = None\n", " line_color = None\n", " \n", " row_data = row.findall('.//%s' % 'td')\n", " for val in row_data:\n", " style = val.get('style')\n", " for child in val:\n", " if child.tag is 'a':\n", " line_name = child.get('title')\n", " if style is not None:\n", " line_color = style\n", " if line_name is not None and line_color is not None:\n", " new_color = line_color.split(';')[0].replace('background:', '')\n", " color_list.append([line_name.encode('utf-8'), new_color.encode('utf-8')])\n", " \n", "# \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u3002\u3000\n", "color_df = pd.DataFrame(color_list, columns=['line_name', 'line_color'])\n", "\n", "# CSV\u3068\u3057\u3066\u66f8\u304d\u51fa\u3057\n", "color_df.to_csv('line_colors.csv')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cytoscape\u3067\u306e\u4f5c\u696d\n", "\u3067\u306f\u51fa\u6765\u4e0a\u304c\u3063\u305f\u30c7\u30fc\u30bf\u3092Cytoscape\u306b\u518d\u3073\u8aad\u307e\u305b\u307e\u3059\u3002\u8a73\u7d30\u306f\u8a18\u4e8b\u306e\u65b9\u306b\u3002\n", "\n", "![](http://cl.ly/X4Ih/tokyo_metro.png)\n", "\n", "![](http://cl.ly/X4Q9/tokyo_metro_geo.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## \u307e\u3068\u3081\n", "\u3053\u306e\u3088\u3046\u306b\u3001\u5b9f\u969b\u306e\u30c7\u30fc\u30bf\u524d\u51e6\u7406\u306f\u3068\u3066\u3082\u5730\u9053\u3067\u3064\u307e\u3089\u306a\u3044\u4f5c\u696d\u304c\u591a\u3044\u306e\u3082\u4e8b\u5b9f\u3067\u3059\u3002\u914d\u5e03\u5143\u304c\u30b7\u30f3\u30d7\u30eb\u304b\u3064\u6b63\u78ba\u306b\u6a5f\u68b0\u53ef\u8aad\u306a\u30c7\u30fc\u30bf\u3092\u914d\u5e03\u3057\u3066\u304f\u308c\u308c\u3070\u307b\u3068\u3093\u3069\u306e\u554f\u984c\u306f\u89e3\u6c7a\u3059\u308b\u306e\u3067\u3059\u304c...\n", "\n", "\u611a\u75f4\u3092\u8a00\u3063\u3066\u3066\u3082\u59cb\u307e\u3089\u306a\u3044\u306e\u3067\u3001\u5728\u91ce\u306e\u500b\u4eba\u958b\u767a\u8005\u3084\u6c11\u9593\u4f01\u696d\u3067\u3082\u6c17\u8efd\u306b\u5229\u7528\u3067\u304d\u308b\u5f62\u3067\u306e\u30c7\u30fc\u30bf\u306e\u516c\u958b\u3078\u306e\u63d0\u8a00\u3092\u307e\u3068\u3081\u3066\u3001\u516c\u7684\u6a5f\u95a2\u306b\u50cd\u304d\u304b\u3051\u308b\u3068\u8a00\u3046\u6d3b\u52d5\u3082\u3042\u308a\u304b\u306a\u3001\u3068\u601d\u3044\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u304a\u8aad\u307f\u306e\u7686\u3055\u3093\u306e\u3088\u3046\u306b\u3001\u5b9f\u969b\u306b\u30c7\u30fc\u30bf\u3092\u4f7f\u3046\u4eba\u3005\u306e\u5177\u4f53\u7684\u6307\u6458\u304c\u3042\u3063\u305f\u307b\u3046\u304c\u3001\u516c\u958b\u3059\u308b\u5074\u3082\u3084\u308a\u3084\u3059\u3044\u3068\u601d\u3044\u307e\u3059\u306e\u3067\u3002\u3053\u3046\u3044\u3063\u305f\u8a71\u984c\u306b\u8208\u5473\u306e\u3042\u308b\u65b9\u306f\u3001\u305c\u3072\u4ee5\u4e0b\u306e\u30b0\u30eb\u30fc\u30d7\u306b\u53c2\u52a0\u3057\u3066\u307f\u3066\u304f\u3060\u3055\u3044\u3002\n", "\n", "* [Data Visualization Japan](https://www.facebook.com/groups/datavizjapan/) (Facebook\u30b0\u30eb\u30fc\u30d7\u3067\u3059)" ] } ], "metadata": {} } ] }