{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# json 模块:处理 JSON 数据" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[JSON (JavaScript Object Notation)](http://json.org) 是一种轻量级的数据交换格式,易于人阅读和编写,同时也易于机器解析和生成。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JSON 基础" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`JSON` 的基础结构有两种:键值对 (`name/value pairs`) 和数组 (`array`)。\n", "\n", "`JSON` 具有以下形式:\n", "\n", "- `object` - 对象,用花括号表示,形式为(数据是无序的):\n", " - `{ pair_1, pair_2, ..., pair_n }`\n", "- `pair` - 键值对,形式为:\n", " - `string : value`\n", "- `array` - 数组,用中括号表示,形式为(数据是有序的):\n", " - `[value_1, value_2, ..., value_n ]`\n", "- `value` - 值,可以是\n", " - `string` 字符串\n", " - `number` 数字\n", " - `object` 对象\n", " - `array` 数组\n", " - `true / false / null` 特殊值\n", "- `string` 字符串\n", "\n", "例子:\n", "\n", "```json\n", "{\n", " \"name\": \"echo\",\n", " \"age\": 24,\n", " \"coding skills\": [\"python\", \"matlab\", \"java\", \"c\", \"c++\", \"ruby\", \"scala\"],\n", " \"ages for school\": { \n", " \"primary school\": 6,\n", " \"middle school\": 9,\n", " \"high school\": 15,\n", " \"university\": 18\n", " },\n", " \"hobby\": [\"sports\", \"reading\"],\n", " \"married\": false\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JSON 与 Python 的转换" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设我们已经将上面这个 `JSON` 对象写入了一个字符串:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import json\n", "from pprint import pprint\n", "\n", "info_string = \"\"\"\n", "{\n", " \"name\": \"echo\",\n", " \"age\": 24,\n", " \"coding skills\": [\"python\", \"matlab\", \"java\", \"c\", \"c++\", \"ruby\", \"scala\"],\n", " \"ages for school\": { \n", " \"primary school\": 6,\n", " \"middle school\": 9,\n", " \"high school\": 15,\n", " \"university\": 18\n", " },\n", " \"hobby\": [\"sports\", \"reading\"],\n", " \"married\": false\n", "}\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们可以用 `json.loads()` (load string) 方法从字符串中读取 `JSON` 数据:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{u'age': 24,\n", " u'ages for school': {u'high school': 15,\n", " u'middle school': 9,\n", " u'primary school': 6,\n", " u'university': 18},\n", " u'coding skills': [u'python',\n", " u'matlab',\n", " u'java',\n", " u'c',\n", " u'c++',\n", " u'ruby',\n", " u'scala'],\n", " u'hobby': [u'sports', u'reading'],\n", " u'married': False,\n", " u'name': u'echo'}\n" ] } ], "source": [ "info = json.loads(info_string)\n", "\n", "pprint(info)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "此时,我们将原来的 `JSON` 数据变成了一个 `Python` 对象,在我们的例子中这个对象是个字典(也可能是别的类型,比如列表):" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(info)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以使用 `json.dumps()` 将一个 `Python` 对象变成 `JSON` 对象:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\"name\": \"echo\", \"age\": 24, \"married\": false, \"ages for school\": {\"middle school\": 9, \"university\": 18, \"high school\": 15, \"primary school\": 6}, \"coding skills\": [\"python\", \"matlab\", \"java\", \"c\", \"c++\", \"ruby\", \"scala\"], \"hobby\": [\"sports\", \"reading\"]}\n" ] } ], "source": [ "info_json = json.dumps(info)\n", "\n", "print info_json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "从中我们可以看到,生成的 `JSON` 字符串中,数组的元素顺序是不变的(始终是 `[\"python\", \"matlab\", \"java\", \"c\", \"c++\", \"ruby\", \"scala\"]`),而对象的元素顺序是不确定的。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 生成和读取 JSON 文件" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "与 `pickle` 类似,我们可以直接从文件中读取 `JSON` 数据,也可以将对象保存为 `JSON` 格式。\n", "\n", "- `json.dump(obj, file)` 将对象保存为 JSON 格式的文件\n", "- `json.load(file)` 从 JSON 文件中读取数据" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with open(\"info.json\", \"w\") as f:\n", " json.dump(info, f)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以查看 `info.json` 的内容:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\"name\": \"echo\", \"age\": 24, \"married\": false, \"ages for school\": {\"middle school\": 9, \"university\": 18, \"high school\": 15, \"primary school\": 6}, \"coding skills\": [\"python\", \"matlab\", \"java\", \"c\", \"c++\", \"ruby\", \"scala\"], \"hobby\": [\"sports\", \"reading\"]}\n" ] } ], "source": [ "with open(\"info.json\") as f:\n", " print f.read()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "从文件中读取数据:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{u'age': 24,\n", " u'ages for school': {u'high school': 15,\n", " u'middle school': 9,\n", " u'primary school': 6,\n", " u'university': 18},\n", " u'coding skills': [u'python',\n", " u'matlab',\n", " u'java',\n", " u'c',\n", " u'c++',\n", " u'ruby',\n", " u'scala'],\n", " u'hobby': [u'sports', u'reading'],\n", " u'married': False,\n", " u'name': u'echo'}\n" ] } ], "source": [ "with open(\"info.json\") as f:\n", " info_from_file = json.load(f)\n", " \n", "pprint(info_from_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "删除生成的文件:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import os\n", "os.remove(\"info.json\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }