{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# collections 模块：更多数据结构"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import collections"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 计数器"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以使用 `Counter(seq)` 对序列中出现的元素个数进行统计。\n",
    "\n",
    "例如，我们可以统计一段文本中出现的单词及其出现的次数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Counter({'two': 2, 'one': 2, 'from': 1, 'i': 1, 'tree': 1, 'three': 1, 'china': 1, 'come': 1})\n"
     ]
    }
   ],
   "source": [
    "from string import punctuation\n",
    "\n",
    "sentence = \"One, two, three, one, two, tree, I come from China.\"\n",
    "\n",
    "words_count = collections.Counter(sentence.translate(None, punctuation).lower().split())\n",
    "\n",
    "print words_count"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 双端队列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "双端队列支持从队头队尾出入队："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])\n",
      "9 8 7 6 5 4 3 2 1 0\n",
      "deque([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])\n",
      "9 8 7 6 5 4 3 2 1 0\n"
     ]
    }
   ],
   "source": [
    "dq = collections.deque()\n",
    "\n",
    "for i in xrange(10):\n",
    "    dq.append(i)\n",
    "    \n",
    "print dq\n",
    "\n",
    "for i in xrange(10):\n",
    "    print dq.pop(), \n",
    "\n",
    "print \n",
    "\n",
    "for i in xrange(10):\n",
    "    dq.appendleft(i)\n",
    "    \n",
    "print dq\n",
    "\n",
    "for i in xrange(10):\n",
    "    print dq.popleft(),"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与列表相比，双端队列在队头的操作更快："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100 loops, best of 3: 598 ns per loop\n",
      "100 loops, best of 3: 291 ns per loop\n"
     ]
    }
   ],
   "source": [
    "lst = []\n",
    "dq = collections.deque()\n",
    "\n",
    "%timeit -n100 lst.insert(0, 10)\n",
    "%timeit -n100 dq.appendleft(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 有序字典"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "字典的 `key` 按顺序排列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Regular Dict:\n",
      "A 1\n",
      "C 3\n",
      "B 2\n",
      "Ordered Dict:\n",
      "A 1\n",
      "B 2\n",
      "C 3\n"
     ]
    }
   ],
   "source": [
    "items = (\n",
    "    ('A', 1),\n",
    "    ('B', 2),\n",
    "    ('C', 3)\n",
    ")\n",
    "\n",
    "regular_dict = dict(items)\n",
    "ordered_dict = collections.OrderedDict(items)\n",
    "\n",
    "print 'Regular Dict:'\n",
    "for k, v in regular_dict.items():\n",
    "    print k, v\n",
    "\n",
    "print 'Ordered Dict:'\n",
    "for k, v in ordered_dict.items():\n",
    "    print k, v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 带默认值的字典"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于 `Python` 自带的词典 `d`，当 `key` 不存在的时候，调用 `d[key]` 会报错，但是 `defaultdict` 可以为这样的 `key` 提供一个指定的默认值，我们只需要在定义时提供默认值的类型即可，如果 `key` 不存在返回指定类型的默认值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n",
      "0\n",
      "0.0\n"
     ]
    }
   ],
   "source": [
    "dd = collections.defaultdict(list)\n",
    "\n",
    "print dd[\"foo\"]\n",
    "\n",
    "dd = collections.defaultdict(int)\n",
    "\n",
    "print dd[\"foo\"]\n",
    "\n",
    "dd = collections.defaultdict(float)\n",
    "\n",
    "print dd[\"foo\"]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}