{ "cells": [ { "cell_type": "code", "execution_count": 184, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from lda2vec import preprocess, Corpus\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "try:\n", " import seaborn\n", "except:\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You must be using a very recent version of pyLDAvis to use the lda2vec outputs. \n", "As of this writing, anything past Jan 6 2016 or this commit `14e7b5f60d8360eb84969ff08a1b77b365a5878e` should work.\n", "You can do this quickly by installing it directly from master like so:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# pip install git+https://github.com/bmabey/pyLDAvis.git@master#egg=pyLDAvis" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import pyLDAvis\n", "pyLDAvis.enable_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading in the saved model topics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After runnning `lda2vec_run.py` script in `examples/twenty_newsgroups/lda2vec` directory a `topics.pyldavis.npz` will be created that contains the topic-to-word probabilities and frequencies. What's left is to visualize and label each topic from the it's prevalent words." ] }, { "cell_type": "code", "execution_count": 157, "metadata": { "collapsed": false }, "outputs": [], "source": [ "npz = np.load(open('topics.pyldavis.npz', 'r'))\n", "dat = {k: v for (k, v) in npz.iteritems()}\n", "dat['vocab'] = dat['vocab'].tolist()\n", "# dat['term_frequency'] = dat['term_frequency'] * 1.0 / dat['term_frequency'].sum()" ] }, { "cell_type": "code", "execution_count": 189, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Topic 0 x11r5 xv window xterm server motif font xlib // sunos\n", "Topic 1 jesus son father matthew sin mary g'd disciples christ sins\n", "Topic 2 s1 nsa s2 clipper chip administration q escrow private sector serial number encryption technology\n", "Topic 3 leafs games playoffs hockey game players pens yankees bike phillies\n", "Topic 4 van - 0 pp en 1 njd standings 02 6\n", "Topic 5 out_of_vocabulary out_of_vocabulary anonymity hiv homicide adl ripem bullock encryption technology eff\n", "Topic 6 hiv magi prof erzurum venus van 2.5 million ankara satellite launched\n", "Topic 7 nsa escrow clipper chip encryption government phones warrant vat decrypt wiretap\n", "Topic 8 mac controller shipping disk printer mb ethernet enable os/2 port\n", "Topic 9 leafs cooper weaver karabagh myers agdam phillies flyers playoffs fired\n", "Topic 10 obfuscated = ciphertext jesus gentiles matthew judas { x int\n", "Topic 11 jesus ra bobby faith god homosexuality bible sin msg islam\n", "Topic 12 jesus sin scripture matthew christ islam god sins prophet faith\n", "Topic 13 mac i thanks monitor apple upgrade card connect using windows\n", "Topic 14 i quadra monitor my apple duo hard drive mac mouse thanks\n", "Topic 15 { shipping } + mac mb os/2 $ 3.5 manuals\n", "Topic 16 playoffs morris yankees leafs // pitching players } team wins\n", "Topic 17 :> taxes guns flame .. clinton kids jobs hey drugs\n", "Topic 18 revolver tires pitching saturn ball trigger car ice team engine\n", "Topic 19 stephanopoulos leafs mamma karabagh mr. koresh apartment fired myers sumgait\n" ] } ], "source": [ "top_n = 10\n", "topic_to_topwords = {}\n", "for j, topic_to_word in enumerate(dat['topic_term_dists']):\n", " top = np.argsort(topic_to_word)[::-1][:top_n]\n", " msg = 'Topic %i ' % j\n", " top_words = [dat['vocab'][i].strip()[:35] for i in top]\n", " msg += ' '.join(top_words)\n", " print msg\n", " topic_to_topwords[j] = top_words" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualize topics" ] }, { "cell_type": "code", "execution_count": 187, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')\n", "prepared_data = pyLDAvis.prepare(dat['topic_term_dists'], dat['doc_topic_dists'], \n", " dat['doc_lengths'] * 1.0, dat['vocab'], dat['term_frequency'] * 1.0, mds='tsne')" ] }, { "cell_type": "code", "execution_count": 188, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "
\n", "" ], "text/plain": [ "" ] }, "execution_count": 188, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pyLDAvis.display(prepared_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 'True' topics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 20 newsgroups dataset is interesting because users effetively classify the topics by posting to a particular newsgroup. This lets us qualitatively check our unsupervised topics with the 'true' labels. For example, the four topics we highlighted above are intuitively close to comp.graphics, sci.med, talk.politics.misc, and sci.space." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " comp.graphics\n", " comp.os.ms-windows.misc\n", " comp.sys.ibm.pc.hardware\n", " comp.sys.mac.hardware\n", " comp.windows.x \n", " rec.autos\n", " rec.motorcycles\n", " rec.sport.baseball\n", " rec.sport.hockey \n", " sci.crypt\n", " sci.electronics\n", " sci.med\n", " sci.space\n", " misc.forsale \n", " talk.politics.misc\n", " talk.politics.guns\n", " talk.politics.mideast \n", " talk.religion.misc\n", " alt.atheism\n", " soc.religion.christian" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Individual document topics" ] }, { "cell_type": "code", "execution_count": 248, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.datasets import fetch_20newsgroups\n", "remove=('headers', 'footers', 'quotes')\n", "texts = fetch_20newsgroups(subset='train', remove=remove).data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### First Example" ] }, { "cell_type": "code", "execution_count": 249, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A fair number of brave souls who upgraded their SI clock oscillator have\n", "shared their experiences for this poll. Please send a brief message detailing\n", "your experiences with the procedure. Top speed attained, CPU rated speed,\n", "add on cards and adapters, heat sinks, hour of usage per day, floppy disk\n", "functionality with 800 and 1.4 m floppies are especially requested.\n", "\n", "I will be summarizing in the next two days, so please add to the network\n", "knowledge base if you have done the clock upgrade and haven't answered this\n", "poll. Thanks.\n" ] } ], "source": [ "print texts[1]" ] }, { "cell_type": "code", "execution_count": 250, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "03% in topic 00 which has top words x11r5, xv, window, xterm, server, motif, font, xlib, //, sunos\n", "03% in topic 03 which has top words leafs, games, playoffs, hockey, game, players, pens, yankees, bike, phillies\n", "22% in topic 08 which has top words mac, controller, shipping, disk, printer, mb, ethernet, enable, os/2, port\n", "41% in topic 13 which has top words mac, i, thanks, monitor, apple, upgrade, card, connect, using, windows\n", "21% in topic 14 which has top words i, quadra, monitor, my, apple, duo, hard drive, mac, mouse, thanks\n", "04% in topic 15 which has top words {, shipping, }, +, mac, mb, os/2, $, 3.5, manuals\n", "03% in topic 18 which has top words revolver, tires, pitching, saturn, ball, trigger, car, ice, team, engine\n" ] } ], "source": [ "msg = \"{weight:02d}% in topic {topic_id:02d} which has top words {text:s}\"\n", "for topic_id, weight in enumerate(dat['doc_topic_dists'][1]):\n", " if weight > 0.01:\n", " text = ', '.join(topic_to_topwords[topic_id])\n", " print msg.format(topic_id=topic_id, weight=int(weight * 100.0), text=text)" ] }, { "cell_type": "code", "execution_count": 251, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 251, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFXCAYAAABdtRywAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHK1JREFUeJzt3XFsG+X9x/HPxYlQSUKKic9EHg2qVzQGGei3/DGhKoi4\nWaJFLWXxSFftt6meVmlSVqg2kMpIGaRrJVCjIjH+CFREZKVR6a+sUAvR1dJq9Y8iFYklIAooVWWW\nrJcEh65tKsHi+/0R4WHSzHZiJ3nI+/WXz36eJ8/d187n7myfLdd1XQEAAKOULPYEAABA/ghwAAAM\nRIADAGAgAhwAAAMR4AAAGIgABwDAQDkFeDweV0tLi5qbm9XT0zNru4GBAd1xxx06fvx4+r7GxkZt\n2LBBGzduVDgcnv+MAQCASrM1SKVS6urqUm9vr2zbVjgcVigUUjAYnNFu7969Wrt2bcb9lmWpr69P\nVVVVhZ05AADLWNYj8IGBAdXW1ioQCKisrEytra2KxWIz2vX19am5uVlerzfjftd1lUqlCjdjAACQ\nPcAdx1FNTU162e/3a3R0dEabEydOaPPmzTP6W5alSCSitrY2HTp0qABTBgAAWU+h52L37t165JFH\n0stfvTrrwYMHZdu2ksmktmzZotWrV6u+vr4QfxYAgGUra4D7/X6NjIyklx3HkW3bGW3ee+89bd++\nXa7ramJiQvF4XKWlpQqFQum2Xq9XTU1NGhwczBrgruvKsqy5rA8AAMtC1gCvq6tTIpHQ8PCwfD6f\notGouru7M9p89T3xHTt26L777lMoFNLVq1eVSqVUXl6uyclJnTp1Sh0dHVknZVmWxsYuzWF1sNh8\nvkpqZzDqZy5qZzafrzLvPlkD3OPxqLOzU5FIRK7rKhwOKxgMqr+/X5Zlqb29fda+4+Pj6ujokGVZ\nmpqa0vr162d8Sh0AAOTPWqo/J8qepJk4CjAb9TMXtTPbXI7AuRIbAAAGIsABADAQAQ4AgIEIcAAA\nDESAAwBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLAAQAwEAEOAICBCHAAAAxEgAMAYCACHAAAAxHg\nAAAYqHSxJwAAS83U1JTOnz9XkLFuvXW1PB5PQcYCvooAB4CvOX/+nB565nVdX2XPa5zJi6N69pEN\nCgbXFGhmwH8Q4ABwDddX2aq4MbDY0wBmxXvgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAM\nRIADAGCgnAI8Ho+rpaVFzc3N6unpmbXdwMCA7rjjDh0/fjzvvgAAIHdZAzyVSqmrq0v79+/XsWPH\nFI1GNTQ0dM12e/fu1dq1a/PuCwAA8pM1wAcGBlRbW6tAIKCysjK1trYqFovNaNfX16fm5mZ5vd68\n+wIAgPxkDXDHcVRTU5Ne9vv9Gh0dndHmxIkT2rx5c959AQBA/gryIbbdu3frkUceKcRQAAAgB1l/\nzMTv92tkZCS97DiObDvzF3ree+89bd++Xa7ramJiQvF4XB6PJ6e+s/H5KnNdBywx1M5s1E+amKgo\n2Fheb8WCbVNqt7xkDfC6ujolEgkNDw/L5/MpGo2qu7s7o81X39fesWOH7rvvPoVCIU1NTWXtO5ux\nsUt5rgqWAp+vktoZjPpNSyYvF3Sshdim1M5sc9n5yhrgHo9HnZ2dikQicl1X4XBYwWBQ/f39sixL\n7e3tefcFAADzY7mu6y72JK6FPUkzcRRgNuo3bWjoY+3oOT3v3wO/PDGsPVt/oGBwTYFmNjtqZ7a5\nHIFzJTYAAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAMRIADAGAg\nAhwAAAMR4AAAGIgABwDAQAQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAADEeAAABiIAAcA\nwEAEOAAABiLAAQAwEAEOAICBCHAAAAxUmkujeDyu3bt3y3VdtbW1aevWrRmPx2IxPfvssyopKVFp\naal27Nih73//+5KkxsZGVVRUpB87fPhw4dcCAIBlJmuAp1IpdXV1qbe3V7ZtKxwOKxQKKRgMptvc\nc889CoVCkqQPP/xQDz/8sN58801JkmVZ6uvrU1VVVZFWAQCA5SfrKfSBgQHV1tYqEAiorKxMra2t\nisViGW1WrFiRvj05OamSkv8M67quUqlUAacMAACyHoE7jqOampr0st/v1+Dg4Ix2J06c0N69e5VM\nJtXT05O+37IsRSIRlZSUqL29XQ8++GCBpg4AwPKV03vguVi3bp3WrVunM2fOaN++fXrppZckSQcP\nHpRt20omk9qyZYtWr16t+vr6Qv1ZAACWpawB7vf7NTIykl52HEe2bc/avr6+Xp988ok+++wzrVy5\nMt3W6/WqqalJg4ODOQW4z1eZy/yxBFE7s1E/aWKiomBjeb0VC7ZNqd3ykjXA6+rqlEgkNDw8LJ/P\np2g0qu7u7ow2iURCq1atkiS9//77+uKLL7Ry5UpdvXpVqVRK5eXlmpyc1KlTp9TR0ZHTxMbGLs1h\ndbDYfL5Kamcw6jctmbxc0LEWYptSO7PNZecra4B7PB51dnYqEonIdV2Fw2EFg0H19/fLsiy1t7fr\nrbfe0tGjR1VWVqbrrrtO+/btkySNj4+ro6NDlmVpampK69ev19q1a/NfMwAAkMFyXddd7ElcC3uS\nZuIowGzUb9rQ0Mfa0XNaFTcG5jXO5Ylh7dn6AwWDawo0s9lRO7PN5QicK7EBAGAgAhwAAAMR4AAA\nGIgABwDAQAQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLA\nAQAwEAEOAICBCHAAAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAM\nRIADAGCgnAI8Ho+rpaVFzc3N6unpmfF4LBbThg0btHHjRoXDYb3zzjs59wUAAPkrzdYglUqpq6tL\nvb29sm1b4XBYoVBIwWAw3eaee+5RKBSSJH344Yd6+OGH9eabb+bUFwAA5C/rEfjAwIBqa2sVCARU\nVlam1tZWxWKxjDYrVqxI356cnFRJSUnOfQEAQP6yHoE7jqOampr0st/v1+Dg4Ix2J06c0N69e5VM\nJtOnynPtCwAA8lOwD7GtW7dOb775pv70pz9p3759hRoWAABcQ9YjcL/fr5GRkfSy4ziybXvW9vX1\n9frkk0/02Wef5d33q3y+ypzaYemhdmajftLEREXBxvJ6KxZsm1K75SVrgNfV1SmRSGh4eFg+n0/R\naFTd3d0ZbRKJhFatWiVJev/99/XFF19o5cqVOfWdzdjYpTmsDhabz1dJ7QxG/aYlk5cLOtZCbFNq\nZ7a57HxlDXCPx6POzk5FIhG5rqtwOKxgMKj+/n5ZlqX29na99dZbOnr0qMrKynTdddelT6HP1hcA\nAMyP5bquu9iTuBb2JM3EUYDZqN+0oaGPtaPntCpuDMxrnMsTw9qz9QcKBtcUaGazo3Zmm8sROFdi\nAwDAQAQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLAAQAw\nEAEOAICBCHAAAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAMRIAD\nAGAgAhwAAAMR4AAAGIgABwDAQAQ4AAAGKs2lUTwe1+7du+W6rtra2rR169aMx9944w298MILkqTy\n8nI98cQT+s53viNJamxsVEVFhUpKSlRaWqrDhw8XeBUAAFh+sgZ4KpVSV1eXent7Zdu2wuGwQqGQ\ngsFgus0tt9yiAwcOqLKyUvF4XDt37tShQ4ckSZZlqa+vT1VVVcVbCwAAlpmsp9AHBgZUW1urQCCg\nsrIytba2KhaLZbS5++67VVlZmb7tOE76Mdd1lUqlCjxtAACWt6wB7jiOampq0st+v1+jo6Oztn/1\n1VfV0NCQXrYsS5FIRG1tbemjcgAAMD85vQeeq9OnT+vIkSN65ZVX0vcdPHhQtm0rmUxqy5YtWr16\nterr67OO5fNVFnJqWEDUzmzUT5qYqCjYWF5vxYJtU2q3vGQNcL/fr5GRkfSy4ziybXtGu7Nnz2rn\nzp168cUXM97v/rKt1+tVU1OTBgcHcwrwsbFLOa0Alhafr5LaGYz6TUsmLxd0rIXYptTObHPZ+cp6\nCr2urk6JRELDw8P6/PPPFY1GFQqFMtqMjIxo27Ztevrpp7Vq1ar0/VevXtWVK1ckSZOTkzp16pTW\nrFmT9yQBAECmrEfgHo9HnZ2dikQicl1X4XBYwWBQ/f39sixL7e3tev7553Xx4kU9+eSTcl03/XWx\n8fFxdXR0yLIsTU1Naf369Vq7du1CrBcAAN9oluu67mJP4lo4FWQmTuOZjfpNGxr6WDt6TqvixsC8\nxrk8Maw9W3+gYLD4Zx6pndmKcgodAAAsPQQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAAD\nEeAAABiIAAcAwEAEOAAABiLAAQAwUEF/DxzAtKmpKZ0/f64gY91662p5PJ6CjAXgm4MAB4rg/Plz\neuiZ13V9lT2vcSYvjurZRzYsyI9hADALAQ4UyfVV9rx/zQoAZsN74AAAGIgABwDAQAQ4AAAGIsAB\nADAQAQ4AgIEIcAAADMTXyABggXCBHxQSAQ4AC4QL/KCQCHAAWEBc4AeFwnvgAAAYiAAHAMBAOQV4\nPB5XS0uLmpub1dPTM+PxN954Qxs2bNCGDRv005/+VGfPns25LwAAyF/WAE+lUurq6tL+/ft17Ngx\nRaNRDQ0NZbS55ZZbdODAAb3++uv69a9/rZ07d+bcFwAA5C9rgA8MDKi2tlaBQEBlZWVqbW1VLBbL\naHP33XersrIyfdtxnJz7AgCA/GUNcMdxVFNTk172+/0aHR2dtf2rr76qhoaGOfUFAAC5KejXyE6f\nPq0jR47olVdemfdYPl9lAWaExUDtpImJioKN5fVWLOg2pX7Fq1+xnxfUbnnJGuB+v18jIyPpZcdx\nZNszL0Jw9uxZ7dy5Uy+++KKqqqry6nstY2OXcmqHpcXnq6R2kpLJywUda6G2KfWbVqz6FfN5Qe3M\nNpedr6yn0Ovq6pRIJDQ8PKzPP/9c0WhUoVAoo83IyIi2bdump59+WqtWrcqrLwAAyF/WI3CPx6PO\nzk5FIhG5rqtwOKxgMKj+/n5ZlqX29nY9//zzunjxop588km5rqvS0lIdPnx41r4AAGB+cnoPvKGh\nIf3BtC9t2rQpfXvXrl3atWtXzn0BAMD8cCU2AAAMRIADAGAgAhwAAAMR4AAAGIgABwDAQAQ4AAAG\nIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLAAQAwEAEOAICBCHAA\nAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAMlFOAx+NxtbS0qLm5\nWT09PTMeP3funDZt2qS6ujq99NJLGY81NjZqw4YN2rhxo8LhcGFmDQDAMlearUEqlVJXV5d6e3tl\n27bC4bBCoZCCwWC6zcqVK/X444/rxIkTM/pblqW+vj5VVVUVduYAACxjWY/ABwYGVFtbq0AgoLKy\nMrW2tioWi2W08Xq9uvPOO1VaOnN/wHVdpVKpws0YAABkD3DHcVRTU5Ne9vv9Gh0dzfkPWJalSCSi\ntrY2HTp0aG6zBAAAGbKeQp+vgwcPyrZtJZNJbdmyRatXr1Z9fX2x/ywAAN9oWQPc7/drZGQkvew4\njmzbzvkPfNnW6/WqqalJg4ODOQW4z1eZ89/A0kLtpImJioKN5fVWLOg2pX7Fq1+xnxfUbnnJGuB1\ndXVKJBIaHh6Wz+dTNBpVd3f3rO1d103fvnr1qlKplMrLyzU5OalTp06po6Mjp4mNjV3KqR2WFp+v\nktpJSiYvF3Sshdqm1G9asepXzOcFtTPbXHa+sga4x+NRZ2enIpGIXNdVOBxWMBhUf3+/LMtSe3u7\nxsfH1dbWpitXrqikpEQvv/yyotGoksmkOjo6ZFmWpqamtH79eq1du3ZOKwcAAP4jp/fAGxoa1NDQ\nkHHfpk2b0rerq6t18uTJGf3Ky8t19OjReU4RAAB8HVdiAwDAQAQ4AAAGIsABADAQAQ4AgIEIcAAA\nDESAAwBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLAAQAwEAEOAICBCHAAAAxEgAMAYCACHAAAAxHg\nAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAMRIADAGAgAhwAAAMR4AAAGIgABwDAQDkFeDwe\nV0tLi5qbm9XT0zPj8XPnzmnTpk2qq6vTSy+9lFdfAACQv6wBnkql1NXVpf379+vYsWOKRqMaGhrK\naLNy5Uo9/vjj+uUvf5l3XwAAkL+sAT4wMKDa2loFAgGVlZWptbVVsVgso43X69Wdd96p0tLSvPsC\nAID8ZQ1wx3FUU1OTXvb7/RodHc1p8Pn0BQAAs+NDbAAAGKg0WwO/36+RkZH0suM4sm07p8Hn09fn\nq8ypHZYeaidNTFQUbCyvt2JBtyn1K179iv28oHbLS9YAr6urUyKR0PDwsHw+n6LRqLq7u2dt77ru\nnPt+1djYpZzaYWnx+SqpnaRk8nJBx1qobUr9phWrfsV8XlA7s81l5ytrgHs8HnV2dioSich1XYXD\nYQWDQfX398uyLLW3t2t8fFxtbW26cuWKSkpK9PLLLysajaq8vPyafQEAwPxkDXBJamhoUENDQ8Z9\nmzZtSt+urq7WyZMnc+4LAADmhw+xAQBgIAIcAAADEeAAABiIAAcAwEAEOAAABiLAAQAwEAEOAICB\nCHAAAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhwAAAMRIADAGAgAhwA\nAAMR4AAAGIgABwDAQAQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgoNJcGsXjce3evVuu66qtrU1b\nt26d0WbXrl2Kx+NasWKF9uzZo+9+97uSpMbGRlVUVKikpESlpaU6fPhwYdcAAIBlKGuAp1IpdXV1\nqbe3V7ZtKxwOKxQKKRgMptucPHlSiURCx48f19///nf94Q9/0KFDhyRJlmWpr69PVVVVxVsLAACW\nmayn0AcGBlRbW6tAIKCysjK1trYqFotltInFYtq4caMk6a677tKlS5c0Pj4uSXJdV6lUqghTBwBg\n+coa4I7jqKamJr3s9/s1Ojqa0WZ0dFQ333xzRhvHcSRNH4FHIhG1tbWlj8oBAMD85PQe+HwcPHhQ\ntm0rmUxqy5YtWr16terr64v9ZwEA+EbLGuB+v18jIyPpZcdxZNt2RhvbtnXhwoX08oULF+T3+9OP\nSZLX61VTU5MGBwdzCnCfrzK3NcCSQ+2kiYmKgo3l9VYs6DalfsWrX7GfF9Ruecka4HV1dUokEhoe\nHpbP51M0GlV3d3dGm1AopAMHDuhHP/qR3n33Xd1www2qrq7W1atXlUqlVF5ersnJSZ06dUodHR05\nTWxs7NLc1giLyuerpHaSksnLBR1robYp9ZtWrPoV83lB7cw2l52vrAHu8XjU2dmpSCQi13UVDocV\nDAbV398vy7LU3t6ue++9VydPnlRTU1P6a2SSND4+ro6ODlmWpampKa1fv15r167Nf80AALOamprS\nRx99VJAdhFtvXS2Px1OAWaHYcnoPvKGhQQ0NDRn3bdq0KWN5586dM/rdcsstOnr06DymBwDI5vz5\nc3romdd1fZWdvfF/MXlxVM8+skHB4JoCzQzFVPQPsQEAiu/6KlsVNwYWexpYQFxKFQAAAxHgAAAY\niAAHAMBABDgAAAYiwAEAMNCS/BR6Ib7PyHcZAQDfZEsywP93xyvz+j7jN+W7jFNTUzp//ty8x2Fn\nBsByUaj/m9LS/9+5JAOc7zNOK8TFGb4pOzMAkIvldFGbJRng+A92ZgAgP8vl/yYfYgMAwEAEOAAA\nBiLAAQAwEAEOAICBCHAAAAxEgAMAYCACHAAAAxHgAAAYiAAHAMBABDgAAAYiwAEAMBABDgCAgQhw\nAAAMRIADAGAgAhwAAAPlFODxeFwtLS1qbm5WT0/PNdvs2rVLP/zhD3X//ffrgw8+yKsvAADIT9YA\nT6VS6urq0v79+3Xs2DFFo1ENDQ1ltDl58qQSiYSOHz+up556Sk888UTOfQEAQP6yBvjAwIBqa2sV\nCARUVlam1tZWxWKxjDaxWEwbN26UJN111126dOmSxsfHc+oLAADylzXAHcdRTU1Netnv92t0dDSj\nzejoqG6++eb08s033yzHcXLqCwAA8ldajEFd151X/8mL8wv52foPDX08r3G/FAyuWZBxJfO2xUcf\nfaRk8nLBxy3mNi7W2POt3WxjFHNbmFa/pfzam20M08aVzKufidtiLrIGuN/v18jISHrZcRzZtp3R\nxrZtXbhwIb184cIF+f1+ffHFF1n7Xsvb//eHXOaeN5/vf4wb9+3/M23OlUUatzjzLdbYJtZuemyz\n6mfaa8+0cb86vinjmrot5iLrKfS6ujolEgkNDw/r888/VzQaVSgUymgTCoX0l7/8RZL07rvv6oYb\nblB1dXVOfQEAQP6yHoF7PB51dnYqEonIdV2Fw2EFg0H19/fLsiy1t7fr3nvv1cmTJ9XU1KQVK1Zo\nz549/7UvAACYH8ud7xvWAABgwXElNgAADESAAwBgIAIcAAADFeV74HMVj8e1e/duua6rtrY2bd26\ndbGnhDw0NjaqoqJCJSUlKi0t1eHDhxd7SvgvHnvsMf3tb3/TTTfdpDfeeEOSdPHiRW3fvl3Dw8P6\n1re+pX379qmysjhfLcPcXat2zz33nA4dOqSbbrpJkrR9+3Y1NDQs5jRxDRcuXNCjjz6qTz/9VCUl\nJfrJT36in//853N67S2ZD7GlUik1Nzert7dXtm0rHA6ru7ubT60bJBQK6ciRI6qqqlrsqSAHZ86c\nUXl5uR599NF0CDzzzDNauXKlfvWrX6mnp0f/+te/9Lvf/W6RZ4qvu1btnnvuOZWXl2vLli2LPDv8\nN2NjYxofH9ftt9+uK1eu6Mc//rGef/55HTlyJO/X3pI5hc51083nuq5SqdRiTwM5qq+v1w033JBx\nXywW0wMPPCBJeuCBB3TixInFmBqyuFbtpPlfBRPF5/P5dPvtt0uSysvLFQwG5TjOnF57SybAuW66\n+SzLUiQSUVtbmw4dOrTY08EcJJNJVVdXS5r+R5NMJhd5RsjHn//8Z91///36/e9/r0uXLi32dJDF\nP/7xD509e1Z33XWXPv3007xfe0smwGG+gwcP6rXXXtMLL7ygAwcO6MyZM4s9JcyTZVmLPQXkaPPm\nzYrFYjp69Kiqq6vTF9TC0nTlyhVt27ZNjz32mMrLy2e81nJ57S2ZAM/lmutY2r6sl9frVVNTkwYH\nBxd5RsjXTTfdpPHxcUnT79V5vd5FnhFy5fV60//0H3zwQV5/S9i///1vbdu2Tffff7/WrVsnaW6v\nvSUT4Fw33WxXr17VlStXJEmTk5M6deqU1qyZ/6/toLi+/p5pY2Ojjhw5Ikl67bXXeA0uYV+v3djY\nWPr2X//6V912220LPSXk6LHHHtO3v/1t/eIXv0jfN5fX3pL5FLo0/TWyP/7xj+nrpvM1MnN88skn\n6ujokGVZmpqa0vr166nfEvfb3/5Wb7/9tj777DNVV1frN7/5jdatW6eHHnpI//znPxUIBLRv375r\nflgKi+tatXv77bf1wQcfqKSkRIFAQE899VT6PVUsHe+8845+9rOf6bbbbpNlWbIsS9u3b9f3vvc9\nPfzww3m99pZUgAMAgNwsmVPoAAAgdwQ4AAAGIsABADAQAQ4AgIEIcAAADESAAwBgIAIcAAADEeAA\nABjo/wEO7N57ApxvzwAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.bar(np.arange(20), dat['doc_topic_dists'][1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Second Example" ] }, { "cell_type": "code", "execution_count": 255, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I have been following this thread on talk.religion,\n", "soc.religion.christian.bible-study and here with interest. I am amazed at\n", "the different non-biblical argument those who oppose the Sabbath present. \n", "\n", "One question comes to mind, especially since my last one was not answered\n", "from Scripture. Maybe clh may wish to provide the first response.\n", "\n", "There is a lot of talk about the Sabbath of the TC being ceremonial. \n", "Answer this:\n", "\n", "Since the TC commandments is one law with ten parts on what biblical\n", "basis have you decided that only the Sabbath portion is ceremonial?\n", "OR You say that the seventh-day is the Sabbath but not applicable to\n", "Gentile Christians. Does that mean the Sabbath commandment has been\n", "annulled? References please.\n", "\n", "If God did not intend His requirements on the Jews to be applicable to\n", "Gentile Christians why did He make it plain that the Gentiles were now\n", "grafted into the commonwealth of Israel?\n", "\n", "Darius\n" ] } ], "source": [ "print texts[51]" ] }, { "cell_type": "code", "execution_count": 259, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "14% in topic 01 which has top words jesus, son, father, matthew, sin, mary, g'd, disciples, christ, sins\n", "14% in topic 02 which has top words s1, nsa, s2, clipper chip, administration, q, escrow, private sector, serial number, encryption technology\n", "09% in topic 07 which has top words nsa, escrow, clipper chip, encryption, government, phones, warrant, vat, decrypt, wiretap\n", "11% in topic 10 which has top words obfuscated, =, ciphertext, jesus, gentiles, matthew, judas, {, x, int\n", "20% in topic 11 which has top words jesus, ra, bobby, faith, god, homosexuality, bible, sin, msg, islam\n", "17% in topic 12 which has top words jesus, sin, scripture, matthew, christ, islam, god, sins, prophet, faith\n", "05% in topic 17 which has top words :>, taxes, guns, flame, .., clinton, kids, jobs, hey, drugs\n", "05% in topic 19 which has top words stephanopoulos, leafs, mamma, karabagh, mr., koresh, apartment, fired, myers, sumgait\n" ] } ], "source": [ "msg = \"{weight:02d}% in topic {topic_id:02d} which has top words {text:s}\"\n", "for topic_id, weight in enumerate(dat['doc_topic_dists'][51]):\n", " if weight > 0.01:\n", " text = ', '.join(topic_to_topwords[topic_id])\n", " print msg.format(topic_id=topic_id, weight=int(weight * 100.0), text=text)" ] }, { "cell_type": "code", "execution_count": 260, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 260, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFXCAYAAABdtRywAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFjZJREFUeJzt3V9MW/f9//HXwXDRACF1sB3ktUR4qdapLL3IRS8iKkEZ\naCgkHd5g07QpnhZpEksWba30zQpdS5dKjYRaLcokuqhR/wWlEW3++KarpYFykUqdlJGLRp1AiBUW\nG+qMpRApmn1+F1W9H4XUBuyYd3g+rrD5nA8f5+TkiX3sE8d1XVcAAMCUkmIvAAAArBwBBwDAIAIO\nAIBBBBwAAIMIOAAABhFwAAAMKs1l0MjIiI4ePSrXddXR0aEDBw4s+v6FCxf06quvSpLKy8v17LPP\n6lvf+pYkqbGxURUVFSopKVFpaanOnj2b54cAAMDG42T7HHg6nVZLS4tOnTolv9+vcDis/v5+hUKh\nzJgrV64oFAqpsrJSIyMjOn78uM6cOSNJampq0tDQkKqqqgr7SAAA2ECyvoQ+Ojqq2tpaBYNBlZWV\nqa2tTbFYbNGYRx99VJWVlZmv4/F45nuu6yqdTud52QAAbGxZAx6Px1VTU5O5HQgElEgk7jj+nXfe\nUUNDQ+a24ziKRCLq6OjIPCsHAABrk9M58FxdvnxZQ0NDevvttzP3nT59Wn6/X8lkUvv371ddXZ12\n7dqVzx8LAMCGk/UZeCAQ0PT0dOZ2PB6X3+9fMu7atWvq7e3Vn/70p0Xnu78c6/V61dzcrKtXr2Zd\nFJdnBwDg62V9Bl5fX6/JyUlNTU3J5/MpGo2qv79/0Zjp6WkdPHhQL730kh588MHM/bdu3VI6nVZ5\nebkWFhZ06dIldXd3Z12U4ziambm5ioeDYvP5Ktl3hrH/7GLf2ebzVa54m6wB93g86unpUSQSkeu6\nCofDCoVCGhwclOM46uzs1IkTJzQ3N6fnnntOrutmPi42Ozur7u5uOY6jVCqlPXv2aPfu3at6cAAA\n4H+yfoysWPhN0iaeBdjG/rOLfWfbap6BcyU2AAAMIuAAABhEwAEAMIiAAwBgEAEHAMAgAg4AgEEE\nHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCLg\nAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEH\nAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgA\nAAYRcAAADCLgAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABpUWewEA7m2pVEoTE+N5\nmWv79jp5PJ68zAVYR8ABFNTExLgOHTuvTVX+Nc2zMJfQK0+1KxTakaeVAbYRcAAFt6nKr4r7g8Ve\nBnBP4Rw4AAAGEXAAAAzKKeAjIyNqbW1VS0uLBgYGlnz/woULam9vV3t7u370ox/p2rVrOW8LAABW\nLmvA0+m0+vr6dPLkSV28eFHRaFRjY2OLxjzwwAN66623dP78ef3yl79Ub29vztsCAICVyxrw0dFR\n1dbWKhgMqqysTG1tbYrFYovGPProo6qsrMx8HY/Hc94WAACsXNaAx+Nx1dTUZG4HAgElEok7jn/n\nnXfU0NCwqm0BAEBu8voxssuXL2toaEhvv/32mufy+SrzsCIUA/vOtnzvvxs3KvI2l9dbwd+vr8Gf\nzcaSNeCBQEDT09OZ2/F4XH7/0gsyXLt2Tb29vfrzn/+sqqqqFW27nJmZmzmNw/ri81Wy7wwrxP5L\nJj/P61z8/Voex55tq/nlK+tL6PX19ZqcnNTU1JRu376taDSqpqamRWOmp6d18OBBvfTSS3rwwQdX\ntC0AAFi5rM/APR6Penp6FIlE5LquwuGwQqGQBgcH5TiOOjs7deLECc3Nzem5556T67oqLS3V2bNn\n77gtAABYG8d1XbfYi1gOLwXZxMt4thVi/42N/UP/N3B5zZdS/fzGlF488BjXQr8Djj3bCvISOgAA\nWH8IOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEHAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAA\nGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCot9gIAYDVSqZQmJsbzMtf27XXy\neDx5mQu4Wwg4AJMmJsZ16Nh5baryr2mehbmEXnmqXaHQjjytDLg7CDgAszZV+VVxf7DYywCKgnPg\nAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEH\nAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgA\nAAYRcAAADCLgAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAINyCvjIyIhaW1vV0tKigYGBJd8f\nHx9XV1eX6uvr9dprry36XmNjo9rb27Vv3z6Fw+H8rBoAgA2uNNuAdDqtvr4+nTp1Sn6/X+FwWE1N\nTQqFQpkxW7Zs0TPPPKMPPvhgyfaO4+iNN95QVVVVflcOAMAGlvUZ+OjoqGpraxUMBlVWVqa2tjbF\nYrFFY7xerx555BGVli79fcB1XaXT6fytGAAAZA94PB5XTU1N5nYgEFAikcj5BziOo0gkoo6ODp05\nc2Z1qwQAAItkfQl9rU6fPi2/369kMqn9+/errq5Ou3btKvSPBQDgnpY14IFAQNPT05nb8Xhcfr8/\n5x/w5Viv16vm5mZdvXo1p4D7fJU5/4xcpFIpjY2N5WWuUCgkj8eTl7nuRfned7i78r3/btyoyNtc\nXm9FZn2Fmteye+ExIHdZA15fX6/JyUlNTU3J5/MpGo2qv7//juNd1818fevWLaXTaZWXl2thYUGX\nLl1Sd3d3TgubmbmZ07hcjY39Q4eOndemqtx/+VjOwlxCrzzVrlBoR55Wdm/x+Srzvu9w9xRi/yWT\nn+d1ri/XV6h5reLYs201v3xlDbjH41FPT48ikYhc11U4HFYoFNLg4KAcx1FnZ6dmZ2fV0dGh+fl5\nlZSU6PXXX1c0GlUymVR3d7ccx1EqldKePXu0e/fuVT24fNhU5VfF/cGi/XwAAPIlp3PgDQ0Namho\nWHRfV1dX5uvq6moNDw8v2a68vFznzp1b4xIBAMBXcSU2AAAMIuAAABhEwAEAMIiAAwBgEAEHAMCg\ngl+J7V6XSqU0MTGel7m2b6/jAjEAgJwQ8DWamBjnAjEAgLuOgOcBF4gBANxtnAMHAMAgAg4AgEEE\nHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCLg\nAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEH\nAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwqLfYCAKwPqVRKn3zyiZLJz9c81/btdfJ4PHlYFYA7IeAA\nJEkTE+M6dOy8NlX51zTPwlxCrzzVrlBoR55WBmA5BBxAxqYqvyruDxZ7GQBywDlwAAAMIuAAABhE\nwAEAMIiAAwBgEAEHAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGBQTgEfGRlR\na2urWlpaNDAwsOT74+Pj6urqUn19vV577bUVbQsAAFYua8DT6bT6+vp08uRJXbx4UdFoVGNjY4vG\nbNmyRc8884x+/vOfr3hbAACwclkDPjo6qtraWgWDQZWVlamtrU2xWGzRGK/Xq0ceeUSlpaUr3hYA\nAKxc1oDH43HV1NRkbgcCASUSiZwmX8u2AADgzngTGwAABpVmGxAIBDQ9PZ25HY/H5ff7c5p8Ldv6\nfJU5jcvVjRsVeZvL663IrK9Q81p2LzyGjcjaMcKxt9S98BiQu6wBr6+v1+TkpKampuTz+RSNRtXf\n33/H8a7rrnrb/9/MzM2cxuUqmfw8r3N9ub5CzWuVz1dp/jFsVNaOEY69xTj2bFvNL19ZA+7xeNTT\n06NIJCLXdRUOhxUKhTQ4OCjHcdTZ2anZ2Vl1dHRofn5eJSUlev311xWNRlVeXr7stgAAYG2yBlyS\nGhoa1NDQsOi+rq6uzNfV1dUaHh7OeVsAALA2vIkNAACDCDgAAAYRcAAADMrpHDhwr0qlUpqYGM/L\nXNu318nj8eRlLgDIhoBjQ5uYGNehY+e1qSq36xPcycJcQq881a5QaEeeVgYAX4+AY8PbVOVXxf3B\nYi8DAFaEc+AAABhEwAEAMIiAAwBgEAEHAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAw\niIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCLgAAAYRMABADCIgAMAYBABBwDAIAIOAIBB\nBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEHAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi\n4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCot9gIAABtPKpXSxMR4Xuba\nvr1OHo8nL3NZQsABAHfdxMS4Dh07r01V/jXNszCX0CtPtSsU2pGnldlBwAEARbGpyq+K+4PFXoZZ\nnAMHAMAgnoEDAJCD9XbenoADAJCD9XbenoADAJCj9XTePqeAj4yM6OjRo3JdVx0dHTpw4MCSMS+8\n8IJGRkZ033336cUXX9S3v/1tSVJjY6MqKipUUlKi0tJSnT17Nr+PAACADShrwNPptPr6+nTq1Cn5\n/X6Fw2E1NTUpFAplxgwPD2tyclLvv/++/v73v+v3v/+9zpw5I0lyHEdvvPGGqqqqCvcoAADYYLK+\nC310dFS1tbUKBoMqKytTW1ubYrHYojGxWEz79u2TJO3cuVM3b97U7OysJMl1XaXT6QIsHQCAjStr\nwOPxuGpqajK3A4GAEonEojGJRELbtm1bNCYej0v64hl4JBJRR0dH5lk5AABYm4K/ie306dPy+/1K\nJpPav3+/6urqtGvXrqzb+XyVeV3HjRsVeZvL663IrK9Q81pm6TGw//7H2jHCvlvK0mOwuP/W25qz\nBjwQCGh6ejpzOx6Py+9f/BZ6v9+v69evZ25fv35dgUAg870vFutVc3Ozrl69mlPAZ2Zu5vYIcpRM\nfp7Xub5cX6HmtcrnqzT1GNh//2PtGGHfLcaxV/jHXsg1rybmWV9Cr6+v1+TkpKampnT79m1Fo1E1\nNTUtGtPU1KT33ntPknTlyhVt3rxZ1dXVunXrlubn5yVJCwsLunTpknbs2HjXqwUAIN+yPgP3eDzq\n6elRJBKR67oKh8MKhUIaHByU4zjq7OzU448/ruHhYTU3N2c+RiZJs7Oz6u7uluM4SqVS2rNnj3bv\n3l3wBwUAwL0up3PgDQ0NamhoWHRfV1fXotu9vb1LtnvggQd07ty5NSwPAAAsh//MBAAAgwg4AAAG\nEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCLgAAAYRMABADCI\ngAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEHAMAgAg4AgEEE\nHAAAgwg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCAQQQcAACDCDgAAAYRcAAADCLg\nAAAYRMABADCIgAMAYBABBwDAIAIOAIBBBBwAAIMIOAAABhFwAAAMIuAAABhEwAEAMIiAAwBgEAEH\nAMAgAg4AgEEEHAAAgwg4AAAGEXAAAAwi4AAAGJRTwEdGRtTa2qqWlhYNDAwsO+aFF17Qd7/7Xe3d\nu1cff/zxirYFAAArkzXg6XRafX19OnnypC5evKhoNKqxsbFFY4aHhzU5Oan3339fzz//vJ599tmc\ntwUAACuXNeCjo6Oqra1VMBhUWVmZ2traFIvFFo2JxWLat2+fJGnnzp26efOmZmdnc9oWAACsXNaA\nx+Nx1dTUZG4HAgElEolFYxKJhLZt25a5vW3bNsXj8Zy2BQAAK1daiEld113T9p988omSyc/XNEco\ntGPJfQtza//lYbk5CjWvJI2N/WPNcy/3Z5GPeZebOx/7brl5C7VeqXD7r1BrtvhnYW1eyd7+K9Sx\nJxVuzdaOvTv9vJXKxxyS5LhZanvlyhX98Y9/1MmTJyUp80a0AwcOZMb09vbqscce0/e+9z1JUmtr\nq9588019+umnWbcFAAArl/Ul9Pr6ek1OTmpqakq3b99WNBpVU1PTojFNTU167733JH0R/M2bN6u6\nujqnbQEAwMplfQnd4/Gop6dHkUhErusqHA4rFAppcHBQjuOos7NTjz/+uIaHh9Xc3Kz77rtPL774\n4tduCwAA1ibrS+gAAGD94UpsAAAYRMABADCIgAMAYFBBPge+WiMjIzp69Khc11VHRwcfNzOmsbFR\nFRUVKikpUWlpqc6ePVvsJeFrHDlyRH/961+1detWXbhwQZI0Nzenw4cPa2pqSt/4xjf08ssvq7Ky\nssgrxVctt++OHz+uM2fOaOvWrZKkw4cPq6GhoZjLxDKuX7+up59+Wp999plKSkr0gx/8QD/96U9X\ndeytmzexpdNptbS06NSpU/L7/QqHw+rv7+dd64Y0NTVpaGhIVVVVxV4KcvDRRx+pvLxcTz/9dCYC\nx44d05YtW/SLX/xCAwMD+s9//qPf/va3RV4pvmq5fXf8+HGVl5dr//79RV4dvs7MzIxmZ2f18MMP\na35+Xt///vd14sQJDQ0NrfjYWzcvoXPddPtc11U6nS72MpCjXbt2afPmzYvui8VievLJJyVJTz75\npD744INiLA1ZLLfvpLVfBROF5/P59PDDD0uSysvLFQqFFI/HV3XsrZuAc910+xzHUSQSUUdHh86c\nOVPs5WAVksmkqqurJX3xD00ymSzyirASb775pvbu3avf/e53unnzZrGXgyw+/fRTXbt2TTt37tRn\nn3224mNv3QQc9p0+fVrvvvuuXn31Vb311lv66KOPir0krJHjOMVeAnL04x//WLFYTOfOnVN1dXXm\nglpYn+bn53Xw4EEdOXJE5eXlS461XI69dRPwQCCg6enpzO14PC6/31/EFWGlvtxfXq9Xzc3Nunr1\napFXhJXaunWrZmdnJX1xrs7r9RZ5RciV1+vN/KP/wx/+kONvHfvvf/+rgwcPau/evXriiSckre7Y\nWzcB57rptt26dUvz8/OSpIWFBV26dEk7diz9n3ywvnz1nGljY6OGhoYkSe+++y7H4Dr21X03MzOT\n+fovf/mLHnroobu9JOToyJEj+uY3v6mf/exnmftWc+ytm3ehS198jOwPf/hD5rrpfIzMjn/+85/q\n7u6W4zhKpVLas2cP+2+d+81vfqMPP/xQ//73v1VdXa1f/epXeuKJJ3To0CH961//UjAY1Msvv7zs\nm6VQXMvtuw8//FAff/yxSkpKFAwG9fzzz2fOqWL9+Nvf/qaf/OQneuihh+Q4jhzH0eHDh/Wd73xH\nv/71r1d07K2rgAMAgNysm5fQAQBA7gg4AAAGEXAAAAwi4AAAGETAAQAwiIADAGAQAQcAwCACDgCA\nQf8PS7wCnGXTEUYAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.bar(np.arange(20), dat['doc_topic_dists'][51])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }