{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Comparison of two LDA models & visualize difference" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## In this notebook, I want to show how you can compare models with itself and with other model and why you need it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## First, clean up 20 newsgroups dataset. We will use it for fitting LDA." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from string import punctuation\n", "from nltk import RegexpTokenizer\n", "from nltk.stem.porter import PorterStemmer\n", "from nltk.corpus import stopwords\n", "from sklearn.datasets import fetch_20newsgroups\n", "\n", "\n", "newsgroups = fetch_20newsgroups()\n", "eng_stopwords = set(stopwords.words('english'))\n", "\n", "tokenizer = RegexpTokenizer('\\s+', gaps=True)\n", "stemmer = PorterStemmer()\n", "translate_tab = {ord(p): u\" \" for p in punctuation}\n", "\n", "def text2tokens(raw_text):\n", " \"\"\"\n", " Convert raw test to list of stemmed tokens\n", " \"\"\"\n", " clean_text = raw_text.lower().translate(translate_tab)\n", " tokens = [token.strip() for token in tokenizer.tokenize(clean_text)]\n", " tokens = [token for token in tokens if token not in eng_stopwords]\n", " stemmed_tokens = [stemmer.stem(token) for token in tokens]\n", " \n", " return [token for token in stemmed_tokens if len(token) > 2] # skip short tokens\n", "\n", "dataset = [text2tokens(txt) for txt in newsgroups['data']] # convert a documents to list of tokens" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from gensim.corpora import Dictionary\n", "dictionary = Dictionary(documents=dataset, prune_at=None)\n", "dictionary.filter_extremes(no_below=5, no_above=0.3, keep_n=None) # use Dictionary to remove un-relevant tokens\n", "dictionary.compactify()\n", "\n", "d2b_dataset = [dictionary.doc2bow(doc) for doc in dataset] # convert list of tokens to bag of word representation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Second, fit two LDA models." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 4min 17s, sys: 22.2 s, total: 4min 39s\n", "Wall time: 5min 13s\n" ] } ], "source": [ "%%time\n", "\n", "from gensim.models import LdaMulticore\n", "num_topics = 15\n", "\n", "lda_fst = LdaMulticore(\n", " corpus=d2b_dataset, num_topics=num_topics, id2word=dictionary,\n", " workers=4, eval_every=None, passes=10, batch=True\n", ")\n", "\n", "lda_snd = LdaMulticore(\n", " corpus=d2b_dataset, num_topics=num_topics, id2word=dictionary,\n", " workers=4, eval_every=None, passes=20, batch=True\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## It's time to cases with visualisation, Yay!" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import plotly.offline as py\n", "import plotly.graph_objs as go\n", "\n", "py.init_notebook_mode()\n", "\n", "def plot_difference(mdiff, title=\"\", annotation=None):\n", " \"\"\"\n", " Helper function for plot difference between models\n", " \"\"\"\n", " annotation_html = None\n", " if annotation is not None:\n", " annotation_html = [\n", " [\n", " \"+++ {}
--- {}\".format(\", \".join(int_tokens), \", \".join(diff_tokens)) \n", " for (int_tokens, diff_tokens) in row\n", " ] \n", " for row in annotation\n", " ]\n", " \n", " data = go.Heatmap(z=mdiff, colorscale='RdBu', text=annotation_html)\n", " layout = go.Layout(width=950, height=950, title=title, xaxis=dict(title=\"topic\"), yaxis=dict(title=\"topic\"))\n", " py.iplot(dict(data=[data], layout=layout))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In gensim, you can visualise topic different with matrix and annotation. For this purposes, you can use method `diff` from LdaModel.\n", "\n", "This function return matrix with distances mdiff and matrix with annotations annotation. Read the docstring for more detailed info.\n", "\n", "In cells mdiff[i][j] we can see a distance between topic_i from the first model and topic_j from the second model.\n", "\n", "In cells annotation[i][j] we can see [tokens from intersection, tokens from difference] between topic_i from first model and topic_j from the second model." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "LdaMulticore.diff?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Case 1: How topics in ONE model correlate with each other." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Short description:\n", "- x-axis - topic;\n", "- y-axis - topic;\n", "- almost red cell - strongly decorrelated topics;\n", "- almost blue cell - strongly correlated topics.\n", "\n", "In an ideal world, we would like to see different topics decorrelated between themselves. In this case, our matrix would look like this:\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "colorscale": "RdBu", "text": null, "type": "heatmap", "z": [ [ 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1 ], [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 ] ] } ], "layout": { "height": 950, "title": "Topic difference (one model) in ideal world", "width": 950, "xaxis": { "title": "topic" }, "yaxis": { "title": "topic" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "\n", "mdiff = np.ones((num_topics, num_topics))\n", "np.fill_diagonal(mdiff, 0.)\n", " \n", "plot_difference(mdiff, title=\"Topic difference (one model) in ideal world\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, in real life, not everything is so good, and the matrix looks different." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Short description (annotations):\n", "- +++ make, world, well - words from the intersection of topics;\n", "- --- money, day, still - words from the symmetric difference of topics." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "colorscale": "RdBu", "text": [ [ "+++ color, henri, engin, window, ftp, includ, version, system, andrew, inform
--- ", "+++ new, distribut, system, file, also, good
--- could, includ, wire, ibm, want, cmu, scsi, driver, right, carri", "+++ also, good
--- think, jew, could, believ, includ, ibm, christian, want, life, cmu", "+++
--- chz, includ, ibm, cmu, scsi, driver, 7ez, 6um, 6ei, look", "+++ scsi, work, new, system, repli, comput, driver, also, look, good
--- think, play, includ, ibm, better, want, cmu, pittsburgh, right, file", "+++ data, work, distribut, new, system, inform, comput, chip, also, bit
--- color, think, henri, window, could, engin, escrow, technolog, ftp, includ", "+++ work, distribut, thank, new, mail, system, repli, anyon, pleas, look
--- think, could, includ, jim, ibm, want, cmu, scsi, driver, amanda", "+++ packag, work, new, system, comput, also, ibm, look, repli
--- georg, think, could, believ, jim, includ, presid, jake, want, cmu", "+++ color, window, includ, version, inform, system, program, comput, mit, also
--- henri, engin, ftp, contact, andrew, set, anyon, server, ibm, user", "+++ data, engin, distribut, new, system, list, anyon, also, pleas, look
--- think, could, includ, ibm, cmu, scsi, driver, center, david, file", "+++ good, help, work, thank, new, system, comput, anyon, imag, also
--- color, think, henri, window, could, engin, ftp, includ, version, andrew", "+++ new, also, support, work, good
--- think, could, believ, includ, ibm, christian, want, cmu, scsi, driver", "+++ work, distribut, new, includ, system, inform, comput, look, also, program
--- think, could, satellit, technolog, ibm, want, cmu, scsi, driver, center", "+++ scsi, work, help, distribut, new, thank, mail, system, repli, comput
--- color, think, simm, henri, price, could, window, engin, ftp, includ", "+++ also, look, new, toronto, good
--- think, play, could, includ, ibm, better, cmu, scsi, driver, right" ], [ "+++ new, distribut, system, file, also, good
--- could, includ, wire, ibm, cmu, scsi, want, driver, right, carri", "+++ keith, ground, could, system, power, law, wire, arm, institut, want
--- ", "+++ want, say, kill, could, time, may, state, point, peopl, right
--- keith, religion, ground, think, said, jew, live, mean, believ, system", "+++
--- chz, could, wire, want, 7ez, right, 6um, carri, 6ei, outlet", "+++ want, say, new, time, system, power, peopl, need, right, make
--- keith, sinc, ground, think, play, could, law, wire, speed, pitt", "+++ want, say, distribut, could, new, time, may, number, system, peopl
--- keith, ground, think, escrow, technolog, inform, power, wire, arm, institut", "+++ want, say, distribut, could, new, may, state, system, point, peopl
--- keith, win, ground, men, think, columbia, jim, power, law, anyon", "+++ want, say, could, new, time, system, point, peopl, right, question
--- keith, georg, ground, think, said, mean, believ, jim, presid, power", "+++ want, time, may, system, need, file, make, also
--- could, includ, wire, mous, right, carri, outlet, look, machin, even", "+++ distribut, new, could, may, system, point, peopl, make, year, time
--- keith, request, ground, think, engin, mission, power, gov, law, anyon", "+++ want, say, could, new, time, may, system, peopl, need, make
--- keith, ground, think, power, jpeg, buy, law, anyon, lot, wire", "+++ want, say, could, new, time, may, state, power, peopl, law
--- keith, shall, ground, file, think, mean, believ, system, free, wire", "+++ want, say, distribut, could, new, time, system, peopl, make, year
--- keith, ground, think, satellit, technolog, includ, inform, power, law, design", "+++ want, distribut, could, usa, new, state, time, system, power, need
--- keith, ground, think, simm, price, law, set, anyon, wire, find", "+++ say, new, could, time, may, point, right, make, well, even
--- keith, win, ground, think, play, system, power, basebal, law, wire" ], [ "+++ also, good
--- think, jew, could, includ, believ, ibm, christian, cmu, scsi, want", "+++ want, say, kill, could, time, may, state, point, peopl, right
--- keith, religion, ground, think, said, jew, live, mean, believ, system", "+++ religion, think, jew, said, live, could, mean, believ, claim, christian
--- ", "+++ israel
--- chz, think, jew, could, believ, christian, want, life, 7ez, right", "+++ tri, want, think, say, first, time, peopl, see, take, right
--- religion, sinc, said, jew, play, could, live, mean, believ, system", "+++ want, think, thing, could, say, time, may, state, peopl, right
--- religion, said, jew, live, mean, escrow, technolog, believ, system, inform", "+++ tri, want, think, say, could, may, state, point, peopl, come
--- win, religion, men, said, jew, live, mean, columbia, believ, jim", "+++ think, said, mean, could, believ, want, thing, come, take, right
--- georg, religion, jew, live, jim, presid, system, tax, jake, ibm", "+++ tri, want, time, may, call, make, also
--- think, jew, could, includ, believ, christian, life, mous, right, look", "+++ tri, think, thing, could, first, time, may, point, peopl, world
--- request, religion, said, engin, live, jew, mean, believ, system, mission", "+++ want, think, thing, could, say, first, time, may, peopl, world
--- religion, said, jew, live, mean, believ, system, jpeg, buy, anyon", "+++ think, mean, could, believ, christian, want, thing, take, right, way
--- shall, religion, said, live, jew, power, law, free, claim, life", "+++ want, think, thing, could, say, first, time, peopl, take, call
--- religion, said, jew, live, satellit, technolog, includ, mean, believ, system", "+++ want, think, thing, could, time, state, make, also, good
--- jew, believ, christian, life, scsi, pin, right, look, exist, even", "+++ think, say, could, first, time, may, point, see, take, right
--- win, religion, said, jew, play, live, mean, believ, basebal, better" ], [ "+++
--- chz, includ, ibm, cmu, scsi, driver, 7ez, 6um, 6ei, look", "+++
--- chz, could, wire, want, right, carri, 7ez, 6um, outlet, 6ei", "+++ israel
--- chz, think, jew, could, believ, christian, want, life, right, 7ez", "+++ fij, chz, r8f, b8g, 147, nuy, 2tct, wm4u, 9f9, b4q
--- ", "+++
--- chz, think, play, better, want, scsi, pittsburgh, driver, right, 7ez", "+++
--- chz, think, could, technolog, want, right, 7ez, 6um, 6ei, b8e", "+++
--- chz, think, could, jim, want, 7ez, 6um, amanda, 6ei, david", "+++
--- georg, chz, think, could, believ, jim, presid, jake, ibm, want", "+++
--- chz, includ, want, mous, 7ez, 6um, 6ei, look, b8e, machin", "+++
--- chz, think, could, 7ez, 6um, center, 6ei, david, look, b8e", "+++
--- chz, think, could, want, 7ez, 6um, 6ei, look, motorcycl, b8e", "+++
--- chz, think, could, believ, christian, want, right, 7ez, 6um, 6ei", "+++
--- chz, think, could, satellit, technolog, includ, want, 7ez, 6um, center", "+++
--- chz, think, could, want, scsi, pin, 7ez, 6um, 6ei, look", "+++
--- chz, think, play, could, better, right, 7ez, 6um, 6ei, look" ], [ "+++ scsi, work, new, system, repli, comput, driver, also, look, good
--- think, play, includ, ibm, better, cmu, want, pittsburgh, right, file", "+++ want, say, new, time, system, power, peopl, need, right, make
--- keith, ground, sinc, think, could, play, law, wire, find, speed", "+++ tri, want, think, say, first, time, peopl, see, take, right
--- religion, sinc, said, jew, could, live, mean, believ, play, system", "+++
--- chz, think, play, better, want, scsi, pittsburgh, driver, 7ez, right", "+++ sinc, think, play, system, power, speed, pitt, better, run, back
--- ", "+++ want, think, work, say, new, time, system, peopl, comput, need
--- sinc, could, play, escrow, technolog, inform, power, law, speed, pitt", "+++ tri, want, think, work, say, new, system, repli, peopl, need
--- win, sinc, men, could, play, columbia, jim, power, anyon, speed", "+++ tri, want, think, work, say, new, first, time, system, peopl
--- georg, sinc, said, mean, could, play, believ, jim, presid, power", "+++ tri, want, read, work, time, system, problem, comput, need, make
--- color, sinc, think, window, play, includ, version, inform, power, set", "+++ car, tri, read, think, new, first, time, system, peopl, see
--- request, sinc, engin, could, play, mission, power, gov, anyon, speed", "+++ good, want, think, work, say, new, first, time, system, peopl
--- sinc, could, play, power, jpeg, buy, anyon, lot, speed, pitt", "+++ want, think, work, say, new, time, power, peopl, see, take
--- shall, sinc, mean, could, play, believ, system, law, free, speed", "+++ want, think, work, say, new, first, time, system, problem, comput
--- sinc, could, satellit, technolog, includ, play, inform, power, design, msg", "+++ car, want, scsi, think, work, new, control, time, system, power
--- sinc, simm, price, could, play, set, anyon, speed, pitt, better", "+++ good, think, say, play, new, first, time, see, take, right
--- win, sinc, could, system, power, basebal, speed, pitt, defens, realli" ], [ "+++ data, work, distribut, new, system, inform, comput, chip, also, bit
--- think, could, technolog, includ, ibm, cmu, scsi, want, driver, right", "+++ want, say, distribut, could, new, time, may, mani, system, peopl
--- keith, ground, think, escrow, technolog, power, inform, wire, find, arm", "+++ want, think, thing, could, say, time, may, state, peopl, right
--- religion, said, jew, live, mean, believ, escrow, technolog, system, inform", "+++
--- chz, think, could, technolog, want, 7ez, right, 6um, 6ei, b8e", "+++ want, think, work, say, new, time, system, peopl, comput, need
--- sinc, play, could, escrow, technolog, power, inform, law, speed, pitt", "+++ think, could, escrow, technolog, system, inform, law, data, want, thing
--- ", "+++ want, think, work, say, distribut, could, new, may, state, system
--- win, men, columbia, escrow, jim, technolog, inform, law, anyon, data", "+++ want, think, work, thing, could, say, new, time, system, peopl
--- georg, said, mean, escrow, believ, jim, technolog, presid, inform, tax", "+++ want, work, time, may, system, inform, comput, need, make, also
--- think, could, technolog, includ, mous, right, look, clipper, machin, file", "+++ data, think, thing, distribut, could, two, new, may, system, peopl
--- request, engin, escrow, technolog, mission, inform, gov, law, anyon, jpl", "+++ want, think, work, thing, could, say, new, time, may, system
--- escrow, technolog, inform, jpeg, buy, law, anyon, lot, alaska, realli", "+++ want, think, work, thing, could, say, new, time, may, govern
--- shall, mean, escrow, believ, technolog, system, power, inform, free, christian", "+++ want, think, work, thing, could, distribut, technolog, two, say, gener
--- satellit, includ, escrow, law, design, msg, program, data, launch, number", "+++ want, think, work, thing, distribut, could, two, new, state, time
--- simm, price, escrow, technolog, power, inform, law, set, anyon, data", "+++ think, say, could, new, two, time, may, right, make, even
--- win, play, escrow, technolog, system, inform, basebal, law, better, run" ], [ "+++ work, distribut, thank, new, mail, system, repli, anyon, pleas, look
--- think, could, includ, jim, ibm, cmu, scsi, want, driver, amanda", "+++ want, say, distribut, could, new, may, state, system, point, peopl
--- keith, win, ground, men, think, columbia, jim, power, law, wire", "+++ tri, want, think, say, could, may, state, point, peopl, world
--- religion, win, men, said, jew, live, mean, believ, columbia, jim", "+++
--- chz, think, could, jim, want, 7ez, 6um, amanda, 6ei, david", "+++ tri, want, think, work, say, new, system, repli, peopl, need
--- win, sinc, men, play, could, columbia, jim, power, anyon, speed", "+++ want, think, work, say, distribut, could, new, may, state, system
--- win, men, escrow, technolog, columbia, jim, inform, law, anyon, data", "+++ win, men, think, could, columbia, jim, system, anyon, want, john
--- ", "+++ tri, want, think, work, say, could, new, jim, system, point
--- georg, win, men, said, mean, columbia, believ, presid, tax, jake", "+++ tri, want, work, thank, name, may, mail, system, need, make
--- color, win, men, think, window, could, columbia, includ, jim, version", "+++ tri, think, org, distribut, could, new, may, state, system, point
--- request, win, men, engin, columbia, jim, mission, gov, jpl, data", "+++ good, want, think, work, say, could, thank, new, may, system
--- win, men, columbia, jim, jpeg, buy, lot, alaska, realli, bike", "+++ want, think, work, say, could, new, may, state, peopl, take
--- shall, win, men, mean, columbia, believ, jim, system, power, law", "+++ want, think, work, say, distribut, could, new, system, peopl, take
--- win, men, satellit, technolog, includ, columbia, jim, inform, design, msg", "+++ want, think, work, distribut, could, new, thank, state, mail, system
--- win, men, simm, price, columbia, jim, power, set, find, board", "+++ win, think, say, could, new, may, point, take, make, time
--- men, play, columbia, jim, system, basebal, anyon, better, run, back" ], [ "+++ packag, work, new, system, comput, also, ibm, look, repli
--- georg, think, could, includ, believ, jim, presid, jake, cmu, scsi", "+++ want, say, could, new, time, system, point, peopl, right, question
--- keith, georg, ground, think, said, mean, believ, jim, power, presid", "+++ think, said, mean, could, believ, want, thing, come, take, right
--- religion, georg, jew, live, jim, presid, system, tax, jake, ibm", "+++
--- georg, chz, think, could, believ, jim, presid, jake, ibm, want", "+++ tri, want, think, work, say, new, first, time, system, peopl
--- georg, sinc, said, mean, play, could, believ, jim, power, presid", "+++ want, think, work, thing, could, say, new, time, system, peopl
--- georg, said, mean, escrow, technolog, believ, jim, inform, presid, tax", "+++ tri, want, think, work, say, could, new, jim, system, point
--- win, georg, men, said, mean, columbia, believ, presid, tax, jake", "+++ georg, think, said, mean, could, believ, jim, presid, system, tax
--- ", "+++ tri, want, work, time, system, comput, make, also, look
--- georg, think, could, includ, believ, jim, presid, jake, ibm, mous", "+++ tri, think, thing, could, new, first, system, look, point, peopl
--- request, georg, said, engin, mean, believ, jim, mission, presid, gov", "+++ want, think, work, thing, could, say, first, new, time, system
--- georg, said, mean, believ, jim, presid, tax, jpeg, buy, jake", "+++ want, think, work, thing, mean, could, say, believ, new, time
--- shall, georg, said, jim, presid, power, system, tax, law, jake", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, satellit, technolog, includ, believ, jim, inform, presid", "+++ want, think, work, thing, could, new, time, system, comput, make
--- georg, believ, jim, presid, jake, ibm, scsi, pin, right, talk", "+++ think, say, could, new, first, time, point, take, right, make
--- win, georg, said, mean, play, believ, jim, presid, system, basebal" ], [ "+++ color, window, includ, version, inform, system, program, comput, display, also
--- henri, engin, ftp, andrew, contact, set, anyon, server, ibm, user", "+++ want, time, may, system, need, file, make, also
--- could, includ, wire, mous, right, carri, outlet, look, machin, even", "+++ tri, want, time, may, call, make, also
--- think, jew, could, believ, includ, christian, life, mous, right, look", "+++
--- chz, includ, want, mous, 7ez, 6um, 6ei, look, b8e, machin", "+++ tri, want, read, work, time, system, problem, comput, need, make
--- color, sinc, think, window, play, includ, version, power, inform, set", "+++ want, work, time, may, system, inform, comput, need, make, also
--- color, file, chang, think, window, could, escrow, technolog, includ, version", "+++ tri, want, work, thank, name, may, mail, system, need, make
--- color, win, men, think, chang, window, could, columbia, includ, jim", "+++ tri, want, work, time, system, comput, make, also, look
--- georg, think, could, believ, jim, includ, presid, jake, ibm, mous", "+++ color, window, includ, version, inform, system, set, server, program, run
--- ", "+++ tri, read, may, system, look, list, make, time, pleas, also
--- think, could, includ, want, mous, center, david, machin, file, much", "+++ want, help, work, thank, time, may, system, comput, need, call
--- color, think, window, could, includ, version, inform, jpeg, buy, set", "+++ want, work, may, make, time, also
--- think, could, believ, includ, christian, mous, right, look, machin, case", "+++ want, work, includ, time, system, inform, comput, look, problem, call
--- color, think, window, could, satellit, technolog, version, design, set, msg", "+++ want, help, work, thank, time, mail, system, problem, comput, need
--- color, chang, think, simm, window, price, could, includ, version, power", "+++ may, look, make, time, also, run
--- think, play, could, includ, better, want, mous, right, start, machin" ], [ "+++ data, engin, distribut, new, system, list, anyon, also, pleas, look
--- color, request, henri, think, window, could, ftp, includ, earth, version", "+++ distribut, new, could, time, may, system, point, peopl, make, year
--- think, wire, want, right, carri, center, outlet, david, look, file", "+++ tri, think, thing, could, first, time, may, point, peopl, world
--- religion, request, said, jew, live, mean, believ, engin, system, mission", "+++
--- chz, think, could, 7ez, 6um, center, 6ei, david, look, b8e", "+++ car, tri, read, think, new, first, time, system, peopl, see
--- request, sinc, engin, play, could, power, mission, gov, anyon, speed", "+++ data, think, thing, distribut, could, two, new, may, time, system
--- request, engin, escrow, technolog, inform, mission, gov, law, anyon, jpl", "+++ tri, think, org, distribut, could, new, may, state, system, point
--- win, request, men, engin, columbia, jim, mission, gov, jpl, data", "+++ tri, think, thing, could, new, first, time, system, point, peopl
--- georg, request, said, mean, engin, believ, jim, presid, mission, tax", "+++ tri, read, time, may, system, list, make, also, pleas, look
--- think, could, includ, want, mous, center, david, machin, file, much", "+++ request, think, engin, could, earth, system, mission, gov, anyon, jpl
--- ", "+++ think, thing, could, first, new, time, may, system, peopl, world
--- request, engin, mission, gov, jpeg, buy, lot, alaska, realli, jpl", "+++ think, thing, could, new, time, may, peopl, see, much, make
--- shall, request, mean, engin, believ, system, power, mission, gov, law", "+++ think, thing, distribut, could, two, new, first, 1993, system, space
--- request, engin, satellit, technolog, includ, inform, mission, gov, design, msg", "+++ car, think, thing, distribut, could, two, new, state, time, system
--- request, simm, engin, price, power, mission, gov, set, jpl, data", "+++ think, could, new, two, first, may, time, point, see, much
--- win, request, engin, play, system, mission, basebal, gov, anyon, better" ], [ "+++ help, work, thank, new, system, repli, comput, anyon, imag, also
--- color, henri, think, window, engin, ftp, could, includ, version, andrew", "+++ want, say, could, new, time, may, system, peopl, need, make
--- keith, ground, think, power, law, jpeg, buy, wire, anyon, find", "+++ want, think, thing, could, say, first, time, may, peopl, world
--- religion, said, jew, live, mean, believ, system, jpeg, buy, anyon", "+++
--- chz, think, could, want, 7ez, 6um, 6ei, look, b8e, motorcycl", "+++ want, think, work, say, new, first, time, system, repli, comput
--- sinc, play, could, power, jpeg, buy, anyon, lot, speed, pitt", "+++ want, think, work, thing, could, say, new, time, may, system
--- escrow, technolog, inform, law, jpeg, buy, anyon, lot, alaska, realli", "+++ want, think, work, say, could, thank, new, may, system, repli
--- win, men, columbia, jim, jpeg, buy, lot, alaska, realli, bike", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, believ, jim, presid, tax, jake, jpeg, buy", "+++ want, help, work, thank, time, may, system, comput, need, call
--- color, think, window, could, includ, version, inform, jpeg, buy, set", "+++ think, thing, could, new, first, may, system, look, world, peopl
--- request, engin, mission, gov, jpeg, buy, lot, alaska, jpl, data", "+++ think, could, system, jpeg, buy, anyon, lot, realli, alaska, want
--- ", "+++ want, think, work, thing, could, say, new, time, may, case
--- shall, mean, believ, system, power, law, jpeg, free, buy, anyon", "+++ want, think, work, thing, could, say, new, first, time, system
--- satellit, technolog, includ, inform, jpeg, design, buy, msg, anyon, lot", "+++ want, think, work, thing, could, help, new, thank, time, system
--- simm, price, power, jpeg, buy, set, lot, alaska, realli, board", "+++ think, say, could, new, first, time, may, see, much, make
--- win, play, system, basebal, jpeg, buy, anyon, lot, better, run" ], [ "+++ new, also, support, work, good
--- think, could, includ, believ, ibm, christian, cmu, scsi, want, driver", "+++ want, say, could, new, time, may, state, power, peopl, law
--- keith, shall, ground, think, mean, believ, system, free, wire, find", "+++ think, mean, could, believ, christian, want, thing, take, right, way
--- religion, shall, said, jew, live, power, law, free, claim, life", "+++
--- chz, think, could, believ, christian, want, 7ez, right, 6um, 6ei", "+++ want, think, work, say, new, time, power, peopl, see, take
--- shall, sinc, mean, play, could, believ, system, law, free, speed", "+++ want, think, work, thing, could, say, new, time, may, govern
--- shall, mean, escrow, technolog, believ, system, inform, power, free, christian", "+++ want, think, work, say, could, new, may, state, peopl, take
--- believ, jim, christian, right, amanda, david, look, case, much, public", "+++ want, think, work, thing, mean, could, say, believ, new, time
--- georg, shall, said, jim, presid, system, power, tax, jake, law", "+++ want, work, may, make, time, also
--- think, could, includ, believ, christian, mous, right, look, machin, case", "+++ think, thing, could, new, time, may, peopl, see, much, make
--- request, shall, engin, mean, believ, system, mission, power, gov, law", "+++ want, think, work, thing, could, say, new, time, may, case
--- shall, mean, believ, system, power, jpeg, buy, law, free, anyon", "+++ shall, think, mean, could, believ, power, law, free, christian, want
--- ", "+++ want, think, work, thing, could, say, new, time, peopl, take
--- shall, mean, satellit, technolog, includ, believ, system, inform, power, law", "+++ want, think, work, thing, could, new, time, state, power, make
--- believ, christian, scsi, pin, right, look, case, much, public, even", "+++ think, say, could, new, time, may, see, take, right, much
--- win, shall, mean, play, believ, power, basebal, law, free, better" ], [ "+++ work, distribut, new, includ, system, inform, comput, look, also, program
--- think, could, satellit, technolog, ibm, cmu, scsi, want, driver, center", "+++ want, say, distribut, could, new, time, system, peopl, make, year
--- think, satellit, technolog, includ, wire, right, carri, center, outlet, look", "+++ want, think, thing, could, say, first, time, peopl, take, call
--- religion, said, jew, live, mean, believ, satellit, technolog, includ, system", "+++
--- chz, think, could, satellit, technolog, includ, want, 7ez, 6um, center", "+++ want, think, work, say, new, first, time, system, problem, comput
--- sinc, play, could, satellit, technolog, includ, power, inform, design, msg", "+++ want, think, work, thing, could, distribut, technolog, two, say, gener
--- satellit, escrow, includ, law, design, msg, program, data, launch, book", "+++ want, think, work, say, distribut, could, new, system, peopl, take
--- win, men, satellit, columbia, technolog, jim, includ, inform, design, anyon", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, satellit, technolog, believ, jim, includ, presid, inform", "+++ want, work, includ, time, system, inform, comput, look, problem, call
--- color, think, window, could, satellit, technolog, version, design, set, msg", "+++ think, thing, distribut, could, two, new, first, 1993, system, look
--- request, engin, satellit, technolog, includ, mission, inform, gov, design, anyon", "+++ want, think, work, thing, could, say, first, new, time, system
--- satellit, technolog, includ, inform, jpeg, buy, design, anyon, lot, msg", "+++ want, think, work, thing, could, say, new, time, peopl, take
--- shall, mean, satellit, technolog, believ, includ, system, power, inform, law", "+++ think, could, satellit, technolog, includ, system, inform, design, msg, program
--- ", "+++ want, think, work, thing, distribut, could, two, new, time, system
--- simm, price, satellit, technolog, includ, power, inform, design, set, anyon", "+++ think, say, could, new, two, first, time, take, much, make
--- win, play, satellit, technolog, includ, system, inform, basebal, design, msg" ], [ "+++ scsi, work, help, distribut, thank, new, mail, system, repli, comput
--- color, henri, think, window, engin, ftp, simm, includ, price, could", "+++ want, distribut, could, sale, new, time, state, system, power, need
--- keith, ground, think, simm, price, law, set, wire, anyon, find", "+++ want, think, thing, could, time, state, make, also, good
--- jew, believ, christian, life, scsi, pin, right, look, exist, even", "+++
--- chz, think, could, want, scsi, pin, 7ez, 6um, 6ei, look", "+++ car, want, scsi, think, work, new, control, time, system, power
--- sinc, simm, play, price, could, set, anyon, speed, pitt, better", "+++ want, think, work, thing, distribut, could, two, new, time, state
--- simm, price, escrow, technolog, inform, power, law, set, anyon, data", "+++ want, think, work, distribut, could, thank, new, state, mail, system
--- win, men, simm, price, columbia, jim, power, set, board, scsi", "+++ want, think, work, thing, could, new, time, system, comput, make
--- georg, simm, said, mean, price, believ, jim, presid, power, tax", "+++ want, help, work, thank, time, mail, system, problem, comput, need
--- color, think, simm, window, price, could, includ, version, inform, power", "+++ car, think, thing, distribut, could, two, new, state, system, look
--- request, simm, engin, price, mission, power, gov, set, jpl, data", "+++ good, want, think, work, thing, could, help, thank, new, time
--- simm, price, power, jpeg, buy, set, lot, alaska, realli, board", "+++ want, think, work, thing, could, new, time, state, power, make
--- shall, simm, mean, rom, price, believ, system, law, free, set", "+++ want, think, work, thing, distribut, could, two, new, time, system
--- simm, satellit, technolog, includ, price, inform, power, design, set, msg", "+++ simm, think, could, price, system, power, set, anyon, want, board
--- ", "+++ think, could, new, two, time, make, also, look, good
--- play, better, want, scsi, pin, right, start, let, team, much" ], [ "+++ also, look, new, toronto, good
--- think, play, could, includ, ibm, better, cmu, scsi, driver, right", "+++ say, could, new, time, may, point, right, make, well, even
--- keith, win, ground, think, play, system, power, basebal, law, wire", "+++ think, say, could, first, time, may, point, see, take, right
--- religion, win, said, jew, live, mean, believ, play, basebal, better", "+++
--- chz, think, play, could, better, 7ez, right, 6um, 6ei, look", "+++ good, think, say, play, new, first, time, see, take, right
--- win, sinc, could, system, power, basebal, speed, pitt, realli, defens", "+++ think, say, could, new, two, time, may, right, make, even
--- win, play, escrow, technolog, system, inform, basebal, law, better, run", "+++ win, think, say, could, new, may, point, take, make, time
--- play, jim, better, want, right, amanda, david, start, let, team", "+++ think, say, could, new, first, time, point, take, right, make
--- georg, win, said, mean, play, believ, jim, presid, system, basebal", "+++ may, look, make, time, also, run
--- think, play, could, includ, better, want, mous, right, machin, start", "+++ think, could, new, two, time, first, may, point, see, much
--- request, win, engin, play, system, mission, basebal, gov, anyon, better", "+++ think, say, could, first, new, time, may, see, much, make
--- win, play, system, basebal, jpeg, buy, anyon, lot, better, alaska", "+++ think, say, could, new, time, may, see, take, right, much
--- shall, win, mean, play, believ, power, basebal, law, free, better", "+++ think, say, could, new, two, first, time, take, much, make
--- win, satellit, technolog, includ, play, system, inform, basebal, design, msg", "+++ think, could, new, two, time, make, also, look, good
--- play, better, want, scsi, pin, right, start, let, team, much", "+++ win, think, could, play, basebal, better, realli, run, defens, back
--- " ] ], "type": "heatmap", "z": [ [ 0, 0.9361702127659575, 0.9795918367346939, 1, 0.8888888888888888, 0.8764044943820225, 0.8764044943820225, 0.9010989010989011, 0.717948717948718, 0.8764044943820225, 0.8505747126436781, 0.9473684210526316, 0.8764044943820225, 0.7951807228915663, 0.9473684210526316 ], [ 0.9361702127659575, 0, 0.7804878048780488, 1, 0.8095238095238095, 0.75, 0.7951807228915663, 0.8235294117647058, 0.9130434782608696, 0.8636363636363636, 0.8095238095238095, 0.7654320987654322, 0.8636363636363636, 0.8235294117647058, 0.8372093023255813 ], [ 0.9795918367346939, 0.7804878048780488, 0, 0.98989898989899, 0.8095238095238095, 0.8095238095238095, 0.8235294117647058, 0.7012987012987013, 0.9247311827956989, 0.8235294117647058, 0.7654320987654322, 0.6111111111111112, 0.8372093023255813, 0.9010989010989011, 0.7804878048780488 ], [ 1, 1, 0.98989898989899, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 0.8888888888888888, 0.8095238095238095, 0.8095238095238095, 1, 0, 0.8235294117647058, 0.8235294117647058, 0.7654320987654322, 0.8505747126436781, 0.8095238095238095, 0.75, 0.7804878048780488, 0.7804878048780488, 0.7341772151898734, 0.75 ], [ 0.8764044943820225, 0.75, 0.8095238095238095, 1, 0.8235294117647058, 0, 0.7654320987654322, 0.8235294117647058, 0.8764044943820225, 0.8372093023255813, 0.7804878048780488, 0.7341772151898734, 0.7654320987654322, 0.8095238095238095, 0.8372093023255813 ], [ 0.8764044943820225, 0.7951807228915663, 0.8235294117647058, 1, 0.8235294117647058, 0.7654320987654322, 0, 0.7654320987654322, 0.8505747126436781, 0.7341772151898734, 0.75, 0.8372093023255813, 0.8372093023255813, 0.7654320987654322, 0.8636363636363636 ], [ 0.9010989010989011, 0.8235294117647058, 0.7012987012987013, 1, 0.7654320987654322, 0.8235294117647058, 0.7654320987654322, 0, 0.9010989010989011, 0.7951807228915663, 0.7654320987654322, 0.75, 0.7951807228915663, 0.8505747126436781, 0.8235294117647058 ], [ 0.717948717948718, 0.9130434782608696, 0.9247311827956989, 1, 0.8505747126436781, 0.8764044943820225, 0.8505747126436781, 0.9010989010989011, 0, 0.8888888888888888, 0.8372093023255813, 0.9361702127659575, 0.8505747126436781, 0.8235294117647058, 0.9361702127659575 ], [ 0.8764044943820225, 0.8636363636363636, 0.8235294117647058, 1, 0.8095238095238095, 0.8372093023255813, 0.7341772151898734, 0.7951807228915663, 0.8888888888888888, 0, 0.7654320987654322, 0.8235294117647058, 0.7341772151898734, 0.8095238095238095, 0.8372093023255813 ], [ 0.8505747126436781, 0.8095238095238095, 0.7654320987654322, 1, 0.75, 0.7804878048780488, 0.75, 0.7654320987654322, 0.8372093023255813, 0.7654320987654322, 0, 0.75, 0.7654320987654322, 0.75, 0.7804878048780488 ], [ 0.9473684210526316, 0.7654320987654322, 0.6111111111111112, 1, 0.7804878048780488, 0.7341772151898734, 0.8372093023255813, 0.75, 0.9361702127659575, 0.8235294117647058, 0.75, 0, 0.8235294117647058, 0.8636363636363636, 0.7804878048780488 ], [ 0.8764044943820225, 0.8636363636363636, 0.8372093023255813, 1, 0.7804878048780488, 0.7654320987654322, 0.8372093023255813, 0.7951807228915663, 0.8505747126436781, 0.7341772151898734, 0.7654320987654322, 0.8235294117647058, 0, 0.8095238095238095, 0.8372093023255813 ], [ 0.7951807228915663, 0.8235294117647058, 0.9010989010989011, 1, 0.7341772151898734, 0.8095238095238095, 0.7654320987654322, 0.8505747126436781, 0.8235294117647058, 0.8095238095238095, 0.75, 0.8636363636363636, 0.8095238095238095, 0, 0.9010989010989011 ], [ 0.9473684210526316, 0.8372093023255813, 0.7804878048780488, 1, 0.75, 0.8372093023255813, 0.8636363636363636, 0.8235294117647058, 0.9361702127659575, 0.8372093023255813, 0.7804878048780488, 0.7804878048780488, 0.8372093023255813, 0.9010989010989011, 0 ] ] } ], "layout": { "height": 950, "title": "Topic difference (one model) [jaccard distance]", "width": 950, "xaxis": { "title": "topic" }, "yaxis": { "title": "topic" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mdiff, annotation = lda_fst.diff(lda_fst, distance='jaccard', num_words=50)\n", "plot_difference(mdiff, title=\"Topic difference (one model) [jaccard distance]\", annotation=annotation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you compare a model with itself, you want to see as many red elements as possible (except diagonal). With this picture, you can look at the not very red elements and understand which topics in the model are very similar and why (you can read annotation if you move your pointer to cell).\n", "\n", "\n", "Jaccard is stable and robust distance function, but this function not enough sensitive for some purposes. Let's try to use Hellinger distance now." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "colorscale": "RdBu", "text": [ [ "+++ color, henri, engin, window, ftp, includ, version, system, andrew, inform
--- ", "+++ new, distribut, system, file, also, good
--- could, includ, wire, ibm, want, cmu, scsi, driver, right, carri", "+++ also, good
--- think, jew, could, believ, includ, ibm, christian, want, life, cmu", "+++
--- chz, includ, ibm, cmu, scsi, driver, 7ez, 6um, 6ei, look", "+++ scsi, work, new, system, repli, comput, driver, also, look, good
--- think, play, includ, ibm, better, want, cmu, pittsburgh, right, file", "+++ data, work, distribut, new, system, inform, comput, chip, also, bit
--- color, think, henri, window, could, engin, escrow, technolog, ftp, includ", "+++ work, distribut, thank, new, mail, system, repli, anyon, pleas, look
--- think, could, includ, jim, ibm, want, cmu, scsi, driver, amanda", "+++ packag, work, new, system, comput, also, ibm, look, repli
--- georg, think, could, believ, jim, includ, presid, jake, want, cmu", "+++ color, window, includ, version, inform, system, program, comput, mit, also
--- henri, engin, ftp, contact, andrew, set, anyon, server, ibm, user", "+++ data, engin, distribut, new, system, list, anyon, also, pleas, look
--- think, could, includ, ibm, cmu, scsi, driver, center, david, file", "+++ good, help, work, thank, new, system, comput, anyon, imag, also
--- color, think, henri, window, could, engin, ftp, includ, version, andrew", "+++ new, also, support, work, good
--- think, could, believ, includ, ibm, christian, want, cmu, scsi, driver", "+++ work, distribut, new, includ, system, inform, comput, look, also, program
--- think, could, satellit, technolog, ibm, want, cmu, scsi, driver, center", "+++ scsi, work, help, distribut, new, thank, mail, system, repli, comput
--- color, think, simm, henri, price, could, window, engin, ftp, includ", "+++ also, look, new, toronto, good
--- think, play, could, includ, ibm, better, cmu, scsi, driver, right" ], [ "+++ new, distribut, system, file, also, good
--- could, includ, wire, ibm, cmu, scsi, want, driver, right, carri", "+++ keith, ground, could, system, power, law, wire, arm, institut, want
--- ", "+++ want, say, kill, could, time, may, state, point, peopl, right
--- keith, religion, ground, think, said, jew, live, mean, believ, system", "+++
--- chz, could, wire, want, 7ez, right, 6um, carri, 6ei, outlet", "+++ want, say, new, time, system, power, peopl, need, right, make
--- keith, sinc, ground, think, play, could, law, wire, speed, pitt", "+++ want, say, distribut, could, new, time, may, number, system, peopl
--- keith, ground, think, escrow, technolog, inform, power, wire, arm, institut", "+++ want, say, distribut, could, new, may, state, system, point, peopl
--- keith, win, ground, men, think, columbia, jim, power, law, anyon", "+++ want, say, could, new, time, system, point, peopl, right, question
--- keith, georg, ground, think, said, mean, believ, jim, presid, power", "+++ want, time, may, system, need, file, make, also
--- could, includ, wire, mous, right, carri, outlet, look, machin, even", "+++ distribut, new, could, may, system, point, peopl, make, year, time
--- keith, request, ground, think, engin, mission, power, gov, law, anyon", "+++ want, say, could, new, time, may, system, peopl, need, make
--- keith, ground, think, power, jpeg, buy, law, anyon, lot, wire", "+++ want, say, could, new, time, may, state, power, peopl, law
--- keith, shall, ground, file, think, mean, believ, system, free, wire", "+++ want, say, distribut, could, new, time, system, peopl, make, year
--- keith, ground, think, satellit, technolog, includ, inform, power, law, design", "+++ want, distribut, could, usa, new, state, time, system, power, need
--- keith, ground, think, simm, price, law, set, anyon, wire, find", "+++ say, new, could, time, may, point, right, make, well, even
--- keith, win, ground, think, play, system, power, basebal, law, wire" ], [ "+++ also, good
--- think, jew, could, includ, believ, ibm, christian, cmu, scsi, want", "+++ want, say, kill, could, time, may, state, point, peopl, right
--- keith, religion, ground, think, said, jew, live, mean, believ, system", "+++ religion, think, jew, said, live, could, mean, believ, claim, christian
--- ", "+++ israel
--- chz, think, jew, could, believ, christian, want, life, 7ez, right", "+++ tri, want, think, say, first, time, peopl, see, take, right
--- religion, sinc, said, jew, play, could, live, mean, believ, system", "+++ want, think, thing, could, say, time, may, state, peopl, right
--- religion, said, jew, live, mean, escrow, technolog, believ, system, inform", "+++ tri, want, think, say, could, may, state, point, peopl, come
--- win, religion, men, said, jew, live, mean, columbia, believ, jim", "+++ think, said, mean, could, believ, want, thing, come, take, right
--- georg, religion, jew, live, jim, presid, system, tax, jake, ibm", "+++ tri, want, time, may, call, make, also
--- think, jew, could, includ, believ, christian, life, mous, right, look", "+++ tri, think, thing, could, first, time, may, point, peopl, world
--- request, religion, said, engin, live, jew, mean, believ, system, mission", "+++ want, think, thing, could, say, first, time, may, peopl, world
--- religion, said, jew, live, mean, believ, system, jpeg, buy, anyon", "+++ think, mean, could, believ, christian, want, thing, take, right, way
--- shall, religion, said, live, jew, power, law, free, claim, life", "+++ want, think, thing, could, say, first, time, peopl, take, call
--- religion, said, jew, live, satellit, technolog, includ, mean, believ, system", "+++ want, think, thing, could, time, state, make, also, good
--- jew, believ, christian, life, scsi, pin, right, look, exist, even", "+++ think, say, could, first, time, may, point, see, take, right
--- win, religion, said, jew, play, live, mean, believ, basebal, better" ], [ "+++
--- chz, includ, ibm, cmu, scsi, driver, 7ez, 6um, 6ei, look", "+++
--- chz, could, wire, want, right, carri, 7ez, 6um, outlet, 6ei", "+++ israel
--- chz, think, jew, could, believ, christian, want, life, right, 7ez", "+++ fij, chz, r8f, b8g, 147, nuy, 2tct, wm4u, 9f9, b4q
--- ", "+++
--- chz, think, play, better, want, scsi, pittsburgh, driver, right, 7ez", "+++
--- chz, think, could, technolog, want, right, 7ez, 6um, 6ei, b8e", "+++
--- chz, think, could, jim, want, 7ez, 6um, amanda, 6ei, david", "+++
--- georg, chz, think, could, believ, jim, presid, jake, ibm, want", "+++
--- chz, includ, want, mous, 7ez, 6um, 6ei, look, b8e, machin", "+++
--- chz, think, could, 7ez, 6um, center, 6ei, david, look, b8e", "+++
--- chz, think, could, want, 7ez, 6um, 6ei, look, motorcycl, b8e", "+++
--- chz, think, could, believ, christian, want, right, 7ez, 6um, 6ei", "+++
--- chz, think, could, satellit, technolog, includ, want, 7ez, 6um, center", "+++
--- chz, think, could, want, scsi, pin, 7ez, 6um, 6ei, look", "+++
--- chz, think, play, could, better, right, 7ez, 6um, 6ei, look" ], [ "+++ scsi, work, new, system, repli, comput, driver, also, look, good
--- think, play, includ, ibm, better, cmu, want, pittsburgh, right, file", "+++ want, say, new, time, system, power, peopl, need, right, make
--- keith, ground, sinc, think, could, play, law, wire, find, speed", "+++ tri, want, think, say, first, time, peopl, see, take, right
--- religion, sinc, said, jew, could, live, mean, believ, play, system", "+++
--- chz, think, play, better, want, scsi, pittsburgh, driver, 7ez, right", "+++ sinc, think, play, system, power, speed, pitt, better, run, back
--- ", "+++ want, think, work, say, new, time, system, peopl, comput, need
--- sinc, could, play, escrow, technolog, inform, power, law, speed, pitt", "+++ tri, want, think, work, say, new, system, repli, peopl, need
--- win, sinc, men, could, play, columbia, jim, power, anyon, speed", "+++ tri, want, think, work, say, new, first, time, system, peopl
--- georg, sinc, said, mean, could, play, believ, jim, presid, power", "+++ tri, want, read, work, time, system, problem, comput, need, make
--- color, sinc, think, window, play, includ, version, inform, power, set", "+++ car, tri, read, think, new, first, time, system, peopl, see
--- request, sinc, engin, could, play, mission, power, gov, anyon, speed", "+++ good, want, think, work, say, new, first, time, system, peopl
--- sinc, could, play, power, jpeg, buy, anyon, lot, speed, pitt", "+++ want, think, work, say, new, time, power, peopl, see, take
--- shall, sinc, mean, could, play, believ, system, law, free, speed", "+++ want, think, work, say, new, first, time, system, problem, comput
--- sinc, could, satellit, technolog, includ, play, inform, power, design, msg", "+++ car, want, scsi, think, work, new, control, time, system, power
--- sinc, simm, price, could, play, set, anyon, speed, pitt, better", "+++ good, think, say, play, new, first, time, see, take, right
--- win, sinc, could, system, power, basebal, speed, pitt, defens, realli" ], [ "+++ data, work, distribut, new, system, inform, comput, chip, also, bit
--- think, could, technolog, includ, ibm, cmu, scsi, want, driver, right", "+++ want, say, distribut, could, new, time, may, mani, system, peopl
--- keith, ground, think, escrow, technolog, power, inform, wire, find, arm", "+++ want, think, thing, could, say, time, may, state, peopl, right
--- religion, said, jew, live, mean, believ, escrow, technolog, system, inform", "+++
--- chz, think, could, technolog, want, 7ez, right, 6um, 6ei, b8e", "+++ want, think, work, say, new, time, system, peopl, comput, need
--- sinc, play, could, escrow, technolog, power, inform, law, speed, pitt", "+++ think, could, escrow, technolog, system, inform, law, data, want, thing
--- ", "+++ want, think, work, say, distribut, could, new, may, state, system
--- win, men, columbia, escrow, jim, technolog, inform, law, anyon, data", "+++ want, think, work, thing, could, say, new, time, system, peopl
--- georg, said, mean, escrow, believ, jim, technolog, presid, inform, tax", "+++ want, work, time, may, system, inform, comput, need, make, also
--- think, could, technolog, includ, mous, right, look, clipper, machin, file", "+++ data, think, thing, distribut, could, two, new, may, system, peopl
--- request, engin, escrow, technolog, mission, inform, gov, law, anyon, jpl", "+++ want, think, work, thing, could, say, new, time, may, system
--- escrow, technolog, inform, jpeg, buy, law, anyon, lot, alaska, realli", "+++ want, think, work, thing, could, say, new, time, may, govern
--- shall, mean, escrow, believ, technolog, system, power, inform, free, christian", "+++ want, think, work, thing, could, distribut, technolog, two, say, gener
--- satellit, includ, escrow, law, design, msg, program, data, launch, number", "+++ want, think, work, thing, distribut, could, two, new, state, time
--- simm, price, escrow, technolog, power, inform, law, set, anyon, data", "+++ think, say, could, new, two, time, may, right, make, even
--- win, play, escrow, technolog, system, inform, basebal, law, better, run" ], [ "+++ work, distribut, thank, new, mail, system, repli, anyon, pleas, look
--- think, could, includ, jim, ibm, cmu, scsi, want, driver, amanda", "+++ want, say, distribut, could, new, may, state, system, point, peopl
--- keith, win, ground, men, think, columbia, jim, power, law, wire", "+++ tri, want, think, say, could, may, state, point, peopl, world
--- religion, win, men, said, jew, live, mean, believ, columbia, jim", "+++
--- chz, think, could, jim, want, 7ez, 6um, amanda, 6ei, david", "+++ tri, want, think, work, say, new, system, repli, peopl, need
--- win, sinc, men, play, could, columbia, jim, power, anyon, speed", "+++ want, think, work, say, distribut, could, new, may, state, system
--- win, men, escrow, technolog, columbia, jim, inform, law, anyon, data", "+++ win, men, think, could, columbia, jim, system, anyon, want, john
--- ", "+++ tri, want, think, work, say, could, new, jim, system, point
--- georg, win, men, said, mean, columbia, believ, presid, tax, jake", "+++ tri, want, work, thank, name, may, mail, system, need, make
--- color, win, men, think, window, could, columbia, includ, jim, version", "+++ tri, think, org, distribut, could, new, may, state, system, point
--- request, win, men, engin, columbia, jim, mission, gov, jpl, data", "+++ good, want, think, work, say, could, thank, new, may, system
--- win, men, columbia, jim, jpeg, buy, lot, alaska, realli, bike", "+++ want, think, work, say, could, new, may, state, peopl, take
--- shall, win, men, mean, columbia, believ, jim, system, power, law", "+++ want, think, work, say, distribut, could, new, system, peopl, take
--- win, men, satellit, technolog, includ, columbia, jim, inform, design, msg", "+++ want, think, work, distribut, could, new, thank, state, mail, system
--- win, men, simm, price, columbia, jim, power, set, find, board", "+++ win, think, say, could, new, may, point, take, make, time
--- men, play, columbia, jim, system, basebal, anyon, better, run, back" ], [ "+++ packag, work, new, system, comput, also, ibm, look, repli
--- georg, think, could, includ, believ, jim, presid, jake, cmu, scsi", "+++ want, say, could, new, time, system, point, peopl, right, question
--- keith, georg, ground, think, said, mean, believ, jim, power, presid", "+++ think, said, mean, could, believ, want, thing, come, take, right
--- religion, georg, jew, live, jim, presid, system, tax, jake, ibm", "+++
--- georg, chz, think, could, believ, jim, presid, jake, ibm, want", "+++ tri, want, think, work, say, new, first, time, system, peopl
--- georg, sinc, said, mean, play, could, believ, jim, power, presid", "+++ want, think, work, thing, could, say, new, time, system, peopl
--- georg, said, mean, escrow, technolog, believ, jim, inform, presid, tax", "+++ tri, want, think, work, say, could, new, jim, system, point
--- win, georg, men, said, mean, columbia, believ, presid, tax, jake", "+++ georg, think, said, mean, could, believ, jim, presid, system, tax
--- ", "+++ tri, want, work, time, system, comput, make, also, look
--- georg, think, could, includ, believ, jim, presid, jake, ibm, mous", "+++ tri, think, thing, could, new, first, system, look, point, peopl
--- request, georg, said, engin, mean, believ, jim, mission, presid, gov", "+++ want, think, work, thing, could, say, first, new, time, system
--- georg, said, mean, believ, jim, presid, tax, jpeg, buy, jake", "+++ want, think, work, thing, mean, could, say, believ, new, time
--- shall, georg, said, jim, presid, power, system, tax, law, jake", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, satellit, technolog, includ, believ, jim, inform, presid", "+++ want, think, work, thing, could, new, time, system, comput, make
--- georg, believ, jim, presid, jake, ibm, scsi, pin, right, talk", "+++ think, say, could, new, first, time, point, take, right, make
--- win, georg, said, mean, play, believ, jim, presid, system, basebal" ], [ "+++ color, window, includ, version, inform, system, program, comput, display, also
--- henri, engin, ftp, andrew, contact, set, anyon, server, ibm, user", "+++ want, time, may, system, need, file, make, also
--- could, includ, wire, mous, right, carri, outlet, look, machin, even", "+++ tri, want, time, may, call, make, also
--- think, jew, could, believ, includ, christian, life, mous, right, look", "+++
--- chz, includ, want, mous, 7ez, 6um, 6ei, look, b8e, machin", "+++ tri, want, read, work, time, system, problem, comput, need, make
--- color, sinc, think, window, play, includ, version, power, inform, set", "+++ want, work, time, may, system, inform, comput, need, make, also
--- color, file, chang, think, window, could, escrow, technolog, includ, version", "+++ tri, want, work, thank, name, may, mail, system, need, make
--- color, win, men, think, chang, window, could, columbia, includ, jim", "+++ tri, want, work, time, system, comput, make, also, look
--- georg, think, could, believ, jim, includ, presid, jake, ibm, mous", "+++ color, window, includ, version, inform, system, set, server, program, run
--- ", "+++ tri, read, may, system, look, list, make, time, pleas, also
--- think, could, includ, want, mous, center, david, machin, file, much", "+++ want, help, work, thank, time, may, system, comput, need, call
--- color, think, window, could, includ, version, inform, jpeg, buy, set", "+++ want, work, may, make, time, also
--- think, could, believ, includ, christian, mous, right, look, machin, case", "+++ want, work, includ, time, system, inform, comput, look, problem, call
--- color, think, window, could, satellit, technolog, version, design, set, msg", "+++ want, help, work, thank, time, mail, system, problem, comput, need
--- color, chang, think, simm, window, price, could, includ, version, power", "+++ may, look, make, time, also, run
--- think, play, could, includ, better, want, mous, right, start, machin" ], [ "+++ data, engin, distribut, new, system, list, anyon, also, pleas, look
--- color, request, henri, think, window, could, ftp, includ, earth, version", "+++ distribut, new, could, time, may, system, point, peopl, make, year
--- think, wire, want, right, carri, center, outlet, david, look, file", "+++ tri, think, thing, could, first, time, may, point, peopl, world
--- religion, request, said, jew, live, mean, believ, engin, system, mission", "+++
--- chz, think, could, 7ez, 6um, center, 6ei, david, look, b8e", "+++ car, tri, read, think, new, first, time, system, peopl, see
--- request, sinc, engin, play, could, power, mission, gov, anyon, speed", "+++ data, think, thing, distribut, could, two, new, may, time, system
--- request, engin, escrow, technolog, inform, mission, gov, law, anyon, jpl", "+++ tri, think, org, distribut, could, new, may, state, system, point
--- win, request, men, engin, columbia, jim, mission, gov, jpl, data", "+++ tri, think, thing, could, new, first, time, system, point, peopl
--- georg, request, said, mean, engin, believ, jim, presid, mission, tax", "+++ tri, read, time, may, system, list, make, also, pleas, look
--- think, could, includ, want, mous, center, david, machin, file, much", "+++ request, think, engin, could, earth, system, mission, gov, anyon, jpl
--- ", "+++ think, thing, could, first, new, time, may, system, peopl, world
--- request, engin, mission, gov, jpeg, buy, lot, alaska, realli, jpl", "+++ think, thing, could, new, time, may, peopl, see, much, make
--- shall, request, mean, engin, believ, system, power, mission, gov, law", "+++ think, thing, distribut, could, two, new, first, 1993, system, space
--- request, engin, satellit, technolog, includ, inform, mission, gov, design, msg", "+++ car, think, thing, distribut, could, two, new, state, time, system
--- request, simm, engin, price, power, mission, gov, set, jpl, data", "+++ think, could, new, two, first, may, time, point, see, much
--- win, request, engin, play, system, mission, basebal, gov, anyon, better" ], [ "+++ help, work, thank, new, system, repli, comput, anyon, imag, also
--- color, henri, think, window, engin, ftp, could, includ, version, andrew", "+++ want, say, could, new, time, may, system, peopl, need, make
--- keith, ground, think, power, law, jpeg, buy, wire, anyon, find", "+++ want, think, thing, could, say, first, time, may, peopl, world
--- religion, said, jew, live, mean, believ, system, jpeg, buy, anyon", "+++
--- chz, think, could, want, 7ez, 6um, 6ei, look, b8e, motorcycl", "+++ want, think, work, say, new, first, time, system, repli, comput
--- sinc, play, could, power, jpeg, buy, anyon, lot, speed, pitt", "+++ want, think, work, thing, could, say, new, time, may, system
--- escrow, technolog, inform, law, jpeg, buy, anyon, lot, alaska, realli", "+++ want, think, work, say, could, thank, new, may, system, repli
--- win, men, columbia, jim, jpeg, buy, lot, alaska, realli, bike", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, believ, jim, presid, tax, jake, jpeg, buy", "+++ want, help, work, thank, time, may, system, comput, need, call
--- color, think, window, could, includ, version, inform, jpeg, buy, set", "+++ think, thing, could, new, first, may, system, look, world, peopl
--- request, engin, mission, gov, jpeg, buy, lot, alaska, jpl, data", "+++ think, could, system, jpeg, buy, anyon, lot, realli, alaska, want
--- ", "+++ want, think, work, thing, could, say, new, time, may, case
--- shall, mean, believ, system, power, law, jpeg, free, buy, anyon", "+++ want, think, work, thing, could, say, new, first, time, system
--- satellit, technolog, includ, inform, jpeg, design, buy, msg, anyon, lot", "+++ want, think, work, thing, could, help, new, thank, time, system
--- simm, price, power, jpeg, buy, set, lot, alaska, realli, board", "+++ think, say, could, new, first, time, may, see, much, make
--- win, play, system, basebal, jpeg, buy, anyon, lot, better, run" ], [ "+++ new, also, support, work, good
--- think, could, includ, believ, ibm, christian, cmu, scsi, want, driver", "+++ want, say, could, new, time, may, state, power, peopl, law
--- keith, shall, ground, think, mean, believ, system, free, wire, find", "+++ think, mean, could, believ, christian, want, thing, take, right, way
--- religion, shall, said, jew, live, power, law, free, claim, life", "+++
--- chz, think, could, believ, christian, want, 7ez, right, 6um, 6ei", "+++ want, think, work, say, new, time, power, peopl, see, take
--- shall, sinc, mean, play, could, believ, system, law, free, speed", "+++ want, think, work, thing, could, say, new, time, may, govern
--- shall, mean, escrow, technolog, believ, system, inform, power, free, christian", "+++ want, think, work, say, could, new, may, state, peopl, take
--- believ, jim, christian, right, amanda, david, look, case, much, public", "+++ want, think, work, thing, mean, could, say, believ, new, time
--- georg, shall, said, jim, presid, system, power, tax, jake, law", "+++ want, work, may, make, time, also
--- think, could, includ, believ, christian, mous, right, look, machin, case", "+++ think, thing, could, new, time, may, peopl, see, much, make
--- request, shall, engin, mean, believ, system, mission, power, gov, law", "+++ want, think, work, thing, could, say, new, time, may, case
--- shall, mean, believ, system, power, jpeg, buy, law, free, anyon", "+++ shall, think, mean, could, believ, power, law, free, christian, want
--- ", "+++ want, think, work, thing, could, say, new, time, peopl, take
--- shall, mean, satellit, technolog, includ, believ, system, inform, power, law", "+++ want, think, work, thing, could, new, time, state, power, make
--- believ, christian, scsi, pin, right, look, case, much, public, even", "+++ think, say, could, new, time, may, see, take, right, much
--- win, shall, mean, play, believ, power, basebal, law, free, better" ], [ "+++ work, distribut, new, includ, system, inform, comput, look, also, program
--- think, could, satellit, technolog, ibm, cmu, scsi, want, driver, center", "+++ want, say, distribut, could, new, time, system, peopl, make, year
--- think, satellit, technolog, includ, wire, right, carri, center, outlet, look", "+++ want, think, thing, could, say, first, time, peopl, take, call
--- religion, said, jew, live, mean, believ, satellit, technolog, includ, system", "+++
--- chz, think, could, satellit, technolog, includ, want, 7ez, 6um, center", "+++ want, think, work, say, new, first, time, system, problem, comput
--- sinc, play, could, satellit, technolog, includ, power, inform, design, msg", "+++ want, think, work, thing, could, distribut, technolog, two, say, gener
--- satellit, escrow, includ, law, design, msg, program, data, launch, book", "+++ want, think, work, say, distribut, could, new, system, peopl, take
--- win, men, satellit, columbia, technolog, jim, includ, inform, design, anyon", "+++ want, think, work, thing, could, say, new, first, time, system
--- georg, said, mean, satellit, technolog, believ, jim, includ, presid, inform", "+++ want, work, includ, time, system, inform, comput, look, problem, call
--- color, think, window, could, satellit, technolog, version, design, set, msg", "+++ think, thing, distribut, could, two, new, first, 1993, system, look
--- request, engin, satellit, technolog, includ, mission, inform, gov, design, anyon", "+++ want, think, work, thing, could, say, first, new, time, system
--- satellit, technolog, includ, inform, jpeg, buy, design, anyon, lot, msg", "+++ want, think, work, thing, could, say, new, time, peopl, take
--- shall, mean, satellit, technolog, believ, includ, system, power, inform, law", "+++ think, could, satellit, technolog, includ, system, inform, design, msg, program
--- ", "+++ want, think, work, thing, distribut, could, two, new, time, system
--- simm, price, satellit, technolog, includ, power, inform, design, set, anyon", "+++ think, say, could, new, two, first, time, take, much, make
--- win, play, satellit, technolog, includ, system, inform, basebal, design, msg" ], [ "+++ scsi, work, help, distribut, thank, new, mail, system, repli, comput
--- color, henri, think, window, engin, ftp, simm, includ, price, could", "+++ want, distribut, could, sale, new, time, state, system, power, need
--- keith, ground, think, simm, price, law, set, wire, anyon, find", "+++ want, think, thing, could, time, state, make, also, good
--- jew, believ, christian, life, scsi, pin, right, look, exist, even", "+++
--- chz, think, could, want, scsi, pin, 7ez, 6um, 6ei, look", "+++ car, want, scsi, think, work, new, control, time, system, power
--- sinc, simm, play, price, could, set, anyon, speed, pitt, better", "+++ want, think, work, thing, distribut, could, two, new, time, state
--- simm, price, escrow, technolog, inform, power, law, set, anyon, data", "+++ want, think, work, distribut, could, thank, new, state, mail, system
--- win, men, simm, price, columbia, jim, power, set, board, scsi", "+++ want, think, work, thing, could, new, time, system, comput, make
--- georg, simm, said, mean, price, believ, jim, presid, power, tax", "+++ want, help, work, thank, time, mail, system, problem, comput, need
--- color, think, simm, window, price, could, includ, version, inform, power", "+++ car, think, thing, distribut, could, two, new, state, system, look
--- request, simm, engin, price, mission, power, gov, set, jpl, data", "+++ good, want, think, work, thing, could, help, thank, new, time
--- simm, price, power, jpeg, buy, set, lot, alaska, realli, board", "+++ want, think, work, thing, could, new, time, state, power, make
--- shall, simm, mean, rom, price, believ, system, law, free, set", "+++ want, think, work, thing, distribut, could, two, new, time, system
--- simm, satellit, technolog, includ, price, inform, power, design, set, msg", "+++ simm, think, could, price, system, power, set, anyon, want, board
--- ", "+++ think, could, new, two, time, make, also, look, good
--- play, better, want, scsi, pin, right, start, let, team, much" ], [ "+++ also, look, new, toronto, good
--- think, play, could, includ, ibm, better, cmu, scsi, driver, right", "+++ say, could, new, time, may, point, right, make, well, even
--- keith, win, ground, think, play, system, power, basebal, law, wire", "+++ think, say, could, first, time, may, point, see, take, right
--- religion, win, said, jew, live, mean, believ, play, basebal, better", "+++
--- chz, think, play, could, better, 7ez, right, 6um, 6ei, look", "+++ good, think, say, play, new, first, time, see, take, right
--- win, sinc, could, system, power, basebal, speed, pitt, realli, defens", "+++ think, say, could, new, two, time, may, right, make, even
--- win, play, escrow, technolog, system, inform, basebal, law, better, run", "+++ win, think, say, could, new, may, point, take, make, time
--- play, jim, better, want, right, amanda, david, start, let, team", "+++ think, say, could, new, first, time, point, take, right, make
--- georg, win, said, mean, play, believ, jim, presid, system, basebal", "+++ may, look, make, time, also, run
--- think, play, could, includ, better, want, mous, right, machin, start", "+++ think, could, new, two, time, first, may, point, see, much
--- request, win, engin, play, system, mission, basebal, gov, anyon, better", "+++ think, say, could, first, new, time, may, see, much, make
--- win, play, system, basebal, jpeg, buy, anyon, lot, better, alaska", "+++ think, say, could, new, time, may, see, take, right, much
--- shall, win, mean, play, believ, power, basebal, law, free, better", "+++ think, say, could, new, two, first, time, take, much, make
--- win, satellit, technolog, includ, play, system, inform, basebal, design, msg", "+++ think, could, new, two, time, make, also, look, good
--- play, better, want, scsi, pin, right, start, let, team, much", "+++ win, think, could, play, basebal, better, realli, run, defens, back
--- " ] ], "type": "heatmap", "z": [ [ 0, 0.6941375044357858, 0.7780005266674749, 0.97189773660304, 0.6494274681021718, 0.6923305372448271, 0.6587748986641923, 0.7172631293260368, 0.5768317051345507, 0.6614222396857877, 0.6348409260412068, 0.7521714148945046, 0.6538877795284743, 0.5899958412723074, 0.7572056357845106 ], [ 0.6941375044357858, 0, 0.664263027346316, 0.9743962370456708, 0.695584265654122, 0.6099902276618888, 0.6725182228398355, 0.6588083981443085, 0.7133721680105588, 0.6666040337585714, 0.6449328433151109, 0.6567507102369133, 0.6557904103701298, 0.6741004728982773, 0.7199592373619665 ], [ 0.7780005266674749, 0.664263027346316, 0, 0.9819638161841043, 0.7312870864008548, 0.6457109042650385, 0.6822011682985184, 0.6136278502100356, 0.7825286902851545, 0.7328428009145692, 0.6647093080369028, 0.5129009239352483, 0.686079850132694, 0.7444034441155176, 0.728017079465759 ], [ 0.97189773660304, 0.9743962370456708, 0.9819638161841043, 0, 0.9906998240780212, 0.9899010268346052, 0.9697970873667073, 0.9805184646715074, 0.997913221347332, 0.9410072398135072, 0.970472913577555, 0.9933128861436723, 0.9651353008216144, 0.988087255000911, 1 ], [ 0.6494274681021718, 0.695584265654122, 0.7312870864008548, 0.9906998240780212, 0, 0.6984165588399608, 0.6880183605332709, 0.6923925747139658, 0.6984072702557435, 0.6881820190614852, 0.6415246046146913, 0.734723733770132, 0.6737033387865671, 0.6113346174101822, 0.6735639764844695 ], [ 0.6923305372448271, 0.6099902276618888, 0.6457109042650385, 0.9899010268346052, 0.6984165588399608, 0, 0.630037509201517, 0.6280273998970393, 0.6864119107262651, 0.6828635157854915, 0.6379453741858915, 0.5848923912785794, 0.6384155109280928, 0.6771215052178688, 0.7247485433842347 ], [ 0.6587748986641923, 0.6725182228398355, 0.6822011682985184, 0.9697970873667073, 0.6880183605332709, 0.630037509201517, 0, 0.6541134738070555, 0.6764020597532797, 0.6616669384674461, 0.639317205129716, 0.6647270363214451, 0.6611877349003192, 0.6504504959646574, 0.697199512599333 ], [ 0.7172631293260368, 0.6588083981443085, 0.6136278502100356, 0.9805184646715074, 0.6923925747139658, 0.6280273998970393, 0.6541134738070555, 0, 0.7386737061919797, 0.6942335664995479, 0.6402472659650361, 0.6048809578448723, 0.6540944355144231, 0.6780932789958286, 0.6929417356032255 ], [ 0.5768317051345507, 0.7133721680105588, 0.7825286902851545, 0.997913221347332, 0.6984072702557435, 0.6864119107262651, 0.6764020597532797, 0.7386737061919797, 0, 0.7202384890474284, 0.6682838375381028, 0.7506248648750038, 0.6988713007454896, 0.6481835386304212, 0.7820951609366245 ], [ 0.6614222396857877, 0.6666040337585714, 0.7328428009145692, 0.9410072398135072, 0.6881820190614852, 0.6828635157854915, 0.6616669384674461, 0.6942335664995479, 0.7202384890474284, 0, 0.6270580649653639, 0.7195497392802706, 0.5903638618131142, 0.6728876127289707, 0.6943688980681291 ], [ 0.6348409260412068, 0.6449328433151109, 0.6647093080369028, 0.970472913577555, 0.6415246046146913, 0.6379453741858915, 0.639317205129716, 0.6402472659650361, 0.6682838375381028, 0.6270580649653639, 0, 0.6549552527820861, 0.5937127598305801, 0.6262404431076025, 0.6887609196728117 ], [ 0.7521714148945046, 0.6567507102369133, 0.5129009239352483, 0.9933128861436723, 0.734723733770132, 0.5848923912785794, 0.6647270363214451, 0.6048809578448723, 0.7506248648750038, 0.7195497392802706, 0.6549552527820861, 0, 0.6689843410868274, 0.7307020730686198, 0.7273455968736988 ], [ 0.6538877795284743, 0.6557904103701298, 0.686079850132694, 0.9651353008216144, 0.6737033387865671, 0.6384155109280928, 0.6611877349003192, 0.6540944355144231, 0.6988713007454896, 0.5903638618131142, 0.5937127598305801, 0.6689843410868274, 0, 0.6592200513160686, 0.6857630808748971 ], [ 0.5899958412723074, 0.6741004728982773, 0.7444034441155176, 0.988087255000911, 0.6113346174101822, 0.6771215052178688, 0.6504504959646574, 0.6780932789958286, 0.6481835386304212, 0.6728876127289707, 0.6262404431076025, 0.7307020730686198, 0.6592200513160686, 0, 0.719246957334677 ], [ 0.7572056357845106, 0.7199592373619665, 0.728017079465759, 1, 0.6735639764844695, 0.7247485433842347, 0.697199512599333, 0.6929417356032255, 0.7820951609366245, 0.6943688980681291, 0.6887609196728117, 0.7273455968736988, 0.6857630808748971, 0.719246957334677, 0 ] ] } ], "layout": { "height": 950, "title": "Topic difference (one model)[hellinger distance]", "width": 950, "xaxis": { "title": "topic" }, "yaxis": { "title": "topic" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mdiff, annotation = lda_fst.diff(lda_fst, distance='hellinger', num_words=50)\n", "plot_difference(mdiff, title=\"Topic difference (one model)[hellinger distance]\", annotation=annotation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You see that everything has become worse, but remember that everything depends on the task.\n", "\n", "You need to choose the function with which your personal point of view about topics similarity and your task (from my experience, Jaccard is fine)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Case 2: How topics from DIFFERENT models correlate with each other." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes, we want to look at the patterns between two different models and compare them. \n", "\n", "You can do this by constructing a matrix with the difference." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "colorscale": "RdBu", "text": [ [ "+++ toronto, new
--- think, play, includ, ibm, cmu, scsi, pittsburgh, driver, 6um, 6ei", "+++ also, look, new, good
--- think, play, could, includ, ibm, better, still, cmu, scsi, driver", "+++ color, window, ftp, includ, version, system, inform, program, data, packag
--- henri, engin, contact, unix, type, andrew, set, anyon, server, ibm", "+++ work, engin, distribut, new, system, repli, chip, anyon, also, look
--- color, chang, think, henri, could, window, ftp, includ, version, andrew", "+++ also, system, good
--- think, could, includ, ibm, want, cmu, scsi, evid, driver, right", "+++ work, also, chip, system, new
--- think, jew, could, includ, presid, ibm, arab, want, cmu, scsi", "+++ also
--- think, could, believ, includ, ibm, christian, follow, want, cmu, scsi", "+++ help, work, distribut, new, anyon, also, ibm, look, bit, good
--- think, could, austin, includ, nec, wire, want, cmu, scsi, driver", "+++ also, work, repli, good
--- think, could, believ, includ, ibm, christian, apr, want, life, cmu", "+++ good, also, distribut
--- think, could, includ, ibm, cmu, scsi, driver, right, carri, look", "+++ data, cmu, engin, new, includ, system, inform, andrew, comput, avail
--- color, henri, window, satellit, technolog, ftp, mission, contact, gov, version", "+++ distribut, ftp, new, includ, mail, system, inform, comput, list, pub
--- section, request, color, henri, window, engin, display, version, andrew, contact", "+++ henri, work, anyon, also, program, toronto, good
--- think, could, technolog, includ, ibm, want, cmu, scsi, driver, look", "+++ good, softwar, work, distribut, new, thank, includ, info, mail, system
--- color, think, simm, henri, price, could, window, engin, ftp, version", "+++ data, scsi, work, window, help, thank, new, system, comput, driver
--- color, motherboard, henri, instal, engin, ftp, includ, interfac, version, devic" ], [ "+++ year, new
--- think, play, could, wire, want, pittsburgh, right, carri, 6um, 6ei", "+++ say, could, new, time, may, point, power, peopl, right, make
--- keith, win, ground, think, play, columbia, system, basebal, law, wire", "+++ distribut, system, need, file, time, also
--- could, includ, type, wire, want, driver, right, carri, outlet, look", "+++ want, say, distribut, could, new, time, may, state, system, point
--- keith, ground, chang, think, engin, power, buy, design, law, anyon", "+++ keith, want, say, could, time, system, point, peopl, right, question
--- natur, sinc, ground, think, mean, power, law, wire, opinion, arm", "+++ want, say, could, new, time, system, peopl, need, law, right
--- keith, ground, file, think, jew, said, polit, presid, power, tax", "+++ want, say, kill, could, time, may, point, peopl, law, right
--- keith, religion, ground, think, said, live, mean, believ, children, system", "+++ ground, want, circuit, distribut, could, new, time, may, power, need
--- keith, think, austin, current, system, nec, gov, law, bnr, msg", "+++ want, say, could, time, may, point, peopl, need, question, make
--- true, keith, ground, think, mean, exampl, believ, system, power, law", "+++ could, law, arm, distribut, gun, right, bill, carri, handgun, also
--- keith, ground, think, mean, crimin, system, power, wire, find, institut", "+++ new, may, system, year, time, number, also
--- could, satellit, technolog, includ, wire, cmu, want, pittsburgh, right, carri", "+++ new, distribut, may, system, file, number
--- could, includ, wire, rule, want, right, carri, outlet, public, even", "+++ want, say, could, peopl, 000, make, year, also, usa, good
--- think, technolog, wire, right, carri, outlet, michael, cost, file, much", "+++ want, distribut, could, sale, new, system, need, make, also, usa
--- keith, ground, think, simm, price, includ, power, buy, law, anyon", "+++ need, also, control, system, new
--- instal, could, wire, want, scsi, pin, mous, driver, right, carri" ], [ "+++ see, think, year
--- jew, play, could, believ, christian, want, life, pittsburgh, right, 6um", "+++ think, say, could, first, time, may, point, peopl, see, take
--- win, religion, said, jew, play, live, columbia, mean, believ, power", "+++ tri, call, also, time
--- think, jew, could, includ, believ, type, christian, want, life, driver", "+++ want, think, thing, could, say, time, may, state, point, see
--- religion, chang, said, engin, live, jew, mean, believ, system, buy", "+++ think, mean, could, want, thing, human, take, right, way, also
--- keith, natur, sinc, religion, said, live, jew, believ, system, opinion", "+++ think, jew, said, could, want, right, way, also, call, question
--- religion, live, mean, polit, believ, presid, system, tax, law, propos", "+++ religion, think, said, mean, live, could, believ, christian, want, thing
--- jew, children, law, claim, follow, life, two, human, muslim, death", "+++ want, think, thing, could, time, may, take, question, make, well
--- religion, ground, said, jew, live, mean, austin, current, believ, power", "+++ think, mean, could, believ, christian, claim, want, life, thing, come
--- true, religion, said, live, exampl, jew, realli, apr, atheist, kill", "+++ tri, think, thing, mean, could, say, state, seem, peopl, right
--- religion, said, live, jew, believ, crimin, law, arm, christian, defens", "+++ first, may, world, year, time, also
--- think, jew, could, satellit, technolog, includ, believ, christian, cmu, want", "+++ may
--- think, jew, could, includ, believ, rule, christian, want, life, right", "+++ want, think, thing, could, say, first, peopl, come, see, make
--- religion, henri, said, jew, live, mean, technolog, believ, gov, prism", "+++ want, think, could, world, call, make, also, good
--- jew, includ, believ, ibm, christian, life, right, look, origin, exist", "+++ tri, also
--- think, jew, instal, could, believ, christian, want, scsi, life, pin" ], [ "+++ okz, 75u, rlk, bhj, 2tm, 34u, 7ey, qax, giz, 145
--- win, fij, chz, think, b8g, final, play, r8f, 147, nuy", "+++
--- chz, think, play, could, better, still, right, 7ez, 6um, 6ei", "+++
--- chz, includ, type, driver, 7ez, 6um, 6ei, look, b8e, 2tm", "+++
--- chz, think, could, better, want, still, right, 7ez, 6um, 6ei", "+++
--- chz, think, could, want, evid, right, 7ez, 6um, 6ei, b8e", "+++ israel
--- chz, think, jew, could, presid, arab, want, right, 7ez, 6um", "+++
--- chz, think, could, believ, christian, follow, want, right, 7ez, 6um", "+++
--- chz, think, could, austin, nec, wire, ibm, want, 7ez, 6um", "+++
--- chz, think, could, believ, christian, apr, want, life, evid, 7ez", "+++
--- chz, think, could, right, carri, 7ez, 6um, 6ei, b8e, 2tm", "+++
--- chz, satellit, technolog, includ, cmu, pittsburgh, 7ez, 6um, center, 6ei", "+++
--- chz, includ, rule, 7ez, 6um, 6ei, b8e, 2tm, file, public", "+++
--- chz, think, could, technolog, want, 7ez, 6um, 6ei, b8e, michael", "+++
--- chz, think, could, includ, ibm, want, 7ez, 6um, 6ei, look", "+++
--- chz, instal, scsi, pin, mous, driver, 7ez, 6um, 6ei, cach" ], [ "+++ think, new, play, pittsburgh, see, year
--- better, want, scsi, driver, right, 6um, 6ei, look, 2tm, leaf", "+++ think, play, power, better, run, back, take, right, also, look
--- win, sinc, could, columbia, system, basebal, speed, pitt, defens, still", "+++ tri, work, time, system, problem, comput, driver, need, also, look
--- think, play, includ, type, better, want, scsi, pittsburgh, right, file", "+++ think, system, speed, better, back, want, take, right, also, look
--- sinc, chang, engin, could, play, power, buy, design, anyon, pitt", "+++ tri, sinc, want, think, read, say, time, system, peopl, problem
--- keith, natur, mean, could, play, power, speed, pitt, opinion, better", "+++ tri, want, think, work, say, new, time, system, peopl, need
--- sinc, jew, said, could, play, polit, presid, power, tax, law", "+++ want, think, say, first, time, peopl, see, take, right, make
--- religion, sinc, said, live, could, mean, play, believ, children, system", "+++ good, want, think, work, new, time, power, problem, need, take
--- ground, sinc, could, play, austin, current, system, nec, gov, bnr", "+++ tri, want, think, work, read, say, time, repli, peopl, need
--- true, sinc, mean, could, exampl, play, believ, system, power, speed", "+++ tri, think, read, say, time, peopl, problem, need, right, make
--- sinc, mean, could, play, crimin, system, power, law, speed, pitt", "+++ bank, new, first, pittsburgh, geb, system, comput, gordon, pitt, time
--- think, play, satellit, technolog, includ, better, cmu, want, scsi, driver", "+++ new, read, system, comput
--- think, play, includ, rule, better, want, scsi, pittsburgh, driver, right", "+++ want, think, work, say, first, peopl, see, much, make, year
--- could, play, technolog, better, scsi, pittsburgh, driver, right, look, michael", "+++ good, want, think, work, new, system, comput, need, make, also
--- sinc, simm, price, could, play, includ, power, buy, anyon, ship", "+++ tri, scsi, work, new, system, problem, comput, driver, need, drive
--- motherboard, sinc, think, instal, window, play, interfac, power, devic, set" ], [ "+++ think, new
--- play, could, technolog, want, pittsburgh, right, 6um, 6ei, clipper, 2tm", "+++ think, say, could, new, two, time, may, peopl, right, make
--- win, play, columbia, escrow, technolog, system, power, inform, basebal, law", "+++ data, work, distribut, time, system, inform, comput, need, also, bit
--- think, could, technolog, includ, type, want, driver, right, look, clipper", "+++ think, could, system, want, thing, distribut, two, right, way, also
--- chang, engin, escrow, technolog, inform, buy, design, law, anyon, speed", "+++ want, think, thing, could, say, time, system, peopl, right, make
--- keith, natur, sinc, mean, escrow, technolog, inform, law, opinion, realli", "+++ think, could, system, law, want, two, right, way, also, clipper
--- jew, said, escrow, polit, technolog, presid, inform, tax, propos, war", "+++ want, think, thing, could, say, two, time, may, peopl, law
--- religion, said, live, mean, escrow, believ, children, technolog, system, inform", "+++ want, think, work, thing, distribut, could, two, new, may, time
--- ground, austin, current, escrow, technolog, system, power, inform, nec, gov", "+++ want, think, work, thing, could, say, time, may, peopl, need
--- true, mean, exampl, escrow, believ, technolog, system, inform, law, realli", "+++ think, thing, distribut, could, say, state, govern, peopl, gun, need
--- mean, escrow, technolog, crimin, system, inform, arm, defens, data, want", "+++ data, new, technolog, may, gener, system, inform, comput, time, number
--- think, could, satellit, includ, cmu, want, pittsburgh, right, center, clipper", "+++ secur, distribut, new, number, may, gener, system, inform, comput, anonym
--- section, request, think, could, ftp, escrow, includ, technolog, law, buf", "+++ want, think, work, thing, could, say, technolog, peopl, make, way
--- right, clipper, michael, cost, chip, much, public, even, spencer, secur", "+++ want, think, work, distribut, could, new, system, comput, need, make
--- simm, price, escrow, includ, technolog, inform, buy, law, anyon, ship", "+++ data, work, new, two, system, comput, need, also, bit
--- think, instal, could, technolog, want, scsi, pin, mous, driver, right" ], [ "+++ win, think, new
--- play, could, jim, want, pittsburgh, 6um, amanda, 6ei, david, look", "+++ win, think, say, could, new, columbia, may, point, peopl, take
--- men, play, jim, system, power, basebal, anyon, better, run, back", "+++ tri, work, distribut, thank, mail, system, need, time, pleas, look
--- think, could, includ, jim, type, want, driver, amanda, david, file", "+++ think, could, system, anyon, want, john, distribut, take, magnu, look
--- win, chang, men, engin, columbia, jim, buy, design, speed, better", "+++ tri, want, think, say, could, system, point, peopl, take, make
--- keith, natur, sinc, win, men, mean, columbia, jim, anyon, opinion", "+++ tri, want, think, work, say, could, new, system, peopl, need
--- win, men, jew, said, columbia, polit, jim, presid, tax, law", "+++ want, think, say, could, may, point, peopl, world, come, take
--- believ, jim, christian, follow, right, amanda, david, look, even, repli", "+++ want, think, work, distribut, could, new, may, need, take, make
--- win, ground, men, austin, current, columbia, jim, system, power, nec", "+++ tri, want, think, work, say, could, may, point, repli, come
--- true, win, men, mean, exampl, columbia, believ, jim, system, anyon", "+++ tri, think, say, distribut, could, state, peopl, need, make, time
--- win, men, mean, columbia, jim, crimin, system, law, anyon, find", "+++ new, may, system, world, news, time, number, repli
--- think, could, satellit, technolog, includ, jim, cmu, want, pittsburgh, amanda", "+++ new, name, distribut, may, mail, system, key, number
--- think, could, includ, jim, rule, want, amanda, david, look, file", "+++ want, think, work, say, could, peopl, come, make, anyon, access
--- win, men, henri, technolog, columbia, jim, system, gov, prism, lot", "+++ good, want, think, work, distribut, could, new, thank, mail, system
--- win, men, simm, price, columbia, includ, jim, buy, ship, upgrad", "+++ tri, work, new, thank, system, need, pleas
--- think, instal, could, jim, want, scsi, pin, mous, driver, amanda" ], [ "+++ year, think, new
--- georg, play, could, believ, jim, presid, jake, ibm, want, pittsburgh", "+++ think, say, could, new, first, time, point, peopl, take, right
--- win, georg, said, mean, play, columbia, believ, jim, presid, power", "+++ packag, tri, work, time, system, comput, also, look
--- georg, think, could, includ, believ, jim, presid, type, jake, ibm", "+++ want, think, work, thing, could, say, new, time, system, point
--- georg, chang, said, engin, mean, believ, jim, presid, tax, buy", "+++ tri, want, think, thing, mean, could, say, time, system, point
--- keith, natur, sinc, georg, said, believ, jim, presid, tax, jake", "+++ tri, want, think, said, work, could, say, clinton, new, time
--- georg, jew, mean, polit, believ, jim, jake, law, propos, war", "+++ want, think, said, mean, could, thing, say, believ, first, time
--- religion, georg, live, children, jim, presid, system, tax, law, jake", "+++ want, think, work, thing, could, new, time, take, question, make
--- georg, ground, said, mean, austin, current, believ, jim, presid, power", "+++ tri, want, think, work, thing, mean, could, say, believ, someth
--- true, georg, said, exampl, jim, presid, system, tax, jake, ibm", "+++ tri, think, thing, mean, could, say, peopl, right, make, well
--- georg, said, believ, jim, crimin, presid, system, tax, law, jake", "+++ new, first, system, comput, news, year, time, also, repli
--- georg, think, could, satellit, technolog, includ, believ, jim, presid, jake", "+++ new, system, comput
--- georg, think, could, includ, believ, jim, presid, jake, rule, ibm", "+++ want, think, work, thing, could, say, first, peopl, come, make
--- georg, henri, said, mean, technolog, believ, jim, presid, system, gov", "+++ want, think, work, could, new, system, comput, make, also, ibm
--- georg, includ, believ, jim, presid, jake, right, origin, talk, made", "+++ tri, work, new, system, comput, also
--- georg, think, instal, could, believ, jim, presid, jake, ibm, want" ], [ "+++
--- think, play, includ, want, pittsburgh, mous, 6um, 6ei, look, machin", "+++ may, look, make, time, also, run
--- think, play, could, includ, better, still, want, mous, right, start", "+++ color, window, includ, version, inform, system, set, server, program, user
--- ftp, unix, type, data, packag, want, distribut, mous, driver, machin", "+++ want, work, time, may, system, problem, need, make, also, look
--- color, think, engin, could, window, includ, version, inform, buy, design", "+++ tri, want, read, time, system, problem, make, also
--- think, could, includ, evid, mous, right, look, machin, jon, case", "+++ tri, want, work, time, system, need, call, make, also
--- think, jew, could, includ, presid, arab, mous, right, look, clipper", "+++ want, may, call, make, time, also
--- think, could, believ, includ, christian, follow, mous, right, look, machin", "+++ want, help, work, time, may, problem, need, run, make, also
--- color, ground, think, window, could, austin, current, includ, system, power", "+++ tri, want, read, work, time, may, need, make, also
--- think, could, believ, includ, christian, apr, life, evid, mous, look", "+++ tri, read, problem, need, make, time, also
--- think, could, includ, want, mous, right, carri, look, machin, keep", "+++ includ, may, system, inform, comput, avail, also, time, program
--- satellit, technolog, cmu, want, pittsburgh, mous, center, look, machin, file", "+++ read, name, code, includ, output, may, system, inform, comput, entri
--- section, request, color, window, ftp, version, pleas, buf, build, set", "+++ make, also, want, program, work
--- think, could, technolog, includ, mous, look, machin, michael, cost, file", "+++ want, softwar, work, thank, includ, mail, system, comput, need, call
--- color, chang, think, simm, window, price, could, version, inform, buy", "+++ bit, tri, help, work, window, thank, mous, system, comput, problem
--- color, motherboard, instal, includ, version, interfac, inform, devic, port, speed" ], [ "+++ see, think, year, new
--- play, could, pittsburgh, 6um, center, 6ei, david, look, 2tm, leaf", "+++ think, could, new, two, first, may, time, point, peopl, see
--- win, request, engin, play, columbia, system, power, mission, basebal, gov", "+++ data, tri, sun, distribut, time, system, also, pleas, look
--- think, could, includ, type, driver, center, david, file, much, probe", "+++ think, engin, could, system, anyon, thing, distribut, two, also, look
--- request, chang, mission, gov, buy, design, speed, better, realli, back", "+++ tri, read, think, thing, could, time, system, point, peopl, see
--- keith, natur, sinc, request, mean, engin, mission, gov, anyon, opinion", "+++ tri, think, could, new, two, time, system, peopl, make, year
--- jew, presid, arab, want, right, center, david, look, clipper, chip", "+++ think, thing, could, first, two, time, may, point, peopl, world
--- religion, request, said, live, mean, engin, believ, children, system, mission", "+++ think, thing, distribut, could, two, new, may, time, gov, much
--- request, ground, engin, austin, current, system, power, mission, nec, bnr", "+++ tri, read, think, thing, could, time, may, point, repli, peopl
--- true, request, mean, exampl, engin, believ, system, mission, gov, anyon", "+++ tri, think, read, thing, distribut, could, time, peopl, make, year
--- right, carri, center, david, look, keep, case, much, public, probe", "+++ engin, system, mission, gov, jpl, data, time, world, news, center
--- request, think, could, satellit, technolog, includ, inform, andrew, anyon, pitt", "+++ request, read, new, distribut, may, 1993, system, list
--- think, could, includ, rule, center, david, look, file, much, public", "+++ moon, think, thing, could, first, space, peopl, gov, see, much
--- request, henri, engin, technolog, earth, system, mission, prism, lot, net", "+++ think, distribut, could, new, system, world, make, anyon, also, pleas
--- includ, ibm, want, center, david, origin, much, probe, need, game", "+++ data, tri, new, two, system, also, pleas
--- think, instal, could, scsi, pin, mous, driver, center, david, cach" ], [ "+++ see, think, year, new
--- play, could, want, pittsburgh, 6um, 6ei, look, motorcycl, 2tm, case", "+++ think, say, could, new, first, time, may, peopl, see, much
--- win, play, columbia, system, power, basebal, jpeg, buy, anyon, lot", "+++ help, work, thank, time, system, comput, need, call, imag, also
--- color, think, window, could, ftp, includ, version, inform, unix, type", "+++ think, could, system, buy, anyon, realli, want, thing, way, also
--- chang, engin, design, jpeg, lot, speed, better, back, alaska, still", "+++ want, think, thing, could, say, time, case, system, peopl, see
--- keith, natur, sinc, mean, jpeg, buy, anyon, lot, opinion, alaska", "+++ want, think, work, say, could, new, time, system, peopl, need
--- jew, said, polit, presid, tax, law, jpeg, buy, anyon, lot", "+++ want, think, thing, could, say, first, time, may, peopl, world
--- religion, said, live, mean, believ, children, system, law, jpeg, buy", "+++ want, bike, think, thing, could, work, help, new, may, time
--- ground, austin, current, system, power, nec, gov, jpeg, buy, bnr", "+++ want, think, work, thing, could, say, time, may, repli, peopl
--- true, mean, exampl, believ, system, jpeg, buy, anyon, lot, christian", "+++ think, thing, could, say, time, case, peopl, need, make, well
--- mean, crimin, system, law, jpeg, buy, anyon, lot, arm, alaska", "+++ new, first, may, system, world, comput, year, time, also, repli
--- think, could, satellit, technolog, includ, cmu, want, pittsburgh, center, look", "+++ new, may, system, comput
--- think, could, includ, rule, want, look, motorcycl, case, file, much", "+++ want, think, work, thing, could, say, first, year, peopl, see
--- henri, technolog, system, gov, prism, jpeg, buy, net, program, pat", "+++ good, want, think, cleveland, work, could, new, thank, system, comput
--- simm, price, includ, jpeg, lot, ship, upgrad, ibm, run, alaska", "+++ help, work, thank, new, system, comput, need, also, pleas
--- think, instal, could, want, scsi, pin, mous, driver, cach, motorcycl" ], [ "+++ see, think, year, new
--- play, could, believ, christian, want, pittsburgh, right, 6um, 6ei, 2tm", "+++ think, say, could, new, time, may, power, peopl, see, take
--- win, shall, mean, play, columbia, believ, basebal, law, free, better", "+++ time, also, support, work
--- think, could, includ, believ, type, christian, want, driver, right, look", "+++ want, think, work, thing, could, say, new, time, may, state
--- shall, chang, engin, mean, believ, system, power, buy, design, law", "+++ think, mean, could, want, thing, take, right, way, also, reason
--- keith, natur, sinc, shall, believ, system, power, law, free, opinion", "+++ think, could, law, want, right, nation, way, also, work, question
--- shall, homosexu, jew, said, mean, polit, believ, presid, system, power", "+++ think, mean, could, believ, law, christian, want, thing, take, right
--- religion, shall, said, live, children, power, free, follow, kill, two", "+++ want, think, work, thing, could, new, time, may, power, take
--- shall, ground, mean, austin, current, believ, nec, gov, law, bnr", "+++ think, mean, could, believ, christian, want, thing, way, also, reason
--- true, shall, exampl, power, law, free, realli, claim, apr, life", "+++ person, think, thing, mean, could, say, state, case, govern, peopl
--- shall, believ, crimin, power, free, arm, christian, defens, want, distribut", "+++ new, may, nation, year, time, also
--- think, could, satellit, technolog, includ, believ, christian, cmu, want, pittsburgh", "+++ part, may, public, new
--- think, could, includ, believ, rule, christian, want, right, case, file", "+++ want, think, work, thing, could, say, peopl, see, much, make
--- shall, henri, mean, technolog, believ, power, gov, prism, law, free", "+++ want, think, work, could, new, make, also, good
--- includ, believ, ibm, christian, right, look, origin, case, much, public", "+++ also, support, work, new
--- think, instal, could, believ, christian, want, scsi, pin, mous, driver" ], [ "+++ year, think, new
--- play, could, satellit, technolog, includ, want, pittsburgh, 6um, center, 6ei", "+++ think, say, could, new, two, first, time, peopl, take, much
--- win, play, satellit, columbia, technolog, includ, system, power, inform, basebal", "+++ work, distribut, includ, time, system, inform, comput, look, problem, call
--- think, could, satellit, technolog, type, want, driver, center, michael, cost", "+++ want, think, work, thing, could, distribut, two, say, new, time
--- chang, engin, satellit, technolog, includ, inform, buy, anyon, msg, speed", "+++ want, think, thing, could, say, time, system, peopl, problem, take
--- keith, natur, sinc, mean, satellit, technolog, includ, inform, design, msg", "+++ want, think, work, say, could, new, two, time, system, peopl
--- jew, said, satellit, technolog, polit, includ, presid, inform, tax, law", "+++ want, think, thing, could, say, two, first, time, peopl, take
--- satellit, technolog, believ, includ, christian, follow, right, center, look, michael", "+++ food, want, think, work, thing, distribut, could, two, new, time
--- ground, satellit, austin, current, technolog, includ, system, power, inform, nec", "+++ want, think, work, thing, could, say, time, scienc, peopl, much
--- satellit, technolog, believ, includ, christian, apr, life, evid, center, look", "+++ think, thing, distribut, could, say, time, peopl, problem, make, year
--- satellit, technolog, includ, want, right, carri, center, look, michael, keep", "+++ satellit, technolog, includ, system, inform, program, launch, gener, comput, center
--- think, engin, could, mission, andrew, gov, design, msg, pitt, jpl", "+++ distribut, new, includ, gener, system, inform, comput, 1993, program
--- think, could, satellit, technolog, rule, want, center, look, michael, cost", "+++ want, think, work, thing, could, michael, technolog, say, first, space
--- henri, satellit, includ, system, inform, gov, prism, design, anyon, lot", "+++ interest, want, think, work, distribut, could, new, includ, system, comput
--- simm, price, satellit, technolog, inform, buy, design, anyon, msg, ship", "+++ work, new, two, system, problem, comput, also
--- think, instal, could, satellit, technolog, includ, want, scsi, pin, mous" ], [ "+++ think, new
--- play, could, want, scsi, pin, pittsburgh, 6um, 6ei, look, 2tm", "+++ think, could, new, two, time, power, make, also, look, good
--- play, better, still, want, scsi, pin, right, start, team, much", "+++ help, work, distribut, thank, time, mail, system, problem, comput, need
--- color, think, simm, window, price, ftp, could, includ, version, power", "+++ car, want, think, work, thing, could, distribut, two, new, time
--- chang, simm, engin, price, power, buy, design, set, speed, better", "+++ want, think, thing, could, time, system, problem, make, also, good
--- scsi, evid, pin, right, look, jon, case, even, caltech, repli", "+++ want, think, work, could, new, two, time, system, need, make
--- simm, jew, said, rom, price, polit, presid, power, tax, law", "+++ want, think, thing, could, two, time, make, also
--- believ, christian, follow, scsi, pin, right, look, even, repli, help", "+++ want, think, work, thing, could, distribut, two, monitor, help, new
--- ground, simm, price, austin, current, system, nec, gov, bnr, set", "+++ want, think, work, thing, could, time, repli, need, make, also
--- true, simm, mean, exampl, price, believ, system, power, set, anyon", "+++ think, thing, distribut, could, time, state, problem, need, make, also
--- simm, mean, price, crimin, system, power, law, set, anyon, arm", "+++ new, system, comput, time, also, repli
--- think, could, satellit, technolog, includ, cmu, want, scsi, pin, pittsburgh", "+++ new, comput, mail, system, distribut
--- think, could, includ, rule, want, scsi, pin, look, file, public", "+++ want, think, work, thing, could, make, anyon, also, usa, good
--- technolog, scsi, pin, look, michael, cost, much, repli, spencer, help", "+++ simm, think, could, price, system, anyon, want, distribut, comput, mac
--- includ, power, buy, set, ship, upgrad, ibm, run, board, scsi", "+++ system, set, board, scsi, pin, two, comput, disk, hard, mac
--- motherboard, think, simm, instal, window, price, could, usa, interfac, power" ], [ "+++ win, think, fan, play, new, divis, season, hockey, nhl, team
--- final, could, basebal, pick, red, devil, better, run, back, defens", "+++ win, think, could, play, basebal, better, run, back, defens, two
--- columbia, power, realli, still, divis, cramer, way, leagu, playoff, mark", "+++ time, also, look, run
--- think, play, could, includ, type, better, driver, right, start, let", "+++ think, could, better, realli, back, two, take, right, way, also
--- win, chang, engin, play, system, basebal, buy, design, anyon, speed", "+++ think, say, could, time, point, see, take, right, make, well
--- keith, natur, sinc, win, mean, play, system, basebal, opinion, better", "+++ think, say, could, new, two, time, right, make, well, even
--- win, jew, said, play, polit, presid, system, basebal, tax, law", "+++ think, say, could, first, two, time, may, point, see, take
--- religion, win, said, live, mean, play, believ, children, basebal, law", "+++ good, think, could, new, two, time, may, got, take, much
--- win, ground, play, austin, current, power, nec, gov, basebal, bnr", "+++ think, say, could, time, may, point, see, much, make, even
--- true, win, mean, exampl, play, believ, basebal, better, christian, claim", "+++ defens, think, say, could, time, right, make, well, year, also
--- play, better, carri, look, start, let, keep, case, team, much", "+++ new, first, may, year, time, also
--- think, play, satellit, technolog, includ, could, better, cmu, pittsburgh, right", "+++ may, new
--- think, play, could, includ, rule, better, right, look, start, let", "+++ think, say, could, first, see, much, make, way, year, also
--- play, technolog, better, want, right, look, start, let, michael, cost", "+++ think, could, new, game, make, best, also, look, run, good
--- win, simm, price, play, includ, system, basebal, buy, anyon, ship", "+++ two, also, run, new
--- think, instal, play, could, better, scsi, pin, mous, driver, right" ] ], "type": "heatmap", "z": [ [ 0.9795918367346939, 0.9583333333333334, 0.5915492957746479, 0.8636363636363636, 0.9690721649484536, 0.9473684210526316, 0.98989898989899, 0.8888888888888888, 0.9583333333333334, 0.9690721649484536, 0.8505747126436781, 0.8235294117647058, 0.9247311827956989, 0.75, 0.8095238095238095 ], [ 0.9795918367346939, 0.8095238095238095, 0.9361702127659575, 0.7654320987654322, 0.7951807228915663, 0.7654320987654322, 0.8095238095238095, 0.8095238095238095, 0.8235294117647058, 0.6666666666666667, 0.9247311827956989, 0.9361702127659575, 0.8888888888888888, 0.8764044943820225, 0.9473684210526316 ], [ 0.9690721649484536, 0.7951807228915663, 0.9583333333333334, 0.7654320987654322, 0.6666666666666667, 0.717948717948718, 0.46153846153846156, 0.8505747126436781, 0.5714285714285714, 0.7951807228915663, 0.9361702127659575, 0.98989898989899, 0.8372093023255813, 0.9130434782608696, 0.9795918367346939 ], [ 0.7654320987654322, 1, 1, 1, 1, 0.98989898989899, 1, 1, 1, 1, 1, 1, 1, 1, 1 ], [ 0.9361702127659575, 0.6842105263157895, 0.8764044943820225, 0.6486486486486487, 0.7804878048780488, 0.8095238095238095, 0.8372093023255813, 0.7804878048780488, 0.7951807228915663, 0.8095238095238095, 0.8505747126436781, 0.9583333333333334, 0.8636363636363636, 0.8505747126436781, 0.7951807228915663 ], [ 0.9795918367346939, 0.8372093023255813, 0.8888888888888888, 0.6666666666666667, 0.8235294117647058, 0.6301369863013699, 0.8095238095238095, 0.8095238095238095, 0.8095238095238095, 0.7654320987654322, 0.8764044943820225, 0.8095238095238095, 0.8505747126436781, 0.8505747126436781, 0.9010989010989011 ], [ 0.9690721649484536, 0.8095238095238095, 0.8888888888888888, 0.6666666666666667, 0.8505747126436781, 0.8372093023255813, 0.8505747126436781, 0.8235294117647058, 0.8235294117647058, 0.8636363636363636, 0.9130434782608696, 0.9130434782608696, 0.8636363636363636, 0.7804878048780488, 0.9247311827956989 ], [ 0.9690721649484536, 0.8235294117647058, 0.9130434782608696, 0.7654320987654322, 0.7654320987654322, 0.7341772151898734, 0.75, 0.8235294117647058, 0.75, 0.8372093023255813, 0.9010989010989011, 0.9690721649484536, 0.8505747126436781, 0.8636363636363636, 0.9361702127659575 ], [ 1, 0.9361702127659575, 0.3870967741935484, 0.8636363636363636, 0.9130434782608696, 0.9010989010989011, 0.9361702127659575, 0.8636363636363636, 0.9010989010989011, 0.9247311827956989, 0.9010989010989011, 0.7951807228915663, 0.9473684210526316, 0.8235294117647058, 0.8095238095238095 ], [ 0.9583333333333334, 0.8235294117647058, 0.9010989010989011, 0.717948717948718, 0.8505747126436781, 0.8636363636363636, 0.8235294117647058, 0.7951807228915663, 0.8235294117647058, 0.8636363636363636, 0.717948717948718, 0.9130434782608696, 0.7951807228915663, 0.8636363636363636, 0.9247311827956989 ], [ 0.9583333333333334, 0.8095238095238095, 0.8636363636363636, 0.6842105263157895, 0.7951807228915663, 0.7951807228915663, 0.7804878048780488, 0.7654320987654322, 0.75, 0.8235294117647058, 0.8888888888888888, 0.9583333333333334, 0.7654320987654322, 0.7341772151898734, 0.9010989010989011 ], [ 0.9583333333333334, 0.7804878048780488, 0.9583333333333334, 0.7341772151898734, 0.7012987012987013, 0.6842105263157895, 0.6111111111111112, 0.7951807228915663, 0.6842105263157895, 0.7341772151898734, 0.9361702127659575, 0.9583333333333334, 0.8235294117647058, 0.9130434782608696, 0.9583333333333334 ], [ 0.9690721649484536, 0.8235294117647058, 0.8636363636363636, 0.75, 0.8372093023255813, 0.8372093023255813, 0.8372093023255813, 0.75, 0.8505747126436781, 0.8636363636363636, 0.6842105263157895, 0.9010989010989011, 0.75, 0.8235294117647058, 0.9247311827956989 ], [ 0.9795918367346939, 0.8888888888888888, 0.8372093023255813, 0.7341772151898734, 0.8888888888888888, 0.8636363636363636, 0.9130434782608696, 0.7654320987654322, 0.8764044943820225, 0.8505747126436781, 0.9361702127659575, 0.9473684210526316, 0.8888888888888888, 0.5915492957746479, 0.6666666666666667 ], [ 0.7951807228915663, 0.360655737704918, 0.9583333333333334, 0.7012987012987013, 0.8095238095238095, 0.8372093023255813, 0.7804878048780488, 0.7951807228915663, 0.8095238095238095, 0.8636363636363636, 0.9361702127659575, 0.9795918367346939, 0.8636363636363636, 0.8888888888888888, 0.9583333333333334 ] ] } ], "layout": { "height": 950, "title": "Topic difference (two models)[jaccard distance]", "width": 950, "xaxis": { "title": "topic" }, "yaxis": { "title": "topic" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mdiff, annotation = lda_fst.diff(lda_snd, distance='jaccard', num_words=50)\n", "plot_difference(mdiff, title=\"Topic difference (two models)[jaccard distance]\", annotation=annotation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at this matrix, you can find similar and different topics (and relevant tokens which describe the intersection and difference)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.3" } }, "nbformat": 4, "nbformat_minor": 2 }