{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Visualizing Topic clusters\n",
"\n",
"In this notebook, we will learn how to visualize topic clusters using dendrogram. Dendrogram is a tree-structured graph which can be used to visualize the result of a hierarchical clustering calculation. Hierarchical clustering puts individual data points into similarity groups, without prior knowledge of groups. We can use it to explore the topic models and see how the topics are connected to each other in a sequence of successive fusions or divisions that occur in the clustering process."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
},
{
"data": {
"text/html": [
""
],
"text/vnd.plotly.v1+html": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from gensim.models.ldamodel import LdaModel\n",
"from gensim.corpora import Dictionary\n",
"from gensim.parsing.preprocessing import remove_stopwords, strip_punctuation\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import re\n",
"\n",
"import plotly.offline as py\n",
"import plotly.graph_objs as go\n",
"\n",
"py.init_notebook_mode()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Train Model\n",
"\n",
"We'll use the [fake news dataset](https://www.kaggle.com/mrisdal/fake-news) from kaggle for this notebook. First step is to preprocess the data and train our topic model using LDA. You can refer to this [notebook](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/lda_training_tips.ipynb) also for tips and suggestions of pre-processing the text data, and how to train LDA model for getting good results."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df_fake = pd.read_csv('fake.csv')\n",
"df_fake[['title', 'text', 'language']].head()\n",
"df_fake = df_fake.loc[(pd.notnull(df_fake.text)) & (df_fake.language == 'english')]\n",
"\n",
"# remove stopwords and punctuations\n",
"def preprocess(row):\n",
" return strip_punctuation(remove_stopwords(row.lower()))\n",
" \n",
"df_fake['text'] = df_fake['text'].apply(preprocess)\n",
"\n",
"# Convert data to required input format by LDA\n",
"texts = []\n",
"for line in df_fake.text:\n",
" lowered = line.lower()\n",
" words = re.findall(r'\\w+', lowered, flags=re.UNICODE|re.LOCALE)\n",
" texts.append(words)\n",
"# Create a dictionary representation of the documents.\n",
"dictionary = Dictionary(texts)\n",
"\n",
"# Filter out words that occur less than 2 documents, or more than 30% of the documents.\n",
"dictionary.filter_extremes(no_below=2, no_above=0.4)\n",
"# Bag-of-words representation of the documents.\n",
"corpus_fake = [dictionary.doc2bow(text) for text in texts]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"lda_fake = LdaModel(corpus=corpus_fake, id2word=dictionary, num_topics=35, passes=30, chunksize=1500, iterations=200, alpha='auto')\n",
"lda_fake.save('lda_35')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"lda_fake = LdaModel.load('lda_35')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Basic Dendrogram\n",
"\n",
"Firstly, a distance matrix is calculated to store distance between every topic pair. These distances are then used ascendingly to cluster the topics together whose process is depicted by the dendrogram."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# This input cell contains the modified code from Plotly[1]. \n",
"# It can be removed after PR (https://github.com/plotly/plotly.py/pull/807) gets merged.\n",
"\n",
"# [1] https://github.com/plotly/plotly.py/blob/master/plotly/figure_factory/_dendrogram.py\n",
"\n",
"from collections import OrderedDict\n",
"\n",
"from plotly import exceptions, optional_imports\n",
"from plotly.graph_objs import graph_objs\n",
"\n",
"# Optional imports, may be None for users that only use our core functionality.\n",
"np = optional_imports.get_module('numpy')\n",
"scp = optional_imports.get_module('scipy')\n",
"sch = optional_imports.get_module('scipy.cluster.hierarchy')\n",
"scs = optional_imports.get_module('scipy.spatial')\n",
"\n",
"\n",
"def create_dendrogram(X, orientation=\"bottom\", labels=None,\n",
" colorscale=None, distfun=None,\n",
" linkagefun=lambda x: sch.linkage(x, 'single'),\n",
" annotation=None):\n",
" \"\"\"\n",
" BETA function that returns a dendrogram Plotly figure object.\n",
"\n",
" :param (ndarray) X: Matrix of observations as array of arrays\n",
" :param (str) orientation: 'top', 'right', 'bottom', or 'left'\n",
" :param (list) labels: List of axis category labels(observation labels)\n",
" :param (list) colorscale: Optional colorscale for dendrogram tree\n",
" :param (function) distfun: Function to compute the pairwise distance from\n",
" the observations\n",
" :param (function) linkagefun: Function to compute the linkage matrix from\n",
" the pairwise distances\n",
"\n",
" clusters\n",
"\n",
" Example 1: Simple bottom oriented dendrogram\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_dendrogram\n",
"\n",
" import numpy as np\n",
"\n",
" X = np.random.rand(10,10)\n",
" dendro = create_dendrogram(X)\n",
" plot_url = py.plot(dendro, filename='simple-dendrogram')\n",
"\n",
" ```\n",
"\n",
" Example 2: Dendrogram to put on the left of the heatmap\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_dendrogram\n",
"\n",
" import numpy as np\n",
"\n",
" X = np.random.rand(5,5)\n",
" names = ['Jack', 'Oxana', 'John', 'Chelsea', 'Mark']\n",
" dendro = create_dendrogram(X, orientation='right', labels=names)\n",
" dendro['layout'].update({'width':700, 'height':500})\n",
"\n",
" py.iplot(dendro, filename='vertical-dendrogram')\n",
" ```\n",
"\n",
" Example 3: Dendrogram with Pandas\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_dendrogram\n",
"\n",
" import numpy as np\n",
" import pandas as pd\n",
"\n",
" Index= ['A','B','C','D','E','F','G','H','I','J']\n",
" df = pd.DataFrame(abs(np.random.randn(10, 10)), index=Index)\n",
" fig = create_dendrogram(df, labels=Index)\n",
" url = py.plot(fig, filename='pandas-dendrogram')\n",
" ```\n",
" \"\"\"\n",
" if not scp or not scs or not sch:\n",
" raise ImportError(\"FigureFactory.create_dendrogram requires scipy, \\\n",
" scipy.spatial and scipy.hierarchy\")\n",
"\n",
" s = X.shape\n",
" if len(s) != 2:\n",
" exceptions.PlotlyError(\"X should be 2-dimensional array.\")\n",
"\n",
" if distfun is None:\n",
" distfun = scs.distance.pdist\n",
"\n",
" dendrogram = _Dendrogram(X, orientation, labels, colorscale,\n",
" distfun=distfun, linkagefun=linkagefun,\n",
" annotation=annotation)\n",
"\n",
" return {'layout': dendrogram.layout,\n",
" 'data': dendrogram.data}\n",
"\n",
"\n",
"class _Dendrogram(object):\n",
" \"\"\"Refer to FigureFactory.create_dendrogram() for docstring.\"\"\"\n",
"\n",
" def __init__(self, X, orientation='bottom', labels=None, colorscale=None,\n",
" width=\"100%\", height=\"100%\", xaxis='xaxis', yaxis='yaxis',\n",
" distfun=None,\n",
" linkagefun=lambda x: sch.linkage(x, 'single'),\n",
" annotation=None):\n",
" self.orientation = orientation\n",
" self.labels = labels\n",
" self.xaxis = xaxis\n",
" self.yaxis = yaxis\n",
" self.data = []\n",
" self.leaves = []\n",
" self.sign = {self.xaxis: 1, self.yaxis: 1}\n",
" self.layout = {self.xaxis: {}, self.yaxis: {}}\n",
"\n",
" if self.orientation in ['left', 'bottom']:\n",
" self.sign[self.xaxis] = 1\n",
" else:\n",
" self.sign[self.xaxis] = -1\n",
"\n",
" if self.orientation in ['right', 'bottom']:\n",
" self.sign[self.yaxis] = 1\n",
" else:\n",
" self.sign[self.yaxis] = -1\n",
"\n",
" if distfun is None:\n",
" distfun = scs.distance.pdist\n",
"\n",
" (dd_traces, xvals, yvals,\n",
" ordered_labels, leaves) = self.get_dendrogram_traces(X, colorscale, distfun, linkagefun, annotation)\n",
"\n",
" self.labels = ordered_labels\n",
" self.leaves = leaves\n",
" yvals_flat = yvals.flatten()\n",
" xvals_flat = xvals.flatten()\n",
"\n",
" self.zero_vals = []\n",
"\n",
" for i in range(len(yvals_flat)):\n",
" if yvals_flat[i] == 0.0 and xvals_flat[i] not in self.zero_vals:\n",
" self.zero_vals.append(xvals_flat[i])\n",
"\n",
" self.zero_vals.sort()\n",
"\n",
" self.layout = self.set_figure_layout(width, height)\n",
" self.data = graph_objs.Data(dd_traces)\n",
"\n",
" def get_color_dict(self, colorscale):\n",
" \"\"\"\n",
" Returns colorscale used for dendrogram tree clusters.\n",
"\n",
" :param (list) colorscale: Colors to use for the plot in rgb format.\n",
" :rtype (dict): A dict of default colors mapped to the user colorscale.\n",
"\n",
" \"\"\"\n",
"\n",
" # These are the color codes returned for dendrograms\n",
" # We're replacing them with nicer colors\n",
" d = {'r': 'red',\n",
" 'g': 'green',\n",
" 'b': 'blue',\n",
" 'c': 'cyan',\n",
" 'm': 'magenta',\n",
" 'y': 'yellow',\n",
" 'k': 'black',\n",
" 'w': 'white'}\n",
" default_colors = OrderedDict(sorted(d.items(), key=lambda t: t[0]))\n",
"\n",
" if colorscale is None:\n",
" colorscale = [\n",
" 'rgb(0,116,217)', # blue\n",
" 'rgb(35,205,205)', # cyan\n",
" 'rgb(61,153,112)', # green\n",
" 'rgb(40,35,35)', # black\n",
" 'rgb(133,20,75)', # magenta\n",
" 'rgb(255,65,54)', # red\n",
" 'rgb(255,255,255)', # white\n",
" 'rgb(255,220,0)'] # yellow\n",
"\n",
" for i in range(len(default_colors.keys())):\n",
" k = list(default_colors.keys())[i] # PY3 won't index keys\n",
" if i < len(colorscale):\n",
" default_colors[k] = colorscale[i]\n",
"\n",
" return default_colors\n",
"\n",
" def set_axis_layout(self, axis_key):\n",
" \"\"\"\n",
" Sets and returns default axis object for dendrogram figure.\n",
"\n",
" :param (str) axis_key: E.g., 'xaxis', 'xaxis1', 'yaxis', yaxis1', etc.\n",
" :rtype (dict): An axis_key dictionary with set parameters.\n",
"\n",
" \"\"\"\n",
" axis_defaults = {\n",
" 'type': 'linear',\n",
" 'ticks': 'outside',\n",
" 'mirror': 'allticks',\n",
" 'rangemode': 'tozero',\n",
" 'showticklabels': True,\n",
" 'zeroline': False,\n",
" 'showgrid': False,\n",
" 'showline': True,\n",
" }\n",
"\n",
" if len(self.labels) != 0:\n",
" axis_key_labels = self.xaxis\n",
" if self.orientation in ['left', 'right']:\n",
" axis_key_labels = self.yaxis\n",
" if axis_key_labels not in self.layout:\n",
" self.layout[axis_key_labels] = {}\n",
" self.layout[axis_key_labels]['tickvals'] = \\\n",
" [zv*self.sign[axis_key] for zv in self.zero_vals]\n",
" self.layout[axis_key_labels]['ticktext'] = self.labels\n",
" self.layout[axis_key_labels]['tickmode'] = 'array'\n",
"\n",
" self.layout[axis_key].update(axis_defaults)\n",
"\n",
" return self.layout[axis_key]\n",
"\n",
" def set_figure_layout(self, width, height):\n",
" \"\"\"\n",
" Sets and returns default layout object for dendrogram figure.\n",
"\n",
" \"\"\"\n",
" self.layout.update({\n",
" 'showlegend': False,\n",
" 'autosize': False,\n",
" 'hovermode': 'closest',\n",
" 'width': width,\n",
" 'height': height\n",
" })\n",
"\n",
" self.set_axis_layout(self.xaxis)\n",
" self.set_axis_layout(self.yaxis)\n",
"\n",
" return self.layout\n",
"\n",
" def get_dendrogram_traces(self, X, colorscale, distfun, linkagefun, annotation):\n",
" \"\"\"\n",
" Calculates all the elements needed for plotting a dendrogram.\n",
"\n",
" :param (ndarray) X: Matrix of observations as array of arrays\n",
" :param (list) colorscale: Color scale for dendrogram tree clusters\n",
" :param (function) distfun: Function to compute the pairwise distance\n",
" from the observations\n",
" :param (function) linkagefun: Function to compute the linkage matrix\n",
" from the pairwise distances\n",
" :rtype (tuple): Contains all the traces in the following order:\n",
" (a) trace_list: List of Plotly trace objects for dendrogram tree\n",
" (b) icoord: All X points of the dendrogram tree as array of arrays\n",
" with length 4\n",
" (c) dcoord: All Y points of the dendrogram tree as array of arrays\n",
" with length 4\n",
" (d) ordered_labels: leaf labels in the order they are going to\n",
" appear on the plot\n",
" (e) P['leaves']: left-to-right traversal of the leaves\n",
"\n",
" \"\"\"\n",
" d = distfun(X)\n",
" Z = linkagefun(d)\n",
" P = sch.dendrogram(Z, orientation=self.orientation,\n",
" labels=self.labels, no_plot=True)\n",
"\n",
" icoord = scp.array(P['icoord'])\n",
" dcoord = scp.array(P['dcoord'])\n",
" ordered_labels = scp.array(P['ivl'])\n",
" color_list = scp.array(P['color_list'])\n",
" colors = self.get_color_dict(colorscale)\n",
"\n",
" trace_list = []\n",
"\n",
" for i in range(len(icoord)):\n",
" # xs and ys are arrays of 4 points that make up the '∩' shapes\n",
" # of the dendrogram tree\n",
" if self.orientation in ['top', 'bottom']:\n",
" xs = icoord[i]\n",
" else:\n",
" xs = dcoord[i]\n",
"\n",
" if self.orientation in ['top', 'bottom']:\n",
" ys = dcoord[i]\n",
" else:\n",
" ys = icoord[i]\n",
" color_key = color_list[i]\n",
" text_annotation = None\n",
" if annotation:\n",
" text_annotation = annotation[i]\n",
" trace = graph_objs.Scatter(\n",
" x=np.multiply(self.sign[self.xaxis], xs),\n",
" y=np.multiply(self.sign[self.yaxis], ys),\n",
" mode='lines',\n",
" marker=graph_objs.Marker(color=colors[color_key]),\n",
" text=text_annotation,\n",
" hoverinfo='text'\n",
" )\n",
"\n",
" try:\n",
" x_index = int(self.xaxis[-1])\n",
" except ValueError:\n",
" x_index = ''\n",
"\n",
" try:\n",
" y_index = int(self.yaxis[-1])\n",
" except ValueError:\n",
" y_index = ''\n",
"\n",
" trace['xaxis'] = 'x' + x_index\n",
" trace['yaxis'] = 'y' + y_index\n",
" trace_list.append(trace)\n",
"\n",
" return trace_list, icoord, dcoord, ordered_labels, P['leaves']"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true,
"scrolled": false
},
"outputs": [],
"source": [
"from gensim.matutils import jensen_shannon\n",
"from scipy import spatial as scs\n",
"from scipy.cluster import hierarchy as sch\n",
"from scipy.spatial.distance import pdist, squareform\n",
"\n",
"\n",
"# get topic distributions\n",
"topic_dist = lda_fake.state.get_lambda()\n",
"\n",
"# get topic terms\n",
"num_words = 300\n",
"topic_terms = [{w for (w, _) in lda_fake.show_topic(topic, topn=num_words)} for topic in range(topic_dist.shape[0])]\n",
"\n",
"# no. of terms to display in annotation\n",
"n_ann_terms = 10\n",
"\n",
"# use Jensen-Shannon distance metric in dendrogram\n",
"def js_dist(X):\n",
" return pdist(X, lambda u, v: jensen_shannon(u, v))\n",
"\n",
"# calculate text annotations\n",
"def text_annotation(topic_dist, topic_terms, n_ann_terms):\n",
" # get dendrogram hierarchy data\n",
" linkagefun = lambda x: sch.linkage(x, 'single')\n",
" d = js_dist(topic_dist)\n",
" Z = linkagefun(d)\n",
" P = sch.dendrogram(Z, orientation=\"bottom\", no_plot=True)\n",
"\n",
" # store topic no.(leaves) corresponding to the x-ticks in dendrogram\n",
" x_ticks = np.arange(5, len(P['leaves']) * 10 + 5, 10)\n",
" x_topic = dict(zip(P['leaves'], x_ticks))\n",
"\n",
" # store {topic no.:topic terms}\n",
" topic_vals = dict()\n",
" for key, val in x_topic.items():\n",
" topic_vals[val] = (topic_terms[key], topic_terms[key])\n",
"\n",
" text_annotations = []\n",
" # loop through every trace (scatter plot) in dendrogram\n",
" for trace in P['icoord']:\n",
" fst_topic = topic_vals[trace[0]]\n",
" scnd_topic = topic_vals[trace[2]]\n",
" \n",
" # annotation for two ends of current trace\n",
" pos_tokens_t1 = list(fst_topic[0])[:min(len(fst_topic[0]), n_ann_terms)]\n",
" neg_tokens_t1 = list(fst_topic[1])[:min(len(fst_topic[1]), n_ann_terms)]\n",
"\n",
" pos_tokens_t4 = list(scnd_topic[0])[:min(len(scnd_topic[0]), n_ann_terms)]\n",
" neg_tokens_t4 = list(scnd_topic[1])[:min(len(scnd_topic[1]), n_ann_terms)]\n",
"\n",
" t1 = \"
\".join((\": \".join((\"+++\", str(pos_tokens_t1))), \": \".join((\"---\", str(neg_tokens_t1)))))\n",
" t2 = t3 = ()\n",
" t4 = \"
\".join((\": \".join((\"+++\", str(pos_tokens_t4))), \": \".join((\"---\", str(neg_tokens_t4)))))\n",
"\n",
" # show topic terms in leaves\n",
" if trace[0] in x_ticks:\n",
" t1 = str(list(topic_vals[trace[0]][0])[:n_ann_terms])\n",
" if trace[2] in x_ticks:\n",
" t4 = str(list(topic_vals[trace[2]][0])[:n_ann_terms])\n",
"\n",
" text_annotations.append([t1, t2, t3, t4])\n",
"\n",
" # calculate intersecting/diff for upper level\n",
" intersecting = fst_topic[0] & scnd_topic[0]\n",
" different = fst_topic[0].symmetric_difference(scnd_topic[0])\n",
"\n",
" center = (trace[0] + trace[2]) / 2\n",
" topic_vals[center] = (intersecting, different)\n",
"\n",
" # remove trace value after it is annotated\n",
" topic_vals.pop(trace[0], None)\n",
" topic_vals.pop(trace[2], None) \n",
" \n",
" return text_annotations"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"application/vnd.plotly.v1+json": {
"data": [
{
"hoverinfo": "text",
"marker": {
"color": "rgb(61,153,112)"
},
"mode": "lines",
"text": [
"['they', 'attorney', 'robert', 'refuge', 'project', 'obama', 'same', 'following', 'year', 'news']",
[],
[],
"['70', 'they', 'currently', 'german', 'course', 'past', 'following', 'year', 'race', 'news']"
],
"type": "scatter",
"x": [
155,
155,
165,
165
],
"xaxis": "x",
"y": [
0,
0.30135198617159836,
0.30135198617159836,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['they', 'collusion', 'now', 'mails', 'obama', 'knows', 'year', 'following', 'news', 'don']",
[],
[],
"['attorney', 'appear', 'rules', 'now', 'obama', 'following', 'year', 'news', 'don', 'abedin']"
],
"type": "scatter",
"x": [
205,
205,
215,
215
],
"xaxis": "x",
"y": [
0,
0.20433085451924773,
0.20433085451924773,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['war', 'course', 'past', 'pentagon', 'russians', 'obama', 'year', 'despite', 'news', 'tensions']",
[],
[],
"['war', 'they', 'german', 'course', 'enemy', 'past', 'exceptional', 'eastern', 'pentagon', 'now']"
],
"type": "scatter",
"x": [
245,
245,
255,
255
],
"xaxis": "x",
"y": [
0,
0.21336308183100594,
0.21336308183100594,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['war', 'they', 'course', 'past', 'now', 'speak', 'obama', 'worse', 'year', 'don']",
[],
[],
"['they', 'voice', 'turn', 'thank', 'course', 'fathers', 'past', 'baseball', 'now', 'speak']"
],
"type": "scatter",
"x": [
265,
265,
275,
275
],
"xaxis": "x",
"y": [
0,
0.21747072170909557,
0.21747072170909557,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"+++: ['war', 'course', 'past', 'pentagon', 'obama', 'year', 'news', 'real', 'recent', 'allies']
---: ['they', 'enemy', 'russians', 'deliveries', 'culture', 'following', 'gaddafi', 'away', 'european', 'ministry']",
[],
[],
"+++: ['they', 'course', 'past', 'now', 'speak', 'year', 'don', 'news', 'real', 'lot']
---: ['voice', 'turn', 'thank', 'fathers', 'war', 'baseball', 'obama', 'worse', 'culture', 'book']"
],
"type": "scatter",
"x": [
250,
250,
270,
270
],
"xaxis": "x",
"y": [
0.21336308183100594,
0.21804251459209004,
0.21804251459209004,
0.21747072170909557
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['they', 'beings', 'turn', 'war', 'course', 'balance', 'past', 'knowledge', 'needed', 'now']",
[],
[],
"+++: ['course', 'past', 'us', 'year', 'news', 'real', 'nation', 'fact', 'media', 'day']
---: ['they', 'war', 'pentagon', 'now', 'speak', 'obama', 'don', 'recent', 'lot', 'allies']"
],
"type": "scatter",
"x": [
235,
235,
260,
260
],
"xaxis": "x",
"y": [
0,
0.23375056044071596,
0.23375056044071596,
0.21804251459209004
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['they', 'elect', 'tape', 'now', 'obama', 'following', 'year', 'race', 'news', 'don']",
[],
[],
"+++: ['course', 'past', 'us', 'year', 'real', 'fact', 'day', 'power', 'want', 'good']
---: ['they', 'beings', 'turn', 'war', 'balance', 'knowledge', 'now', 'obama', 'possibilities', 'news']"
],
"type": "scatter",
"x": [
225,
225,
247.5,
247.5
],
"xaxis": "x",
"y": [
0,
0.26407808018236467,
0.26407808018236467,
0.23375056044071596
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"+++: ['now', 'obama', 'following', 'year', 'news', 'don', 'abedin', 'fact', 'media', 'day']
---: ['they', 'rules', 'mails', 'contact', 'recent', 'facts', '11', 'previous', 'coming', 'having']",
[],
[],
"+++: ['year', 'real', 'fact', 'day', 'power', 'want', 'good', 'long', 'history', 'america']
---: ['they', 'course', 'past', 'elect', 'tape', 'now', 'obama', 'following', 'race', 'news']"
],
"type": "scatter",
"x": [
210,
210,
236.25,
236.25
],
"xaxis": "x",
"y": [
0.20433085451924773,
0.2683711916292657,
0.2683711916292657,
0.26407808018236467
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['war', 'bashar', 'republic', 'past', 'pentagon', 'eastern', 'areas', 'obama', 'ankara', 'year']",
[],
[],
"+++: ['going', 'the', 'know', 'long', 'right', 'years', 'year', 'that', 'think', 'state']
---: ['now', 'obama', 'following', 'news', 'don', 'real', 'abedin', 'media', 'com', 'scandal']"
],
"type": "scatter",
"x": [
195,
195,
223.125,
223.125
],
"xaxis": "x",
"y": [
0,
0.2686458161812807,
0.2686458161812807,
0.2683711916292657
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['war', 'uk', 'currently', 'youth', 'past', 'eastern', 'areas', 'khamenei', 'project', 'year']",
[],
[],
"+++: ['the', 'years', 'long', 'year', 'that', 'think', 'state', 'we', 'way', 'and']
---: ['war', 'past', 'pentagon', 'obama', 'ankara', 'news', 'saddam', '11', 'media', 'ministry']"
],
"type": "scatter",
"x": [
185,
185,
209.0625,
209.0625
],
"xaxis": "x",
"y": [
0,
0.3066587417203187,
0.3066587417203187,
0.2686458161812807
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"['currently', 'balance', 'provided', 'now', 'project', 'april', 'areas', 'year', 'following', 'don']",
[],
[],
"['they', 'past', 'gaap', 'rules', 'estate', 'obama', 'year', 'poor', 'despite', 'lance']"
],
"type": "scatter",
"x": [
285,
285,
295,
295
],
"xaxis": "x",
"y": [
0,
0.3129272952316177,
0.3129272952316177,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(255,65,54)"
},
"mode": "lines",
"text": [
"+++: ['the', 'years', 'long', 'year', 'state', 'we', 'way', 'fact']
---: ['war', 'currently', 'medicines', 'youth', 'past', 'eastern', 'areas', 'khamenei', 'project', 'targets']",
[],
[],
"+++: ['year', 'real', 'fact', 'end', 'long', 'come', 'public', '000', 'way', 'immigration']
---: ['they', 'balance', 'past', 'rules', 'estate', 'obama', 'april', 'following', 'poor', 'recent']"
],
"type": "scatter",
"x": [
197.03125,
197.03125,
290,
290
],
"xaxis": "x",
"y": [
0.3066587417203187,
0.3132561331406368,
0.3132561331406368,
0.3129272952316177
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['war', 'republic', 'german', 'diplomacy', 'course', 'eastern', 'project', 'year', 'despite', 'culture']",
[],
[],
"+++: ['the', 'years', 'long', 'year', 'state', 'we', 'way', 'fact']
---: ['real', 'president', 'power', 'end', 'major', 'business', 'think', 'come', 'public', '000']"
],
"type": "scatter",
"x": [
175,
175,
243.515625,
243.515625
],
"xaxis": "x",
"y": [
0,
0.3314790813616175,
0.3314790813616175,
0.3132561331406368
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"+++: ['they', 'following', 'year', 'news', 'don', 'recent', 'local', 'media', 'day', 'com']
---: ['course', 'past', 'robert', 'refuge', 'obama', 'race', 'nov', '12', 'fact', 'away']",
[],
[],
"+++: ['the', 'years', 'long', 'year', 'state', 'we', 'way', 'fact']
---: ['war', 'republic', 'german', 'diplomacy', 'course', 'eastern', 'project', 'despite', 'culture', 'theresa']"
],
"type": "scatter",
"x": [
160,
160,
209.2578125,
209.2578125
],
"xaxis": "x",
"y": [
0.30135198617159836,
0.33197925073991885,
0.33197925073991885,
0.3314790813616175
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['dogs', 'they', 'youtube', 'zuckerberg', 'halloween', 'now', 'project', 'year', 'standing', 'news']",
[],
[],
"+++: ['state', 'the', 'we', 'year', 'way']
---: ['they', 'following', 'news', 'don', 'recent', 'local', 'fact', 'media', 'day', 'com']"
],
"type": "scatter",
"x": [
145,
145,
184.62890625,
184.62890625
],
"xaxis": "x",
"y": [
0,
0.332106851071544,
0.332106851071544,
0.33197925073991885
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['they', 'eye', 'junk', 'limited', 'patient', 'year', 'don', 'contact', 'lot', 'previous']",
[],
[],
"+++: ['state', 'the', 'we', 'year', 'way']
---: ['dogs', 'they', 'youtube', 'zuckerberg', 'halloween', 'now', 'project', 'standing', 'news', 'music']"
],
"type": "scatter",
"x": [
135,
135,
164.814453125,
164.814453125
],
"xaxis": "x",
"y": [
0,
0.34173581106417317,
0.34173581106417317,
0.332106851071544
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['cis', 'span', 'block', 'concentrations', 'relief', 'year', 'news', 'strong', 'fact', 'media']",
[],
[],
"['pregnant', 'balance', 'nutrient', 'areas', 'application', 'year', 'don', 'news', 'logs', 'day']"
],
"type": "scatter",
"x": [
305,
305,
315,
315
],
"xaxis": "x",
"y": [
0,
0.3422402090650555,
0.3422402090650555,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"+++: ['the', 'we', 'year', 'way']
---: ['they', 'eye', 'junk', 'limited', 'patient', 'don', 'contact', 'lot', 'previous', 'day']",
[],
[],
"+++: ['year', 'news', 'day', 'com', 'skin', 'companies', 'risk', '000', 'way', 'levels']
---: ['pregnant', 'cis', 'span', 'balance', 'block', 'concentrations', 'fact', 'media', 'simple', 'having']"
],
"type": "scatter",
"x": [
149.9072265625,
149.9072265625,
310,
310
],
"xaxis": "x",
"y": [
0.34173581106417317,
0.34382594704496805,
0.34382594704496805,
0.3422402090650555
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['digital', 'span', 'transactions', 'past', 'block', 'provided', 'year', 'rate', 'news', 'loans']",
[],
[],
"+++: ['the', 'year', 'way']
---: ['natural', '20', 'news', 'worldtruth', 'day', 'com', 'skin', 'especially', 'called', 'research']"
],
"type": "scatter",
"x": [
125,
125,
229.95361328125,
229.95361328125
],
"xaxis": "x",
"y": [
0,
0.3494192755733978,
0.3494192755733978,
0.34382594704496805
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['they', 'loves', 'youtube', 'bedroom', 'now', 'year', 'news', 'don', 'truck', 'club']",
[],
[],
"['they', 'youtube', 'julian', 'teacher', 'now', 'year', 'book', 'don', 'consensual', 'real']"
],
"type": "scatter",
"x": [
335,
335,
345,
345
],
"xaxis": "x",
"y": [
0,
0.33530166746511025,
0.33530166746511025,
0
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['digital', 'pregnant', 'course', 'alex', 'now', 'knows', 'year', 'book', 'don', 'news']",
[],
[],
"+++: ['they', 'youtube', 'now', 'year', 'don', 'away', 'media', 'day', 'com', 'share']
---: ['loves', 'book', 'consensual', 'pathological', 'news', 'horrific', '12', '11', 'door', 'porn']"
],
"type": "scatter",
"x": [
325,
325,
340,
340
],
"xaxis": "x",
"y": [
0,
0.35129158808551997,
0.35129158808551997,
0.33530166746511025
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"+++: ['the', 'year']
---: ['digital', 'span', 'transactions', 'past', 'block', 'provided', 'rate', 'news', 'loans', 'recent']",
[],
[],
"+++: ['now', 'year', 'up', 'don', 'life', 'got', 'there', 'away', 'media', 'want']
---: ['they', 'digital', 'pregnant', 'youtube', 'course', 'alex', 'book', 'news', '11', 'fact']"
],
"type": "scatter",
"x": [
177.476806640625,
177.476806640625,
332.5,
332.5
],
"xaxis": "x",
"y": [
0.3494192755733978,
0.3513828465304287,
0.3513828465304287,
0.35129158808551997
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['beings', 'course', 'past', 'strings', 'station', 'now', 'following', 'year', 'exciting', 'index']",
[],
[],
"+++: ['the', 'year']
---: ['now', 'up', 'don', 'life', 'got', 'there', 'away', 'media', 'want', 'com']"
],
"type": "scatter",
"x": [
115,
115,
254.9884033203125,
254.9884033203125
],
"xaxis": "x",
"y": [
0,
0.36207531185782427,
0.36207531185782427,
0.3513828465304287
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['war', 'beings', 'knowledge', 'robert', 'accepted', 'integrity', 'echo', '1972', 'cube', 'culture']",
[],
[],
"+++: ['the', 'year']
---: ['beings', 'course', 'past', 'strings', 'station', 'now', 'following', 'exciting', 'index', 'spotted']"
],
"type": "scatter",
"x": [
105,
105,
184.99420166015625,
184.99420166015625
],
"xaxis": "x",
"y": [
0,
0.3721623243723292,
0.3721623243723292,
0.36207531185782427
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['ordinary', 'turn', 'course', 'rules', 'now', 'obama', 'same', 'murderer', 'race', 'news']",
[],
[],
"+++: ['the', 'year']
---: ['war', 'beings', 'knowledge', 'robert', 'accepted', 'integrity', 'echo', '1972', 'cube', 'culture']"
],
"type": "scatter",
"x": [
95,
95,
144.99710083007812,
144.99710083007812
],
"xaxis": "x",
"y": [
0,
0.37728501615310084,
0.37728501615310084,
0.3721623243723292
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['war', 'they', 'eti', 'jihad', 'obama', 'year', 'recep', 'news', 'race', 'previous']",
[],
[],
"+++: ['the', 'year']
---: ['ordinary', 'turn', 'course', 'rules', 'now', 'obama', 'same', 'following', 'murderer', 'race']"
],
"type": "scatter",
"x": [
85,
85,
119.99855041503906,
119.99855041503906
],
"xaxis": "x",
"y": [
0,
0.38032474705740754,
0.38032474705740754,
0.37728501615310084
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['44', 'canterbury', '47', 'uk', 'buildings', 'alex', 'toxicology', 'menstrual', 'poetic', 'following']",
[],
[],
"+++: ['the', 'year']
---: ['war', 'they', 'eti', 'jihad', 'obama', 'recep', 'news', 'race', 'previous', 'fact']"
],
"type": "scatter",
"x": [
75,
75,
102.49927520751953,
102.49927520751953
],
"xaxis": "x",
"y": [
0,
0.3839725093065638,
0.3839725093065638,
0.38032474705740754
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['moldova', 'jewelry', 'kejriwal', 'bulgaria', 'eastern', 'travelers', 'highlights', 'year', 'activity', 'news']",
[],
[],
"+++: ['the', 'year']
---: ['44', 'canterbury', 'sean', '18', 'buildings', 'alex', 'toxicology', 'menstrual', 'poetic', 'following']"
],
"type": "scatter",
"x": [
65,
65,
88.74963760375977,
88.74963760375977
],
"xaxis": "x",
"y": [
0,
0.39811995565364255,
0.39811995565364255,
0.3839725093065638
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['willie', 'baxter', 'lobbying', 'past', 'fix', 'provided', 'epidemic', 'patient', 'obama', 'worse']",
[],
[],
"+++: ['the', 'year']
---: ['moldova', 'jewelry', 'bulgaria', 'kejriwal', 'eastern', 'travelers', 'highlights', 'activity', 'news', 'sevastopol']"
],
"type": "scatter",
"x": [
55,
55,
76.87481880187988,
76.87481880187988
],
"xaxis": "x",
"y": [
0,
0.4050397350268663,
0.4050397350268663,
0.39811995565364255
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['war', 'slat', 'evacuation', 'realizing', 'silsby', 'discharged', 'now', 'emitting', 'april', 'year']",
[],
[],
"+++: ['the', 'year']
---: ['willie', 'baxter', 'lobbying', 'past', 'fix', 'provided', 'epidemic', 'patient', 'obama', 'worse']"
],
"type": "scatter",
"x": [
45,
45,
65.93740940093994,
65.93740940093994
],
"xaxis": "x",
"y": [
0,
0.4066990714457525,
0.4066990714457525,
0.4050397350268663
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['uk', 'course', 'humic', 'block', 'eaten', 'year', 'despite', 'news', 'don', 'strong']",
[],
[],
"+++: ['the', 'year']
---: ['war', 'slat', 'evacuation', 'realizing', 'silsby', 'discharged', 'now', 'emitting', 'april', '04']"
],
"type": "scatter",
"x": [
35,
35,
55.46870470046997,
55.46870470046997
],
"xaxis": "x",
"y": [
0,
0.40990818502935716,
0.40990818502935716,
0.4066990714457525
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['they', 'singh', 'millennia', 'slater', 'cookies', 'baxter', 'lord', 'diamond', 'now', 'wine']",
[],
[],
"+++: ['the', 'year']
---: ['fulvic', 'course', 'humic', 'block', 'eaten', 'despite', 'news', 'don', 'strong', 'proposition']"
],
"type": "scatter",
"x": [
25,
25,
45.234352350234985,
45.234352350234985
],
"xaxis": "x",
"y": [
0,
0.4228019635629176,
0.4228019635629176,
0.40990818502935716
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['pyramids', 'emitted', 'block', 'departments', 'pinging', 'standing', 'news', 'coppola', 'electromagnetic', 'fibonacci']",
[],
[],
"+++: ['the', 'year']
---: ['they', 'singh', 'millennia', 'slater', 'cookies', 'baxter', 'lord', 'diamond', 'now', 'wine']"
],
"type": "scatter",
"x": [
15,
15,
35.11717617511749,
35.11717617511749
],
"xaxis": "x",
"y": [
0,
0.4379139957373382,
0.4379139957373382,
0.4228019635629176
],
"yaxis": "y"
},
{
"hoverinfo": "text",
"marker": {
"color": "rgb(0,116,217)"
},
"mode": "lines",
"text": [
"['war', 'jpg', 'baghdadi', 'quds', '2l', 'eastern', 'buildings', 'areas', 'april', 'hashd']",
[],
[],
"+++: ['the']
---: ['pyramids', 'emitted', 'block', 'departments', 'pinging', 'year', 'standing', 'news', 'coppola', 'electromagnetic']"
],
"type": "scatter",
"x": [
5,
5,
25.058588087558746,
25.058588087558746
],
"xaxis": "x",
"y": [
0,
0.4614169441220397,
0.4614169441220397,
0.4379139957373382
],
"yaxis": "y"
}
],
"layout": {
"autosize": false,
"height": 600,
"hovermode": "closest",
"showlegend": false,
"width": 1000,
"xaxis": {
"mirror": "allticks",
"rangemode": "tozero",
"showgrid": false,
"showline": true,
"showticklabels": true,
"tickmode": "array",
"ticks": "outside",
"ticktext": [
19,
16,
2,
10,
6,
23,
20,
31,
15,
14,
28,
13,
7,
9,
33,
12,
25,
34,
5,
32,
4,
30,
11,
35,
18,
24,
17,
29,
3,
22,
21,
26,
27,
1,
8
],
"tickvals": [
5,
15,
25,
35,
45,
55,
65,
75,
85,
95,
105,
115,
125,
135,
145,
155,
165,
175,
185,
195,
205,
215,
225,
235,
245,
255,
265,
275,
285,
295,
305,
315,
325,
335,
345
],
"type": "linear",
"zeroline": false
},
"yaxis": {
"mirror": "allticks",
"rangemode": "tozero",
"showgrid": false,
"showline": true,
"showticklabels": true,
"ticks": "outside",
"type": "linear",
"zeroline": false
}
}
},
"text/html": [
"