{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Demonstration of the topic coherence pipeline in Gensim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be using the `u_mass` and `c_v` coherence for two different LDA models: a \"good\" and a \"bad\" LDA model. The good LDA model will be trained over 50 iterations and the bad one for 1 iteration. Hence in theory, the good LDA model will be able come up with better or more human-understandable topics. Therefore the coherence measure output for the good LDA model should be more (better) than that for the bad LDA model. This is because, simply, the good LDA model usually comes up with better topics that are more human interpretable." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from __future__ import print_function\n", "\n", "import os\n", "import logging\n", "import json\n", "import warnings\n", "\n", "try:\n", " raise ImportError\n", " import pyLDAvis.gensim\n", " CAN_VISUALIZE = True\n", " pyLDAvis.enable_notebook()\n", " from IPython.display import display\n", "except ImportError:\n", " ValueError(\"SKIP: please install pyLDAvis\")\n", " CAN_VISUALIZE = False\n", "\n", "import numpy as np\n", "\n", "from gensim.models import CoherenceModel, LdaModel, HdpModel\n", "from gensim.models.wrappers import LdaVowpalWabbit, LdaMallet\n", "from gensim.corpora import Dictionary\n", "\n", "warnings.filterwarnings('ignore') # To ignore all warnings that arise here to enhance clarity" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up corpus" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As stated in table 2 from [this](http://www.cs.bham.ac.uk/~pxt/IDA/lsa_ind.pdf) paper, this corpus essentially has two classes of documents. First five are about human-computer interaction and the other four are about graphs. We will be setting up two LDA models. One with 50 iterations of training and the other with just 1. Hence the one with 50 iterations (\"better\" model) should be able to capture this underlying pattern of the corpus better than the \"bad\" LDA model. Therefore, in theory, our topic coherence for the good LDA model should be greater than the one for the bad LDA model." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "texts = [['human', 'interface', 'computer'],\n", " ['survey', 'user', 'computer', 'system', 'response', 'time'],\n", " ['eps', 'user', 'interface', 'system'],\n", " ['system', 'human', 'system', 'eps'],\n", " ['user', 'response', 'time'],\n", " ['trees'],\n", " ['graph', 'trees'],\n", " ['graph', 'minors', 'trees'],\n", " ['graph', 'minors', 'survey']]" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "dictionary = Dictionary(texts)\n", "corpus = [dictionary.doc2bow(text) for text in texts]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up two topic models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll be setting up two different LDA Topic models. A good one and bad one. To build a \"good\" topic model, we'll simply train it using more iterations than the bad one. Therefore the `u_mass` coherence should in theory be better for the good model than the bad one since it would be producing more \"human-interpretable\" topics." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "goodLdaModel = LdaModel(corpus=corpus, id2word=dictionary, iterations=50, num_topics=2)\n", "badLdaModel = LdaModel(corpus=corpus, id2word=dictionary, iterations=1, num_topics=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using U_Mass Coherence" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "goodcm = CoherenceModel(model=goodLdaModel, corpus=corpus, dictionary=dictionary, coherence='u_mass')\n", "badcm = CoherenceModel(model=badLdaModel, corpus=corpus, dictionary=dictionary, coherence='u_mass')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View the pipeline parameters for one coherence model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Following are the pipeline parameters for `u_mass` coherence. By pipeline parameters, we mean the functions being used to calculate segmentation, probability estimation, confirmation measure and aggregation as shown in figure 1 in [this](http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf) paper." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Coherence_Measure(seg=, prob=, conf=, aggr=)\n" ] } ], "source": [ "print(goodcm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interpreting the topics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we will see below using LDA visualization, the better model comes up with two topics composed of the following words:\n", "1. goodLdaModel:\n", " - __Topic 1__: More weightage assigned to words such as \"system\", \"user\", \"eps\", \"interface\" etc which captures the first set of documents.\n", " - __Topic 2__: More weightage assigned to words such as \"graph\", \"trees\", \"survey\" which captures the topic in the second set of documents.\n", "2. badLdaModel:\n", " - __Topic 1__: More weightage assigned to words such as \"system\", \"user\", \"trees\", \"graph\" which doesn't make the topic clear enough.\n", " - __Topic 2__: More weightage assigned to words such as \"system\", \"trees\", \"graph\", \"user\" which is similar to the first topic. Hence both topics are not human-interpretable.\n", "\n", "Therefore, the topic coherence for the goodLdaModel should be greater for this than the badLdaModel since the topics it comes up with are more human-interpretable. We will see this using `u_mass` and `c_v` topic coherence measures." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualize topic models" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "if CAN_VISUALIZE:\n", " prepared = pyLDAvis.gensim.prepare(goodLdaModel, corpus, dictionary)\n", " display(pyLDAvis.display(prepared))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "if CAN_VISUALIZE:\n", " prepared = pyLDAvis.gensim.prepare(badLdaModel, corpus, dictionary)\n", " display(pyLDAvis.display(prepared))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-14.695344054692296\n", "-14.722989402972397\n" ] } ], "source": [ "print(goodcm.get_coherence())\n", "print(badcm.get_coherence())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using C_V coherence" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "goodcm = CoherenceModel(model=goodLdaModel, texts=texts, dictionary=dictionary, coherence='c_v')\n", "badcm = CoherenceModel(model=badLdaModel, texts=texts, dictionary=dictionary, coherence='c_v')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pipeline parameters for C_V coherence" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Coherence_Measure(seg=, prob=, conf=, aggr=)\n" ] } ], "source": [ "print(goodcm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Print coherence values" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.3838413553737203\n", "0.3838413553737203\n" ] } ], "source": [ "print(goodcm.get_coherence())\n", "print(badcm.get_coherence())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Support for wrappers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This API supports gensim's _ldavowpalwabbit_ and _ldamallet_ wrappers as input parameter to `model`." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Replace with path to your Vowpal Wabbit installation\n", "vw_path = '/usr/local/bin/vw'\n", "\n", "# Replace with path to your Mallet installation\n", "home = os.path.expanduser('~')\n", "mallet_path = os.path.join(home, 'mallet-2.0.8', 'bin', 'mallet')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "ename": "FileNotFoundError", "evalue": "[Errno 2] No such file or directory: '/usr/local/bin/vw': '/usr/local/bin/vw'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mLdaVowpalWabbit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvw_path\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcorpus\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcorpus\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnum_topics\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mid2word\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdictionary\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpasses\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m50\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mmodel2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mLdaVowpalWabbit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvw_path\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcorpus\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcorpus\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnum_topics\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mid2word\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdictionary\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpasses\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/git/gensim/gensim/models/wrappers/ldavowpalwabbit.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, vw_path, corpus, num_topics, id2word, chunksize, passes, alpha, eta, decay, offset, gamma_threshold, random_seed, cleanup_files, tmp_prefix)\u001b[0m\n\u001b[1;32m 214\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 215\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcorpus\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 216\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcorpus\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 217\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 218\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcorpus\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/git/gensim/gensim/models/wrappers/ldavowpalwabbit.py\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(self, corpus)\u001b[0m\n\u001b[1;32m 235\u001b[0m \u001b[0mcmd\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_vw_train_command\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcorpus_size\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 236\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 237\u001b[0;31m \u001b[0m_run_vw_command\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcmd\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 238\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 239\u001b[0m \u001b[0;31m# ensure that future updates of this model use correct offset\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/git/gensim/gensim/models/wrappers/ldavowpalwabbit.py\u001b[0m in \u001b[0;36m_run_vw_command\u001b[0;34m(cmd)\u001b[0m\n\u001b[1;32m 849\u001b[0m \u001b[0mlogger\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0minfo\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Running Vowpal Wabbit command: %s\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m' '\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcmd\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 850\u001b[0m proc = subprocess.Popen(cmd, stdout=subprocess.PIPE,\n\u001b[0;32m--> 851\u001b[0;31m stderr=subprocess.STDOUT)\n\u001b[0m\u001b[1;32m 852\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mproc\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcommunicate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdecode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'utf-8'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 853\u001b[0m \u001b[0mlogger\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdebug\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Vowpal Wabbit output: %s\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutput\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/lib/python3.7/subprocess.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)\u001b[0m\n\u001b[1;32m 773\u001b[0m \u001b[0mc2pread\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mc2pwrite\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 774\u001b[0m \u001b[0merrread\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0merrwrite\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 775\u001b[0;31m restore_signals, start_new_session)\n\u001b[0m\u001b[1;32m 776\u001b[0m \u001b[0;32mexcept\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 777\u001b[0m \u001b[0;31m# Cleanup if the child failed starting.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/lib/python3.7/subprocess.py\u001b[0m in \u001b[0;36m_execute_child\u001b[0;34m(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)\u001b[0m\n\u001b[1;32m 1520\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0merrno_num\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0merrno\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mENOENT\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1521\u001b[0m \u001b[0merr_msg\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;34m': '\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mrepr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merr_filename\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1522\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mchild_exception_type\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merrno_num\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0merr_msg\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0merr_filename\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1523\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mchild_exception_type\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merr_msg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1524\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '/usr/local/bin/vw': '/usr/local/bin/vw'" ] } ], "source": [ "model1 = LdaVowpalWabbit(vw_path, corpus=corpus, num_topics=2, id2word=dictionary, passes=50)\n", "model2 = LdaVowpalWabbit(vw_path, corpus=corpus, num_topics=2, id2word=dictionary, passes=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cm1 = CoherenceModel(model=model1, corpus=corpus, coherence='u_mass')\n", "cm2 = CoherenceModel(model=model2, corpus=corpus, coherence='u_mass')\n", "print(cm1.get_coherence())\n", "print(cm2.get_coherence())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model1 = LdaMallet(mallet_path, corpus=corpus, num_topics=2, id2word=dictionary, iterations=50)\n", "model2 = LdaMallet(mallet_path, corpus=corpus, num_topics=2, id2word=dictionary, iterations=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cm1 = CoherenceModel(model=model1, texts=texts, coherence='c_v')\n", "cm2 = CoherenceModel(model=model2, texts=texts, coherence='c_v')\n", "print(cm1.get_coherence())\n", "print(cm2.get_coherence())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Support for other topic models\n", "The gensim topics coherence pipeline can be used with other topics models too. Only the tokenized `topics` should be made available for the pipeline. Eg. with the gensim HDP model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hm = HdpModel(corpus=corpus, id2word=dictionary)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# To get the topic words from the model\n", "topics = []\n", "for topic_id, topic in hm.show_topics(num_topics=10, formatted=False):\n", " topic = [word for word, _ in topic]\n", " topics.append(topic)\n", "topics[:2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialize CoherenceModel using `topics` parameter\n", "cm = CoherenceModel(topics=topics, corpus=corpus, dictionary=dictionary, coherence='u_mass')\n", "cm.get_coherence()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conclusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hence as we can see, the `u_mass` and `c_v` coherence for the good LDA model is much more (better) than that for the bad LDA model. This is because, simply, the good LDA model usually comes up with better topics that are more human interpretable. The badLdaModel however fails to decipher between these two topics and comes up with topics which are not clear to a human. The `u_mass` and `c_v` topic coherences capture this wonderfully by giving the interpretability of these topics a number as we can see above. Hence this coherence measure can be used to compare difference topic models based on their human-interpretability." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 1 }