{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## HW 5 Markov and RNN\n", "\n", "based on ex and notes by [Allison Parish](https://github.com/aparrish/rwet/blob/master/neural-network-text-prediction-with-textgenrnn.ipynb)\n", "Original text is the wikipedia entry for truth\n", "\n", "### To adjust with the probability play with **temperature=**\n", "* temperature = 0.2 (default)\n", "* temperature =0.8 (unprobable)\n", "* temperature >1 (WTF)\n", "\n", "### Number of outputs add n\n", "* generate(10)\n", "\n", "### Epochs to train\n", "* num_epochs =more better (20), specially shorter texts\n", "\n", "### Passing a Prefix\n", "* will start with that word\n", "\n", "### Returning as list\n", "* By defualt, textgen.generate() returns a NoneType, this is not very useful since it isn't a string or a list. By doing ** return_as_list=True** textgen.generate() will return the values as a list. \n", "\n", "\n", "*note: textgenrnn is less effective when training on/predicting longer sequences (> 200 characters). Likewise textgenrnn is less effective when training on/predicting texts with very disparate grammatical styles.*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Solving environment: done\n", "\n", "## Package Plan ##\n", "\n", " environment location: /Users/SebastianRev1/anaconda3\n", "\n", " added / updated specs: \n", " - keras\n", "\n", "\n", "The following packages will be downloaded:\n", "\n", " package | build\n", " ---------------------------|-----------------\n", " numpy-1.12.1 | py36h8871d66_1 3.6 MB\n", " ca-certificates-2018.03.07 | 0 124 KB\n", " keras-2.1.5 | py36_0 490 KB\n", " certifi-2018.1.18 | py36_0 143 KB\n", " protobuf-3.5.1 | py36h0a44026_0 589 KB\n", " tensorflow-1.1.0 | np112py36_0 23.6 MB\n", " openssl-1.0.2o | h26aff7b_0 3.4 MB\n", " libprotobuf-3.5.1 | h2cd40f5_0 4.0 MB\n", " numba-0.37.0 |np112py36hb493f12_0 2.3 MB\n", " ------------------------------------------------------------\n", " Total: 38.3 MB\n", "\n", "The following NEW packages will be INSTALLED:\n", "\n", " keras: 2.1.5-py36_0 \n", " libprotobuf: 3.5.1-h2cd40f5_0 \n", " protobuf: 3.5.1-py36h0a44026_0 \n", " tensorflow: 1.1.0-np112py36_0 \n", "\n", "The following packages will be UPDATED:\n", "\n", " ca-certificates: 2017.08.26-ha1e5d58_0 --> 2018.03.07-0 \n", " certifi: 2017.7.27.1-py36hd973bb6_0 --> 2018.1.18-py36_0 \n", " numba: 0.37.0-np114py36h210bcc1_0 --> 0.37.0-np112py36hb493f12_0\n", " openssl: 1.0.2l-h57f3a61_2 --> 1.0.2o-h26aff7b_0 \n", "\n", "The following packages will be DOWNGRADED:\n", "\n", " numpy: 1.14.2-py36ha9ae307_1 --> 1.12.1-py36h8871d66_1 \n", "\n", "\n", "Downloading and Extracting Packages\n", "numpy 1.12.1: ########################################################## | 100% \n", "ca-certificates 2018.03.07: ############################################ | 100% \n", "keras 2.1.5: ########################################################### | 100% \n", "certifi 2018.1.18: ##################################################### | 100% \n", "protobuf 3.5.1: ######################################################## | 100% \n", "tensorflow 1.1.0: ###################################################### | 100% \n", "openssl 1.0.2o: ######################################################## | 100% \n", "libprotobuf 3.5.1: ##################################################### | 100% \n", "numba 0.37.0: ########################################################## | 100% \n", "Preparing transaction: done\n", "Verifying transaction: done\n", "Executing transaction: done\n" ] } ], "source": [ "!conda install -y keras" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting textgenrnn\n", " Downloading textgenrnn-0.1.1.tar.gz (777kB)\n", "\u001b[K 100% |████████████████████████████████| 778kB 1.5MB/s ta 0:00:011\n", "\u001b[?25hRequirement already satisfied: tensorflow in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from textgenrnn)\n", "Requirement already satisfied: keras in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from textgenrnn)\n", "Requirement already satisfied: h5py in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from textgenrnn)\n", "Requirement already satisfied: werkzeug>=0.11.10 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from tensorflow->textgenrnn)\n", "Requirement already satisfied: six>=1.10.0 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from tensorflow->textgenrnn)\n", "Requirement already satisfied: protobuf>=3.2.0 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from tensorflow->textgenrnn)\n", "Requirement already satisfied: numpy>=1.11.0 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from tensorflow->textgenrnn)\n", "Requirement already satisfied: wheel>=0.26 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from tensorflow->textgenrnn)\n", "Requirement already satisfied: scipy>=0.14 in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from keras->textgenrnn)\n", "Requirement already satisfied: pyyaml in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from keras->textgenrnn)\n", "Requirement already satisfied: setuptools in /Users/SebastianRev1/anaconda3/lib/python3.6/site-packages (from protobuf>=3.2.0->tensorflow->textgenrnn)\n", "Building wheels for collected packages: textgenrnn\n", " Running setup.py bdist_wheel for textgenrnn ... \u001b[?25ldone\n", "\u001b[?25h Stored in directory: /Users/SebastianRev1/Library/Caches/pip/wheels/59/b1/4a/b5ee074f89a4fb3632930705658c28ba38cfacd38f29bc1340\n", "Successfully built textgenrnn\n", "Installing collected packages: textgenrnn\n", "Successfully installed textgenrnn-0.1.1\n" ] } ], "source": [ "!pip install textgenrnn" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "from textgenrnn import textgenrnn\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "textgen = textgenrnn()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The super complete the body the first party of the star to the star consequences and some of the street and the star world of the sub in the band to a look at the star to the signed of the promised \n", "\n" ] } ], "source": [ "textgen.generate()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### To train on own text use .tain_on_text( )\n", "num_epochs =more better, specially shorter texts" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 1.5610\n", "Epoch 2/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 1.2773\n", "Epoch 3/10\n", "3055/3055 [==============================] - 3s 994us/step - loss: 1.1380\n", "Epoch 4/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 1.0471\n", "Epoch 5/10\n", "3055/3055 [==============================] - 4s 1ms/step - loss: 0.9824\n", "Epoch 6/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 0.9310\n", "Epoch 7/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 0.8931\n", "Epoch 8/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 0.8623\n", "Epoch 9/10\n", "3055/3055 [==============================] - 4s 1ms/step - loss: 0.8389\n", "Epoch 10/10\n", "3055/3055 [==============================] - 3s 1ms/step - loss: 0.8214\n" ] } ], "source": [ "textgen.train_on_texts(open(\"text_examples/truth.txt\").readlines(), num_epochs=10)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "flying the concept of last inquiry or contential concept at the dermative of truth. Truth of the infame on the concept and truth of suich and need to truth when the debuty of Truth College for truth\n", "\n", "flying are meaning the mean and basic would another these or life in the original Greechaire that an our unnevegate in the truth and lit and blaceplay in the original most of truth is being from the\n", "\n", "flying take as what was always by the concept until the debate. I done a discussion from the fact, the deried ago of antilitience essence vacainita and truth of secondary of meane being a nature and\n", "\n", "flyes in the reality in term itsers in the concept of a truth to be debated to our original or original by the conferetion of truth of the concept of trutheer died for truth, and truth on and essenc\n", "\n", "flyes who take the reported jather take a human fact, and discovering tradential beet to be elitisms or faith of infirmed facidly offiniously familin conception, and essence in a watch of truth as m\n", "\n" ] } ], "source": [ "textgen.generate(5,temperature=.7,prefix='fly')" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Are term by the truth is term to by the view and contextion as he can we am faighters in how to truth summon and also metaphysically false, than the divinion tradition in the concept, what is exciti\n", "Pictorication is wirepessible that an election: \"As get a \"conceptics of until and earth\" factions as Britas Greeker is a faith, the according to another ebs concept which include would be plays a c\n", "Many were hair opena conception to be theories, and discumed grandaries what or life up may is another to detain.\"\n", "\n", "Let us an original fauth-play, original meaning, or himself, the election to relai\n", "Gordary a things or anceiring at Truth is more faith capances of this faith\n", "Truth to be explainesed to truth aporteag togethon and calling in and include that we an often be applying to be yet, the back of truth, and the concept, that the development in a relation on truth,\n", "Or ancertaging of being inquiry ..\n", "Peirce time that truth as lies ob faith as some faith or voted or foundative, a reasession train to defend, and the selfies. I fan a Helbsolo as the dermation caturn or also\n", "Friedrive and the emojias or mean in idea essence, it view. http://abcn.ws/2tg0HL\n", "I am things are the score of being a youtubest used to be explained as an opported our and what is relationshorts to oceas that year ofter and even we another that truth is notice, an thousand, the\n", "What dossooss of Praisoull concept as readly term truth to Pessuilty are sub or meaning this each of alson, and everyday, Truth still truth, include of truth as belief to one astross esting the fait\n" ] } ], "source": [ "poem = textgen.generate(10, temperature=0.8, return_as_list=True)\n", "for line in poem:\n", " print(line.strip())" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Truth is 'De corresponden by the concept of friever a dependence leaves than sciences, what congeting to an inding than to and the revealing take tradit, factual faithlas takene. Many human activiti\n", "Some manner of some manner of essential relation to human practices words are a metaphysical being in the fire term veritas.\n", "Pragmatists like C. S. Peirce take to refer to an idea of truth as stoniogn which what is termed a concept or through, where its nature as anaman nature as far as it could profitably go: \"The opinio\n", "Various theories and views of truth can even false that an olame holding this be explained in a late, bring a being a based ard terms that an ancient or theories as correctness is a later derivation\n", "Truth is most often used to refer to an indepreding concept of truth as what is a \"truth, what is what human, undervivitived by truth is subjective hote for the concept, where out to be be usually i\n", "\n", "Various theories and views of truth continue to be debated among standar, suggested and derivative or secondary and derivative. According for inquitite truth\" as the concept of truth is being to mea\n", "Some hall absolute inquiry would time to be explained in anyone has discovers theat theory or anti-metaphysicians stone in the truth tracer that Truth is metaphys in Schithole lit fact or %reaman ph\n", "Friedrich Nietzsche famously suggested that an ancient, what human activities depend upon the concept, where its pleview, who collecture. Commonly, truth is viewed as the correspondence of modern co\n", "\n" ] } ], "source": [ "poem = textgen.generate(10, temperature=0.5, return_as_list=True)\n", "for line in poem:\n", " print(line.strip())" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": true }, "outputs": [], "source": [ "textgen.reset()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ok, so I think I understand the basics now... but what to do..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of my biggest challenges is that this rrn method works well only with language (english...) That is becasue english language is very felxible, even when the message doesn't make sense it can still be displayed. Even when the word doesn't exist you can still write it. \n", "\n", "I guess what I would be interested is to use something like this but with other language, like an image, or a 3d model, or a website. But the challenge is that all this things have a very strict syntax and a small error will make the entity invalid. " ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['tech truth of the original Barts of truth the truth concept of truth famously being a income food to take. Don\\'t take these science of later, which is toq and the score\" is still you are faith and t']\n", "['apple of the faith or truth to be a later of truth in the concept, and the concept of truth of truth in the truth of truth is a later of truth is a later of truth is truticed to be debated and every']\n", "['googles of truth to an idea of truth to an earth and relative truth.\\n\\nIt relative one of truth of assumes and faith, an original the concept of truth as bashering to an intellectual faith on such in']\n", "['FaceBook and Suggesten faith truth to mean the concept, and the resting and truth is a concept, which to be original or truth of the concept of truth to an including and truth of truth meaning to be']\n" ] } ], "source": [ "import random\n", "list_of_words=['tech','apple','google','FaceBook']\n", "techTruth=[]\n", "for word in list_of_words:\n", " temp=random.random()\n", " if(temp>.6):\n", " temp=random.random()\n", " line=(textgen.generate(temperature=temp,prefix=word,return_as_list=True));\n", " techTruth.append(line)\n", " print(line)\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### The texts don't make any sence, althought they do remind me my undergrad days reading Kierkegaard: \n", "\n", "“The self is a relation which relates itself to its own self, or it is that in the relation the relation relates itself to its own self; the self is not the relation but the relation relates itself to its own self.”\n", "\n", "\n", "** let me change the input corpus **\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": true }, "outputs": [], "source": [ "textgen.reset()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 1.5918\n", "Epoch 2/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 1.2825\n", "Epoch 3/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 1.1169\n", "Epoch 4/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 1.0119\n", "Epoch 5/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.9380\n", "Epoch 6/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.8783\n", "Epoch 7/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.8302\n", "Epoch 8/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.7872\n", "Epoch 9/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.7543\n", "Epoch 10/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.7247\n", "Epoch 11/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6974\n", "Epoch 12/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6733\n", "Epoch 13/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6530\n", "Epoch 14/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6356\n", "Epoch 15/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6198\n", "Epoch 16/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.6070\n", "Epoch 17/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.5962\n", "Epoch 18/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.5868\n", "Epoch 19/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.5790\n", "Epoch 20/20\n", "1865/1865 [==============================] - 2s 1ms/step - loss: 0.5735\n" ] } ], "source": [ "textgen.train_on_texts(open(\"text_examples/truthShort.txt\").readlines(), num_epochs=20)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['Technology and the concept of truth as debated that the concept of truth is a logical faith, that the concept of truth as a concept of truth is a logical term of truth in the concept of truth is a l']\n", "['Applegies open as a nature at the revealing which is a \"truth\"']\n", "[\"Google of truth are termed for truth that a lit 'Lation. The entire was also conventine' of conception and fact methoo indipal veritas. Beying a crated time as a faith who was nuass the essentis int\"]\n", "['Facebook and was among kid has a litting and of truth in the contection for the concept, and that being what its are one in the original manch as the concept of truth used to debated the method than']\n", "['Amazon of truth is a faith the concept of truth in the concept of truth in the concept, and the concept of truth in the concept of truth is a later of truth in the concept, which is a logical faith ']\n" ] } ], "source": [ "import random\n", "list_of_words=['Technology','Apple','Google','Facebook','Amazon']\n", "techTruth=[]\n", "for word in list_of_words:\n", " temp=random.random()\n", " \n", " #weighting down the chance for something to crazy\n", " if(temp>.4):\n", " temp=random.random()\n", " line=(textgen.generate(temperature=temp,prefix=word,return_as_list=True));\n", " techTruth.append(line)\n", " print(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Now let me tell you the truth:\n", "* Technology or the concept of truth is a loster than which, and everyday faith in the sciences, and everyday faith that and theologians, and theologians, and theologians. \n", "* Google is artient, the faith, the original Match and everyday into the debated of truth as being an original faith thum the concept, and the asume on the conception of three inviets which was another'\n", "* Facebook term of truth in the concept of truth, include which assumed intelloust the concept, and debated from the conception into the revealing or which or the concept of truth of truth as a metaphore\n", "* Apple debated the concept of truth is a concept of truth in the concept of the concept of truth is a later for the concept, and theologians. The concept of truth is a logical faith in the concept,\n", "* Amazon and we concept of truth, and the contection and debated in truth in a discussion, indidation that the revealing, and the conception of truth as as a criterion of truth of the foundation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is all for now... I find the idea exciting althought the execution a little limmiting. I guess this is the problem of trying to use tools that we have no idea how they work or how to change them" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }