In [19]:
import keras
keras.__version__

'2.0.8'

# Text generation with LSTM

This notebook contains the code samples found in Chapter 8, Section 1 of [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text features far more content, in particular further explanations and figures: in this notebook, you will only find source code and related comments.

----

[...]

## Implementing character-level LSTM text generation


Let's put these ideas in practice in a Keras implementation. The first thing we need is a lot of text data that we can use to learn a 
language model. You could use any sufficiently large text file or set of text files -- Wikipedia, the Lord of the Rings, etc. In this 
example we will use some of the writings of Nietzsche, the late-19th century German philosopher (translated to English). The language model 
we will learn will thus be specifically a model of Nietzsche's writing style and topics of choice, rather than a more generic model of the 
English language.

## Preparing the data

Let's start by downloading the corpus and converting it to lowercase:

In [21]:
import keras
import numpy as np

path = keras.utils.get_file(
 'nietzsche.txt',
 origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path).read().lower()
print('Corpus length:', len(text))

Corpus length: 600893



Next, we will extract partially-overlapping sequences of length `maxlen`, one-hot encode them and pack them in a 3D Numpy array `x` of 
shape `(sequences, maxlen, unique_characters)`. Simultaneously, we prepare a array `y` containing the corresponding targets: the one-hot 
encoded characters that come right after each extracted sequence.

In [22]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
 sentences.append(text[i: i + maxlen])
 next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
 for t, char in enumerate(sentence):
 x[i, t, char_indices[char]] = 1
 y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 200278
Unique characters: 57
Vectorization...


## Building the network

Our network is a single `LSTM` layer followed by a `Dense` classifier and softmax over all possible characters. But let us note that 
recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in 
recent times.

In [23]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

Since our targets are one-hot encoded, we will use `categorical_crossentropy` as the loss to train the model:

In [24]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the language model and sampling from it


Given a trained model and a seed text snippet, we generate new text by repeatedly:

* 1) Drawing from the model a probability distribution over the next character given the text available so far
* 2) Reweighting the distribution to a certain "temperature"
* 3) Sampling the next character at random according to the reweighted distribution
* 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, 
and draw a character index from it (the "sampling function"):

In [25]:
def sample(preds, temperature=1.0):
 preds = np.asarray(preds).astype('float64')
 preds = np.log(preds) / temperature
 exp_preds = np.exp(preds)
 preds = exp_preds / np.sum(exp_preds)
 probas = np.random.multinomial(1, preds, 1)
 return np.argmax(probas)


Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures 
after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of 
temperature in the sampling strategy.

In [26]:
import random
import sys

for epoch in range(1, 60):
 print('epoch', epoch)
 # Fit the model for 1 epoch on the available training data
 model.fit(x, y,
 batch_size=128,
 epochs=1)

 # Select a text seed at random
 start_index = random.randint(0, len(text) - maxlen - 1)
 generated_text = text[start_index: start_index + maxlen]
 print('--- Generating with seed: "' + generated_text + '"')

 for temperature in [0.2, 0.5, 1.0, 1.2]:
 print('------ temperature:', temperature)
 sys.stdout.write(generated_text)

 # We generate 400 characters
 for i in range(400):
 sampled = np.zeros((1, maxlen, len(chars)))
 for t, char in enumerate(generated_text):
 sampled[0, t, char_indices[char]] = 1.

 preds = model.predict(sampled, verbose=0)[0]
 next_index = sample(preds, temperature)
 next_char = chars[next_index]

 generated_text += next_char
 generated_text = generated_text[1:]

 sys.stdout.write(next_char)
 sys.stdout.flush()
 print()

epoch 1
Epoch 1/1
--- Generating with seed: "h they inspire." or, as la
rochefoucauld says: "if you think"
------ temperature: 0.2
h they inspire." or, as la
rochefoucauld says: "if you think in the sense of the say the same of the antimated and present in the all the has a such and opent and the say and and the fan and the sense of the into the sense of the say the words and the present the sense of the present present of the present in the man is the man in the sense of the say the sense of the say and the say and the say it is the such and the sense of the ast the sense of the say 
------ temperature: 0.5
t is the such and the sense of the ast the sense of the say the instand of the way and it is the man for the some songully the sain it is opperience of all the sensity of the same the intendition of the man, in the most with the same philosophicism of the feelient of internations of a present and and colleng it is the sense the greath to the highers of the antolity as nature and th

redical pleniscion ap no revereiblines, tho lacquiring that fegais oracus--is preyer. the pery measime, as firnom and rack. -purss
love to they like relight of
reoning
cage of signtories, the timu to
coursite; that libenes afverbtersal; all catured, ehhic: when all tumple, heartted a inhting in
away love
the puten
party al mistray. i jesess. own can clatorify
seloperati", wh
epoch 5
Epoch 1/1
--- Generating with seed: "ion (werthschätzung)--the
dislocation, distortion and the ap"
------ temperature: 0.2
ion (werthschätzung)--the
dislocation, distortion and the appearation of his sensition and conscience of the distrusting the far the sensition of the individually the suffering the sense of the presentiments of the sense of the suffering and suffering the stronger of the suffering and the consequently the sense of the subject of the sense of the moral the sense of the desire the sense of the
self--and the sensition of the suffering the sensition of the sen
------ temperature: 0.5
-and t

calmualing own he interpretic thingsnly, there your new dothrible for rights at which and
germansness of
eternal, meanss, pruded from warthor. - a continceion," but a suppose, european allowu
------ temperature: 1.2
om warthor. - a continceion," but a suppose, european allowubleness! to give smotifits and dorming be
had
charm, thenloces science great too "scinccengenness from courseituss.ogus, out of estimately-pokeno myselveed chulked ain also it to
reloch: even thinds spisprequapal art. congojedt, and vocture.) an erdorled ftich must when their freeusedsed and counter in-part the most"-alfalxible to
fulgesing" outtebitio
sequien
supinuntsism," and man is,
are a
perh
epoch 9
Epoch 1/1
--- Generating with seed: "one wished to do away
altogether with the "seeming world"--w"
------ temperature: 0.2
one wished to do away
altogether with the "seeming world"--who has to the problem and consequently been the sense of the sense of the state of the personal still the problem of the contempt an

 This is separate from the ipykernel package so we can avoid doing imports until


he souring and light and danger of the strong and the feeling to the end in one of the end and present mankind, the the have always by the same something and will, the contraltenen and deligious stranges and the the present a plentroly,
fundamental success, and soul and place of its suff
------ temperature: 1.0
entroly,
fundamental success, and soul and place of its sufftice is he onemy, by them quibhardable on the laboriwing remorrided look confunted pricite rung, who delightantter, it has hence aspaction and
stricited valus every thing from which the world and one should day be letent of being acticled. conversant around art up swear. ranking to which wishes them grow simply injuring the fraginged the sensibility; there are man taken of anazied, hisell wisund o
------ temperature: 1.2
sensibility; there are man taken of anazied, hisell wisund or a
redie ivale of ho-proceby longer, ultile. the its, inating in it? theught upon rempily spirits"--may lents
datigualy read
noticish in succ

=well-wishing.=--among the small, but infinitely plentures of the great and world in the same the experience of the same a men the sense of the same histle as a still and the has as a said and suffering and superiority of the problem of a conscience of the end and man is a power of the then are and into the same an incarate of the same the suffering the same a more the conduct to the same a long and more the profound the profound the same all the fo
------ temperature: 0.5
 long and more the profound the profound the same all the former prevolute of the
same the liberation in the end of the science that the conduct as a must be honor of the ancient and the consequently. the instinct to the sense of his knowledge to a long to the desire of facts of the error, and such a fact to be as
"nature of the more fragnous, one are former of the age the suxis for the belief of a great the engeshed men the perto body in a last the foreto
------ temperature: 1.0
a great the engeshed men the perto bo

deassing; but willih finally. that. confessates for of heart. the eyes, always this gives, present o
epoch 19
Epoch 1/1
--- Generating with seed: " believers are too noisy and obtrusive; he guards against
th"
------ temperature: 0.2
 believers are too noisy and obtrusive; he guards against
the sense of the strength and the singer of the sense of the sense and as a more the profoundly so the still the sense of the sense of the soul of the same the intellectual strength and his own soul of his own own contemposent and the has a still and the soul of the sense of the sense of the significancing of the sense of the fact that the belief of the instinct and the sense of the soul of the sens
------ temperature: 0.5
belief of the instinct and the sense of the soul of the sense of a man is the happy, the spirit with the most soul of his striting for the fact that with the considerate himself in the primord that it is we have not delights and also found to his experience of the most man, who wit

the type, man, thbee-iduan-talvents of thesering these didane
at shlokeity concerning consemials: whereto petch ashertain itself asoperike is much , does the newsnecre mub)ighual unexercatlist:

the promise innocent littuate. other every senses again--thaty he present medehter mekon "for timpa, inturiyed
"natted, discondicity
of
maptest tyem. no then, diw, very habmlionacn") for the false
epoch 23
Epoch 1/1
--- Generating with seed: "men, not great enough, nor hard enough,
to be entitled as ar"
------ temperature: 0.2
men, not great enough, nor hard enough,
to be entitled as are sense of the constant and precisely to be believed and the subjection of the same the superficial interest and subtle profoundly be intellect and to be the sense of the superficial the superficial person and stronger and subjections of the feeling of the the whole and conscience of the superficial and the profounder and subjection of the superficial profoundly and the subject the states and sens
------ temperat

the attercuate an high all manuilation in which as a romantidly with "stateless"! it is its balting upon the
done that it ommand way without with virtue has very adorained among. it is, against willed
------ temperature: 1.2
 with virtue has very adorained among. it is, against willed itself, skers in perhaps volution, in trages races, characters been a inesliding over, as
or deterioration. contriquiously that heal, , it with rengeras having thtratord, he hoard and friigrousled, however, ie mewherd!
 he in, lohe unevery humanity. a which
that my only vengeutaries"hway and
winded reality--"l recolve is inacresit in? a view.
the "our
wledigated he: this plach rement. fasting
epoch 27
Epoch 1/1
--- Generating with seed: " democratic mingling of classes and
races--it is only the ni"
------ temperature: 0.2
 democratic mingling of classes and
races--it is only the ninamed and such a morality of the most conscience of the power of the work of the such and every power of the presentiment for 

re the sense of the same the individuals and the stronger the consequently the wantong and the spirit man that the noble to the problem, and and faith--when they are his commence and we feel the free trough and precisely something which is that the even that is is that the far that the end that it is the greatest the conceive and it has as distinctions, that is some great opposing for the greatest and the soul, and and words of faith, which is not in the f
------ temperature: 1.0
 and the soul, and and words of faith, which is not in the factors as it rementions. the feelor
(silability
well in which its profounds these wordes. bas-qualities who in
romindly to
which
how much to author in necessary
care of human ease evil that which the edriney probable and taughten its
falsehood for
these proprord cast turn modern ideas
to have them pursorate in"s, with the striving
at them is
the "such more
will no dain, healty which much as worth 
------ temperature: 1.2
 is
the "such more
will no dai

tween knowledge and capacity is perhaps greater, and also more and the desire and the same time of the same talred and the sense of the desires, and the same time of the same the state of the contradiction of the same the sense of the contradiction of the sense of the sense of the same all the concerning to the sense of the reason, and the sense of the deation of the same tall and and be say of the same the sense of the same antithours and saint and the se
------ temperature: 0.5
e same the sense of the same antithours and saint and the sense that is the same desires, that is the values" of the contrary interest and to any origin of the hoper and absolutely be all carry and in such an antithours in the same sought itself is solition of the one who does not be deepen the contrary self-senteness of the most indescrust, and the saugh and and the same task in the general every the security, which have the same njudicism of the lower th
------ temperature: 1.0
 the security, which have the 

contractly had habstlok noe be systemates! they havely, away to seevts: they
languaunant"
of a causanc
epoch 38
Epoch 1/1
--- Generating with seed: "n, then nearly every manifestation of so called immoral
egoi"
------ temperature: 0.2
n, then nearly every manifestation of so called immoral
egoistic and also an excessions of the sense of the propositic the stone of the single and stronger that the stronger the stronger the stone of the stronger the stronger all the stone of the stronger the stronger the stronger the stronger the fact that the stronger the spirit of the most plato that the sense of the stronger the stronger the world is also far as a more sensation of the fact that the wo
------ temperature: 0.5
orld is also far as a more sensation of the fact that the world of the will to himself for the power of the present case of the unward and philosophers is literated to the sense of the greek for the super. he were the fact of the live the world and long the store in the charm of 

hesitate the principle as "sabene. and something his recognized that to behing, and convention of his well the dreams foverstoo
"folteous. to example
at once not a higher they
thus diver imming shack of every fay of blonerstoved in ourselves customs and i of mediocre, siffers sol
a generating, how her in with pleasure of this spirited no,
ye present knowledge or "cracung invo-dmjeeles, error faith how to bad histood of
natures
of nugisce, however, a
epoch 42
Epoch 1/1
--- Generating with seed: "erhaps not be the
exception, but the rule?--perhaps genius i"
------ temperature: 0.2
erhaps not be the
exception, but the rule?--perhaps genius in the conscience and the same the acts of the same the superior the same the conscience of the surerism, the strength and almost and single and all the same the strange and such a man and stronger themselves the more still and the same the same the superior the same the strange and desire and the propers of the fact themselves the properficulation of t

. how musticked rationificarian of this interest he attence of rangency for its occasion condeds to round
in and metnige to his bodiely wherever lead within a make therebned coars interpr
pointation of only that fundamentally founder-possible the
belief in
enlo
: meyso
cipimation now
------ temperature: 1.2
y founder-possible the
belief in
enlo
: meyso
cipimation now--how caused, and
and look is attain of the herd.--decearness philosochodiblity maims (froppolute gurp into the hange, of great darty
brighters philosophic obely
prjugated
of tedumasing. they
as one of its falsepues in mastering upon human primitive is evil) thriok is albitry
write leving a
recogninne, by that some desire
applurent generalicy in
an are life religion of horthest, however s. 
suit
epoch 46
Epoch 1/1
--- Generating with seed: "man would
not have the genius for adornment, if she had not "
------ temperature: 0.2
man would
not have the genius for adornment, if she had not the strength the strength the state and 

his greek proposition of the commander, in the experience). the acts of the greatest been the should and life and could eatiness the supererable and suffer the sense of the entire own delight and morality are always a
------ temperature: 1.0
he sense of the entire own delight and morality are always again-createm mrou: what does changes in the diskeps, which willification is in the inner an other
said appeared but refinement with it" whiched the higher uncertaid hitherto is advan
demorar trough or "selighter
about only true, so
great his trainants, it are far the futurd which one's relacle appeared by the lards), the mediocrer, hence the individuls, assumes refore, the greater hif-act forti wi
------ temperature: 1.2
the individuls, assumes refore, the greater hif-act forti with, a loums
of the
veryeing,
artably enemy willitude;
it is withor
the
coach he is, perhaps, this our freehh, than who has digness regards a grolite: every
friend and colledence., feventrians).


boing to rage be 

mediocrity, which labours instinctively for the present the present and power and an absolute and place is a power and an age of the most personal man and power the greatest states of the fact that the propetured the strength to an inverself when the strength to the soul of the same art of the propetured and all the strength to an antithed and personal the soul of the most propettion of the strength to an antithed to every antithesity of
------ temperature: 0.5
ttion of the strength to an antithed to every antithesity of the most decided and in the inventhed himself as he seems to respection which has not taken and of the incentines of the present to be sure as out of the heavicates and person of the will to do neverthes the staterament the truth and darier, and art of the very one the disposition of the belief in the that still many which is also still their soul and belief of dangerous than with a man which els
------ temperature: 1.0
their soul and belief of dangerous than with a ma

furthes--he cannecsive as they ce wherd henced obsabise
of complimationar your strong is som
epoch 57
Epoch 1/1
--- Generating with seed: "matter, partly conventional and arbitrarily manifested in re"
------ temperature: 0.2
matter, partly conventional and arbitrarily manifested in reläwivégre the gratherere 
the will of historioeh? the fathetyre, while of
the morts vely to com anding h" mor ever of pheleeongs, anticion theers "the fathmon or faul f"wiynart " are some bumant angible of who the -yeas beinxger taing. the naitl zation ugntitiory on that her crastle ys ävele to
which has herével which wher ensult awa the thoss on a detunns and and cate of the wich of the preashe w
------ temperature: 0.5
thoss on a detunns and and cate of the wich of the preashe who e"diytne of of cons-politv. its "idoe conses[t, frylos thing ou whish of historocgos hin(g;
andism, " a godre of abselftes, andius freash fore fveptions, anting on her sort is dist vilyë". it is sentemnewnt wijd out "-the mort 


As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in 
particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text 
becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as 
"eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings 
of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment 
with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and 
realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is 
sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is 
a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To 
evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like 
our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic 
statistical structure, thus making it impossible to learn a language model like we just did.


## Take aways

* We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
* In the case of text, such a model is called a "language model" and could be based on either words or characters.
* Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
* One way to handle this is the notion of _softmax temperature_. Always experiment with different temperatures to find the "right" one.