Creating the model

30.04.2020

Contents/Index

Intro
Sequencing
@Creating the model
Using the model

Next we need to create the model. As in the guide we use keras for python.

The whole program is found in create_model.py. This closely resembles the code given in the guide. For details look in that. We don't need to tell the creating program the $n$ value. However several other values can be adjusted. You might want to mess around with it.

Creating

It took pretty close to 40 minutes to create the the model on my pretty average machine. The creation is series of epoch steps. We have set it to do 100. The last 10 are

Epoch 90/100 26309/26309 [==============================] - 21s 792us/step - loss: 2.6520 - accuracy: 0.3694 Epoch 91/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.6309 - accuracy: 0.3760 Epoch 92/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.6209 - accuracy: 0.3775 Epoch 93/100 26309/26309 [==============================] - 21s 795us/step - loss: 2.5996 - accuracy: 0.3800 Epoch 94/100 26309/26309 [==============================] - 21s 793us/step - loss: 2.5841 - accuracy: 0.3847 Epoch 95/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.5657 - accuracy: 0.3872 Epoch 96/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.5495 - accuracy: 0.3897 Epoch 97/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.5381 - accuracy: 0.3926 Epoch 98/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.5218 - accuracy: 0.3965 Epoch 99/100 26309/26309 [==============================] - 21s 794us/step - loss: 2.5059 - accuracy: 0.4011 Epoch 100/100 26309/26309 [==============================] - 21s 793us/step - loss: 2.4899 - accuracy: 0.4029

As can be seen the accuracy has landed on 40%. We might need to test it with an even lower $n$ value. Maybe $n = 30$ in order to reach 50% accuracy. Since poetry writing isn't an exact science, I'll let it stay here for now. Adding more epochs probably raises the accuracy.

When the model has been created, it is saved with a .h5 file extension. We can load this and create output quite snappy compared to the time used for creating the model.

Memory error

When trying to create a model using large texts on Linux you can be met with

MemoryError: Unable to allocate array with shape (264908, 29599) and data type float32

This has something to do with a limited amount of virtual memory that is offered to each application. I think. The solution is to open a terminal in root mode by typing sudo -i and then allow overcommit with the command

echo 1 > /proc/sys/vm/overcommit_memory

This should fix that.