Language Modeling for the Lord of The Rings trilogy corpus

Language Models trained with LSTM using the Lord of The Rings trilogy corpus.

Two methods were explored:
1. sparse_categorical_crossentropy model where the input sentences were vectorized using TextVectorization from Keras, yielding input shape of (MAX_SEQ_LEN,) where MAX_SEQ_LEN is the maximum sequence of the input sentences (Tx). Each individual sequence in the MAX_SEQ_LEN would correspond to the vectorized integer number for the corresponding N_UNIQUE_CHARS of the corpus (where N_UNIQUE_CHARS is the unique characters found in the corpus)
2. categorical_crossentropy model or One-Hot Encoding model where the input sentences were vectorized into one-hot encoded arrays, yielding input shape of (MAX_SEQ_LEN, N_UNIQUE_CHARS), where MAX_SEQ_LEN is the maximum sequence of the input sentences (Tx). For each of the individual sequence in the MAX_SEQ_LEN there would be corresponding one-hot-encoded vector of length N_UNIQUE_CHARS where N_UNIQUE_CHARS` is the unique characters found in the corpus.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
models		models
LICENSE		LICENSE
LOTR_LSTM_Character_Level.ipynb		LOTR_LSTM_Character_Level.ipynb
LOTR_LSTM_Character_Level_OneHot.ipynb		LOTR_LSTM_Character_Level_OneHot.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback