Top Level Project Planning:

  1. Data Collection: “Hamlet” dataset
  2. Data Preprocessing:
    1. Tokenization
    2. converted into sequences,
    3. padded
    4. sequences are then split into training and testing sets
  3. Model Building:
    1. 1 embedding layer
    2. 2 LSTM layers
    3. 1 dense output layer (softmax activation function)
  4. Model Training:
    1. early stopping
  5. Model Evaluation
  6. Deployment

Streamlit Web App:

  1. Load LSTM model
  2. Load tokenizer
  3. Function to predict next word
  4. Streamlit component

image.png