Python 🐍 LSTM in Keras Tensorflow
In this exercise, we develop a model of the dynamic temperature response of the TCLab and compare the LSTM model prediction to a second-order linear differential equation solution. We use the 4 hours of dynamic data from a TCLab (14400 data points = 1 second sample rate for 4 hours) for training and generate new data (840 data points = 1 second sample rate for 14 min) for validation (see sample validation data).
We use the measured temperature and heater values to predict the next temperature value with an LSTM model and validate the model with a new data set in a predictive and forecast mode. The predictive mode predicts one step ahead while the forecast does not use temperature measurements to generate the predictions.
LSTM (Long Short Term Memory) networks are a special type of RNN (Recurrent Neural Network) that is structured to remember and predict based on long-term dependencies that are trained with time-series data. An LSTM repeating module has four interacting components.
The LSTM is trained (parameters adjusted) with an input window of prior data and minimized difference between the predicted and next measured value. Sequential methods predict just one next value based on the window of prior data. When there is contextual data (before and after) the desired prediction point, a Convolutional Neural Network (CNN) may improve performance with fewer resources to train and deploy the network.
Data preparation for LSTM networks involves consolidation, cleansing, separating the input window and output, scaling, and data division for training and validation.
Consolidation – consolidation is the process of combining disparate data (Excel spreadsheet, PDF report, database, cloud storage) into a single repository.
Data Cleansing – bad data should be removed and may include outliers, missing entries, failed sensors, or other types of missing or corrupted information.
Inputs and Outputs – data is separated into inputs (prior time-series window) and outputs (predicted next value). The inputs are fed into a series of functions to produce the output prediction. The squared difference between the predicted output and the measured output is a typical loss (objective) function for fitting.
Scaling – scaling all data (inputs and outputs) to a range of 0-1 can improve the training process.
Training and Validation – data is divided into training (e.g. 80%) and validation (e.g. 20%) sets so that the model fit can be evaluated independently of the training. Cross-validation is an approach to divide the training data into multiple sets that are fit separately. The parameter consistency is compared between the multiple models.