Single and multi-step temperature time series forecasting for Vilnius using LSTM deep learning model | by Eligijus Bujokas | Dec, 2020
The data that we have with all the feature engineering is this:
The function f that we want to approximate is:
The goal is to use past values to predict the future. The data is a time series or a sequence. For sequence modeling, we will choose tensorflow implementation of recurrent neural network with an LSTM layer.
The input to an LSTM network is a 3D array:
(samples, timesteps, features)
samples — total number of sequences constructed for training.
timesteps — the length of the samples.
features — number of features used.
The first thing before modeling is from the data that is in a 2D format convert into a 3D array. The following function does that:
For example, if we assume that the whole data is the first 10 rows of the data, we use 3 past hours as features and we want to forecast 1 step ahead:
ts = d[
‘wind_speed’].head(10).valuesX, Y = create_X_Y(ts, lag=3, n_ahead=1)
As we can see, the shape of the X matrix is 6 samples, 3 timesteps and 7 features. In other words, we have 6 observations each with 3 rows of data and 7 columns. There are 6 observations because the first 3 lags are dropped and used only as X data and we are forecasting 1 step ahead thusthe last observation is lost as well.
The first value pairs of X and Y are presented in the above picture.
The hyperparameter list for the final model:
# Number of lags (hours back) to use for models
lag = 48# Steps ahead to forecast
n_ahead = 1# Share of obs in testing
test_share = 0.1# Epochs for training
epochs = 20# Batch size
batch_size = 512# Learning rate
lr = 0.001# Number of neurons in LSTM layer
n_layer = 10# The features used in the modeling
features_final = [‘temp’, ‘day_cos’, ‘day_sin’, ‘month_sin’, ‘month_cos’, ‘pressure’, ‘wind_speed’]
The model class:
Read More …