AQI Forecasting — CNN-LSTM Hybrid
Hamza Zeewaqar · 2025
A hybrid CNN-LSTM neural network trained to forecast PM2.5 air quality concentrations 24 hours ahead — combining convolutional feature extraction with recurrent temporal modelling in a modular PyTorch pipeline.
Tech Stack
| Layer | Tools |
|---|---|
| Language | Python |
| Deep Learning | PyTorch — custom CNN + LSTM module implementations |
| Feature Engineering | Lag features · rolling window averages |
| Evaluation | RMSE · R² score |
| Experiment Tracking | experiments_log.md — per-run metrics and observations |
Problem & Approach
Air quality forecasting is a sequence prediction problem: PM2.5 readings are temporally dependent, autocorrelated, and driven by both local patterns and longer-range atmospheric trends. A pure LSTM captures long-range dependencies but struggles with local feature extraction. A CNN excels at extracting local patterns but has no memory across time.
The hybrid CNN-LSTM combines both: convolutional layers extract local temporal features from the input sequence, which are then fed into LSTM layers that model the long-range dependencies — giving the model the best of both architectures.
Model Architecture
| Component | Detail |
|---|---|
| Input | Lagged PM2.5 readings + rolling averages (engineered features) |
| CNN block | 1D convolutions — extract local temporal patterns from the input window |
| LSTM block | Recurrent layers — model sequential dependencies across time steps |
| Output | Single-step 24-hour PM2.5 forecast |
| Baseline | Linear Regression — performance lower bound for comparison |
| Loss | RMSE (Root Mean Squared Error) |
Model Architecture — Core Class
The CNN block reshapes the input for 1D convolution, extracts local temporal patterns, then hands off to the LSTM which models long-range dependencies across time steps.
class HybridCNNLSTM(nn.Module):
def __init__(self, input_size, hidden_size=64, output_size=1, dropout=0.2):
super(HybridCNNLSTM, self).__init__()
self.cnn = nn.Conv1d(in_channels=input_size, out_channels=hidden_size, kernel_size=1)
self.lstm = nn.LSTM(input_size=hidden_size, hidden_size=hidden_size, batch_first=True)
self.dropout = nn.Dropout(dropout)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
x_cnn = x.permute(0, 2, 1) # (Batch, Features, Seq)
out_cnn = torch.relu(self.cnn(x_cnn)) # CNN feature extraction
out_cnn = out_cnn.permute(0, 2, 1) # (Batch, Seq, Features)
out_lstm, _ = self.lstm(out_cnn) # LSTM temporal modelling
last_step = self.dropout(out_lstm[:, -1, :])
return self.fc(last_step)
Feature Engineering
24-hour lookahead target created by shifting PM2.5 forward by 24 rows. Lag features and temporal context (hour, weekday, month) are added as model inputs.
def engineer_features(df):
df['target_pm25_next_day'] = df['pm2_5'].shift(-24) # future target
df['pm2_5_current'] = df['pm2_5']
df['pm2_5_lag24'] = df['pm2_5'].shift(24) # same hour yesterday
df['hour'] = df.index.hour
df['day_of_week'] = df.index.dayofweek
df['month'] = df.index.month
return df.drop(columns=['pm2_5']).dropna()
Pipeline
- 01Install dependencies via pip (requirements.txt)
- 02Run data_preprocessing.py — clean raw data, engineer lag and rolling-average features
- 03Train baseline_model.py — linear regression reference
- 04Train hybrid_model.py — CNN-LSTM architecture, log RMSE and R² per epoch
- 05Compare model performance and review outputs in results/
Results
