TY - GEN
T1 - Integrating Temporal Analysis with Hybrid Machine Learning and Deep Learning Models for Enhanced Air Quality Prediction
AU - Omer, Muhammad
AU - Ali, Sardar Jaffar
AU - Raza, Syed M.
AU - Le, Duc Tai
AU - Choo, Hyunseung
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Air pollution remains a critical issue, adversely affecting public health and the environment. In this study, we utilize the Air Quality index dataset from Kaggle to analyze temporal and seasonal variations of key pollutants, specifically Carbon Monoxide (CO), Nitrogen Oxides (NOx), and Benzene (C6H6 ). Building upon this analysis, we predict Absolute Humidity (AH), a vital meteorological factor influencing pollutant dispersion, using Machine Learning (ML) and Deep Learning (DL) techniques. Three ML techniques, Linear Regression (LR), Random Forest (RF), and Support Vector Regression (SVR), and three DL techniques, Artificial Neural Network (ANN), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM), are employed for predicting AH. The results indicate that while the RF model achieved the lowest Mean Absolute Error (MAE) among ML methods (0.02), the CNN model, despite having an MAE of 0.04, demonstrated statistical superiority through paired t-tests and Wilcoxon signed-rank tests (p < 0.005), outperforming RF (p < 0.03). These findings highlight the statistical significance of DL methods, specifically CNN, over ML methods in predicting AH.
AB - Air pollution remains a critical issue, adversely affecting public health and the environment. In this study, we utilize the Air Quality index dataset from Kaggle to analyze temporal and seasonal variations of key pollutants, specifically Carbon Monoxide (CO), Nitrogen Oxides (NOx), and Benzene (C6H6 ). Building upon this analysis, we predict Absolute Humidity (AH), a vital meteorological factor influencing pollutant dispersion, using Machine Learning (ML) and Deep Learning (DL) techniques. Three ML techniques, Linear Regression (LR), Random Forest (RF), and Support Vector Regression (SVR), and three DL techniques, Artificial Neural Network (ANN), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM), are employed for predicting AH. The results indicate that while the RF model achieved the lowest Mean Absolute Error (MAE) among ML methods (0.02), the CNN model, despite having an MAE of 0.04, demonstrated statistical superiority through paired t-tests and Wilcoxon signed-rank tests (p < 0.005), outperforming RF (p < 0.03). These findings highlight the statistical significance of DL methods, specifically CNN, over ML methods in predicting AH.
KW - Absolute Humidity
KW - Benzene
KW - Carbon Monoxide
KW - Nitrogen Oxide
KW - paired T-tests
KW - Wilcoxon signed-rank tests
UR - https://www.scopus.com/pages/publications/85218148494
U2 - 10.1109/IMCOM64595.2025.10857575
DO - 10.1109/IMCOM64595.2025.10857575
M3 - Conference contribution
AN - SCOPUS:85218148494
T3 - Proceedings of the 2025 19th International Conference on Ubiquitous Information Management and Communication, IMCOM 2025
BT - Proceedings of the 2025 19th International Conference on Ubiquitous Information Management and Communication, IMCOM 2025
A2 - Lee, Sukhan
A2 - Choo, Hyunseung
A2 - Ismail, Roslan
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th International Conference on Ubiquitous Information Management and Communication, IMCOM 2025
Y2 - 3 January 2025 through 5 January 2025
ER -