Materials and Methods

Computing Infrastructure
- Operating System: Windows 10 (64-bit)
- Processor: Intel Core i3, 2.10 GHz
- RAM: 8 GB
- Python version: 3.9
- Deep Learning Framework: TensorFlow 2.13

Dataset
- Dataset Source: OpenWeatherMap Air Pollution API
- URL: https://api.openweathermap.org/data/2.5/air_pollution/history?lat=31.5497&lon=74.3436&start=1672531200&end=1704067200&appid=4ccd6f333eec5f5a5982f255f550782a
- Period: 2020 to 2023
- Location: Lahore, Pakistan
- File Used: `Air_Quality_Data_with_Numerical_Smog_Levels_PM2.5_2020-2023.csv`

Evaluation Method
- Data Split: 80% Training, 20% Testing using `train_test_split` from Scikit-learn
- Cross-validation: 10% of training data used for validation in deep learning models

Assessment Metrics
- Accuracy Score: Used for all models to evaluate the overall classification performance.
- Confusion Matrix: Generated for each model to visualize class-wise performance.
- Justification:
  - Accuracy is suitable for multi-class classification when the dataset is balanced.
  - Confusion matrices provide detailed insight into misclassifications among smog levels.

Models Applied
- Classical ML: SVM, Decision Tree, Random Forest, KNN
- Deep Learning: CNN, DNN, LSTM (Keras/TensorFlow)

Model Deployment:
The trained models have been deployed using Streamlit and are accessible online at:
https://smog-pred.streamlit.app
This interface allows real-time smog level classification for uploaded datasets.