## Overview The script `ren-energy-paper-code.py` contains how to: - Install necessary Python packages. - Use data for training and evaluation. - Plot training accuracy and loss to visualize the performance of machine learning models over time. ## Requirements To run this script, you will need: - Python 3.x - Jupyter Notebook or any Python IDE (e.g., PyCharm, VSCode) - Internet connection for package installation ## Installation 1. Clone the repository (if applicable): ```bash git clone https://github.com/ping543f/ren-energy.git cd ren-energy 2. Install required packages: The script installs the necessary packages automatically. However, you can manually install them by running: pip install plot_keras_history numpy matplotlib # Running the Script Open the script in a Python IDE or Jupyter Notebook. # Run the script: If you're using Jupyter Notebook, run each cell sequentially. If you're using a Python IDE, simply run the script. The script will: - Load sample data for training accuracy and loss. - Plot the training accuracy and loss using Matplotlib. - View the plots to analyze the model's performance over epochs. # Description and Explanation 1. Importing Necessary Libraries NumPy (np): Used for numerical operations, particularly with arrays and matrices, which are fundamental in data processing and machine learning. Pandas (pd): A data manipulation library used for reading and processing datasets (like .csv files). TensorFlow (tf) and Keras: Popular libraries for building and training machine learning models, especially neural networks. Matplotlib (plt): A plotting library to visualize data and model performance. Scikit-learn (sklearn): Provides tools for preprocessing data and evaluating model performance through various metrics. Google Colab Libraries (files and drive): Used to interact with Google Colab, a cloud-based platform where you can run Python code. 2. Mounting Google Drive Mounting Google Drive: This allows the code to access files stored in Google Drive directly, which is useful for working with datasets stored in the cloud. 3. Loading and Plotting Data Loading Data: Reads a CSV file containing hourly energy demand data into a DataFrame (df). The data is indexed by date and time (Datetime). Plotting the Data: Visualizes the raw energy demand data over time to understand trends and patterns. 4. Data Normalization Normalization: Scales the data to a range (0, 1) using MinMaxScaler to prepare it for machine learning models. This step ensures that all input data are on the same scale, improving model performance and convergence speed. Visualizing Normalized Data: Shows the normalized data, which now has all values between 0 and 1. 5. Data Preparation for LSTM Model Data Preparation: This function prepares the data for the LSTM model. Splitting Data into Sequences: Splits the time series data into sequences (X_train) and targets (y_train) for training. The seq_len parameter defines the number of past time steps used to predict the next time step. Reshaping Data: The data is reshaped to fit the input requirements of the LSTM model. 6. Building and Training the LSTM Model LSTM Model Architecture: Sequential Model: A linear stack of layers where each layer has exactly one input tensor and one output tensor. LSTM Layers: Long Short-Term Memory (LSTM) layers with 200 units each, used to capture temporal dependencies in the data. The first LSTM layer returns sequences (return_sequences=True), allowing the next LSTM layer to receive its output. Dropout Layers: Prevent overfitting by randomly setting a fraction of input units to 0 at each update during training. Dense Layer: A fully connected layer that produces the output. Model Summary and Visualization: Provides a summary of the model architecture and visualizes it using plot_model. 7. Compiling and Training the Model Compiling the Model: Specifies the optimizer (adam) and loss functions (mean_squared_error and mean_absolute_error) used to train the model. Training the Model: Trains the model on the training data (X_train, y_train) for 10 epochs with a batch size of 3000. It also validates the model using the test data (X_test, y_test). Saving the Model: Saves the trained model to a file for future use. 8. Evaluating Model Performance Making Predictions: Uses the trained model to predict energy demand on the test data. Calculating Performance Metrics: Evaluates the model's performance using: R^2 Score: Indicates how well the predictions match the actual data. Mean Absolute Error (MAE): Measures the average magnitude of the prediction errors. Mean Squared Error (MSE): Similar to MAE but squares the errors before averaging, penalizing larger errors more. 9. Plotting Predictions Plotting Predictions: Visualizes the model's predictions against the actual data to assess its accuracy visually. This helps to understand how well the model performs and whether it can capture the underlying trends. 10. Using Prophet and SVR Models The code also includes steps for using two other models, Prophet and SVR (Support Vector Regression), for predicting energy demand: Prophet Model: Used for time series forecasting and captures seasonality effects in data. SVR Model: A type of regression that fits the input data into a high-dimensional space for better prediction accuracy. Different kernels (RBF, Linear, Polynomial) are tested to see which fits the data best.