**Title**: Hybrid Machine Learning Framework for Remaining Useful Life Prediction of Lithium-Ion Batteries **Dataset**: NASA Battery Data Set (Saha & Goebel, 2007). Available at: https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository Data Description: A set of four Li-ion batteries (# 5, 6, 7 and 18) were run through 3 different operational profiles (charge, discharge and impedance) at room temperature. Charging was carried out in a constant current (CC) mode at 1.5A until the battery voltage reached 4.2V and then continued in a constant voltage (CV) mode until the charge current dropped to 20mA. Discharge was carried out at a constant current (CC) level of 2A until the battery voltage fell to 2.7V, 2.5V, 2.2V and 2.5V for batteries 5 6 7 and 18 respectively. Impedance measurement was carried out through an electrochemical impedance spectroscopy (EIS) frequency sweep from 0.1Hz to 5kHz. Repeated charge and discharge cycles result in accelerated aging of the batteries while impedance measurements provide insight into the internal battery parameters that change as aging progresses. The experiments were stopped when the batteries reached end-of-life (EOL) criteria, which was a 30% fade in rated capacity (from 2Ahr to 1.4Ahr). This dataset can be used for the prediction of both remaining charge (for a given discharge cycle) and remaining useful life (RUL). Files: B0005.mat Data for Battery #5 B0006.mat Data for Battery #6 B0007.mat Data for Battery #7 B0018.mat Data for Battery #18 Data Structure: cycle: top level structure array containing the charge, discharge and impedance operations type: operation type, can be charge, discharge or impedance ambient_temperature: ambient temperature (degree C) time: the date and time of the start of the cycle, in MATLAB date vector format data: data structure containing the measurements for charge the fields are: Voltage_measured: Battery terminal voltage (Volts) Current_measured: Battery output current (Amps) Temperature_measured: Battery temperature (degree C) Current_charge: Current measured at charger (Amps) Voltage_charge: Voltage measured at charger (Volts) Time: Time vector for the cycle (secs) for discharge the fields are: Voltage_measured: Battery terminal voltage (Volts) Current_measured: Battery output current (Amps) Temperature_measured: Battery temperature (degree C) Current_charge: Current measured at load (Amps) Voltage_charge: Voltage measured at load (Volts) Time: Time vector for the cycle (secs) Capacity: Battery capacity (Ahr) for discharge till 2.7V for impedance the fields are: Sense_current: Current in sense branch (Amps) Battery_current: Current in battery branch (Amps) Current_ratio: Ratio of the above currents Battery_impedance: Battery impedance (Ohms) computed from raw data Rectified_impedance: Calibrated and smoothed battery impedance (Ohms) Re: Estimated electrolyte resistance (Ohms) Rct: Estimated charge transfer resistance (Ohms) **Methodology**: Details are explained in the paper. The pipeline follows preprocessing → feature selection → LSTM → XGBoost → evaluation. **Code Description**: This repository includes Python code and scripts used for data preprocessing, feature selection, model training (RF, LSTM, XGBoost), statistical evaluation, and generating the figures in the manuscript. **Code Structure**: - `preprocess.py`: Handles data cleaning, normalization, and labeling - `feature_selection.py`: Implements RFE with Random Forest and Pearson Correlation - `lstm_model.py`: Builds and trains LSTM for temporal features - `xgboost_model.py`: Final RUL prediction using stacked ensemble - `stat_eval.py`: Includes Wilcoxon test, Bland-Altman analysis, confidence intervals, etc. **Usage Instructions**: ```bash pip install -r requirements.txt python preprocess.py python feature_selection.py python lstm_model.py python xgboost_model.py python stat_eval.py ``` **Requirements**: - Python 3.9+ - TensorFlow, Keras, Scikit-learn, XGBoost, Pandas, NumPy, Matplotlib