# Quantum-inspired Seagull optimised Deep Belief Network approach for cardiovascular disease prediction ## Overview This project is focused on predicting the presence of heart disease using various machine learning algorithms. The data used in this project comes from multiple sources, including the Cleveland, Hungarian, Switzerland, and Long Beach VA datasets. These datasets are well-known in the field of cardiovascular research and have been used extensively to develop predictive models for heart disease. ## Dataset The project utilizes the following datasets: - **Cleveland Data**: `cleveland.data` and `processed.cleveland.data` - **Hungarian Data**: `hungarian.data` and `processed.hungarian.data` - **Switzerland Data**: `switzerland.data` and `processed.switzerland.data` - **Long Beach VA Data**: `long-beach-va.data` and `processed.va.data` These datasets contain various attributes such as age, sex, chest pain type, resting blood pressure, cholesterol levels, fasting blood sugar, resting ECG results, maximum heart rate achieved, exercise-induced angina, and more. ## Preprocessing The raw data was preprocessed as follows: - Removal of duplicate entries. - Calculation of Body Mass Index (BMI). - Filtering out unrealistic blood pressure values. - Splitting the data into training (70%) and test (30%) sets. ## Models Used The following machine learning models were used to predict heart disease: - **Logistic Regression** - **Random Forest Classifier** - **Gradient Boosting Classifier** - **Support Vector Machines (SVM)** - **K-Nearest Neighbors (KNN)** - **Neural Networks (using Keras)** - **XGBoost** - **LightGBM** Additionally, hyperparameter tuning was performed using `hyperopt`. ## How to Run the Code 1. Clone the repository or download the code. 2. Ensure you have all necessary dependencies installed. You can do this using pip: ``` pip install -r requirements.txt ``` 3. Run the implementation script: ``` python Implementation_code.py ``` ## Dependencies The project requires the following Python libraries: - `numpy` - `pandas` - `matplotlib` - `sklearn` - `keras` - `xgboost` - `lightgbm` - `hyperopt` ## Results The performance of each model was evaluated using cross-validation. The results showed that ensemble models like Random Forest and Gradient Boosting performed better compared to individual classifiers. ## Conclusion This project demonstrates the effectiveness of machine learning models in predicting heart disease. With proper preprocessing and model tuning, the predictive performance can be significantly improved. ## Acknowledgments The datasets used in this project are provided by the UCI Machine Learning Repository and have been cited in numerous academic papers related to heart disease prediction. ## Contact For any questions or issues, please contact [Your Name] at [Your Email].