Title: Nocturnal sleep sound classification with multi-spectrogram feature fusion and an attention-based stacked hybrid ConvBiLSTM-ViT architecture Author: Ensar Arif Sağbaş Date: July 2025 Description: This repository contains all source code and evaluation scripts for the study titled "Nocturnal sleep sound classification with multi-spectrogram feature fusion and an attention-based stacked hybrid ConvBiLSTM-ViT architecture ", submitted to PeerJ Computer Science. The project proposes a hybrid deep learning framework that combines CNN, BiLSTM, Attention, and Vision Transformer (ViT) models to classify sleep-related environmental sounds. The audio files are first converted into three different spectrogram representations—Mel, MFCC, and CQT—then passed through separate model branches with SpecAugment applied for robustness. Ensemble strategies (average, weighted, and stacking) are utilized to enhance performance. All evaluations are conducted using 10-fold stratified cross-validation. Dataset: - Source: Akbal & Tuncer (2021), "FusedTSNet: An automated nocturnal sleep sound classification method based on a fused textural and statistical feature generation network", Applied Acoustics, Vol. 171, Article 107559 - Link: https://doi.org/10.1016/j.apacoust.2020.107559 - Format: 700 .wav files (48kHz, 4–8 seconds) - Classes: 0: Cough, 1: Laugh, 2: Scream, 3: Sneeze, 4: Snore, 5: Sniffle, 6: Farting Directory Structure: - SLEEP/nocturnal_wav/ → Original .wav files - SLEEP/mel/, SLEEP/mfcc/, SLEEP/cqt/ → Spectrogram images - SLEEP/results_ensemble_multibranch/ → All outputs (.npy, .png, .txt) Files: 1. ensemble_from_wav_multibranch_regularization.py – Full pipeline: preprocessing, training, evaluation 2. save_branch_predictions.py – Saves softmax predictions per branch 3. evaluate_branches_10fold.py – Evaluates each model across 10 folds 4. weighted_ensemble_evaluation.py – Performs weighted ensemble classification 5. stacking_logreg.py – Applies logistic regression stacking 6. stacking_xgboost.py – Applies XGBoost stacking 7. confidence_distribution.py – Plots confidence histograms 8. pca&tsne.py – Visualizes softmax feature space 9. requirements.txt – Lists required dependencies Dependencies: Install all required Python packages using: pip install -r requirements.txt Execution Instructions: To run the complete experiment, execute the scripts in the following order: python 1-ensemble_from_wav_multibranch_regularization.py python 2-save_branch_predictions.py python 3-evaluate_branches_10fold.py python 4-weighted_ensemble_evaluation.py python 5-stacking_logreg.py python 6-stacking_xgboost.py python 7-confidence_distribution.py python 8-pca&tsne.py Outputs: All evaluation results (.npy arrays, classification reports, ROC curves, confusion matrices) are stored under: SLEEP/results_ensemble_multibranch/ Contact: For replication issues or dataset access inquiries, please contact: arifsagbas@mu.edu.tr Citation: If you use this repository, please cite: Sağbaş E.A., Nocturnal sleep sound classification with multi-spectrogram feature fusion and an attention-based stacked hybrid ConvBiLSTM-ViT architecture, PeerJ Computer Science, Under Review.