README for BTSCM: A Bimodal Time Series Classification Model
==============================================================

📘 Project Title:
Parametric Art Creation Platform Design Based on Visual Delivery and Multimedia Data Fusion

🔬 Summary:
This project introduces the BTSCM (Bimodal Time Series Classification Model), a multimodal deep learning framework combining I3D-based video feature extraction and MFCC-based audio feature processing. It is designed to support video classification and intelligent content tagging on art creation platforms.

==============================================================

📁 Project Structure:
- `train_btscm.py`: Main training script for the BTSCM network
- `evaluate.py`: Script for testing on various datasets
- `model/`: Contains FCN, LSTM, and DQN modules
- `data/`: Includes preprocessed audio-visual features and segment definitions
- `configs/`: YAML configuration files for hyperparameters and environment control

==============================================================

🖥 Computing Infrastructure:
- OS: Ubuntu 20.04 LTS
- CPU: Intel Xeon Silver 4214R (24 cores)
- GPU: NVIDIA Tesla V100 (32GB HBM2)
- RAM: 128 GB
- Framework: Python 3.8, PyTorch 1.13, CUDA 11.6

🔧 Key Dependencies:
- numpy
- pandas
- torch >= 1.13
- torchaudio
- librosa
- scikit-learn
- matplotlib

To install:
conda create -n btscm_env python=3.8
conda activate btscm_env
pip install -r requirements.txt

==============================================================

📊 Datasets Used:
- UCF101
- Kinetics-400
- Sports-1M
- ActivityNet
- Self-established art video dataset (not public)

Data format:
- Video input: MP4 or AVI, 25 FPS
- Audio input: 16kHz mono WAV
- Labels: action class, video-level annotations

==============================================================

🔍 Data Preprocessing Steps:
1. Videos are resized to 224x224 and segmented into 64-frame clips.
2. Audio is downsampled to 16kHz, converted to MFCCs (40 filters).
3. Z-score normalization is applied across all features.
4. Audio and video streams are time-aligned based on timestamps.

==============================================================

🚀 Training:
To train the model on UCF101:
python train_btscm.py --dataset UCF101 --config configs/btscm_ucf.yaml

To evaluate:
python evaluate.py --model checkpoints/btscm_final.pth --dataset UCF101

==============================================================