# README: Design of a Consumer Behavior Prediction Model Integrating Reinforcement Learning and Time Series Analysis in Online E-Commerce Reviews

## 📌 Title
**QL-ANN-HMM: A Hybrid Consumer Behavior Prediction Model Based on Reinforcement Learning and Time Series Analysis**

## 📄 Description
This repository contains the implementation of a hybrid intelligent model—**QL-ANN-HMM**—designed to predict consumer behavior from e-commerce reviews. The model integrates an **Improved Q-Learning** algorithm, an **Artificial Neural Network (GA-ANN)**, and a **Hidden Markov Model (HMM)** to enhance prediction accuracy by addressing nonlinearity, noise, and stochastic patterns in multivariate time series data derived from platforms such as Amazon and Flipkart.

## 📂 Dataset Information
- **Amazon Reviews 2023**: Over 570 million reviews from 48 million products across 33 categories.
- **Flipkart Reviews**: E-commerce consumer reviews with metadata (ratings, timestamps, helpfulness).
- **Preprocessing** includes:
  - Tokenization, normalization, emoji/stopword removal.
  - Min-Max scaling of numerical features.
  - Sliding window time series segmentation (30-day window, 7-day stride).
  - Outlier removal with Z-score filtering.
  - BERT-based sentiment embedding extraction.

## 💻 Code Information
- **QL Module**: Implements improved Q-learning with probabilistic action selection.
- **ANN Module**: GA-optimized feed-forward ANN with 8- and 60-node configurations.
- **HMM Module**: Sequence modeling via Gaussian mixture emission probabilities.
- **Integration**: All components fused via a multi-stage pipeline for training and prediction.

## 🛠️ Usage Instructions

1. **Environment Setup**
   ```bash
   pip install -r requirements.txt
   ```

2. **Preprocessing**
   ```python
   python preprocess.py --dataset amazon_reviews.csv --output structured_data.pkl
   ```

3. **Train Models**
   ```python
   python train_qlearning.py
   python train_ann.py
   python train_hmm.py
   ```

4. **Run Prediction**
   ```python
   python predict.py --model QL-ANN-HMM --input structured_data.pkl
   ```

5. **Evaluate Performance**
   ```python
   python evaluate.py --metrics MAE MAPE NMSE
   ```

## 🧠 Methodology
The predictive framework involves the following:

- **Reinforcement Learning Component**:
  - Probabilistic multi-step Q-learning with greedy policy.
  - Accelerated convergence and stability improvements.
- **ANN Component**:
  - GA-tuned topology and weight optimization.
  - Predicts short-term consumer actions from multivariate features.
- **HMM Component**:
  - Captures hidden behavioral states.
  - Outputs sequential probability distributions using Gaussian mixtures.
- **Model Fusion**:
  - Historical data → ANN → Reinforcement-guided selection → HMM → Final output.

## 📦 Requirements
- Python 3.10+
- PyTorch 2.0
- NumPy 1.25
- Pandas 2.1
- scikit-learn 1.3
- hmmlearn
- deap
- seaborn, matplotlib

## 📚 Citations
If you use this code or dataset in your research, please cite:

> Lin, Z., Huang, Y., Yang, J., Cui, C., Lian, Y., Zhang, H., & Al-Turjman, F. (2025). *Design of a Consumer Behavior Prediction Model Integrating Reinforcement Learning and Time Series Analysis in Online E-Commerce Reviews*. [Manuscript]

## 📜 License & Contribution Guidelines
- **License**: MIT License (or specify if otherwise)
- **Contributions**: Fork the repo → create a feature branch → submit a pull request with clear documentation.