## Controllable Chinese Landscape Art Creation using Adversarially Regularized Autoencoder

This repository provides a reference implementation of an adversarially-regularized autoencoder (ARAE) for controllable generation of Chinese landscape paintings. The model learns a latent space via an encoder–decoder autoencoder and uses adversarial training to align generated images with real images. Guided generation is supported by blending reference images with generated images using a coefficient alpha.


### Repository structure
```
python code/
  ├─ chinese_landscape_paintings/
  │  └─ train/               # dataset directory (images + metadata.jsonl)
  ├─ README.md               
  ├─ requirements.txt
  ├─ preprocess.py           # dataset preprocessing pipeline
  ├─ train.py                # training
  ├─ sample.py               # guided generation (alpha blending)
  └─ src/
     ├─ data/
     │  └─ dataset.py
     ├─ models/
     │  ├─ encoder.py
     │  ├─ decoder.py
     │  ├─ generator.py
     │  └─ discriminator.py
     └─ utils/
        ├─ losses.py
        └─ checkpoint.py
```

### Requirements and Installation
1) Create a Python 3.9+ environment.
2) Install requirements:
```bash
pip install -r requirements.txt
```

### Dataset
Folder contains dataset: `code/chinese_landscape_paintings/train`

Run preprocessing (resize to 512 short side, center crop or sliding windows, optional text removal with inpainting):
```bash
python preprocess.py \
  --input_dir "python code/chinese_landscape_paintings/train" \
  --output_dir "python code/chinese_landscape_paintings/train_preprocessed" \
  --min_short_side 512 --crop_size 512 --stride 256 --remove_text false
```

### Instructions 

Training
```bash
python train.py \
  --data_dir "python code/chinese_landscape_paintings/train" \
  --save_dir "python code/outputs" \
  --image_size 512 \
  --latent_dim 256 \
  --batch_size 4 \
  --lr 2e-4 \
  --critic_iters 5 \
  --gp_lambda 10.0 \
  --max_steps 10000
```
Guided sampling (alpha blending)
```bash
python sample.py \
  --checkpoint "python code/outputs/latest.pt" \
  --reference "python code/chinese_landscape_paintings/train/1_res.jpg" \
  --alpha 0.75 \
  --num_samples 4 \
  --out_dir "python code/samples"
```
- alpha = 1.0: output closely follows the reference structure
- alpha = 0.0: output is driven purely by noise (diverse styles)

### Method summary
- Autoencoder: Conv encoder downsampling to a global-average pooled latent; deconv decoder upsampling back to image space with Tanh output
- Generator: FC → reshape to 512x4x4 → series of Deconv2D(5x5) + BN + ReLU → Tanh, outputs images from noise
- Discriminator: FC layers operating on flattened images (786,432 → 256 → 128 → 1) with Sigmoid output
- Objective: L1 reconstruction + Wasserstein distance on images (with gradient penalty)


### License and Citation
This code is provided for research and educational purposes.

* Dataset: https://ieeexplore.ieee.org/document/10219843





