# AnomalyGPT **Repository Path**: yangtf/AnomalyGPT ## Basic Information - **Project Name**: AnomalyGPT - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-05-02 - **Last Updated**: 2024-10-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
🌐 Project Page • 🤗 Online Demo • 📃 Paper • 🤖 Model • 📹 Video
Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang **** ## Catalogue: * 1. Introduction * 2. Running AnomalyGPT Demo * 2.1 Environment Installation * 2.2 Prepare ImageBind Checkpoint * 2.3 Prepare Vicuna Checkpoint * 2.4 Prepare Delta Weights of AnomalyGPT * 2.5 Deploying Demo * 3. Train Your Own AnomalyGPT * 3.1 Data Preparation * 3.2 Training Configurations * 3.3 Training AnoamlyGPT * 4. Examples * License * Citation * Acknowledgments **** ### 1. Introduction: [Back to Top]
We leverage a pre-trained image encoder and a Large Language Model (LLM) to align IAD images and their corresponding textual descriptions via simulated anomaly data. We employ a lightweight, visual-textual feature-matching-based image decoder to obtain localization result, and design a prompt learner to provide fine-grained semantic to LLM and fine-tune the LVLM using prompt embeddings. Our method can also detect anomalies for previously unseen items with few normal sample provided.
****
### 2. Running AnomalyGPT Demo [Back to Top]
#### 2.1 Environment Installation
Clone the repository locally:
```
git clone https://github.com/CASIA-IVA-Lab/AnomalyGPT.git
```
Install the required packages:
```
pip install -r requirements.txt
```
#### 2.2 Prepare ImageBind Checkpoint:
You can download the pre-trained ImageBind model using [this link](https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth). After downloading, put the downloaded file (imagebind_huge.pth) in [[./pretrained_ckpt/imagebind_ckpt/]](./pretrained_ckpt/imagebind_ckpt/) directory.
#### 2.3 Prepare Vicuna Checkpoint:
To prepare the pre-trained Vicuna model, please follow the instructions provided [[here]](./pretrained_ckpt#1-prepare-vicuna-checkpoint).
#### 2.4 Prepare Delta Weights of AnomalyGPT:
We use the pre-trained parameters from [PandaGPT](https://github.com/yxuansu/PandaGPT) to initialize our model. You can get the weights of PandaGPT trained with different strategies in the table below. In our experiments and online demo, we use the Vicuna-7B and `openllmplayground/pandagpt_7b_max_len_1024` due to the limitation of computation resource. Better results are expected if switching to Vicuna-13B.
| **Base Language Model** | **Maximum Sequence Length** | **Huggingface Delta Weights Address** |
| :---------------------: | :-------------------------: | :----------------------------------------------------------: |
| Vicuna-7B (version 0) | 512 | [openllmplayground/pandagpt_7b_max_len_512](https://huggingface.co/openllmplayground/pandagpt_7b_max_len_512) |
| Vicuna-7B (version 0) | 1024 | [openllmplayground/pandagpt_7b_max_len_1024](https://huggingface.co/openllmplayground/pandagpt_7b_max_len_1024) |
| Vicuna-13B (version 0) | 256 | [openllmplayground/pandagpt_13b_max_len_256](https://huggingface.co/openllmplayground/pandagpt_13b_max_len_256) |
| Vicuna-13B (version 0) | 400 | [openllmplayground/pandagpt_13b_max_len_400](https://huggingface.co/openllmplayground/pandagpt_13b_max_len_400) |
Please put the downloaded 7B/13B delta weights file (pytorch_model.pt) in the [./pretrained_ckpt/pandagpt_ckpt/7b/](./pretrained_ckpt/pandagpt_ckpt/7b/) or [./pretrained_ckpt/pandagpt_ckpt/13b/](./pretrained_ckpt/pandagpt_ckpt/13b/) directory.
After that, you can download AnomalyGPT weights from the table below.
| Setup and Datasets | Weights Address |
| :---------------------------------------------------------: | :-------------------------------: |
| Unsupervised on MVTec-AD | [AnomalyGPT/train_mvtec](https://huggingface.co/FantasticGNU/AnomalyGPT/blob/main/train_mvtec/pytorch_model.pt) |
| Unsupervised on VisA | [AnomalyGPT/train_visa](https://huggingface.co/FantasticGNU/AnomalyGPT/blob/main/train_visa/pytorch_model.pt) |
| Supervised on MVTec-AD, VisA, MVTec-LOCO-AD and CrackForest | [AnomalyGPT/train_supervised](https://huggingface.co/FantasticGNU/AnomalyGPT/blob/main/train_supervised/pytorch_model.pt) |
After downloading, put the AnomalyGPT weights in the [./code/ckpt/](./code/ckpt/) directory.
In our [online demo](https://huggingface.co/spaces/FantasticGNU/AnomalyGPT), we use the supervised setting as our default model to attain an enhanced user experience. You can also try other weights locally.
#### 2.5. Deploying Demo
Upon completion of previous steps, you can run the demo locally as
```bash
cd ./code/
python web_demo.py
```
****
### 3. Train Your Own AnomalyGPT [Back to Top]
**Prerequisites:** Before training the model, making sure the environment is properly installed and the checkpoints of ImageBind, Vicuna and PandaGPT are downloaded.
#### 3.1 Data Preparation:
You can download MVTec-AD dataset from [[this link]](https://www.mvtec.com/company/research/datasets/mvtec-ad/downloads) and VisA from [[this link]](https://github.com/amazon-science/spot-diff). You can also download pre-training data of PandaGPT from [[here]](https://huggingface.co/datasets/openllmplayground/pandagpt_visual_instruction_dataset/tree/main). After downloading, put the data in the [[./data]](./data/) directory.
The directory of [[./data]](./data/) should look like:
```
data
|---pandagpt4_visual_instruction_data.json
|---images
|-----|-- ...
|---mvtec_anomaly_detection
|-----|-- bottle
|-----|-----|----- ground_truth
|-----|-----|----- test
|-----|-----|----- train
|-----|-- capsule
|-----|-- ...
|----VisA
|-----|-- split_csv
|-----|-----|--- 1cls.csv
|-----|-----|--- ...
|-----|-- candle
|-----|-----|--- Data
|-----|-----|-----|----- Images
|-----|-----|-----|--------|------ Anomaly
|-----|-----|-----|--------|------ Normal
|-----|-----|-----|----- Masks
|-----|-----|-----|--------|------ Anomaly
|-----|-----|--- image_anno.csv
|-----|-- capsules
|-----|-----|----- ...
```
#### 3.2 Training Configurations
The table below show the training hyperparameters used in our experiments. The hyperparameters are selected based on the constrain of our computational resources, i.e. 2 x RTX3090 GPUs.
| **Base Language Model** | **Epoch Number** | **Batch Size** | **Learning Rate** | **Maximum Length** |
| :---------------------: | :--------------: | :------------: | :---------------: | :----------------: |
| Vicuna-7B | 50 | 16 | 1e-3 | 1024 |
#### 3.3 Training AnomalyGPT
To train AnomalyGPT on MVTec-AD dataset, please run the following commands:
```yaml
cd ./code
bash ./scripts/train_mvtec.sh
```
The key arguments of the training script are as follows:
* `--data_path`: The data path for the json file `pandagpt4_visual_instruction_data.json`.
* `--image_root_path`: The root path for training images of PandaGPT.
* `--imagebind_ckpt_path`: The path of ImageBind checkpoint.
* `--vicuna_ckpt_path`: The directory that saves the pre-trained Vicuna checkpoints.
* `--max_tgt_len`: The maximum sequence length of training instances.
* `--save_path`: The directory which saves the trained delta weights. This directory will be automatically created.
* `--log_path`: The directory which saves the log. This directory will be automatically created.
Note that the epoch number can be set in the `epochs` argument at [./code/config/openllama_peft.yaml](./code/config/openllama_peft.yaml) file and the learning rate can be set in [./code/dsconfig/openllama_peft_stage_1.json](./code/dsconfig/openllama_peft_stage_1.json)
****
### 4. Examples
