# STF
**Repository Path**: xinci/STF
## Basic Information
- **Project Name**: STF
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-01-02
- **Last Updated**: 2025-01-02
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting
> Lingdong Shen, Chunlei Huo, Nuo Xu, Chaowei Han, Zichen Wang
## 1. Installation
This implementation is based on [Decision Transformer](https://sites.google.com/berkeley.edu/decision-transformer), [FCOS](https://github.com/tianzhi0549/FCOS), [gym](https://github.com/openai/gym) and [keras-rl](https://github.com/keras-rl/keras-rl), .
## 2. Dataset
Download address of two image datasets to create the environment: [SA](https://www.dropbox.com/s/jwusmkq90t0cq5f/SA.zip?dl=0) and [VP](https://www.dropbox.com/s/4jmdbpy0lbnyddn/VP.zip?dl=0).
Replay buffer transition data is stored in trajectories under ./data/transition/
We have obtained the offline features of all the scene images, and you can download them if you need them [Features] --- Because the shared file is too big and takes up a lot of space(Our dropbpx is out of space), we will put it in other web disks later.
## 3. Training agent
### Object Detection Module
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$((RANDOM + 10000)) \
tools/train_net.py \
--config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \
DATALOADER.NUM_WORKERS 2 \
OUTPUT_DIR training_dir/fcos_imprv_R_50_FPN_1x
### Camera Control Module
For the simulated airport (SA)
python tools/train_dqn_vs.py \
--drl-weights training_dir/ddqn_plane \
--double
For the virtual park (VP)
python tools/train_dqn_vsb.py \
--drl-weights training_dir/ddqn_car \
--double
### Training STF agent
Change "--game" for different datasets
python run_dt_eod.py --seed [seed] --context_length 6 --epochs 5 --model_type 'reward_conditioned' --game 'SA' --batch_size 128 --data_dir_prefix [DIRECTORY_NAME]
## 4. Inference
### Step
1. Get the search file for the inference process
2. Interpret the search file to generate a JSON file for inference
### Camera Control Module
For the simulated airport (SA)
python tools/test_dqn_vs.py \
--drl-weights training_dir/ddqn_plane/dqn_weights_final.h5f \
--pickle-dir ddqn_plane_search \
--double
For the virtual park (VP)
python tools/test_dqn_vsb.py \
--drl-weights training_dir/ddqn_car/dqn_weights_final.h5f \
--pickle-dir ddqn_car_search \
--double
### STF Control Module
Change "--game" for different datasets
python run_dt_eod.py -- test -- model_path [MODEL_PATH] --seed [seed] --context_length 6 --epochs 5 --model_type 'reward_conditioned' --game 'SA' --batch_size 128 --data_dir_prefix [DIRECTORY_NAME]
Obtain ground truth of each step for testing according to the corresponding camera parameters. Please modify category and the name of pickle file in get_json.py.
python tools/get_json.py
### Object Detection Module
For the ground truth of each step generated by get_json.py, use the following command for testing.
python tools/test_net.py \
--config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \
MODEL.WEIGHT FCOS_imprv_R_50_FPN_1x.pth \
TEST.IMS_PER_BATCH 4
## 5. Citation
```
@inproceedings{conf/aaai/ShenHXHW24,
author = {Lingdong Shen and Chunlei Huo and Nuo Xu and Chaowei Han and Zichen Wang},
title = {Learn How to See: Collaborative Embodied Learning for Object Detection
and Camera Adjusting},
booktitle = {AAAI},
year = {2024},
}
```