# STF **Repository Path**: xinci/STF ## Basic Information - **Project Name**: STF - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-01-02 - **Last Updated**: 2025-01-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting > Lingdong Shen, Chunlei Huo, Nuo Xu, Chaowei Han, Zichen Wang


## 1. Installation This implementation is based on [Decision Transformer](https://sites.google.com/berkeley.edu/decision-transformer), [FCOS](https://github.com/tianzhi0549/FCOS), [gym](https://github.com/openai/gym) and [keras-rl](https://github.com/keras-rl/keras-rl), . ## 2. Dataset Download address of two image datasets to create the environment: [SA](https://www.dropbox.com/s/jwusmkq90t0cq5f/SA.zip?dl=0) and [VP](https://www.dropbox.com/s/4jmdbpy0lbnyddn/VP.zip?dl=0). Replay buffer transition data is stored in trajectories under ./data/transition/ We have obtained the offline features of all the scene images, and you can download them if you need them [Features] --- Because the shared file is too big and takes up a lot of space(Our dropbpx is out of space), we will put it in other web disks later. ## 3. Training agent ### Object Detection Module python -m torch.distributed.launch \ --nproc_per_node=8 \ --master_port=$((RANDOM + 10000)) \ tools/train_net.py \ --config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \ DATALOADER.NUM_WORKERS 2 \ OUTPUT_DIR training_dir/fcos_imprv_R_50_FPN_1x ### Camera Control Module For the simulated airport (SA) python tools/train_dqn_vs.py \ --drl-weights training_dir/ddqn_plane \ --double For the virtual park (VP) python tools/train_dqn_vsb.py \ --drl-weights training_dir/ddqn_car \ --double ### Training STF agent Change "--game" for different datasets python run_dt_eod.py --seed [seed] --context_length 6 --epochs 5 --model_type 'reward_conditioned' --game 'SA' --batch_size 128 --data_dir_prefix [DIRECTORY_NAME] ## 4. Inference ### Step 1. Get the search file for the inference process 2. Interpret the search file to generate a JSON file for inference ### Camera Control Module For the simulated airport (SA) python tools/test_dqn_vs.py \ --drl-weights training_dir/ddqn_plane/dqn_weights_final.h5f \ --pickle-dir ddqn_plane_search \ --double For the virtual park (VP) python tools/test_dqn_vsb.py \ --drl-weights training_dir/ddqn_car/dqn_weights_final.h5f \ --pickle-dir ddqn_car_search \ --double ### STF Control Module Change "--game" for different datasets python run_dt_eod.py -- test -- model_path [MODEL_PATH] --seed [seed] --context_length 6 --epochs 5 --model_type 'reward_conditioned' --game 'SA' --batch_size 128 --data_dir_prefix [DIRECTORY_NAME] Obtain ground truth of each step for testing according to the corresponding camera parameters. Please modify category and the name of pickle file in get_json.py. python tools/get_json.py ### Object Detection Module For the ground truth of each step generated by get_json.py, use the following command for testing. python tools/test_net.py \ --config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \ MODEL.WEIGHT FCOS_imprv_R_50_FPN_1x.pth \ TEST.IMS_PER_BATCH 4 ## 5. Citation ``` @inproceedings{conf/aaai/ShenHXHW24, author = {Lingdong Shen and Chunlei Huo and Nuo Xu and Chaowei Han and Zichen Wang}, title = {Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting}, booktitle = {AAAI}, year = {2024}, } ```