# UAVDetectionTrackingBenchmark **Repository Path**: traveler_zhao/UAVDetectionTrackingBenchmark ## Basic Information - **Project Name**: UAVDetectionTrackingBenchmark - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 2 - **Created**: 2021-08-25 - **Last Updated**: 2025-04-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # UAVDetectionTrackingBenchmark This repository contains the code, configuration files and dataset statistics used for the paper **Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark** submitted to IROS 2021. The repository is organized as follows: - **datasets** (dir): Contains the COCO annotation files used for each dataset - **detection** (dir): This directory contains the configuration files for detection, the log files and some scripts used for setting up the detection dataset. ### Prerequisites - PyTorch 1.7.1 - OpenCV 4.5 - MMCV 1.2.4 - MMDet 2.8.0 - MMTrack 0.5.1 ### Installation Detection and tracking was carried using the OpenMMLab frameworks for each task. In this section, we give a summary on how to setup the frameworks for each task #### Detection 1. Install [MMDetection](https://github.com/open-mmlab/mmdetection) using the [Getting Started](https://github.com/open-mmlab/mmdetection/blob/master/docs/get_started.md) guide. 1. Create a directory under the `configs` folder (e.g., `configs/uavbenchmark`) and copy the [config files](detection/configs). 1. Create the `data` directory in the root folder of the project and create the dataset folders (you can also use symbolic links): - anti-uav - anti-uav/images - drone-vs-bird - drone-vs-bird/images - mav-vid - mav-vid/images 1. Under each dataset folder copy the [annotation files](datasets) to its corresponding folder. 1. Copy all the images for each dataset to `/images` (see below for each dataset details). 1. Create a `checkpoints` folder under the root of the project and download the weight files (see below) to this folder. 1. Run the following script: `python tools/test.py ` ### Datasets Three datasets were in our benchmark. An example of each dataset is shown next, with (a) MAV-VID, (b) Drone-vs-Bird, (c) Anti-UAV Visual and (d) Anti-UAV Infrared. ![Dataset examples](/images/dataset_example.png) #### Multirotor Aerial Vehicle VID (MAV-VID) This dataset consists on videos at different setups of single UAV. It contains videos captured from other drones, ground based surveillance cameras and handheld mobile devices. It can be downloaded in its [kaggle site](https://www.kaggle.com/alejodosr/multirotor-aerial-vehicle-vid-mavvid-dataset). This dataset is composed of images with YOLO annotations divided in two directories: train and val. In order to use this dataset in this benchmark kit, create the COCO annotation files for each data partition, using the [convert_mav_vid_to_coco.py](detection/utils/convert_mav_vid_to_coco.py), rename them to *train.json* and *val.json* and move them to the `data/mav-vid` directory created in the installation steps. Then, copy all images of both partitions to the `data/mav-vid/images` directory. #### Drone-vs-Bird As part of the [International Workshop on Small-Drone Surveillance, Detection and Counteraction techniques](https://wosdetc2020.wordpress.com/drone-vs-bird-detection-challenge/) of IEEE AVSS 2020, the main goal of this challenge is to reduce the high false positive rates that vision-based methods usually suffer. This dataset comprises videos of UAV captured at long distances and often surrounded by small objects, such as birds. The videos can be downloaded upon request and the annotations can be downloaded via their [GitHub site](https://github.com/wosdetc/challenge). The annotations follow a custom format, where a a .txt file is given for each video. Each annotation file has a line for each video frame and the annotation is given in the format ` [ ...]`. To use this dataset in this benchmark, first you need to convert the video to images via [video_to_images.py](detection/utils/video_to_images.py) and then you need to create the COCO annotations using the [convert_drone_vs_bird_to_coco.py](detection/utils/convert_drone_vs_bird_to_coco.py) script. Just as in the MAV-VID dataset, copy the images to the `data/drone-vs-bird/images` directory and the annotations to `data/drone-vs-bird`. #### Anti-UAV This multi-modal dataset comprises fully-annotated RGB and IR unaligned videos. Anti-UAV dataset is intended to provide a real-case benchmark for evaluating object tracker algorithms in the context of UAV. It contains recordings of 6 UAV models flying at different lightning and background conditions. This dataset can be downloaded in their [website](https://anti-uav.github.io/dataset/). This dataset is also comprised of videos and custom annotations. Once downlaoded and extracted, the videos are organised in folders containing the RGB and IR versions, with their corresponding JSON annotations. To convert this dataset to images and COCO annotations, use the [convert_anti_uav_to_coco.py](detection/utils/convert_anti_uav_to_coco.py) script and copy the images generated annotations to `data/anti-uav` and the images to `data/anti-uav/images`. The images folder will contain the images for both modalities and the full (both modalities), RGB and IR annotations will be generated. #### Dataset Statistics Dataset object size Dataset | Size | Average Object Size --------|------|--------------------- **MAV-VID** | *Training*: 53 videos (29,500 images)
*Validation*: 11 videos (10,732 images) | 136 x 77 pxs (0.66% of image size) **Drone-vs-Bird** | *Training*: 61 videos (85,904 images)
*Validation*: 16 videos (18,856 images) | 34 x 23 pxs (0.10% of image size) **Anti-UAV** | *Training*: 60 videos (149,478 images)
*Validation*: 140 videos (37,016 images) | *RGB*: 125 x 59 pxs (0.40% image size)
*IR*: 52 x 29 pxs (0.50% image size) Location, size and image composition statistics ![Dataset examples](/images/dataset_statistics.png) ### Detection Results Four detection architectures were used for our analysis: [Faster RCNN](https://arxiv.org/abs/1506.01497), [SSD512](https://arxiv.org/abs/1512.02325), [YOLOv3](https://arxiv.org/abs/1804.02767) and [DETR](https://arxiv.org/abs/2005.12872). For the implementation details, refer to our paper. The results are as follows:

Dataset	Model	AP	AP_0.5	AP_0.75	AP_S	AP_M	AP_L	AR	AR_S	AR_M	AR_L
MAV-VID	Faster RCNN log weights	0.592	0.978	0.672	0.154	0.541	0.656	0.659	0.369	0.621	0.721
	SSD512 log weights	0.535	0.967	0.536	0.083	0.499	0.587	0.612	0.377	0.578	0.666
	YOLOv3 log weights	0.537	0.963	0.542	0.066	0.471	0.636	0.612	0.208	0.559	0.696
	DETR log weights	0.545	0.971	0.560	0.044	0.490	0.612	0.692	0.346	0.661	0.742
Drone-vs-Bird	Faster RCNN log weights	0.283	0.632	0.197	0.218	0.473	0.506	0.356	0.298	0.546	0.512
	SSD512 log weights	0.629	0.134	0.199	0.422	0.052	0.379	0.327	0.549	0.556
	YOLOv3 log weights	0.210	0.546	0.105	0.158	0.395	0.356	0.302	0.238	0.512	0.637
	DETR log weights	0.251	0.667	0.123	0.190	0.444	0.533	0.473	0.425	0.631	0.550
Anti-UAV-Full	Faster RCNN log weights	0.612	0.974	0.701	0.517	0.619	0.737	0.666	0.601	0.670	0.778
	SSD512 log weights	0.613	0.982	0.697	0.527	0.619	0.712	0.678	0.616	0.682	0.780
	YOLOv3 log weights	0.604	0.977	0.676	0.529	0.619	0.708	0.667	0.618	0.668	0.760
	DETR log weights	0.586	0.977	0.648	0.509	0.589	0.692	0.649	0.598	0.649	0.752
Anti-UAV-RGB	Faster RCNN log weights	0.642	0.982	0.770	0.134	0.615	0.718	0.694	0.135	0.677	0.760
	SSD512 log weights	0.627	0.979	0.747	0.124	0.593	0.718	0.703	0.156	0.682	0.785
	YOLOv3 log weights	0.617	0.986	0.717	0.143	0.595	0.702	0.684	0.181	0.664	0.758
	DETR log weights	0.628	0.978	0.740	0.129	0.590	0.734	0.700	0.144	0.675	0.794
Anti-UAV-IR	Faster RCNN log weights	0.581	0.977	0.641	0.523	0.623	-	0.636	0.602	0.663	-
	SSD512 log weights	0.590	0.975	0.639	0.518	0.636	-	0.649	0.609	0.681	-
	YOLOv3 log weights	0.591	0.976	0.643	0.533	0.638	-	0.651	0.620	0.675	-
	DETR log weights	0.599	0.980	0.655	0.525	0.642	-	0.671	0.633	0.701	-

### Citation ``` @article{uavbenchmark, title={Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark}, author={Isaac-Medina, Brian K. S. and Poyser, Matt and Organisciak, Daniel and Willcocks, Chris G. and Breckon, Toby P. and Shum, Hubert P. H.}, journal = {arXiv}, year={2021} } ```