Logo Github SeaDronesSee MaCVi                   Logo

MaCVi '23

1st Workshop on Maritime Computer Vision (MaCVi)

SeaDronesSee v2 - Object Detection


Download the SeaDronesSee Object Detection v2 dataset on the dataset page. Upload a json-prediction on the test set on the upload page.


The SeaDronesSee benchmark was recently published in WACV 2022. The goal of this benchmark is to advance computer vision algorithms in Search and Rescue missions. It is aimed at detecting humans, boats and other objects in open water. See the Explore page for some annotation examples. The task of object detection in maritime search and rescue is far from solved. For example, the best performing model of the SeaDronesSee object detection track currently only achieves 36% mAP. Compare that to the common COCO benchmark with the best performer achieving over 60% mAP. For the challenge of the workshop, we are going to make a few changes from the original SeaDronesSee object detection benchmark. In addition to the already publicly available data, we collected further training data, which is included at the start of the challenge. In particular, we extend the object detection track of SeaDronesSee by almost 9K newly captured images depicting the sea surface from the viewpoint of a UAV. The task is to detect the classes swimmer, boat, jetski, buoy, life saving appliance (life jacket/lifebelt). The ground truth bounding boxes are available and the evaluation protocol is based on the standard mean average precision. Owing to the application scenario, we offer an additional sub-track where only class-agnostic models are evaluated which resembles the use-case of detecting anything that is not water. Furthermore, we require participants to upload information on their used hardware and runtime.


Create an object detector that, given an image, outputs bounding boxes for the classes of interest. The bounding boxes are rectangular axis-aligned boxes.
There are two subtracks:
1. Standard Object Detection: In the first subtrack, you should localize and classify all annotated classes as given below.
2. Binary Object Detection: In the second subtrack, you should localize and classify objects in a binary fashion: water and non-water.

Your submission to the first subtrack will automatically be evaluated on the second subtrack by merging all your predicted classes to a single class "non-water". However, you are encouraged to specifically train and submit a model on the second subtrack since you might improve the accuracy for the (possibly simpler) binary prediction task.


There are 14,227 RGB images for the task of object detection (training: 8,930; validation: 1,547; testing: 3,750). The images are captured from various altitudes and viewing angles ranging from 5 to 260 meters and 0 to 90° degrees while providing the respective meta information for altitude, viewing angle and other meta data for almost all frames, which are of 3840x2160, 5456x3632 and 1229x934 resolution. Each image is annotated with labels for the classes

swimmer, boat, jetski, buoy, life saving appliance (life jacket/lifebelt)

Additionally, there is an ignore class. This region contains difficult to label or ambiguous objects. We blackened out these regions in the images already.
We provide meta data for most of the images. However, there is some number of images not containing any meta labels. Although the bounding box annotations for the test set are withheld, the meta data labels for the test set are available here.

Evaluation Metrics

We evaluate your predictions on the commonly used AP, AP50, AP75, AR1 and AR10. We use the commonly used COCO evaluation protocol. You can see the od.py protocol on our Github. For the first subtrack, we average the AP results over all classes. For the second subtrack, we only have a single class.

The determining metric for winning will be AP. In case of a draw, AP50 counts.

Furthermore, we require every participant to submit information on the speed of their method measured in frames per second walltime. Please also indicate the hardware that you used. Lastly, you should indicate which data sets (also for pretraining) you used during training and whether you used meta data.


In order to participate, you can perform the following steps:
  1. Download the dataset SeaDronesSee Object Detection v2 (not the old one!) on the dataset page.
  2. Visualize the bounding boxes using these Python scripts or Google colabs. You should adapt the paths for that.
  3. Train an object detector of your choice on the dataset. Optimally, you train a model for each of the stated subtracks individually. We provide you with sample object detection training pipelines on Git.
  4. Create a json-file in COCO style that are the predictions on the test set (see also the dataset page for more details). Find a sample submission file for the standard object detection here. For the binary track, download a sample file here.
  5. Upload your json-file along with all the required information in the form here. You need to register first. Make sur you choose the proper track (Object Detection v2 or Binary Object Detection v2).
  6. If you provided a valid file and sufficient information, you should see your method on the leaderboard. Now, you can reiterate. You may upload at most three times per day (independent of the challenge track).

Terms and Conditions

In case of any questions regarding the challenge modalities or uploads, please direct them to benjamin.kiefer@uni-tuebingen.de.

Sentient Vision Systems Logo