Download HOOT

Download Using HOOT Toolkit

The full HOOT dataset in original quality (>3TB) and its HD version (566GB) can be downloaded using the toolkit. The dataset comes with extracted frames for each video, as well as the annotation files (in JSON format).

Scroll down for additional information on the dataset/annotation format and licensing…

HOOT Dataset Structure


Folder Structure


HOOT dataset has the following folder structure:

  • The dataset is organized by object class and video name.
  • The video frames have been extracted in the .png format and named using 0-indexing.
  • Basic video-level metadata information (e.g. target and motion attributes) can be found in the meta.info file under each video folder.
  • Video annotations are given in the anno.json file in the video folder. More information on the annotation format can be found below.
  • Videos in the training split, in the train.txt file.
  • Videos in the test split, in the test.txt file.
  • License information for the dataset can be found in the root HOOT folder, in the license.txt file.
      HOOT/
      ├── apple/
      │   ├── 001/
      │   │   ├── 000000.png
      │   │   ├── 000001.png
      │   │   ├── ...
      │   │   ├── 000949.png
      │   │   ├── meta.info
      │   │   ├── anno.json
      │   ├── 002/
      │   ├── ...
      │   └── 020/
      ├── .../
      ├── zebra/
      ├── test.txt
      ├── train.txt
      └── license.txt
      

Annotation Format


HOOT annotations are in the JSON format, and include the following:

  • Video key in the class-video_name format, (e.g. apple-001).
  • Percentage of frames that have occlusion, frame_occlusion_level.
  • Median and mean occlusion level of the target across frames, computed using mask IoU with the ground truth target box.
  • A list of frame annotations in the form of python dictionaries, each dict includes:
    • A frame_id key with the frame index as its value.
    • rot_bb and aa_bb keys for the rotated and axis-aligned bounding boxes for the object. These bounding boxes are in the form of [(x1,y1),(x2,y2),(x3,y3),(x4,y4)], and contain floats for each of the points.
    • The mask dictionary contains the occlusion masks for each frame. If a specific occlusion mask does not appear in the frame, the value will be []. Otherwise, the mask will be given in the RLE format popularized by the COCO annotations. The toolkit will require pycocotools and provides examples for reading the annotations. If there's an occlusion in the frame, any of the following mask types might be annotated:
      • all: the mask for all occluders computed by taking the union of all masks.
      • s: the mask for all solid occluders combined.
      • sp: the mask for all sparse occluders combined.
      • st: the mask for all semi-transparent occluders combined.
      • st: the mask for all transparent occluders combined.
    • An attributes dictionary with the frame-level occlusion tags for absent, full_occlusion, cut_by_frame, partial_obj_occlusion and similar_occluder.
      {
        "video_key": apple-001,
        "frame_occlusion_level": 0.6715,
        "median_target_occlusion_level": 0.6080,
        "mean_target_occlusion_level": 0.6487,
        "frames":
        [
          ...
          {
            "frame_id": 20,
            "aa_bb": [(x1,y1),(x2,y2),(x3,y3),(x4,y4)],
            "rot_bb": [(x1,y1),(x2,y2),(x3,y3),(x4,y4)],
            "masks": 
            {
              "all": "RLE encoded mask"
              "s": "RLE encoded mask"
              "sp": []
              "st": []
              "t": "RLE encoded mask"
            },
            "attributes":
            {
              "absent": false
              "full_occlusion": false
              "cut-by-frame": true
              "partial_obj_occlusion": true
              "similar_occluder": false
            }
          },
          ...
        ]
      } 
      

License Information


Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

For inquiries regarding a commercial license to this work please contact the USC Stevens Center for Innovation at licensing@stevens.usc.edu and reference case number 2022-179.