JHU Visual Perception Datasets

Announcement

1/23/2017 Textured meshes of hand tools used in JHUSecne-50 and JHUSEQ-25 are available!

1/17/2017 Download packages of JHUSEQ-25 is available!

5/13/2016 Download packages of JHUScene-50 is available!

8/1/2015 Download packages of JHUIT-50 and LN-66 are available!

Introduction

The JHU Visual Perception Datasets (JHU-VP) contain benchmarks for object recognition, detection and pose estimation using RGB-D data. In particular, these datasets are designed to evaluate perception algorithms in typical robotics environments.  Currently, the releases of the data include a SLAM-based scene understanding dataset JHUSEQ-25[1], a pose estimation dataset JHUScene-50[2], an object classification benchmark JHUIT-50[3] and a scene dataset LN-66[4]. PrimeSense Carmine 1.09 depth sensor (short range) was used to capture RGB-D data. The calibrated camera parameters can be downloaded from here. More data will be coming very soon!

For each RGB-D data, we provide a point cloud in pcd format with each point type as pcl::PointXYZRGBA,  a corresponding depth map, a RGB image and a binary object mask. Please refer to PCL library for more information about the pcd file.

Textured meshes of the hand tools used in JHUSecne-50 and JHUSEQ-25 can be separately downloaded from here.

JHUSEQ-25:

  • 25 RGB-D video sequences of office scenes captured at 20Hz ~30Hz.
  • >400 frames per video.
  • 6-DoF pose groundtruth is provided for 10 hand tools in each video.
  • Ideally used for testing SLAM-based recognition methods

JHUScene-50:gtposeall_tools

  • This dataset contains 50 indoor scenes, 5000 testing frames and 22520 labeled poses for 10 common hand tools used in mechanical operations. More details can be found in [1].
  • We provide a simple C++ program to interact the dataset based on the PCL library .
  • For each object,  an CAD model and 900 real partial views with groundtruth segmentation are provided.
  • Textured CAD models are coming soon.

JHUIT-50:examples_IT_50

  • This dataset contains 50 industrial objects (shown in right figure) and hand tools used in mechanical operations. More details can be found in [2].
  • The training and testing data are captured in non-overlapping viewpoints. This benchmark is good for testing the robustness of your algorithm to 3D rotation.
  • The prefix of each file is formatted as “{object name}_{seq id}_{frame id}” in which “seq id” is ranging from 0-6.  Training data for each object corresponds to the file with “seq id”<4 and the remaining is test data.

LN-66:

  • dense_partsatomic_partsThis scene dataset contains 614 testing scene examples with various complex configurations of the “link” and “node” objects. Both objects can also be found in JHUIT-50. More details can be found in [3]. The right figure shows the “link” and “node” objects and an example scene.
  • The downloaded package includes raw scene point clouds formatted as “ln_{frame id}.pcd” in folder LN_66_scene, object pose groundtruth formatted as “{object name}_gt_{frame id}.csv” in folder LN_66_gt and object data in folder training_objects.
  • We provide object mesh files in two formats (.obj) and (.vtk).  There is no color info in object mesh for now.

Downloads

** This dataset is released for the purpose of academic research, not for commercial usage. **

Please complete the following form to access the data set and utilities. Select all packages you wish to download by checking the box.

The download links will show up below the submit button upon submitting the form and the password will be delivered to you via email in 24 hours.

** Note: the download link expires in 24 hours. **


Name

Email

Institution


JHUSEQ-25
JHUScene-50
JHUIT-50 Object Dataset
LN-66 Scene Dataset

References

 If you use the object database JHUIT-50 and scene dataset LN-66, or refer to their results, please cite the following papers accordingly:

  1. Chi Li, Han Xiao, Keisuke Tateno, Federico Tombari, Nassir Navab and Gregory D. Hager. Incremental Scene Understanding on Dense SLAM, In International Conference on Intelligent Robots and Systems (IROS), 2016. [pdf]
  2. Chi Li, Jonathan Bohren, Eric Carlson and Gregory D. Hager. Hierarchical Semantic Parsing for Object Pose Estimation in Densely Cluttered Scenes, In International Conference on Robotics Automation (ICRA), 2016. [pdf]
  3. Chi Li, Austin Reiter and Gregory D. Hager. Beyond Spatial Pooling: Fine-Grained Representation Learning in Multiple Domains, In Computer Vision and Pattern Recognition (CVPR), 2015. [pdf]
  4. Chi Li, Jonathan Boheren and Gregory D. Hager. Bridging the Robot Perception Gap With Mid-Level Vision, In International Symposium on Robotics Research (ISRR), 2015. [pdf]

Acknowledgements

Credits to Chi Li, Jonathan Boheren, Felix Jonathan, Hanyue Liang, Han Xiao, Yuanwei Zhao, Bo Liu and Eric Carlson for collecting data and developing the data-collection software.

Special acknowledgement to Gregory D. Hager and Austin Reiter for their support and suggestions in building this dataset.

This dataset is funded by the National Science Foundation under Grant No. NRI-1227277.

Feedback

Please let us know your questions, comments and suggestions via chi_li [at] jhu [dot] edu. We appreciate your feedback!

Links and resources

Comments are closed