In this very much ongoing work, we are exploring segmentation and tracking of (rigid) objects into submaps. On the one hand, the underlying algorithms depend on instance-level semantic segmentation networks, and on the other hand they employ geometric and photometric tracking (also for identification of moving objects), as well as volumetric mapping. Works include Fusion++ (Dyson Robotics Lab at Imperial College) and MID-Fusion (SRL at Imperial College)
Current collaborators:
Former collaborators:
Export as PDF, XML, TEX or BIB
2022
Conference and Workshop Papers
[] Learning to Complete Object Shapes for Object-level Mapping in Dynamic Scenes , In International Conference on Intelligent Robots and Systems (IROS), 2022. ([video][project page])
2019
Conference and Workshop Papers
[] Mid-fusion: Octree-based object-level multi-instance dynamic slam , In 2019 International Conference on Robotics and Automation (ICRA), 2019.
2018
Conference and Workshop Papers
[] Fusion++: Volumetric object-level slam , In 2018 international conference on 3D vision (3DV), 2018.
In SemanticFusion, we use a real-time capable dense RGB-D SLAM system, ElasticFusion, and add a semantic layer to it. In parallel to the localisation and mapping process, a CNN takes the same inputs (colour image and depth image), in order to output semantic segmentation predictions. We aggregate this semantic information in the map by means of Bayesian fusion. The work is significant for two reasons: first of all, such a real-time semantic mapping framework will play a core enabling role for future robots to perform more abstract reasoning, i.e. bridging the gap with AI, also in relation to intuitive user interaction. Second, we could experimentally show that the map serving as a means for semantic data association across many frames in fact boosts accuracy of 2D semantic segmentation — when compared to single-view predictions.
Former collaborators:
Deep learning approaches are naturally data hungry. We are therefore working on a number of datasets, where imagery is syntethically generated through realistic rendering. Furthermore, we can use the datasets for evaluation of SLAM algorithms (pose and structure), as we have ground truth trajectories, maps, and also complementary sensing modalities available, such as IMUs.
See Software & Datasets for downloads.