Semantic, Object-level and Dynamic SLAM

Multi-Object and Object-level Dynamic Mapping

In this very much ongoing work, we are exploring segmentation and tracking of (rigid) objects into submaps. On the one hand, the underlying algorithms depend on instance-level semantic segmentation networks, and on the other hand they employ geometric and photometric tracking (also for identification of moving objects), as well as volumetric mapping. Works include Fusion++ (Dyson Robotics Lab at Imperial College) and MID-Fusion (SRL at Imperial College)

Current collaborators:

Binbin Xu (SRL at Imperial College)
Prof. Andrew Davison (Imperial College)

Former collaborators:

Dr Ronnie Clark (Imperial College)
Dr Michael Bloesch (previously Dyson Robotics Lab, now DeepMind)
Dr Sajad Saaedi (previously Dyson Robotics Lab)

Related Publications

Export as PDF, XML, TEX or BIB

2022

Conference and Workshop Papers
[]		Learning to Complete Object Shapes for Object-level Mapping in Dynamic Scenes (B Xu, A Davison and S Leutenegger), In International Conference on Intelligent Robots and Systems (IROS), 2022. ([video][project page]) [bibtex] [pdf]

2019
Conference and Workshop Papers
[]		Mid-fusion: Octree-based object-level multi-instance dynamic slam (B Xu, W Li, D Tzoumanikas, M Bloesch, A Davison and S Leutenegger), In 2019 International Conference on Robotics and Automation (ICRA), 2019. [bibtex]

2018
Conference and Workshop Papers
[]		Fusion++: Volumetric object-level slam (J McCormac, R Clark, M Bloesch, A Davison and S Leutenegger), In 2018 international conference on 3D vision (3DV), 2018. [bibtex]

SemanticFusion (Dyson Robotics Lab at Imperial College)

In SemanticFusion, we use a real-time capable dense RGB-D SLAM system, ElasticFusion, and add a semantic layer to it. In parallel to the localisation and mapping process, a CNN takes the same inputs (colour image and depth image), in order to output semantic segmentation predictions. We aggregate this semantic information in the map by means of Bayesian fusion. The work is significant for two reasons: first of all, such a real-time semantic mapping framework will play a core enabling role for future robots to perform more abstract reasoning, i.e. bridging the gap with AI, also in relation to intuitive user interaction. Second, we could experimentally show that the map serving as a means for semantic data association across many frames in fact boosts accuracy of 2D semantic segmentation — when compared to single-view predictions.

Former collaborators:

Dr Ronnie Clark (Imperial College)
Prof. Andrew Davison (Imperial College)
John McCormac (Previously Dyson Robotics Lab at Imperial College)
Dr Ankur Handa (previously Dyson Robotics Lab at Imperial College)

Related Publication

Export as PDF, XML, TEX or BIB

2017

Conference and Workshop Papers
[]		Semanticfusion: Dense 3d semantic mapping with convolutional neural networks (J McCormac, A Handa, A Davison and S Leutenegger), In 2017 IEEE International Conference on Robotics and automation (ICRA), 2017. [bibtex]

Datasets

Deep learning approaches are naturally data hungry. We are therefore working on a number of datasets, where imagery is syntethically generated through realistic rendering. Furthermore, we can use the datasets for evaluation of SLAM algorithms (pose and structure), as we have ground truth trajectories, maps, and also complementary sensing modalities available, such as IMUs.

SceneNet RGB-D
InteriorNet

See Software & Datasets for downloads.