We produced a software that includes methods for performing semantic segmentation and instance segmentation at image level and in point cloud. In addition, it proposes two novel methods to efficiently perform continual learning on semantic segmentation and depth estimation.
The prototype implementation of tools for the instance segmentation is built on top of the YOLACT method. This 2D segmentation Convolutional Neural Network (CNN) generates prototype masks of individual objects using a projection head on the ResNet-101 backbone. The prototypes are linearly combined, leading to the real-time speed required for online data processing. The following tool focuses on the same task processed on top of the point cloud from the depth estimation or MVS tools. It is the first end-to-end trainable CNN that constructs a segmentation tree. Tree traversal and the splitting module provide proposals for further pruned and refined objects.
The spatio-temporal segmentation is a high-dimensional convolutional neural network for 4D spatio-temporal data. We used the published example on 3D data with a fixed time. The algorithm is based on an auto-differentiation library for sparse tensors and sparse convolution, that is, the Minkowski engine. The sparse tensors outperform the classical CNN and lead to SoTA results in quality and time criteria.
Regarding the incremental learning model for semantic segmentation, we propose a new framework, namely uncertainty-aware contrast distillation (UCD) that has been recently published in IEEE TPAMI. The model proposes a novel contrastive distillation loss that accounts for semantic association among pixels of the new and old model. The method adopts an uncertainty estimation strategy for the contrastive learning framework that leverages the joint probability of the pixel pairs belonging to the same class and weights the strength of the distillation signal
accordingly. Additionally, the model has been proven to be beneficial on top of most of the traditional semantic segmentation frameworks on a number of publicly available benchmarks 5. The work has been published in the IEEE Transaction on Pattern Analysis and Machine Intelligence in 2022.
Details and links to the module are available within D2.5 in our results page.