List of SPRING public deliverables (also available here as a bundle)
- Work Package 1: Experimental Validation
Work Package 2: Environment Mapping, Self-localisation and Simulation
D2.3 – Audio-visual data simulator
Work Package 3: Robust Audio-visual Perception of Humans
Work Package 4: Multi-Modal Human Behaviour Understanding
Work Package 5: Multi-User Spoken Conversations with Robots
Work Package 6: Learning Robot Behaviour
Work Package 7: Robot Customization and Software Integration
Work Package 8: Dissemination, Communication and Exploitation
Work Package 9: Project Management
Journal & Conference Papers
108 published papers as of May 2023 — list also available as a pdf here
The hybrid Cramér-Rao lower bound for simultaneous self-localization and room geometry estimation Maya Veisman; Yair Noam; Sharon Gannot EURASIP Journal on Advances in Signal Processing 10.1186/s13634-020-00702-6 https://doaj.org/article/d9d188dab150478081bb0ae2d66a2a8e Dynamically localizing multiple speakers based on the time-frequency domain Hodaya Hammer; Shlomo E. Chazan; Jacob Goldberger; Sharon Gannot EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-021-00203-w http://link.springer.com/content/pdf/10.1186/s13636-021-00203-w.pdf Semi-Supervised Source Localization in Reverberant Environments With Deep Generative Modeling Michael J. Bianco; Sharon Gannot; Efren Fernandez-Grande; Peter Gerstoft IEEE Access 10.1109/access.2021.3087697 https://ieeexplore.ieee.org/document/9449880/ An online algorithm for echo cancellation, dereverberation and noise reduction based on a Kalman-EM Method Nili Cohen; Gershon Hazan; Boaz Schwartz; Sharon Gannot EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-021-00219-2 https://link.springer.com/content/pdf/10.1186/s13636-021-00219-2.pdf Audio source separation by activity probability detection with maximum correlation and simplex geometry Bracha Laufer-Goldshtein; Ronen Talmon; Sharon Gannot EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-021-00195-7 http://link.springer.com/content/pdf/10.1186/s13636-021-00195-7.pdf Semi-Supervised Source Localization with Deep Generative Modeling Michael J. Bianco; Sharon Gannot; Peter Gerstoft 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP) 10.1109/mlsp49062.2020.9231825 http://xplorestaging.ieee.org/ielx7/9217888/9231523/09231825.pdf?arnumber=9231825 Simultaneous Tracking and Separation of Multiple Sources Using Factor Graph Model Koby Weisberg; Bracha Laufer-Goldshtein; Sharon Gannot IEEE/ACM Transactions on Audio, Speech, and Language Processing 10.1109/taslp.2020.3028650 http://xplorestaging.ieee.org/ielx7/6570655/8938144/09212589.pdf?arnumber=9212589 Speech enhancement with mixture-of-deep-experts with clean clustering pre-training Chazan, Shlomo E.; Goldberger, Jacob; Gannot, Sharon ICASSP http://arxiv.org/abs/2102.06034 Misalignment Recognition in Acoustic Sensor Networks Using a Semi-Supervised Source Estimation Method and Markov Random Fields Miller, Gabriel F; Brendel, Andreas; Kellermann, Walter; Gannot, Sharon ICASSP 10.1109/icassp39728.2021.9413765 http://xplorestaging.ieee.org/ielx7/9413349/9413350/09413765.pdf?arnumber=9413765 A Bayesian Hierarchical Model for Blind Audio Source Separation Y. Laufer and S. Gannot 2020 28th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco47968.2020.9287348 https://sharongannot.group/wp-content/uploads/2021/04/20200701044448_404125_1434.pdf Decoupled Direction-of-Arrival Estimations Using Relative Harmonic Coefficients Y. Hu, T. D. Abhayapala, P. N. Samarasinghe and S. Gannot 2020 28th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco47968.2020.9287611 https://sharongannot.group/wp-content/uploads/2021/04/Decoupled_Direction-of-Arrival_Estimations_Using_Relative_Harmonic_Coefficients.pdf Multiple Speaker Localization using Mixture of Gaussian Model with Manifold-based Centroids A. Bross, B. Laufer-Goldshtein and S. Gannot 2020 28th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco47968.2020.9287796 https://sharongannot.group/wp-content/uploads/2021/04/20200623014046_441602_1497.pdf Forward-backward recursive expectation-maximization for concurrent speaker tracking Dorfan, Y., Schwartz, B. & Gannot, S EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-020-00189-x https://rdcu.be/ch29Q Blind Audio Source Separation Using Two Expectation-Maximization Algorithms A. Eisenberg, B. Schwartz and S. Gannot 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP) 10.1109/mlsp49062.2020.9231931 https://sharongannot.group/wp-content/uploads/2021/04/PID6534275.pdf A Bayesian Hierarchical Mixture of Gaussian Model for Multi-Speaker DOA Estimation and Separation Y. Laufer and S. Gannot 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP) 10.1109/mlsp49062.2020.9231852 https://sharongannot.group/wp-content/uploads/2021/04/20200701044448_404125_1434.pdf Estimation of acoustic echoes using expectation-maximization methods Saqib, U., Gannot, S. & Jensen, J. EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-020-00179-z https://rdcu.be/b6O0C Semi-Supervised Multiple Source Localization Using Relative Harmonic Coefficients Under Noisy and Reverberant Environments Y. Hu, P. N. Samarasinghe, S. Gannot and T. D. Abhayapala IEEE/ACM Transactions on Audio, Speech, and Language Processing 10.1109/taslp.2020.3037521 https://sharongannot.group/wp-content/uploads/2021/04/2020020110.pdf Robust Relative Transfer Function Identification on Manifolds for Speech Enhancement A. Sofer, T. Kounovský, J. Čmejla, Z. Koldovský and S. Gannot 2021 29th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco54536.2021.9616175 https://sharongannot.group/wp-content/uploads/2021/04/20210527034712_513055_6333.pdf Robust Relative Transfer Function Identification on Manifolds for Speech Enhancement A. Sofer, T. Kounovský, J. Čmejla, Z. Koldovský and S. Gannot 2021 29th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco54536.2021.9616175 https://sharongannot.group/wp-content/uploads/2021/04/20210527034712_513055_6333.pdf Online Blind Audio Source Separation using Recursive Expectation-Maximization Aviad Eisenberg, Boaz Schwartz and Sharon Gannot INTERSPEECH 2021 10.21437/interspeech.2021-662 https://www.isca-speech.org/archive/interspeech_2021/eisenberg21_interspeech.html Scene-Agnostic Multi-Microphone Speech Dereverberation Yochai Yemini, Ethan Fetaya, Haggai Maron, Sharon Gannot INTERSPEECH 2021 10.21437/interspeech.2021-889 https://www.isca-speech.org/archive/interspeech_2021/yemini21_interspeech.html A recursive expectation-maximization algorithm for speaker tracking and separation Schwartz, O., Gannot, S. EURASIP Journal on Audio, Speech, and Music Processing 10.1186/s13636-021-00228-1 https://rdcu.be/cC16X Combining Visual and Social Dialogue for Human-Robot Interaction Nancie Gunson; Daniel Hernandez Garcia; Jose L. Part; Yanchao Yu; Weronika Sieińska; Christian Dondrup; Oliver Lemon ICMI ’21: Proceedings of the 2021 International Conference on Multimodal Interaction 10.1145/3462244.3481303 https://dl.acm.org/doi/pdf/10.1145/3462244.3481303 Towards Visual Dialogue for Human-Robot Interaction Jose L. Part; Daniel Hernández García; Yanchao Yu; Nancie Gunson; Christian Dondrup; Oliver Lemon HRI ’21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction 10.1145/3434074.3447278 https://dl.acm.org/doi/pdf/10.1145/3434074.3447278 Explainable Representations of the Social State: A Model for Social Human-Robot Interactions García, Daniel Hernández; Yu, Yanchao; Sieińska, Weronika; Part, Jose L.; Gunson, Nancie; Lemon, Oliver; Dondrup, Christian AAAI FSS-20 AI-HRI http://arxiv.org/abs/2010.04570 A Conversational AI System for Tackling Misinformation Nancie Gunson; Weronika Sieińska; Yanchao Yu; Daniel Hernandez Garcia; Jose L. Part; Christian Dondrup; Oliver Lemon GoodIT ’21: Proceedings of the Conference on Information Technology for Social Good 10.1145/3462203.3475874 https://dl.acm.org/doi/pdf/10.1145/3462203.3475874 Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion Kukelova, Zuzana; Albl, Cenek; Sugimoto, Akihiro; Schindler, Konrad; Pajdla, Tomas European Conference on Computer Vision 10.5281/zenodo.4335228 https://dx.doi.org/10.5281/zenodo.4335228 SocialInteractionGAN: Multi-person Interaction Sequence Generation Airale, Louis; Vaufreydaz, Dominique; Alameda-Pineda, Xavier Transactions on Automatic Control https://hal.inria.fr/hal-03163467 Evaluation and Design Guidelines for Combining Open-Domain Social Conversation with Task-Based Dialogue in Intelligent Buildings Nancie Gunson; Weronika Sieinska; Christopher Walsh; Christian Dondrup; Oliver Lemon IVA ’20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents 10.1145/3383652.3423889 https://dl.acm.org/doi/pdf/10.1145/3383652.3423889 Making Affine Correspondences Work in Camera Geometry Computation Barath, Daniel; Polic, Michal; Förstner, Wolfgang; Sattler, Torsten; Pajdla, Tomas; Kukelova, Zuzana European Conference on Computer Vision 10.1007/978-3-030-58621-8_42 https://dx.doi.org/10.5281/zenodo.4333706 Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation Yahui Liu; Sangineto, Enver; Yajing Chen; Linchao Bao; Haoxian Zhang; Sebe, Nicu; Lepri, Bruno; Wang, Wei; De Nadai, Marco IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10.5281/zenodo.5014014 http://arxiv.org/abs/2106.09016 From two rolling shutters to one global shutter Albl, Cenek; Kukelova, Zuzana; Larsson, Viktor; Pajdla, Tomas; Schindler, Konrad CVPR http://arxiv.org/abs/2006.01964 Neighborhood Contrastive Learning for Novel Class Discovery Zhong, Zhun; Fini, Enrico; Subhankar Roy; Zhiming Luo; Ricci, Elisa; Sebe, Nicu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10.5281/zenodo.5014108 http://arxiv.org/abs/2106.10731 A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling Bie, Xiaoyu; Girin, Laurent; Leglaive, Simon; Hueber, Thomas; Alameda-Pineda, Xavier Interspeech 2021 – 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic. pp.1-5 10.21437/interspeech.2021-256 https://hal.inria.fr/hal-03295657 Learning Visual Voice Activity Detection with an Automatically Annotated Dataset Guy, Sylvain; Lathuilière, Stéphane; Mesejo, Pablo; Horaud, Radu ICPR 2020 – 25th International Conference on Pattern Recognition, Jan 2021, Milano / Virtual, Italy 10.1109/icpr48806.2021.9412884 https://hal.inria.fr/hal-02882229 Variational Inference and Learning of Piecewise linear Dynamical Systems. Xavier Alameda-Pineda; Vincent Drouard; Radu Horaud IEEE Transactions on Neural Networks and Learning Systems 10.1109/tnnls.2021.3054407 https://hal.archives-ouvertes.fr/hal-02745527 Conversational Agents for Intelligent Buildings Weronika Sieinska, Nancie Gunson, Christopher Walsh, Christian Dondrup, and Oliver Lemon Proceedings of the SIGdial 2020 Conference https://aclanthology.org/2020.sigdial-1.5.pdf Uncertainty Based Camera Model Selection Michal Polic, Stanislav Steidl, Cenek Albl, Zuzana Kukelova, Tomas Pajdla Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content_CVPR_2020/html/Polic_Uncertainty_Based_Camera_Model_Selection_CVPR_2020_paper.html ARI: the Social Assistive Robot and Companion Sara Cooper; Alessandro Di Fava; Carlos Vivas; Luca Marchionni; Francesco Ferro 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 10.1109/ro-man47096.2020.9223470 https://www.researchgate.net/publication/347266798_ARI_the_Social_Assistive_Robot_and_Companion Motion Segmentation with Pairwise Matches and Unknown Number of Motions Federica Arrigoni, Luca Magri, and Tomas Pajdla 2020 25th International Conference on Pattern Recognition (ICPR) 10.1109/icpr48806.2021.9413142 https://ailb-web.ing.unimore.it/icpr/media/slides/11199.pdf On the Usage of the Trifocal Tensor in Motion Segmentation Federica Arrigoni, Luca Magri, Tomas Pajdla ECCV’20 Online https://www.ecva.net/papers/eccv_2020/papers_ECCV/html/3533_ECCV_2020_paper.php ODANet: Online Deep Appearance Network for Identity-Consistent Multi-person Tracking Guillaume Delorme, Yutong Ban, Guillaume Sarrazin, Xavier Alameda-Pineda ICPR 2021: Pattern Recognition. ICPR International Workshops and Challenges 10.1007/978-3-030-68780-9_60 https://hal.inria.fr/hal-03188744/document Robot control and navigation: ARI’s autonomous system Francesco Ferro, Federico Nardi, Sara Cooper, Luca Marchionni 29th IEEE International Conference on Robot & Human Interactive Communication ROMAN-2020 https://www.techrxiv.org/articles/preprint/Robot_control_and_navigation_ARI_s_autonomous_system/14350571 Dynamical Variational Autoencoders: A Comprehensive Review Laurent Girin, Simon Leglaive , Xiaoyu Bie , Julien Diard , Thomas Hueber , Xavier Alameda-Pineda Foundations and Trends in Machine Learning 10.1561/2200000089 https://hal.inria.fr/hal-02926215 Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement M. Sadeghi and X. Alameda-Pineda IEEE Transactions on Signal Processing 10.1109/tsp.2021.3066038 https://hal.inria.fr/hal-02926172/ Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification Yuyang Zhao, Zhun Zhong, Fengxiang Yang, Zhiming Luo, Yaojin Lin, Shaozi Li, Nicu Sebe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content/CVPR2021/html/Zhao_Learning_to_Generalize_Unseen_Domains_via_Memory-based_Multi-Source_Meta-Learning_for_CVPR_2021_paper.html MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas J. Guibas Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content/CVPR2021/html/Huang_MultiBodySync_Multi-Body_Segmentation_and_Motion_Estimation_via_3D_Scan_Synchronization_CVPR_2021_paper.html OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi Yang, Nicu Sebe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content/CVPR2021/html/Zhong_OpenMix_Reviving_Known_Knowledge_for_Discovering_Novel_Visual_Categories_in_CVPR_2021_paper.html Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation Subhankar Roy, Evgeny Krivosheev, Zhun Zhong, Nicu Sebe, Elisa Ricci Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content/CVPR2021/html/Roy_Curriculum_Graph_Co-Teaching_for_Multi-Target_Domain_Adaptation_CVPR_2021_paper.html Viewing Graph Solvability via Cycle Consistency Federica Arrigoni, Andrea Fusiello, Elisa Ricci, Tomas Pajdla Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Arrigoni_Viewing_Graph_Solvability_via_Cycle_Consistency_ICCV_2021_paper.html Synchronization of Group-Labelled Multi-Graphs Andrea Porfiri Dal Cin, Luca Magri, Federica Arrigoni, Andrea Fusiello, Giacomo Boracchi Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Dal_Cin_Synchronization_of_Group-Labelled_Multi-Graphs_ICCV_2021_paper.html A Unified Objective for Novel Class Discovery Enrico Fini, Enver Sangineto, Stéphane Lathuilière, Zhun Zhong, Moin Nabi, Elisa Ricci Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Fini_A_Unified_Objective_for_Novel_Class_Discovery_ICCV_2021_paper.html Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling? Yue Song, Nicu Sebe, Wei Wang Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Song_Why_Approximate_Matrix_Square_Root_Outperforms_Accurate_SVD_in_Global_ICCV_2021_paper.html Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction Guanglei Yang, Hao Tang, Mingli Ding, Nicu Sebe, Elisa Ricci Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Yang_Transformer-Based_Attention_Networks_for_Continuous_Pixel-Wise_Prediction_ICCV_2021_paper.html Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer Haoyu Chen, Hao Tang, Henglin Shi, Wei Peng, Nicu Sebe, Guoying Zhao Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) https://openaccess.thecvf.com/content/ICCV2021/html/Chen_Intrinsic-Extrinsic_Preserved_GANs_for_Unsupervised_3D_Pose_Transfer_ICCV_2021_paper.html Intérêt de la robotique sociale et d’assistance auprès des sujets âgés (Interest in social and assistive robotics for elderly subjects) Maribel Pino, Sébastien Dacunha , Étienne Berger, Anna Goncalves , Anne-Sophie Rigaud Actualités Pharmaceutiques 10.1016/j.actpha.2021.10.010 https://www.sciencedirect.com/science/article/pii/S0515370021004109 Multimodal Across Domains Gaze Target Detection Francesco Tonini; Cigdem Beyan; Elisa Ricci ICMI ’22: Proceedings of the 2022 International Conference on Multimodal Interaction 10.48550/arxiv.2208.10822 https://doi.org/10.1145/3536221.3556624 Multi-Person Extreme Motion Prediction Wen Guo; Xiaoyu Bie; Xavier Alameda-Pineda; Francesc Moreno-Noguer IEEE/CVF Conference on Computer Vision and Pattern Recognition 10.48550/arxiv.2105.08825 https://hal.inria.fr/hal-03295672 Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation Guanglei Yang; Enrico Fini; Dan Xu; Paolo Rota; Mingli Ding; Moin Nabi; Xavier Alameda-Pineda; Elisa Ricci IEEE Transactions on Pattern Analysis and Machine Intelligence 10.48550/arxiv.2203.14098 https://doi.org/10.48550/arxiv.2203.14098 Multi-frame Motion Segmentation by Combining Two-Frame Results Federica Arrigoni; Elisa Ricci; Tomas Pajdla IEEE Transactions on Image Processing 10.1007/s11263-021-01544-x https://doi.org/10.1007/s11263-021-01544-x SocialInteractionGAN: Multi-person Interaction Sequence Generation Louis Airale; Dominique Vaufreydaz; Xavier Alameda-Pineda IEEE Transactions on Affective Computing 10.48550/arxiv.2103.05916 https://doi.org/10.48550/arxiv.2103.05916 Learning and controlling the source-filter representation of speech with a variational autoencoder Samir Sadok; Simon Leglaive; Laurent Girin; Xavier Alameda-Pineda; Renaud Séguier Speech Communication 10.48550/arxiv.2204.07075 http://arxiv.org/abs/2204.07075 It’s Good to Chat? Gunson, Nancie; Sieińska, Weronika; Walsh, Christopher; Dondrup, Christian; Lemon, Oliver isbn: 9781450375863 https://doi.org/10.1145/3383652.3423889 The Impact of Removing Head Movements on Audio-visual Speech Enhancement Zhiqi Kang; Mostafa Sadeghi; Radu Horaud; Xavier Alameda-Pineda; Jacob Donley; Anurag Kumar 10.48550/arxiv.2202.00538 http://arxiv.org/abs/2202.00538 Expression-preserving face frontalization improves visually assisted speech processing Zhiqi Kang; Mostafa Sadeghi; Radu Horaud; Xavier Alameda-Pineda International Journal of Computer Vision 10.48550/arxiv.2204.02810 http://arxiv.org/abs/2204.02810 Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction Guanglei Yang; Hao Tang; Mingli Ding; Nicu Sebe; Elisa Ricci 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 10.1109/iccv48922.2021.01596 http://hdl.handle.net/11572/326202 Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation Subhankar Roy; Evgeny Krivosheev; Zhun Zhong; Nicu Sebe; Elisa Ricci CVPR 2021 10.5281/zenodo.5014029 http://arxiv.org/pdf/2104.00808 Continual Attentive Fusion for Incremental Learning in Semantic Segmentation Guanglei Yang; Enrico Fini; Dan Xu; Paolo Rota; Mingli Ding; Tang Hao; Xavier Alameda-Pineda; Elisa Ricci IEEE Transactions on Multimedia 10.48550/arxiv.2202.00432 https://hal.inria.fr/hal-03626393 Green TransCenter: Transformers With Dense Representations for Multiple-Object Tracking Yihong Xu; Yutong Ban; Guillaume Delorme; Chuang Gan; Daniela Rus; Xavier Alameda-Pineda IEEE Transactions on Pattern Analysis and Machine Intelligence 10.1109/tpami.2022.3225078 http://arxiv.org/abs/2103.15145 Novel Class Discovery in Semantic Segmentation Yuyang Zhao; Zhun Zhong; Nicu Sebe; Gim Hee Lee CVPR 2022 10.5281/zenodo.7100314 https://hdl.handle.net/11572/361269 Self-Supervised Models are Continual Learners Fini, Enrico; da Costa, Victor G. Turrisi; Alameda-Pineda, Xavier; Ricci, Elisa; Alahari, Karteek; Mairal, Julien CVPR 2022 – IEEE/CVF Conference on Computer Vision and Pattern Recognition 10.48550/arxiv.2112.04215 https://doi.org/10.1109/cvpr52688.2022.00940 OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World Zhun Zhong; Linchao Zhu; Zhiming Luo; Shaozi Li; Yi Yang; Nicu Sebe CVPR 2021 10.1109/cvpr46437.2021.00934 https://doi.org/10.48550/arxiv.2004.05551 Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition Alessandro Conti; Paolo Rota; Yiming Wang; Elisa Ricci BMCV 2022 10.5281/zenodo.7296310 https://zenodo.org/record/7296310 Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling? Yue Song; Nicu Sebe; Wei Wang 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 10.48550/arxiv.2105.02498 https://doi.org/10.48550/arxiv.2105.02498 Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification Yuyang Zhao; Zhun Zhong; Fengxiang Yang; Zhiming Luo; Yaojin Lin; Shaozi Li; Nicu Sebe IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021 10.5281/zenodo.5014450 https://doi.org/10.5281/zenodo.5014450 Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer Haoyu Chen; Hao Tang; Henglin Shi; Wei Peng; Nicu Sebe; Guoying Zhao 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 10.48550/arxiv.2108.07520 https://doi.org/10.1109/iccv48922.2021.00851 Globally Optimal Solution to Inverse Kinematics of 7DOF Serial Manipulator Pavel Trutman; Mohab Safey El Din; Didier Henrion; Tomas Pajdla IEEE Robotics and Automation Letters 10.48550/arxiv.2007.12550 https://hal.science/hal-02905816 Training-Based Multiple Source Tracking Using Manifold-Learning and Recursive Expectation-Maximization Avital Bross; Sharon Gannot IEEE/ACM Transactions on Audio, Speech, and Language Processing 10.1109/taslp.2023.3245414 https://sharongannot.group/publications/all/ Variational Structured Attention Networks for Deep Visual Representation Learning Guanglei Yang; Paolo Rota; Xavier Alameda-Pineda; Dan Xu; Mingli Ding; Elisa Ricci IEEE Transactions on Image Processing 10.1109/tip.2021.3137647 https://arxiv.org/abs/2103.03510 Fast Differentiable Matrix Square Root Yue Song, Nicu Sebe, Wei Wang ICLR 2022 10.48550/arxiv.2201.08663 https://arxiv.org/abs/2201.08663 Multi-frame Motion Segmentation by Combining Two-Frame Results Federica Arrigoni, Elisa Ricci & Tomas Pajdla International Journal of Computer Vision https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwj8nNDvvt39AhXuUaQEHQJjAyoQFnoECAwQAQ&url=https%3A%2F%2Firis.unitn.it%2Fretrieve%2Fe3835199-48a5-72ef-e053-3705fe0ad821%2F22_IJCV.pdf&usg=AOvVaw1DT-63ctQtmfIxei7cT7pJ Single microphone speaker extraction using unified time-frequency Siamese-Unet Aviad Eisenberg; Sharon Gannot; Shlomo E. Chazan 2022 30th European Signal Processing Conference (EUSIPCO) 10.23919/eusipco55093.2022.9909545 https://arxiv.org/abs/2203.02941 Novel Class Discovery in Semantic Segmentation Yuyang Zhao, Zhun Zhong, Nicu Sebe, Gim Hee Lee CVF 2021 10.48550/arxiv.2112.01900 https://arxiv.org/abs/2112.01900 Optimizing Elimination Templates by Greedy Parameter Search Evgeniy Martyushev, Jana Vrablikova, Tomas Pajdla CVF 2022 10.48550/arxiv.2203.14901 https://arxiv.org/abs/2203.14901 The impact of removing head movements on audio-visual speech enhancement Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar ICASSP 2022 10.1109/icassp43922.2022.9746401 https://arxiv.org/abs/2202.00538 Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss Riccardo Franceschini, Enrico Fini, Cigdem Beyan, Alessandro Conti, Federica Arrigoni, Elisa Ricci ICPR 2022 10.1109/icpr56361.2022.9956589 https://arxiv.org/abs/2207.11482 Unsupervised Domain Adaptation for Video Transformers in Action Recognition Victor G. Turrisi da Costa, Giacomo Zara, Paolo Rota, Thiago Oliveira-Santos, Nicu Sebe, Vittorio Murino, Elisa Ricci ICPR 2022 10.48550/arxiv.2207.12842 https://arxiv.org/abs/2207.12842 Developing a Social Conversational Robot for the Hospital waiting room Nancie Gunson, Daniel Hernandez Garcia, Weronika Sieinska, Christian Dondrup, Oliver Lemon RO-MAN 2022 10.1109/ro-man53752.2022.9900827 https://researchportal.hw.ac.uk/en/publications/developing-a-social-conversational-robot-for-the-hospital-waiting A Visually-Aware Conversational Robot Receptionist Nancie Gunson, Daniel Hernandez Garcia, Weronika Sieińska, Angus Addlesee, Christian Dondrup, Oliver Lemon, Jose L. Part, Yanchao Yu SIGdial 2022 https://aclanthology.org/2022.sigdial-1.61/ Open-source Natural Language Processing on the PAL Robotics ARI Social Robot Séverin Lemaignan, Sara Cooper, Raquel Ros, Lorenzo Ferrini, Antonio Andriella, Aina Irisarri HRI ’23: Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwi85JT7yt39AhXhU6QEHRgNAZcQFnoECAwQAQ&url=https%3A%2F%2Facademia.skadge.org%2Fpublis%2Flemaignan2023opensource.pdf&usg=AOvVaw1JlOT_UiVpvF4W06zQD03S RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection Yue Song, Nicu Sebe, Wei Wang NeurIPS 2022 10.48550/arxiv.2209.08590 https://arxiv.org/abs/2209.08590 Orthogonal SVD Covariance Conditioning and Latent Disentanglement Yue Song, Nicu Sebe, Wei Wang IEEE Transactions on Pattern Analysis and Machine Intelligence 10.1109/tpami.2022.3228979 https://arxiv.org/abs/2212.05599 Uncertainty-Guided Source-Free Domain Adaptation Subhankar Roy, Martin Trapp, Andrea Pilzer, Juho Kannala, Nicu Sebe, Elisa Ricci, Arno Solin ECCV 2022 10.48550/arxiv.2208.07591 https://arxiv.org/abs/2208.07591 Class-incremental Novel Class Discovery Subhankar Roy, Mingxuan Liu, Zhun Zhong, Nicu Sebe, Elisa Ricci ECCV 2022 10.48550/arxiv.2207.08605 https://arxiv.org/abs/2207.08605 3D-Aware Semantic-Guided Generative Model for Human Synthesis Jichao Zhang, Enver Sangineto, Hao Tang, Aliaksandr Siarohin, Zhun Zhong, Nicu Sebe, Wei Wang ECCV 2022 10.48550/arxiv.2112.01422 https://arxiv.org/abs/2112.01422 Unsupervised High-Resolution Portrait Gaze Correction and Animation Jichao Zhang, Jingjing Chen, Hao Tang, Enver Sangineto, Peng Wu, Yan Yan, Nicu Sebe, Wei Wang IEEE Transactions on Image Processing 10.1109/tip.2022.3191852 https://arxiv.org/abs/2207.00256 On The Importance of Acoustic Reflections in Beamforming Oren Shmaryahu; Sharon Gannot IWAENC 2022 10.1109/iwaenc53105.2022.9914749 https://sharongannot.group/publications/all/ D-InLoc++: Indoor Localization in Dynamic Environments Martina Dubenova, Anna Zderadickova, Ondrej Kafka, Tomas Pajdla, Michal Polic Lecture Notes in Computer Science book series (LNCS,volume 13485) 10.1007/978-3-031-16788-1_16 https://arxiv.org/abs/2209.10185 Unsupervised Speech Enhancement using Dynamical Variational Autoencoders Xiaoyu Bie; Simon Leglaive; Xavier Alameda-Pineda; Laurent Girin IEEE/ACM Transactions on Audio, Speech and Language Processing 10.1109/taslp.2022.3207349 http://arxiv.org/abs/2106.12271 Semi-supervised learning made simple with self-supervised clustering Fini, Enrico; Astolfi, Pietro; Alahari, Karteek; Alameda-Pineda, Xavier; Mairal, Julien; Nabi, Moin; Ricci, Elisa 10.48550/arxiv.2306.07483 https://inria.hal.science/hal-04073630 Galois/monodromy groups for decomposing minimal problems in 3D reconstruction Duff, Timothy; Korotynskiy, Viktor; Pajdla, Tomas; Regan, Margaret H. SIAM Journal on Applied Algebra and Geometry 10.1137/21m142287 http://arxiv.org/abs/2105.04460 From two rolling shutters to one global shutter Albl, Cenek; Kukelova, Zuzana; Larsson, Viktor; Pajdla, Tomas; Schindler, Konrad CVF 2020 10.48550/arxiv.2006.01964 http://arxiv.org/abs/2006.01964 Variational Meta Reinforcement Learning for Social Robotics Ballou, Anand; Alameda-Pineda, Xavier; Reinke, Chris Applied Intelligence 10.48550/arxiv.2206.03211 https://hal.inria.fr/hal-03908505 Successor Feature Representations Reinke, Chris; Alameda-Pineda, Xavier Transactions on Machine Learning Research 10.48550/arxiv.2110.15701 https://inria.hal.science/hal-03426870 ROS4HRI: Standardising an Interface for Human-Robot Interaction Raquel Ros, Séverin Lemaignan, Lorenzo Ferrini, Antonio Andriella, Aina Irisarri HRI 2023 Workshop https://academia.skadge.org/publis/ros2023ros4hri.pdf Unifying Bottom-Up and Top-Down Attention Models for Social Robots S. Lemaignan; S. Cooper; R. Ros; L. Ferrini; A. Andriella; A. Irisarri HRI 2023 10.1145/3568294.3580041 https://dl.acm.org/doi/abs/10.1145/3568294.3580041 Speech Modeling with a Hierarchical Transformer Dynamical VAE Xiaoyu Lin; Xiaoyu Bie; Simon Leglaive; Laurent Girin; Xavier Alameda-Pineda 10.48550/arxiv.2303.09404 https://doi.org/10.1109/icassp49357.2023.10096751 A Multimodal Dynamical Variational Autoencoder for Audiovisual Speech Representation Learning Sadok, Samir; Leglaive, Simon; Girin, Laurent; Alameda-Pineda, Xavier; Séguier, Renaud 10.48550/arxiv.2305.03582 https://inria.hal.science/hal-04132316
Data & software
All of our currently released modules and code is available here: https://gitlab.inria.fr/spring
Check Deliverables if you are looking for one specific module and interested in reusing our results
Workshops & Seminars
2020 online seminars:
- SPRING technical seminar #1, Audio Visual Machine Perception for Human-Robot Interaction
- SPRING Technical Seminar #2: Multi-Microphone Speech Enhancement
- SPRING Technical Seminar #3: Conversational AI for Human-Robot Interaction: an introduction
- SPRING Technical Seminar #4: Synchronisation and Cycle-Consistency in Computer Vision
HRI workshop, 9-10 March 2022 by PAL Robotics: https://www.hrimeetup.eu/
Workshop “AI and robotics applied to sustainable care” at HRI 2023.
PAL described the connection with Spring and how to develop Socially Assistive Robots capable of multi-person interactions and extensive dialoguing.
UPCOMING: SoRAIM Winter School, 19-23 February 2024, Grenoble, France — details on dedicated page