Real Virtual Humans (Talk)
With the explosive growth of available training data, 3D human pose and shape estimation is ahead of a transition to a data-centric paradigm. To leverage data scale, we need flexible models trainable from heterogeneous data sources. To this end, our latest work, Neural Localizer Fields, seamlessly unifies different human pose and shape-related tasks and datasets though the ability - both at training and test time - to query any arbitrary point of the human volume, and obtain its estimated location in 3D, based on a single RGB image. We achieve this by learning a continuous neural field of body point localizer functions, each of which is a differently parameterized 3D heatmap-based convolutional point localizer. This way, we can naturally exploit differently annotated data sources including parametric mesh, 2D/3D skeleton and dense pose, without having to explicitly convert between them, and thereby train large-scale 3D human mesh and skeleton estimation models that outperform the state-of-the-art on several public benchmarks including 3DPW, EMDB and SSP-3D by a considerable margin.
Biography: István Sárándi is a postdoctoral researcher in the Real Virtual Humans group led by Gerard Pons-Moll at the University of Tübingen. He received his BSc in Computer Engineering from the Budapest University of Technology and Economics, his MSc in Computer Science from RWTH Aachen University, and his PhD in Computer Science also from RWTH Aachen University advised by Bastian Leibe. His research has focused on computer vision for 3D human pose and body shape estimation, especially targeting real-time robotics applications. His previous works have won 3D pose estimation challenges at ECCV 2018 and ECCV 2020