2019
Towards Geometric Understanding of Motion
The motion of the world is inherently dependent on the spatial structure of the world and its geometry. Therefore, classical optical flow methods try to model this geometry to solve for the motion. However, recent deep learning methods take a completely different approach. They try to predict optical flow by learning from labelled data. Although deep networks have shown state-of-the-art performance on classification problems in computer vision, they have not been as effective in solving optical flow. The key reason is that deep learning methods do not explicitly model the structure of the world in a neural network, and instead expect the network to learn about the structure from data. We hypothesize that it is difficult for a network to learn about motion without any constraint on the structure of the world. Therefore, we explore several approaches to explicitly model the geometry of the world and its spatial structure in deep neural networks.
The spatial structure in images can be captured by representing it at multiple scales. To represent multiple scales of images in deep neural nets, we introduce a Spatial Pyramid Network (SpyNet). Such a network can leverage global information for estimating large motions and local information for estimating small motions. We show that SpyNet significantly improves over previous optical flow networks while also being the smallest and fastest neural network for motion estimation. SPyNet achieves a 97% reduction in model parameters over previous methods and is more accurate.
The spatial structure of the world extends to people and their motion. Humans have a very well-defined structure, and this information is useful in estimating optical flow for humans. To leverage this information, we create a synthetic dataset for human optical flow using a statistical human body model and motion capture sequences. We use this dataset to train deep networks and see significant improvement in the ability of the networks to estimate human optical flow.
The structure and geometry of the world affects the motion. Therefore, learning about the structure of the scene together with the motion can benefit both problems. To facilitate this, we introduce Competitive Collaboration, where several neural networks are constrained by geometry and can jointly learn about structure and motion in the scene without any labels. To this end, we show that jointly learning single view depth prediction, camera motion, optical flow and motion segmentation using Competitive Collaboration achieves state-of-the-art results among unsupervised approaches.
Our findings provide support for our hypothesis that explicit constraints on structure and geometry of the world lead to better methods for motion estimation.
ProtoGAN: Towards Few Shot Learning for Action Recognition
Dwivedi, S. K., Gupta, V., Mitra, R., Ahmed, S., Jain, A.
Proc. International Conference on Computer Vision (ICCV) Workshops, October 2019 (manual)
2018
Model-based Optical Flow: Layers, Learning, and Geometry
Combining Data-Driven 2D and 3D Human Appearance Models
2017
Human Shape Estimation using Statistical Body Models
Learning Inference Models for Computer Vision
Decentralized Simultaneous Multi-target Exploration using a Connected Network of Multiple Robots
Nestmeyer, T., Robuffo Giordano, P., Bülthoff, H. H., Franchi, A.
In pages: 989-1011, Autonomous Robots, 2017 (incollection)
Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects
2016
Non-parametric Models for Structured Data and Applications to Human Bodies and Natural Scenes
2015
Proceedings of the 37th German Conference on Pattern Recognition
Gall, J., Gehler, P., Leibe, B.
Springer, German Conference on Pattern Recognition, October 2015 (proceedings)
Shape Models of the Human Body for Distributed Inference
From Scans to Models: Registration of 3D Human Shapes Exploiting Texture Information
Long Range Motion Estimation and Applications
2014
Advanced Structured Prediction
Nowozin, S., Gehler, P. V., Jancsary, J., Lampert, C. H.
Advanced Structured Prediction, pages: 432, Neural Information Processing Series, MIT Press, November 2014 (book)
Modeling the Human Body in 3D: Data Registration and Human Shape Representation
Model transport: towards scalable transfer learning on manifolds - supplemental material
Freifeld, O., Hauberg, S., Black, M. J.
(9), April 2014 (techreport)
RoCKIn@Work in a Nutshell
Ahmad, A., Amigoni, A., Awaad, I., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Fontana, G., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schiaffonati, V., Schneider, S.
(FP7-ICT-601012 Revision 1.2), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, March 2014 (techreport)
RoCKIn@Home in a Nutshell
Ahmad, A., Amigoni, F., Awaad, I., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Fontana, G., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schneider, S.
(FP7-ICT-601012 Revision 0.8), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, March 2014 (techreport)
Human Pose Estimation from Video and Inertial Sensors
Simulated Annealing
2013
Puppet Flow
D2.1.4 RoCKIn@Work - Innovation in Mobile Industrial Manipulation Competition Design, Rule Book, and Scenario Construction
Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schneider, S.
(FP7-ICT-601012 Revision 0.7), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, sep 2013 (techreport)
D2.1.1 RoCKIn@Home - A Competition for Domestic Service Robots Competition Design, Rule Book, and Scenario Construction
Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schneider, S.
(FP7-ICT-601012 Revision 0.7), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, sep 2013 (techreport)
Statistics on Manifolds with Applications to Modeling Shape Deformations
D1.1 Specification of General Features of Scenarios and Robots for Benchmarking Through Competitions
Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Fontana, G., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schiaffonati, V., Schneider, S.
(FP7-ICT-601012 Revision 1.0), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, July 2013 (techreport)
SocRob-MSL 2013 Team Description Paper for Middle Sized League
Messias, J., Ahmad, A., Reis, J., Serafim, M., Lima, P.
17th Annual RoboCup International Symposium 2013, July 2013 (techreport)
Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms
A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles Behind Them
Sun, D., Roth, S., Black, M. J.
(CS-10-03), Brown University, Department of Computer Science, January 2013 (techreport)
Modeling Shapes with Higher-Order Graphs: Theory and Applications
Wang, C., Zeng, Y., Samaras, D., Paragios, N.
In Shape Perception in Human and Computer Vision: An Interdisciplinary Perspective, (Editors: Zygmunt Pizlo and Sven Dickinson), Springer, 2013 (incollection)
Class-Specific Hough Forests for Object Detection
Gall, J., Lempitsky, V.
In Decision Forests for Computer Vision and Medical Image Analysis, pages: 143-157, 11, (Editors: Criminisi, A. and Shotton, J.), Springer, 2013 (incollection)
Image Gradient Based Level Set Methods in 2D and 3D
Xianhua Xie, Si Yong Yeo, Majid Mirmehdi, Igor Sazonov, Perumal Nithiarasu
In Deformation Models: Tracking, Animation and Applications, pages: 101-120, 0, (Editors: Manuel González Hidalgo and Arnau Mir Torres and Javier Varona Gómez), Springer, 2013 (inbook)
2012
Virtual Human Bodies with Clothing and Hair: From Images to Animation
Coregistration: Supplemental Material
Lie Bodies: A Manifold Representation of 3D Human Shape. Supplemental Material
MPI-Sintel Optical Flow Benchmark: Supplemental Material
From Pixels to Layers: Joint Motion Estimation and Segmentation
Exploiting pedestrian interaction via global optimization and social behaviors
Leal-Taixé, L., Pons-Moll, G., Rosenhahn, B.
In Theoretic Foundations of Computer Vision: Outdoor and Large-Scale Real-World Scene Analysis, Springer, April 2012 (incollection)
HUMIM Software for Articulated Tracking
Soren Hauberg, Kim S. Pedersen
(01/2012), Department of Computer Science, University of Copenhagen, January 2012 (techreport)
A geometric framework for statistics on trees
Aasa Feragen, Mads Nielsen, Soren Hauberg, Pechin Lo, Marleen de Bruijne, Francois Lauze
(11/02), Department of Computer Science, University of Copenhagen, January 2012 (techreport)
Data-driven Manifolds for Outdoor Motion Capture
Pons-Moll, G., Leal-Taix’e, L., Gall, J., Rosenhahn, B.
In Outdoor and Large-Scale Real-World Scene Analysis, 7474, pages: 305-328, LNCS, (Editors: Dellaert, Frank and Frahm, Jan-Michael and Pollefeys, Marc and Rosenhahn, Bodo and Leal-Taix’e, Laura), Springer, 2012 (incollection)
Scan-Based Flow Modelling in Human Upper Airways
Perumal Nithiarasu, Igor Sazonov, Si Yong Yeo
In Patient-Specific Modeling in Tomorrow’s Medicine, pages: 241 - 280, 0, (Editors: Amit Gefen), Springer, 2012 (inbook)
An Introduction to Random Forests for Multi-class Object Detection
Gall, J., Razavi, N., van Gool, L.
In Outdoor and Large-Scale Real-World Scene Analysis, 7474, pages: 243-263, LNCS, (Editors: Dellaert, Frank and Frahm, Jan-Michael and Pollefeys, Marc and Rosenhahn, Bodo and Leal-Taix’e, Laura), Springer, 2012 (incollection)
Home 3D body scans from noisy image and range data
Weiss, A., Hirshberg, D., Black, M. J.
In Consumer Depth Cameras for Computer Vision: Research Topics and Applications, pages: 99-118, 6, (Editors: Andrea Fossati and Juergen Gall and Helmut Grabner and Xiaofeng Ren and Kurt Konolige), Springer-Verlag, 2012 (incollection)
Consumer Depth Cameras for Computer Vision - Research Topics and Applications
Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K.
Advances in Computer Vision and Pattern Recognition, Springer, 2012 (book)
2011
ISocRob-MSL 2011 Team Description Paper for Middle Sized League
Messias, J., Ahmad, A., Reis, J., Sousa, J., Lima, P.
15th Annual RoboCup International Symposium 2011, July 2011 (techreport)
Steerable random fields for image restoration and inpainting
Roth, S., Black, M. J.
In Markov Random Fields for Vision and Image Processing, pages: 377-387, (Editors: Blake, A. and Kohli, P. and Rother, C.), MIT Press, 2011 (incollection)
Benchmark datasets for pose estimation and tracking
Andriluka, M., Sigal, L., Black, M. J.
In Visual Analysis of Humans: Looking at People, pages: 253-274, (Editors: Moesland and Hilton and Kr"uger and Sigal), Springer-Verlag, London, 2011 (incollection)
Fields of experts
Roth, S., Black, M. J.
In Markov Random Fields for Vision and Image Processing, pages: 297-310, (Editors: Blake, A. and Kohli, P. and Rother, C.), MIT Press, 2011 (incollection)