I am interested in developing algorithms for 3D visual perception. It includes 3D reconstruction of the objects in the scene and the visual perception in the moving world. While such processing is trivial to the human visual system, most of the sophisticated computer vision algorithms come nowhere close in terms of performance and are unable to do the processing online.
I am working on the project AirCap, where the goal is to develop a 3D shape and motion capture system in outdoor scenarios using multiple UAVs. Each UAV is equipped with an RGB camera and onboard computation capabilities to process the camera input. I am interested in developing shape and pose estimation algorithms which can execute on each UAV’s computation unit with minimal inter-UAV communication. The algorithm should take advantage of multi-view RGB input and should provide feedback to the UAV’s flight controller for the best formation planning of the UAVs. I am also working at the software workshop where I am implementing the derivative calculation of SMPL model using OpenCL to harness the power of parallel GPU computing.
I have completed my Master studies in Neural Information Processing from the International Max Planck Research School of Cognitive and Systems Neuroscience, University of Tuebingen. Before that, I have worked in Samsung R&D Institute Bangalore, India for two years. I have done my Bachelors in Electronics and Communication Engineering from IIT(BHU) Varanasi, India.
Our goal is markless, unconstrained, human and animal motion capture outdoors. To that end, we are developing a flying mocap system using a team of aerial vehicles (MAVs) with only on-board, monocular RGB cameras. To realize such an outdoor motion capture system we need to address research challenges...
IEEE Robotics and Automation Letters, IEEE Robotics and Automation Letters, 5(4):6678 - 6685, IEEE, October 2020, Also accepted and presented in the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (article)
In this letter, we introduce a deep reinforcement learning (DRL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose, and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system, and observation models. Such models are difficult to derive, and generalize across different systems. Moreover, the non-linearities, and non-convexities of these models lead to sub-optimal controls. In our work, we formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019), Workshop on Aerial Swarms, November 2019 (misc)
This paper presents an overview of the Grassroots project Aerial Outdoor Motion Capture (AirCap) running at the Max Planck Institute for Intelligent Systems. AirCap's goal is to achieve markerless, unconstrained, human motion capture (mocap) in unknown and unstructured outdoor environments. To that end, we have developed an autonomous flying motion capture system using a team of aerial vehicles (MAVs) with only on-board, monocular RGB cameras. We have conducted several real robot experiments involving up to 3 aerial vehicles autonomously tracking and following a person in several challenging scenarios using our approach of active cooperative perception developed in AirCap. Using the images captured by these robots during the experiments, we have demonstrated a successful offline body pose and shape estimation with sufficiently high accuracy. Overall, we have demonstrated the first fully autonomous flying motion capture system involving multiple robots for outdoor scenarios.
Proceedings 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages: 823-832, IEEE, October 2019 (conference)
Capturing human motion in natural scenarios means moving motion capture out of the lab and into the wild. Typical approaches rely on fixed, calibrated, cameras and reflective markers on the body, significantly limiting the motions that can be captured. To make motion capture truly unconstrained, we describe the first fully autonomous outdoor capture system based on flying vehicles. We use multiple micro-aerial-vehicles(MAVs), each equipped with a monocular RGB camera, an IMU, and a GPS receiver module. These detect the person, optimize their position, and localize themselves approximately. We then develop a markerless motion capture method that is suitable for this challenging scenario with a distant subject, viewed from above, with approximately calibrated and moving cameras. We combine multiple state-of-the-art 2D joint detectors with a 3D human body model and a powerful prior on human pose. We jointly optimize for 3D body pose and camera pose to robustly fit the 2D measurements. To our knowledge, this is the first successful demonstration of outdoor, full-body, markerless motion capture from autonomous flying vehicles.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems