We view optical flow as the projection of the 3D motion field into the image plane. Until recently, optical flow algorithms were designed by hand and incorporated various heuristics. Deep learning methods provide an opportunity to move away from hand-crafted models but have several limitations. The key one is that they require significant amounts of training data and there are no sensors that give ground truth optical flow for real image sequences.
To deal with large image motions in a compact network, we developed the Spatial Pyramid Networks (SpyNet) [ ], which computes optical flow by combining a classical coarse-to-fine flow approach with deep learning. At each level of a spatial pyramid, the deep network computes an update to the current flow estimate. SpyNet is 96% smaller than FlowNet, is very fast, and can be trained end-to-end, making it easy to incorporate into other networks for tasks like action recognition [ ].
Synthetic data, used for training most deep flow methods, is currently far from realistic. Consequently, we train our IPFlow [ ] method on a temporal frame interpolation task using a movie database such that it is encouraged to learn about image motion in complex scenes. We show that this netwok can then be easily fine tuned to compute flow using a small amount of ground truth data.
We go further to address the problem of unsupervised learning. To make this feasible, we build in known geometric information about optical flow in rigid scenes. We introduce the Competitive Collaboration framework [ ] and use it to train four different networks that estimate monocular depth, camera pose, optical flow and non-rigid motion segmentation. All of these models compete and collaborate to explain the image sequence. This produces the most accurate unsupervised flow results to date.
Additionally, occlusion boundaries give important information about scene structure and we have worked on learning to detect these [ ]. Furthermore, we model occlusions and multiple frames in a video sequence for unsupervised learning of optical flow [ ].
Ranjan, A., Jampani, V., Balles, L., Kim, K., Sun, D., Wulff, J., Black, M. J.
In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June 2019 (inproceedings)