Monday, March 30, 2015, 4pm
Andrew Fitzgibbon, Microsoft Research, Cambridge.
Learning about Shape
Abstract: Vision is naturally concerned with shape. If we could recover a stable and compact representation of object shape from images, we would hope it might aid with numerous vision tasks. Just the silhouette of an object is a strong cue to its identity, and the silhouette is generated by its 3D shape. In computer vision, many representations have been explored: collections of points, “simple” shapes like ellipsoids or polyhedra, algebraic surfaces and other implicit surfaces, generalized cylinders and ribbons, and piecewise (rational) polynomial representations like NURBS and subdivision surfaces. Many of these can be embedded more or less straightforwardly into probabilistic shape spaces, and recovery (a.k.a. “learning”) of one such space is the goal of the experimental part of this talk.
When recovering shape from measurements, there is at first sight a natural hierarchy of stability: straight lines can represent very little but may be robustly recovered from data, then come conic sections, splines with fixed knots, and general piecewise representations. I will show, however, that one can pass almost immediately to piecewise representations without loss of robustness. In particular, I shall show how a popular representation in computer graphics—subdivision curves and surfaces—may readily be fit to a variety of image data using the technique for ellipse fitting introduced by Gander, Golub, and Strebel in 1994. I show how we can address the previously-difficult problem of recovering 3D shape from multiple silhouettes, and the considerably harder problem which arises when the silhouettes are not from the same object instance, but from members of an object class, for example 30 images of different dolphins each in different poses. This requires that we simultaneously learn the shape space and the projections from each instance into its image. This simultaneous optimization is reminiscent of the bundle adjustment problem in computer vision, and indeed our most recent application, to tracking the human hand, makes good use of the Ceres Solver.
Joint work with Tom Cashman and others.
Bio: Andrew Fitzgibbon is a principal researcher in the computer vision group at Microsoft Research Cambridge. He is best known for his work on 3D vision, having been a core contributor to the Emmy-award-winning 3D camera tracker “boujou” (www.boujou.com) and to the human body tracking of Kinect for Xbox 360, but his interests are broad, spanning computer vision, graphics, machine learning, and even a little neuroscience. He has published numerous highly-cited papers, and received many awards for his work, including 9 conference prizes for best paper or runner-up, the Silver medal of the Royal Academy of Engineering, and the British Computer Society’s Roger Needham award. He is a fellow of the Royal Academy of Engineering, the British Computer Society, and the International Association for Pattern Recognition. He has been program chair of CVPR and ECCV. Before joining Microsoft in 2005, he was a Royal Society University Research Fellow at Oxford University, having previously studied at Edinburgh University, Heriot-Watt University, and University College, Cork.