Oct 20 [CORE, joint with Statistics]

John Duchi, *Departments of Statistics and Electrical Engineering, Stanford University*

# Author Archives: kmacphee

# Hongzhou Lin; A Generic Quasi-Newton Algorithm for Faster Gradient-Based Optimization

**June 6, 2017, 4pm**

PDL C-401

**Hongzhou Lin, ***Inria Grenoble*

In this talk, we propose a generic approach to accelerate gradient-based optimization algorithms with quasi-Newton principles. The proposed scheme, called QuickeNing, can be applied to incremental first-order methods such as stochastic variance-reduced gradient (SVRG) or incremental surrogate optimization (MISO). It is also compatible with composite objectives, meaning that it has the ability to provide exactly sparse solutions when the objective involves a sparsity-inducing regularization. QuickeNing relies on limited-memory BFGS rules, making it appropriate for solving high-dimensional optimization problems. Besides, it enjoys a worst-case linear convergence rate for strongly convex problems. We present experimental results where QuickeNing gives significant improvements over competing methods for solving large-scale high-dimensional machine learning problems.

# Kellie MacPhee; Gauge and perspective duality

**May 30, 2017, 4pm**

PDL C-401

**Kellie MacPhee, ***Department of Mathematics, University of Washington*

*Abstract: *Fenchel-Young duality is widely used in convex optimization and relies on the conjugacy operation for convex functions; however, alternative notions of duality relying on parallel operations exist as well. In particular, gauge and perspective duality are defined via the polarity operation on gauge functions. We present a perturbation argument for deriving gauge duality, thus placing it on equal footing with Fenchel-Young duality. This approach also yields explicit optimality conditions (analogous to KKT conditions), and a simple primal-from-dual recovery method based on existing algorithms. Numerical results confirm the usefulness of this approach in certain contexts (e.g. optimization over PLQ functions).

# Peng Zheng; What’s the shape of your penalty?

**May 9, 2017, 4pm**

PDL C-401

**Peng Zheng, ***Department of Applied Mathematics, University of Washington*

*Abstract: *Performance of machine learning approaches is strongly influenced by choice of misfit penalty, and correct settings of penalty parameters, such as the threshold of the Huber function. These parameter are typically chosen using expert knowledge, cross-validation, or black-box optimization, which are time consuming for large-scale applications.

We present a data-driven approach that simultaneously solves inference problems and learns error structure and penalty parameters. We discuss theoretical properties of these joint problems, and present algorithms for their solution. We show numerical examples from the piecewise linear-quadratic (PLQ) family of penalties.

# Scott Roy; An Optimal First-order Method Based on Optimal Quadratic Averaging

**May 1, 2017, 4pm**

PDL C-401

**Scott Roy, ***Department of Mathematics, University of Washington*

*Abstract:* In a recent paper, Bubeck, Lee, and Singh introduced a new first order method for minimizing smooth strongly convex functions. Their geometric descent algorithm, largely inspired by the ellipsoid method, enjoys the optimal linear rate of convergence. We show that the same iterate sequence is generated by a scheme that in each iteration computes an optimal average of quadratic lower-models of the function. Indeed, the minimum of the averaged quadratic approaches the true minimum at an optimal rate. This intuitive viewpoint reveals clear connections to the original fast-gradient methods and cutting plane ideas, and leads to limited-memory extensions with improved performance.

Joint work with Dmitriy Drusvyatskiy and Maryam Fazel.

# Andrew Pryhuber; A QCQP Approach for Triangulation

**Apr 25, 2017, 4pm**

PDL C-401

**Andrew Pryhuber, ***Department of Mathematics, University of Washington*

*Abstract: *Reconstruction of a 3D world point from $n\geq 2$ noisy 2D images is referred to as the triangulation problem and is fundamental in multi-view geometry. We show how this problem can be formulated as a quadratically constrained quadratic program and discuss an algorithm to construct candidate solutions. We also present a polynomial time test motivated by the underlying geometry of the triangulation problem to confirm optimality of such a solution. Based on work by Chris Aholt, Sameer Agarwal, and Rekha Thomas.

# Zaid Harchaoui; Catalyst, Generic Acceleration Scheme for Gradient-based Optimization

CORE Series

**Tuesday, April 18, 2017**

**EEB 125, 4:00-5:00PM **

**Zaid Harchaoui, **University of Washington

**TITLE: Catalyst, Generic Acceleration Scheme for Gradient-based Optimization**

ABSTRACT: We introduce a generic scheme called Catalyst for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. The proposed approach consists of minimizing a convex objective by approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. This strategy applies to a large class of algorithms, including gradient descent, block coordinate descent, SAG, SAGA, SDCA, SVRG, Finito/MISO, and their proximal variants. For all of these methods, we provide acceleration and explicit support for non-strongly convex objectives. Furthermore, the approach can be extended to venture into possibly nonconvex optimization problems without sacrificing the rate of convergence to stationary points. We present experimental results showing that the Catalyst acceleration scheme is effective in practice, especially for ill-conditioned problems where we measure significant improvements.

BIO: Zaid Harchaoui is currently a Provost’s Initiative in Data-driven Discovery Assistant Professor in the Department of Statistics and a Data Science Fellow in the eScience Institute at University of Washington. He completed his Ph.D. at ParisTech (now in Univ. Paris-Saclay), working with Eric Moulines, Stephane Canu and Francis Bach. Before joining the University of Washington, he was a visiting assistant professor at the Courant Institute for Mathematical Sciences at New York University (2015 – 2016). Prior to this, he was a permanent researcher on the LEAR team of Inria (2010 – 2015). He was a postdoctoral fellow in the Robotics Institute of Carnegie Mellon University in 2009.

He received the Inria award for scientific excellence and the NIPS reviewer award. He gave a tutorial on “Frank-Wolfe, greedy algorithms, and friends” at ICML’14, on “Large-scale visual recognition” at CVPR’13, and on “Machine learning for computer vision” at MLSS Kyoto 2015. He recently co-organized the “Future of AI” symposium at New York University, the workshop on “Optimization for MachineLearning” at NIPS’14, and the “Optimization and statistical learning” workshop in 2015 and 2013 in Ecole de Physique des Houches (France). He served/will serve as Area Chair for ICML 2015, ICML 2016, NIPS 2016, ICLR 2016. He is currently an associate editor of IEEE Signal Processing Letters.