Leger, F., & Aubin, P.-C. (2024). Extending convexity and gradient descent: a framework for general costs based on alternating minimization. In Journees SMAI MODE (pp. 15–15).
E105-04 - Forschungsbereich Variationsrechnung, Dynamische Systeme und Operations Research
-
Published in:
Journees SMAI MODE
-
Date (published):
27-Mar-2024
-
Event name:
Journées SMAI MODE (2024)
fr
Event date:
27-Mar-2024 - 29-Mar-2024
-
Event place:
Lyon, France
-
Number of Pages:
1
-
Keywords:
continuous optimization; Gradient Descent; General Cost
en
Abstract:
In this talk we go beyond the quadratic cost in optimization, by replacing it with a
general cost function and using a majorize-minimize framework. We unveiled in [1] a new class of gradient-type optimization methods that extends vanilla gradient descent, mirror descent, Riemannian gradient descent, and natural gradient descent, while keeping the same proof ideas and (sub)linear rates of convergence. Our approach involves constructing a surrogate for the
objective function f in a systematic manner, based on a chosen cost function c. This surrogate ϕ(x, y) = c(x, y)+fc
(y) is then minimized using an alternating minimization scheme. We obtain
novel rates of convergence for this general algorithm based on Csiszar and Tusnady’s “five-points
inequality” [2].
Specifying the results to cost functions c, using optimal transport theory we establish convergence rates based on generalized notions of L-smoothness and convexity, namely c-concavity and the c-cross-convexity that we introduce. We provide local versions of these two notions when the cost satisfies a condition known as nonnegative cross-curvature. In particular our framework
encompasses the concept of relative smoothness introduced in 2017 in [3] for Bregman divergences, and we provide the first global rates for natural gradient descent and Newton’s method under a new assumption distinct from self-concordance.
en
Project title:
Unilateralität und Asymmetrie in der Variationsanalyse: P 36344N (FWF - Österr. Wissenschaftsfonds)
-
Additional information:
Références
[1] F. Léger and P.-C. Aubin-Frankowski. Gradient descent with a general cost.
arxiv.org/abs/2305.04917, 2023.
[2] I. Csiszár and G. Tusnády. Information Geometry and Alternating Minimization Procedures. Statistics and Decisions, 205–237, 1984.
[3] H. H. Bauschke, J. Bolte and M. Teboulle. A descent lemma beyond Lipschitz gradient
continuity: first-order methods revisited and applications. Mathematics of Operations Research, 42(2):330–348, 2017