Calculus and Optimization

Machine learning reduces to finding parameters that minimize a loss function. This chapter covers the mathematics of minimization: multivariable calculus, convex and non-convex optimization, and constrained optimization. The convex case is well understood, but the real difficulty lives in the non-convex and constrained settings, which is why all four sections are needed.

Multivariable Calculus: Partial derivatives, gradient, directional derivative, Taylor expansion
Convex Optimization: Convex sets and functions, convergence guarantees, strong convexity
Non-Convex Optimization: Saddle points, loss landscape geometry, learning rate schedules
Constrained Optimization: Lagrange multipliers, KKT conditions, duality