Scalable Second Order Optimization for Machine Learning

June 9, 2023

Report

Scalable Second Order Optimization for Machine Learning

Abstract

Many machine learning (ML) training tasks are essentially optimization processes that would at first glance appear eminently parallelizable and scalable. However, effective acceleration of these tasks with scalable parallel hardware has proven to be elusive. While standard methods for machine learning, e.g., stochastic gradient descent (SGD) for DNNs, tend to be resource efficient, they appear to be fundamentally sequential in nature.

Published: June 9, 2023