June 9, 2023
Report
Scalable Second Order Optimization for Machine Learning
Abstract
Many machine learning (ML) training tasks are essentially optimization processes that would at first glance appear eminently parallelizable and scalable. However, effective acceleration of these tasks with scalable parallel hardware has proven to be elusive. While standard methods for machine learning, e.g., stochastic gradient descent (SGD) for DNNs, tend to be resource efficient, they appear to be fundamentally sequential in nature.Published: June 9, 2023