Numerical Analysis

In order to clearly understand the algorithms behind ML, it is important to have clear numerical analysis understanding.

First a good summary of gradient descent methods are explained in this paper:

An overview of gradient descent optimization algorithms – arxiv

Then, Michel Bierlaire from the EPFL in Switzerland wrote a good book on optimization, and has a YouTube channel which gives very good introduction to optimization methods for ML. 

Michel Berliaire EPFL web page 

YouTube channel: Michel_Bierlaire