Depth is key to the success of modern neural networks. Mathematically, however, it is not obvious why and when deeper is better. This talk will survey several lines of work, spanning approximation theory, probability, statistical physics, and optimization, that have sought to answer such questions.