A function space view of overparameterized neural networks – Rebecca Willet, University of Chicago
that vastly overparameterized neural networks with the capacity to fit virtually any labels
nevertheless generalize well when trained on real data. One possible explanation of this
phenomenon is that complexity control is being achieved by implicitly or explicitly
controlling the magnitude of the weights of the network. This raises the question: What
functions are well-approximated by neural networks whose weights are bounded in norm?
In this talk, I will give some partial answers to this question. In particular, I will give a
precise characterization of the space of functions realizable as a two-layer (i.e., one
hidden layer) neural network with ReLU activations having an unbounded number of units,
but where the Euclidean norm of the weights in the network remains bounded.
Surprisingly, this characterization is naturally posed in terms of the Radon transform as
used in computational imaging, and I will show how tools from Radon transform analysis
yield novel insights about learning with two and three-layer ReLU networks. This is joint
work with Greg Ongie, Daniel Soudry, and Nati Srebro
—
Recent years have witnessed an increased cross-fertilisation between the fields of statistics and computer science. In the era of Big Data, statisticians are increasingly facing the question of guaranteeing prescribed levels of inferential accuracy within certain time budget. On the other hand, computer scientists are progressively modelling data as noisy measurements coming from an underlying population, exploiting the statistical regularities of the data to save on computation.
This cross-fertilisation has led to the development and understanding of many of the algorithmic paradigms that underpin modern machine learning, including gradient descent methods and generalisation guarantees, implicit regularisation strategies, high-dimensional statistical models and algorithms.
About the event
This event will bring together experts to talk about advances at the intersection of statistics and computer science in machine learning. This two-day conference will focus on the underlying theory and the links with applications, and will feature 12 talks by leading international researchers.
The intended audience is faculty, postdoctoral researchers and Ph.D. students from the UK/EU, in order to introduce them to this area of research and to the Turing.
Add comment