Chief Scientist at MosaicML

I currently work as Chief Scientist at MosaicML, a startup dedicated to reducing the cost of training neural networks by changing the training algorithm itself. I lead the research team, which empirically studies the learning dynamics of practical neural networks, develops interventions that change the training algorithm to improve efficiency, and combines these speedup methods into recipes.

Our current recipes have reduced the cost and training time required to train ResNet-50 on ImageNet by 7x, DeepLabv3 on ADE-20K by 5x, BERT Pre-Training by 2x, and GPT Language Modeling by 2x. You can access these recipes through our Composer library for PyTorch. For the latest data, see the MosaicML Explorer. For details on our speedup methods, see our documentation.

Incoming Assistant Professor of Computer Science at Harvard

In the fall of 2023, I will be joining the faculty at Harvard as a member of the Computer Science Department. I will be teaching courses on machine learning and deep learning, and I will continue my empirical research on the properties of practical deep neural networks. If you are interested in working with me, you should apply to the PhD program in computer science at Harvard.

Education

I am completing my PhD at MIT, where I empirically study the behavior of practical neural networks with Prof. Michael Carbin. During my PhD, I investigated the properties of sparse neural networks that allow them to train effectively through my lottery ticket hypothesis. I previously earned my BSE and MSE at Princeton.

Technology Policy

I spend a portion of my time working on technology policy. In this capacity work closely with lawyers, journalists, and policymakers on topics related to AI. I currently work with the OECD to implement the AI Principles that we developed in 2019. I previously served as the inaugural Staff Technologist at the Center on Privacy and Technology at Georgetown Law, where I contributed to a landmark report on police use of face recognition (The Perpetual Lineup) and co-developed a course on Computer Programming for Lawyers with Prof. Paul Ohm.

You can find my full academic CV here.