Tristan Thrush

I'm a Computer Science PhD student at Stanford in the NLP group and AI lab, supervised by Tatsunori Hashimoto and Christopher Potts. Before that, I was a research engineer at Hugging Face. Before that, I was a research associate at Facebook AI Research, supervised by Douwe Kiela and then Adina Williams. And before that, I was a research associate at MIT Brain and Cognitive Sciences, supervised by Roger Levy.

I received my MEng in computer science with a concentration in artificial intelligence under Patrick Winston at the MIT Computer Science and Artificial Intelligence Lab. I received my BS also at MIT in computer science, with a minor in linguistics and a minor in math. While I was an undergrad, I did research with the Perception Systems Group at NASA's Jet Propulsion Laboratory.

I'm interested in AI. Specifically: natural language processing, computer vision, high-dimensional statistics, and data-centric AI methods. I have done several large-scale projects with a focus on the data side.

Selected Papers

This one uses higher-order gradients to generate LLM training data for anything we want — it can even generate training data that can encode a QR code in model weights.
This one has some fun math for pretraining data selection.
This one poses a still open challenge for word-order understanding in vision-language models.
There is simply nothing cooler than robots in space.