Friday Lunch: Assistant Professor Sharon Li

This event has passed.

Memorial Union, 800 Langdon St. (Room Information Shared Upon Registration)
@ 12:00 pm - 1:00 pm

Steering Large Language Models by Human Preferences

Large language models (LLMs) trained on massive datasets exhibit remarkable abilities, but at the same time, these models can inadvertently generate misinformation and harmful outputs. This concern underscores the urgent challenge of language model alignment: ensuring these models’ behaviors agree with human preferences and safety considerations. In recent years, a spectrum of alignment strategies have emerged, with prominent methods showcasing the effectiveness of reinforcement learning with human feedback (RLHF). RLHF has gained widespread adoption among state-of-the-art models, including OpenAI’s GPT-4, Anthropic’s Claude, Google’s Bard, and Meta’s Llama 2-Chat. A pivotal component within RLHF is proximal policy optimization, which employs an external reward model that mirrors human preferences for its optimization process. Despite the promise, RLHF suffers from unstable and resource-intensive training. Furthermore, the need to repeat PPO training when altering the reward model hinders rapid customization to evolving datasets and emerging needs. In this Friday Lunch talk, Assistant Professor Sharon Li will discuss alternative paradigms on how we can achieve alignment without RL training.

Please note: A catered lunch will be provided at this Friday Lunch event. Seats are limited and available on a first-come basis. To register, please send an email to rsvp@humanities.wisc.edu with your name, title, or affiliation.

Download a poster in PDF or JPG.

Sharon Yixuan Li is an Assistant Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. She received a Ph.D. from Cornell University in 2017, advised by John E. Hopcroft. Subsequently, she was a postdoctoral scholar in the Computer Science department at Stanford University. Her research focuses on the algorithmic and theoretical foundations of learning in open worlds. She is the recipient of the AFOSR Young Investigator Program (YIP) award, the NSF CAREER award, the MIT Technology Review TR-35 Award, Forbes30Under30 in Science, and multiple faculty research awards from Google, Meta, and Amazon. Her works received a NeurIPS Outstanding Paper Award, and an ICLR Outstanding Paper Award Honorable Mention in 2022.