Skip to main content

February 20, 2026

Validating Generative AI-Based Social Sciences

Validating Generative AI-Based Social Sciences

Friday, Feb. 20
9 a.m. - 5 p.m.

Frances Searle Building, Center for Human-Computer Interaction + Design (Room 1-122), 2240 Campus Drive

 

Generative Al is reshaping how we study human behavior, but are we validating what we simulate? As large language models (LLMs) increasingly generate social and behavioral data, we'll explore what kinds of tests, methods, and systems are needed to ensure that these simulations are credible, replicable, and truly useful.

With preprints and products emerging faster than the research community can connect, this event offers a timely space for interdisciplinary dialogue. Our participants include statisticians, economists, and computer scientists in addition social scientists/computational social scientists to tackle questions at the heart of generative social science. This event will spark collaboration across disconnected teams and help shape the future of ethical, rigorous generative Al research.

This symposia has been organized by the Center for HCI+D and supported by the McCormick School of Engineering and the Northwestern Cognitive Science Program

Organizers

Guest Speakers

Serina Chang headshot

Serina Chang

Assistant Professor in Electrical Engineering and Computer Sciences and Computational Precision Health, UC Berkeley
View Serina Chang's Profile
Eli Ben-Michael

Eli Ben-Michael

Assistant Professor in the Department of Statistics & Data Science and the Heinz College of Information Systems and Public Policy, Carnegie Mellon University
View Eli Ben-Michael's Profile

Schedule

Time Activity Speakers
9 - 9:30 a.m. Opening / Welcome
  • Christopher Schuh, Dean of McCormick School of Engineering
  • Darren Gergle, Co-Director of the Center for Human-Computer Interaction + Design
  • Jessica Hullman, Ginni Rometty Professor of Computer Science and Faculty Fellow at the Institute for Policy Research at Northwestern University
9:30 - 10:30 a.m.

Lightning Talks and Q&A

  • Why Human Interaction Matters for AI Evaluation—and How User Simulators Can Help - Serina Chang
  • The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations - David Broska
  • Valid Survey Simulations with Limited Human DataKristina Gligoric
10:30 - 10:45 a.m. Break
10:45 a.m. - 12:15 p.m.

Lightning Talks and Q&A

  • How Much Can AI Help in Randomized Experiments in the Social Sciences? - Eli Ben-Michael
  • Large language models that replace human participants can harmfully misportray and flatten identity groups - Angelina Wang
  • When Can We Trust Experiments on Digital Twins?  A Potential Outcomes Framework for Causal Inference with LLM Simulations - Patryk Perkowski
  • Causal Inference in Experiments with Mixed-Subjects Designs - Austin van Loon
12:15 - 1:30 p.m. Lunch 
1:30 - 3 p.m. Lightning Talks and Q&A
  • Semantic Structure in LLM Internal Representations - Austin Kozlowski
  • Detailed Self-Reports Can Reduce Bias in LLM-Simulated Survey RespondentsJonne Kamphorst
  • Synthetic Consumers in Practice: Methods, Validation, and Applied ChallengesMichael Spadafore
  • Social Science and the Bitter Lesson: Thoughts on Valid Social Science when Machines Do ResearchJohn Horton
3 - 3:30 p.m. Group Discussion and
Breakout Topic Refinement
3:30 - 4:30 p.m. Breakout Group Discussions
4:30 - 5 p.m. Report-Out and Wrap Up
Back to top