The A-B Test Deception: Divergent Delivery, Response Heterogeneity, and Erroneous Inferences in Online Advertising Field Experiments

Seminars - Department Seminar Series
Speakers
ERIC SCHWARTZ, University of Michigan
13:00 - 14:30
Hybrid mode (Zoom - Meeting room E4-SR03, Via Roentgen, 1, 4th floor)

Abstract

Advertisers and researchers use tools provided by advertising platforms to conduct randomized experiments for testing user responses to creative elements in online ads. Internally valid comparisons between ads require the mix of experimental users exposed to each ad to be similar across all ads. But that internal validity is threatened when platforms’ targeting algorithms deliver each ad to its own optimized mix of users, which diverges across ads. We extend the potential outcomes model of causal inference to treat random assignment of ads and the user exposure states for each ad as two separate decisions. We then demonstrate how targeting ads to users leads advertisers to incorrectly infer which ad performs better, based on aggregate test results. Through analysis and simulation, we characterize how bias in the aggregate estimate of the difference between two ads’ lifts is driven by the interplay between heterogeneous responses to different ads and how platforms deliver ads to divergent subsets of users. We also identify conditions for an undetectable “Simpson’s reversal,” in which all unobserved types of users may prefer ad A over ad B, but the advertiser mistakenly infers from aggregate experimental results that users prefer ad B over ad A. 

 

Keywords: Targeted online advertising, experimental design, A-B testing, measuring advertising effectiveness, causal inference, Simpson’s paradox, social media