Researchers from Northwestern University and Facebook in March published new research in the INFORMS journal Marketing Science that sheds light on whether common approaches for online advertising measurement are as reliable and accurate as the “gold standard” of large-scale, randomized experiments.
The study, “A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook,” is authored by Brett Gordon of Northwestern University; Florian Zetttelmeyer of Northwestern University and the National Bureau of Economic Research; and Neha Bhargava and Dan Chapsky of Facebook.
“Our findings suggest that commonly used observational approaches that rely on data usually available to advertisers often fail to accurately measure the true effect of advertising,” said Brett Gordon.
Observational approaches are those that encompass a broad class of statistical models that rely on the data “as they are,” generated without explicit manipulation through a randomized experiment.
“We found a significant difference in the ad effectiveness obtained from randomized control trials and those observational methods that are frequently used by advertisers to evaluate their campaigns,” added Zettelmeyer. “Generally, the current and more common methods overestimate ad effectiveness relative to what we found in our randomized tests. Though in some cases, they significantly underestimate effectiveness.”
Measuring the effectiveness of advertising remains an important problem for many firms. A key question is whether an advertising campaign produced incremental outcomes: did more consumers purchase because they saw an ad, or would many of those consumers have purchased even in the absence of the ad? Obtaining an accurate measure of incremental outcomes (“conversions”) helps an advertiser calculate the return on investment (ROI) of the campaign.
“Digital platforms that carry advertising, such as Facebook, have created comprehensive means to assess ad effectiveness, using granular data that link ad exposures, clicks, page visit, online purchases and even offline purchases,” said Gordon. “Still, even with these data, measuring the causal effect of advertising requires the proper experimentation platform.”
The study authors used data from 15 U.S. advertising experiments at Facebook comprising 500 million user-experiment observations and 1.6 billion ad impressions.
Facebook’s “conversion lift” experimentation platform provides advertisers with the ability to run randomized controlled experiments to measure the causal effect of an ad campaign on consumer outcomes.
These experiments randomly allocate users to a control group, who are never exposed to the ad, and to a test group, who are eligible to see the ad. Comparing outcomes between the groups provides the causal effect of the ad because randomization ensures the two groups are, on average, equivalent except for advertising exposures in the test group. The experimental results from each ad campaign served as a baseline with which to evaluate common observational methods.
Observational methods compare outcomes between users who were exposed to the ad to users who were unexposed. These two groups of users tend to differ systematically in many ways, such as age and gender. These differences in characteristics may be observable because the advertiser (or its advertising platform) often has access data on these characteristics and others, e.g., in addition to knowing the gender and age of an online user, it is possible to observe the type of device being used, the location of the user, how long it’s been since the user last visited, etc. However, the tricky part is that the exposed and unexposed groups may also differ in ways that are very difficult to measure, such as the users underlying affinity for the brand. To say that the ad “caused” an effect requires the research to be able to account for both observed and unobserved differences between the two groups. Observational methods use data on the characteristics of the users that are observed in attempt to adjust for both the observable and unobservable differences.
“We set out to determine whether, as commonly believed, current observational methods using comprehensive individual-level data are ‘good enough’ for ad measurement,” said Zettelmeyer. “What we found was that even fairly comprehensive data prove inadequate to yield reliable estimates of advertising effects.”
“In principle, we believe that using large-scale randomized controlled trials to evaluate advertising effectiveness should be the preferred method for advertisers whenever possible.”