The “Super Tuesday” of the Democratic Primary, sees a field of candidates with a wide range of advertising budgets, compete for votes across 16 US states on a single. It affords us the chance to ask: did the people who spent the most money on advertising get the most votes?
Total ad spend by 4th of March 2020, as reported by CNN
The number of delegates and votes, as reported by Wikipedia by the end of 4th of March - “Super Tuesday”.
I could have made this neater by pulling and merging these two data sets automatically, but instead I just copied and pasted them into a single spreadsheet.
df =read.csv('supertuesday.csv') # load data
#tidying, changing type
df$delegates <- as.numeric(as.character(df$delegates1546))
df$votes <- as.numeric(as.character(df$votes1546))
#tidying, remove all except variables we need
df <- df %>% select (-c(delegates0820,votes0820,delegates1546,votes1546))
print(df) #we can show all data, since this is a small data set
## candidates spend delegates votes
## 1 Michael Bloomberg 560.0 64 1573388
## 2 Tom Steyer 210.0 0 194646
## 3 Donald Trump 60.0 NA NA
## 4 Bernie Sanders 55.0 395 3442670
## 5 Pete Buttigieg 36.0 26 594202
## 6 Elizabeth Warren 27.0 47 3442670
## 7 Amy Klobuchar 17.0 7 369504
## 8 Joe Biden 16.0 497 4365361
## 9 Tulsi Gabbard 5.5 2 92772
p <- ggplot(data = df, mapping = aes(x = spend, y=delegates,label = candidates))
p + geom_point(color='red',size=3) +
labs(x = "Total ad spend ($million)", y = "delegates won by SuperTuesday") +
geom_text_repel()
xlim(0,750)
#save output
ggsave('delegates.png')
It looks safe to conclude there is no strong relationship
p <- ggplot(data = df, mapping = aes(x = spend, y=votes, label=candidates))
p + geom_point(color='blue',size=3) +
labs(x = "Total ad spend ($million)", y = "votes won by SuperTuesday") +
geom_text_repel() +
scale_y_continuous(labels = scales::comma)
#save output
ggsave('votes.png')
Again, it is safe to conclude there is no strong relationship
Let’s work out how much each candidate payed per vote/delegate (note, this makes the very dubious assumption that with 0 ad spend they would get 0 votes/delegates)
df <- df %>% mutate(per_delegate=(spend*1000000)/delegates,per_vote=(spend*1000000)/votes) %>% drop_na()
print(df)
## candidates spend delegates votes per_delegate per_vote
## 1 Michael Bloomberg 560.0 64 1573388 8750000.00 355.919837
## 2 Tom Steyer 210.0 0 194646 Inf 1078.881662
## 4 Bernie Sanders 55.0 395 3442670 139240.51 15.975972
## 5 Pete Buttigieg 36.0 26 594202 1384615.38 60.585457
## 6 Elizabeth Warren 27.0 47 3442670 574468.09 7.842750
## 7 Amy Klobuchar 17.0 7 369504 2428571.43 46.007621
## 8 Joe Biden 16.0 497 4365361 32193.16 3.665218
## 9 Tulsi Gabbard 5.5 2 92772 2750000.00 59.285129
p <- ggplot(data=df,
mapping=aes(x = per_vote,y=reorder(candidates,per_vote)))
p + geom_point(size=3) +
labs(x= "Price per vote ($)",y="") +
scale_x_continuous(labels = scales::comma)
ggsave('pervote.png')
Enjoy discussion of this analysis around the twitter thread here
Caveat #1: Initial graphing done while results were still coming in, so the actual figures changed (but the general pattern held).
Caveat #2: Apparently Bloomberg’s ads were terrible so better ads might have had an effect.
Caveat #3: Also, primary voters are almost by defintion high engagement, so this doesn’t test the case of influencing decisions from low-engagement voters
Caveat #4: Maybe we should think of Bloomberg’s spend as a hedge against the collapse of the Biden campaign
Caveat #5: Maybe Bloomberg was buying something other than votes (e.g. press)
Caveat #6: Maybe it was effective, since he overtook Biden in the polls at one point (from a lower base)
Caveat #7: Bloomberg’s campaign stalled after disappointing debate performance, “there’s only so much advertising can do with a bad product”
Normally I’d do this with Python, but I was inspired by Kieran Healy’s book to use R. It’s a great book:
Healy, K. (2018). Data visualization: a practical introduction. Princeton University Press.
This page was useful in publishing to github How to publish project online? by Cathy Gao, August 7, 2019. Thanks Cathy!
Repo for this analysis (the markdown file which generated this page, data and plot files) is here: github.com/tomstafford/supertues