The “Super Tuesday” of the Democratic Primary, sees a field of candidates with a wide range of advertising budgets, compete for votes across 16 US states on a single. It affords us the chance to ask: did the people who spent the most money on advertising get the most votes?
Total ad spend by 4th of March 2020, as reported by CNN
The number of delegates and votes, as reported by Wikipedia by the end of 4th of March - “Super Tuesday”.
I could have made this neater by pulling and merging these two data sets automatically, but instead I just copied and pasted them into a single spreadsheet.
df =read.csv('supertuesday.csv') # load data
#tidying, changing type
df$delegates <- as.numeric(as.character(df$delegates1546))
df$votes  <- as.numeric(as.character(df$votes1546))
#tidying, remove all except variables we need
df <- df %>% select (-c(delegates0820,votes0820,delegates1546,votes1546))
print(df) #we can show all data, since this is a small data set##          candidates spend delegates   votes
## 1 Michael Bloomberg 560.0        64 1573388
## 2        Tom Steyer 210.0         0  194646
## 3      Donald Trump  60.0        NA      NA
## 4    Bernie Sanders  55.0       395 3442670
## 5    Pete Buttigieg  36.0        26  594202
## 6  Elizabeth Warren  27.0        47 3442670
## 7     Amy Klobuchar  17.0         7  369504
## 8         Joe Biden  16.0       497 4365361
## 9     Tulsi Gabbard   5.5         2   92772
p <- ggplot(data = df, mapping = aes(x = spend, y=delegates,label = candidates))
p + geom_point(color='red',size=3) +
  labs(x = "Total ad spend ($million)", y = "delegates won by SuperTuesday") + 
  geom_text_repel()
  xlim(0,750)
#save output
ggsave('delegates.png')    It looks safe to conclude there is no strong relationship
p <- ggplot(data = df, mapping = aes(x = spend, y=votes, label=candidates))
p + geom_point(color='blue',size=3) +
  labs(x = "Total ad spend ($million)", y = "votes won by SuperTuesday") +
  geom_text_repel() +
  scale_y_continuous(labels = scales::comma)
#save output
ggsave('votes.png')  Again, it is safe to conclude there is no strong relationship
Let’s work out how much each candidate payed per vote/delegate (note, this makes the very dubious assumption that with 0 ad spend they would get 0 votes/delegates)
df <- df %>% mutate(per_delegate=(spend*1000000)/delegates,per_vote=(spend*1000000)/votes) %>% drop_na()
print(df)##          candidates spend delegates   votes per_delegate    per_vote
## 1 Michael Bloomberg 560.0        64 1573388   8750000.00  355.919837
## 2        Tom Steyer 210.0         0  194646          Inf 1078.881662
## 4    Bernie Sanders  55.0       395 3442670    139240.51   15.975972
## 5    Pete Buttigieg  36.0        26  594202   1384615.38   60.585457
## 6  Elizabeth Warren  27.0        47 3442670    574468.09    7.842750
## 7     Amy Klobuchar  17.0         7  369504   2428571.43   46.007621
## 8         Joe Biden  16.0       497 4365361     32193.16    3.665218
## 9     Tulsi Gabbard   5.5         2   92772   2750000.00   59.285129p <- ggplot(data=df,
            mapping=aes(x = per_vote,y=reorder(candidates,per_vote)))
p + geom_point(size=3) +
  labs(x= "Price per vote ($)",y="") +
  scale_x_continuous(labels = scales::comma)
ggsave('pervote.png')
Enjoy discussion of this analysis around the twitter thread here
Caveat #1: Initial graphing done while results were still coming in, so the actual figures changed (but the general pattern held).
Caveat #2: Apparently Bloomberg’s ads were terrible so better ads might have had an effect.
Caveat #3: Also, primary voters are almost by defintion high engagement, so this doesn’t test the case of influencing decisions from low-engagement voters
Caveat #4: Maybe we should think of Bloomberg’s spend as a hedge against the collapse of the Biden campaign
Caveat #5: Maybe Bloomberg was buying something other than votes (e.g. press)
Caveat #6: Maybe it was effective, since he overtook Biden in the polls at one point (from a lower base)
Caveat #7: Bloomberg’s campaign stalled after disappointing debate performance, “there’s only so much advertising can do with a bad product”
Normally I’d do this with Python, but I was inspired by Kieran Healy’s book to use R. It’s a great book:
Healy, K. (2018). Data visualization: a practical introduction. Princeton University Press.
This page was useful in publishing to github How to publish project online? by Cathy Gao, August 7, 2019. Thanks Cathy!
Repo for this analysis (the markdown file which generated this page, data and plot files) is here: github.com/tomstafford/supertues