Bang for your buck? Candidate Ad spend in the Democractic primary correlated with votes and delegates

   

Research Question

The “Super Tuesday” of the Democratic Primary, sees a field of candidates with a wide range of advertising budgets, compete for votes across 16 US states on a single. It affords us the chance to ask: did the people who spent the most money on advertising get the most votes?

   

Data Origins

Total ad spend by 4th of March 2020, as reported by CNN

The number of delegates and votes, as reported by Wikipedia by the end of 4th of March - “Super Tuesday”.

I could have made this neater by pulling and merging these two data sets automatically, but instead I just copied and pasted them into a single spreadsheet.

df =read.csv('supertuesday.csv') # load data

   

Data Preparation

#tidying, changing type
df$delegates <- as.numeric(as.character(df$delegates1546))
df$votes  <- as.numeric(as.character(df$votes1546))

#tidying, remove all except variables we need
df <- df %>% select (-c(delegates0820,votes0820,delegates1546,votes1546))

print(df) #we can show all data, since this is a small data set
##          candidates spend delegates   votes
## 1 Michael Bloomberg 560.0        64 1573388
## 2        Tom Steyer 210.0         0  194646
## 3      Donald Trump  60.0        NA      NA
## 4    Bernie Sanders  55.0       395 3442670
## 5    Pete Buttigieg  36.0        26  594202
## 6  Elizabeth Warren  27.0        47 3442670
## 7     Amy Klobuchar  17.0         7  369504
## 8         Joe Biden  16.0       497 4365361
## 9     Tulsi Gabbard   5.5         2   92772

   

Visulisation 1: Spend v Delegates

p <- ggplot(data = df, mapping = aes(x = spend, y=delegates,label = candidates))
p + geom_point(color='red',size=3) +
  labs(x = "Total ad spend ($million)", y = "delegates won by SuperTuesday") + 
  geom_text_repel()
  xlim(0,750)

#save output
ggsave('delegates.png')    

It looks safe to conclude there is no strong relationship

   

Visualisation 2: Spend v Votes

p <- ggplot(data = df, mapping = aes(x = spend, y=votes, label=candidates))
p + geom_point(color='blue',size=3) +
  labs(x = "Total ad spend ($million)", y = "votes won by SuperTuesday") +
  geom_text_repel() +
  scale_y_continuous(labels = scales::comma)

#save output
ggsave('votes.png')  

Again, it is safe to conclude there is no strong relationship

   

Visualisation 3: Price for votes

Let’s work out how much each candidate payed per vote/delegate (note, this makes the very dubious assumption that with 0 ad spend they would get 0 votes/delegates)

df <- df %>% mutate(per_delegate=(spend*1000000)/delegates,per_vote=(spend*1000000)/votes) %>% drop_na()

print(df)
##          candidates spend delegates   votes per_delegate    per_vote
## 1 Michael Bloomberg 560.0        64 1573388   8750000.00  355.919837
## 2        Tom Steyer 210.0         0  194646          Inf 1078.881662
## 4    Bernie Sanders  55.0       395 3442670    139240.51   15.975972
## 5    Pete Buttigieg  36.0        26  594202   1384615.38   60.585457
## 6  Elizabeth Warren  27.0        47 3442670    574468.09    7.842750
## 7     Amy Klobuchar  17.0         7  369504   2428571.43   46.007621
## 8         Joe Biden  16.0       497 4365361     32193.16    3.665218
## 9     Tulsi Gabbard   5.5         2   92772   2750000.00   59.285129
p <- ggplot(data=df,
            mapping=aes(x = per_vote,y=reorder(candidates,per_vote)))

p + geom_point(size=3) +
  labs(x= "Price per vote ($)",y="") +
  scale_x_continuous(labels = scales::comma)

ggsave('pervote.png')

   

Summary, Discussion

Enjoy discussion of this analysis around the twitter thread here

Caveat #1: Initial graphing done while results were still coming in, so the actual figures changed (but the general pattern held).

Caveat #2: Apparently Bloomberg’s ads were terrible so better ads might have had an effect.

Caveat #3: Also, primary voters are almost by defintion high engagement, so this doesn’t test the case of influencing decisions from low-engagement voters

Caveat #4: Maybe we should think of Bloomberg’s spend as a hedge against the collapse of the Biden campaign

Caveat #5: Maybe Bloomberg was buying something other than votes (e.g. press)

Caveat #6: Maybe it was effective, since he overtook Biden in the polls at one point (from a lower base)

Caveat #7: Bloomberg’s campaign stalled after disappointing debate performance, “there’s only so much advertising can do with a bad product”

   

Colophon

Normally I’d do this with Python, but I was inspired by Kieran Healy’s book to use R. It’s a great book:

Repo for this analysis (the markdown file which generated this page, data and plot files) is here: github.com/tomstafford/supertues