Interview: Dr. Joshua Tucker on Bots and Regulating Social Media
At DisinfoLab, we track the activity of bot accounts on social media and seek to understand how narratives promoted by these accounts develop over time. In July, DisinfoLab sat down with Professor Joshua Tucker to discuss the relationship between social media and politics, and how bot accounts can play a role in shaping political narratives.
Joshua Tucker is a Professor of Politics at New York University, where he serves as the co-director of The Center for Social Media and Politics, an academic research institute that publishes policy-relevant research on the impact of social media in politics. Additionally, Professor Tucker is a co-author and editor of The Monkey Cage, a politics and policy blog at the Washington Post that promotes the work of political scientists within the national political conversation.
His current research provides critical context to the conversation surrounding democratic values in the digital political environment. Professor Tucker’s work spans a variety of areas, including network diversity, partisan echo chambers, online hate speech and protests, and Russian bots and trolls.
The following interview has been edited for clarity and length.
DL: Discussions of disinformation have taken center stage over the last few years and “bot” is a term that comes up very frequently. What exactly are bots, and what role do they play in politics on social media?
JT: Bots are fully automated accounts that produce content online that is governed by the use of algorithms. The algorithm itself produces the content as opposed to a human being sitting at a terminal and typing. We contrast that with a “troll,” which refers to an actual human being producing the content. Normally when we use the term troll, we’re referring to someone who has a false identity: they are pretending to be someone they're not. And then there's a third term, “cyborg,” which gets much less attention in the media, but probably explains a lot of these accounts. A cyborg would be an account where some of the content is produced by an algorithm and some of the content is produced by a human.
This brings us to a good point: bots themselves are neither inherently good nor bad. It depends on your research question. And it depends to what use they're being put to.
DL: One scenario which prompted discourse over the roles of bots and trolls in online disinformation was the 2016 presidential election when the Russian Internet Research Agency (IRA) was found to have used various online tactics to advance Russian interests online. In your research, you concluded that IRA tools were unlikely to have had a significant impact on the election outcome. Could you explain why?
JT: In the lab, we looked at survey data that was linked to the Twitter handles of people who took those surveys. We could go out and collect all of the tweets by all the people they followed and get a sense of, first of all, how concentrated that data was. A uniform distribution would mean every single person on Twitter is equally likely to see a tweet by a Russian troll. But exposure to tweets was highly concentrated. Most people have no exposure. Other people have quite a bit of exposure.
Who are those people? Republican men. In this study, we went and collected information on how much people were exposed to other potential sources of politics, including tweets from politicians and tweets from the media. The number of tweets that people saw from trolls is minuscule compared to the number of tweets they see from these other sources of information.
Our postulation here is that if you have a situation where A) for most people, it's really hard to change their vote, B) people are being bombarded with tons of information over the course of the campaign, C) most people aren't seeing anything from these trolls, it's concentrated among small numbers of people who are likely to vote for Trump anyway, and D) even within social media, it's being dwarfed by all sorts of other information, it seems like a stretch to argue that the tweets that people see from Russian trolls would be the thing that tipped them the edge.
That doesn't mean it shouldn't be considered a national security threat to have a foreign government trying to interfere with the election environment in our country. That doesn't mean that this wasn't a probe to see how we would respond when other countries tried to interfere with the context of an election. The question of whether this is something that the government should view as a national security threat is very different from whether these tweets had a causal effect on political behavior.
We spent four years talking about these social media campaigns. Many people contended that it's not possible that the American public elected Donald Trump on their own––it must've been that they were manipulated by the Russians to do it. That mindset puts into people's eyes that it's very easy to manipulate and change the outcome of an election. Then 2020 comes along and you get the Big Lie from Donald Trump that the election was fraudulent, despite zero evidence to support that claim. Does it matter that for four years people heard, “The Russians were able to change the outcome of the election?” Why not nefarious Democratic operatives in Arizona and Georgia? We argue in the research that we've done that what actually may be the biggest impact of this Russian influence attempt is not the direct effect of having changed votes in the 2016 election, but it's the indirect effect of having had the United States for four years before the 2020 election bombarded with stories about how easy it was for a foreign power to manipulate and change the outcome of an election.
DL: We're curious about what you think about the strategies that social media companies have implemented to combat disinformation. One example is warning labels. Instagram began labeling posts that mentioned COVID-19 with a warning saying to visit the CDC website for more information. Following claims of the “Big Lie,” there were numerous labels verifying the security of the voting system. Do you think these warning labels have been effective?
JT: I can’t say whether they were effective, but I can tell you a little bit about some of the research we've done in this regard. We looked at what happened to all of Trump's tweets that got warning labels put on them. What we found was that when they implemented a hard label––where they literally blocked the tweet and said you couldn't forward it––that was incredibly effective. Those tweets did not spread on Twitter. When they put the soft warning and warned people that it might contain incorrect information, those tweets continued to spread.
You have to think about who gets to make the decision about what content is removed. Is it the government? If it's the government, is that every government in the world? Do we want Putin telling Twitter what tweets have to be taken down in Russia? If it's the platforms, how do we ensure the platforms are acting in a fair manner? It’s a very complicated question because even if you want to appear non-partisan, it's really tough to do that. Because if one side is producing more disinformation, then being non-biased is going to mean putting warning labels on more content from one side than the other side.
DL: This question relates to your book, Social Media and Democracy. You state, “This book represents a clarion call for making social media data available for research with results concomitantly released in the public domain.” We find that this call to action is especially relevant in light of Facebook's recent dispute with the White House over vaccine disinformation. In response to statements by the Biden administration that Facebook was spreading harmful vaccine disinformation, Facebook released a report indicating that “since the beginning of the pandemic, have removed over 18 million instances of COVID-19 misinformation,” but the company has not released other critical statistics like how many pieces of disinformation were actually uploaded to the site and how many people viewed them. Are social media companies disincentivized from combating disinformation?
JT: I don't think social media companies are disincentivized from addressing disinformation. I think if social media companies had a way to address disinformation that dealt with the problems I just mentioned to you, they'd be happy to do it. Disinformation is a huge problem for them––it’s negative publicity. But logistically, it's incredibly tricky. You simply can't hire enough people to look at all the disinformation that's on Facebook.
Then you have the meta-level problems about who should be making these decisions. The platforms are caught in a bit of a bind: if they go too far, they're accused of seizing power and controlling speech. If they don't go far enough, they're accused of letting hate flow and of letting disinformation around COVID flow.
DL: Why do you think Facebook is reluctant to share this kind of data?
JT: It’s complex. There are three essential questions. One is the legal regulatory framework. These companies are all terrified of being sued; they're terrified of being broken up by the government. But we could change the legal regulatory framework to compel social media data companies to share data with scientific researchers who promise to put their findings into the public domain. That is a legal problem.
There's a second problem, which is the business model. What do social media companies think their users expect of them in terms of data stewardship? Would you use WhatsApp if you thought that upon you sending your message on WhatsApp, it would be on a billboard in Times Square within three minutes? Probably not. They say we should be able to study the data, but you have to figure out where that line is drawn. Do they think it would be harmful to their business models to make data available for researchers?
The third question is an ethical question. What is the right thing to do in this regard? A lot of people think about the ethics of this in terms of privacy. That's an easy way out because we all agree, privacy is good but, if you take that to the fullest extent, you could just tell Facebook to delete all the data after seven days. But at that point we have no way of assessing Facebook's impact on, say, the 2016 election or the 2020 election or anything, nor would we be able to harness that data to see if we can figure out ways to make it so that LGBT youth are less likely to commit suicide, for example. There are incredible things that you can do with this kind of data.
So on one end, you can ensure full privacy for everybody, but then there's no benefit to society from the collection of this data. And indeed, the data that is collected is only used in the short term to increase the profit of wealthy companies, and in the longer term, it's used to do whatever Facebook or Google wants to do with it. On the other end, we could ensure that this data is also used to advance the public interest. You have to tradeoff between privacy and how much the public knows about the impacts of the platform in society. Or it can be a tradeoff between privacy and how much science we can advance with the use of this data. We’re arguing that the policy discussion should be reconceptualized.
I think a world where we know nothing about what is happening on these platforms, and the company knows everything––that's bad. That's bad for public policy, it’s bad for advancing scientific knowledge, it's bad for accountability. Having your data not be private and running risks that your data is going to be lost, that's bad too. So we have to acknowledge this tradeoff and think creatively.
We are grateful to Professor Tucker for sharing his research with the DisinfoLab team. You can keep up with The Monkey Cage’s latest publications on the Washington Post website and learn more about NYU’s Center for Social Media and Politics here. This discussion is the first installment in a series of expert interviews DisinfoLab will conduct this year.