In the first installment of our Activist Shanghai! series, we pointed out that public scrutiny has begun to focus on campaign finance -- who is giving money and who is taking money to run election campaigns. But not much attention has yet been paid to WHAT this campaign money is actually being spent on. More and more often, political campaigns are spending their funds on user profile data so they can reach young voters where they live: online.
As I was preparing to write this article, news of "Datagate" broke: Bernie Sanders' presidential campaign was accused of stealing voter data compiled by the Hillary Clinton campaign through a shared software system that is run by the Democratic party. The Sanders camp countered this accusation by stating that they had tried to force the Democratic National Committee to fix the software "glitch" that allowed this to happen by bringing it to the DNC's attention earlier in the year, with no success. All of this comes on the heels of the Sanders campaign announcing that it has now officially surpassed President Obama's record 2.2 million individual grassroots donations.
This story is the best possible illustration of how important Big Data has become to politics. Immediately following the Datagate breach, the Democratic party -- which has been repeatedly accused of allegedly tippping the scales in favor of Clinton -- cut off Sanders' access to the voter database entirely. At which point the Sanders campaign immediately filed a lawsuit against the DNC and called for a detailed independent investigation. The DNC quickly relented and restored database access, although Sanders still called for a probe and suggested that Clinton's team may have also used the glitch to access his data as well.
Actually, this is a tale of one system shared by two candidates. Upon hearing this story, the first question that might go through a technologist's head is "Why were they sharing a system at all?". The glitch was supposedly in the "firewall" between the two campaigns' voter data; in other words, there were rules created within the software that simply created a virtual boundary between the two campaigns' data. This is a very weak design if the intention was to separate each campaign's data. Even the n00biest n00b could tell you that the best, most secure design in this situation would have been two separate installations of the same software/database running on two separate machines entirely so there would be no possibility of accidental cross-contamination in case of bugs or glitches (and yes, there are always bugs and glitches). The cost of two separate software installations versus one would have been minimal.
Soon after the story broke it was revealed that the software vendor at the center of this scandal -- NGP VAN -- is run by former Clinton insiders. CEO Stuart Trevelyan worked on Bill Clinton's 1992 campaign and original founder Nathaniel Pearlman served as Hillary's 2008 campaign CTO. While working for Clinton, Pearlman managed the controversial email server at the center of the other Clinton server scandal. As TheOutsiderNews.com points out, "The potential conflicts of interest in providing rival campaigns with analytical tools and data analysis is mind boggling and the data breach at the center of the controversy appears to represent a fundamental design flaw in the NGP VAN voter database system."
The urgency and vehemence of all of the parties involved in Datagate underscores how important data has become in our election cycles in just the last few years. Second to none, Barack Obama's presidential candidacies of 2008 and 2012 trailblazed new techniques that leverage voter data and social media, techniques which are now commonplace. In 2013, The New York Times pulled back the curtain:
"The [Obama] campaign recruited the best young minds in the booming fields of analytics and behavioral science and placed them in a room they called “the cave” for up to 16 hours a day over the course of roughly 16 months. After the election, when the technology wizards finally came out, they had not only helped produce a victory that defied a couple of historical predictors; they also developed a host of highly effective marketing techniques that were either entirely new or had never been tried".
The techniques pioneered by the Obama campaign in 2008 were the first to leverage ad microtargetting for political purposes on a large scale. Obama's techniques were so effective that online marketers now school people in using them:
"The key is to analyze consumer behavior in detail. Moving from batch ‘n blast campaigns and mass marketing to micro targeted campaigns pays... start with analyzing web browsing behavior and email engagement for example... Use Facebook Custom Audience to target advertisements specifically to a group of named individuals... The Obama campaign sifted through self-described supporters’ Facebook pages in search of friends who might be on the campaign’s list of the most persuadable voters. Then the campaign would ask the self-identified supporters to bring their undecided friends along."
-- "How to Use Obama's Secrets in Marketing" [AgileOne.com]
A huge admirer of Barack Obama's 2008 campaign is this season's Republican candidate Ted Cruz. In an article entitled "Don't Blame Ted Cruz For Facebook's Sins", Kevin Drum of Mother Jones explains how Ted Cruz hired the firm Cambridge Analytica to compile "psychographic profiles" of millenial voters via Facebook. The company reportedly paid users to turn over their Facebook profiles for $1. Even more potent, the company was able to leverage that information to also harvest data about those users' friends as well, ballooning the size of their dataset.
In this brave new landscape, the companies that compile the most voter data simply become The Kingmakers. This is the business model of most of the services we use every day "for free" on the internet. We ask Facebook to share our photos and keep track of our friends, we ask Google what time it is in Singapore or Amsterdam, and in return they monetize our habits, queries -- and most importantly -- our relationships to other human beings and the world around us.
Online ad networks are very efficient instruments of the modern surveillance economy. Ad networks can derive an incredible amount of data about you simply based on your IP address (the numbers that identify your machine's network location to other machines). Because many different sites you visit are often using the same advertising vendors, these vendors can compile data about what you visit and when by tracking your activity across vast portions of the internet. Even supposedly free tools like Google Analytics are tracking users extremely effectively (GA is used on 60% of the top 10,000 websites on the internet). If you are keenly interested in the details of how advertising networks gather data about you while you are just browsing around the websites you enjoy or are obligated to use for work, you might check out this article by Sam Snelling.
In an article entitled "Hi, I'm from the games industry. Governments please stop us.", game designer Cliff Harris cries out for help:
"Lets think about this for a minute. A company hires people to stalk its customers and befriend them so they can build up a psychological profile of each customer to allow them to extract more money. This is not market research, this is not game design. This is psychological warfare. Lines have been crossed so much we cannot even see them behind us with binoculars. We need to reign this stuff in. Its not just psychological warfare, but warfare where you, the customer, are woefully outgunned, and losing. Some people are losing catastrophically."
The more we talk about Big Money in elections, the more we need to talk about Big Data in elections. As a society, we urgently need to have discussions such as:
How should campaigns be allowed to gather information about voters?
If controlling voter data is so important, who chooses the software vendors for political campaigns and what standards do they have to meet? There are almost no laws around this right now, as illustrated by the Datagate fiasco.
Can campaigns turn around and sell the data they have collected? Should voters who give their contact info and other data to political campaigns they are interested in be "monetized" without knowledge or consent? Can voter data be passed along to other parties without consent?
How can we keep Big Data and Social Media from crushing dissent? Do secret partnerships between law enforcement and Silicon Valley social media companies constitute a form of censorship? (More discussion here)
How do legislators with little knowledge or understanding of technology design laws to protect the public?
How do we separate the Surveillance Economy from the Surveillance State?
What happens when the Surveillance State falls into the hands of a belligerent incumbent?
How will emerging technologies such as Machine Learning and Artificial Intelligence intersect with democratic political systems?
In the immortal words of the cautionary tale Zero Wing:
We get signal.
Main screen turn on.
How are you gentlemen?
All your base are belong to us.
You are on the way to destruction.
What you say?
You have no chance to survive make your time. Ha ha ha...
All your base your base, base, base
All your base are belong to us.
[UPDATE 2/29/2016] Since writing this blog post, the Electronic Frontier Foundation has published an excellent, detailed breakdown about how information is gathered about voters and what you can do to protect yourself.
[UPDATE 5/16/2016] The Belgian Police have published a warning to Facebook users about not using the new Facebook "reactions" feature if they want to protect their privacy. Link to an article about this is here.
Join us for our third installment in the Activist Shanghai series, where we see what campaigns do with all of that Big Data: "Astroturf Gone Wild"