Detailing the Need for Transparency in Data Practices

Cover Photo
Stock Photo From "".

The big data revolution took the technology and software industry by storm
and dramatically reshaped the operations landscape for many digital companies. The transformation did not occur overnight, but the media has recently begun to pick up on the importance of data privacy and has highlighted many data practices that companies would have preferred to keep under wraps. These companies have faced little pushback for data misuse under the mask of providing users with free services in exchange for their data. However, the motivation for collecting this data has always been profit driven . While the average user may have been blissfully ignorant to the negligence and lack of oversight that the corporations operated with, recent events in the mainstream have led people to question their online privacy and the sanctity of their online interactions. While corporate profits are important to the national economy, the issue of data has major political elections and the veracity of journalism hanging in the balance. Lobbyists, lawmakers and representatives have begun to engage in debate and discussion about the future of data usage and the fight for balance between economic growth, constituents’ privacy, and the influence of data on the sociotechnical spheres. Although these are important first steps, monetization transparency is also largely important to making progress on this issue.

One of the most prominent data driven scandals in recent history was created by Cambridge Analytica’s unorthodox use of Facebook user data to generate targeted content for the 2016 American Election . Even before the whistleblower leaked information on the attempted manipulation of the 2016 presidential election in March of 2018, articles on Cambridge Analytica’s presence in American politics were written as early as 2015. Republican senator and presidential candidate Ted Cruz had also hired the same firm to perform voter research . Though Cambridge Analytica had been around and been contributing to political campaigns across the world, the massive rise in skepticism did not manifest until 2018, showing how delayed the response to well know data misuses was. The primary cause of the uproar in 2018 was the revelation that the firm had gathered data from nonconsenting users. Through unassuming online ‘quizzes’, users were coerced into granting profile access to collection groups without a face or name. Not only were participants’ profiles accessible, but the profiles of their friends were also accessible. With the ability to scrape information regarding the friends of consenting users, Cambridge Analytica was able to build an enormous database of voters across the country and use it to create targeted content to sway votes. The Cambridge Analytica voting scandal served as a proof of concept for Facebook as a political influencer. Cambridge Analytica earned millions from campaigns across the country, and Facebook earns upwards of $400 million from political campaigns, annually . Facebook broke customer trust even earlier, in January of 2012. A research team conducted a weeklong study on mood swings of users by manipulating nearly 700,000 peoples’ news feeds and measuring changes in the emotions of posted content following an interaction with the manipulated feed . While it technically may have been legal, the ethical boundaries of this experiment are far from normal. These advertising and data experiments are run to determine the ability for increased profit and revenue from a shareholder standpoint, but they do not reflect the impacts they can have on longstanding social institutions.

Facebook is not alone in their transgressions of customer trust and privacy. Many other prominent companies have financially thrived over the data access their products grant them. Throughout all of these incidents, the companies always tried to put exhibit improvements in their data policies when the truth is, not much actually changed for most users. While Facebook claimed to be restricting third party access to private user data, a 2018 New York Times report stated that many prominent companies, including Spotify and Netflix, were given excuses and passes into the system through hidden contracts, allowing them to continue accessing private messages. The same article claims that American companies were projected to spend nearly $20 billion on attempts to collect and analyze consumers personal data. The high dollar value on data and the private contracts with other large companies draws a clear picture of the motivation for lax data policies. Facebook has perhaps made two open and publicly beneficial changes to its system in recent history. In 2015, Facebook shut down its Friends API, the same tool that Cambridge Analytica was able to make use of to build their database of voters that may not have consented to their data being analyzed by this particular third party . While the action was necessary, the impacts had already been felt on a global scale. Another positive step that Facebook took was actually an indication of their prior desire to prevent customers from knowing the extent of data being collected by the social network. When GDPR regulations were enforced in early 2018 , Facebook rolled out a new data transparency platform which enabled users to download all of their data in the form of a JavaScript Object Notation (JSON). JSON is one of the most portable form of data and enables other developers to port data from pretty much any service that generates JSON outputs . Prior to the JSON transformation, the data downloads were limited to exclude attachments sent over the platform and were stored in an esoteric html format. Touted as being user readable, the html format could be opened by users but was very clearly designed to block developers from porting Facebook data to other platforms analyze the data on behalf of the users, monopolizing all analytics to be conducted through Facebook’s API.

It is clear that companies have been profiting off user data for an extended period of time, and have been collecting information before the data race even began. It would also be ignorant to state that the collection and processing of big data has not provided any tangible benefits to the world. Major issues tend to arise when the incentives of marginal profit are blurred with the risks of private data falling into the wrong hands, as the consequences of such situations have proven to be disastrous. The balance lies in the transparency of data monetization strategies. When companies engage in monetization techniques that require the collection of user data, users should have the right to be informed of where their data is going and have the ability to electively opt out. The world of data has only just started to turn, but early intervention on negative actions can lead to a symbiotic relationship between profit and privacy.

Shivam Parikh Software Engineer at Flexport. UC Berkeley Class of 2020 - Computer Science, Data Science, Environmental Economics and Policy. Avid photographer, engineer, and community member.

Related Posts