When tech company Strava was founded in 2009 it had a simple mission – to help cyclists and runners keep track of their activities.
It didn't take long, however, for an entirely different group of customers to take an interest. San Francisco-based Strava has received so many requests for its user data that it now runs a secondary business selling anonymized cycling data to municipalities and other groups looking to better understand how and why local cyclists choose the routes they do. The Oregon Department of Transportation, for example, paid $20,000 last year for a data set that included information about 400,000 trips made by 35,000 Oregon cyclists in 2013.
A simple activity-tracker app, it appears, has collected enough data from its users to help decipher one of the great modern-day urban planning mysteries.
For decades, the best source of systemic information on how and why people do the things they do often came from painstaking studies conducted by government bodies and academic researchers. But increasingly, the newest, biggest and most interesting troves of data on the subject of human behaviour now belong not to universities or public institutions, but a slew of technology and social media startups – many of which didn't exist just a few years ago. With billions of users and access to immense computational powers, companies such as Facebook Inc. and Google Inc. are now able to conduct in the blink of an eye behavioural experiments that are in many cases simply not feasible for traditional researchers. And even tiny startups are quickly collecting user information of such scale and scope that many of the world's leading research institutions can't hope to compete.
That shift has profound commercial and social implications. Not only does this behavioural data have immense financial value, there exits no set of rules or ethical guidelines dictating what sort of experiments tech companies can and can't run on their user bases. That leaves each company, essentially, to make the rules up as it goes along.
"In a decade, or maybe five years, the way things are going, there will be well-established practices," says Michael Horvath, co-founder of Strava. "They may not be laws, but if you don't abide by them, you will not be considered an A-quality company.
"But right now, that doesn't exist."
The massive quantity of data in the hands of tech industry corporations has been well-known for years. But in recent months, the issue of how companies test and manipulate that data has come under scrutiny. In June, it was revealed that in 2012 Facebook undertook a sprawling behavioural experiment on almost 700,000 of its users. Without the users' specific consent or knowledge, the social network subtly altered the content of the users' news feeds to show more positive or negative content. The purpose of the experiment was to determine whether content viewed on Facebook can influence the mood of the person viewing it.
A month later, the dating web site OKCupid revealed that it too has been running experiments on its users. In one such experiment, designed to test the power of suggestion, the site took pairs of individuals that OkCupid's own algorithms suggested would not get along, and instead told the users that they were good matches for one another.
In both cases, the revelations prompted a vocal response from users, many of whom saw the experiments as a form of manipulation to which, had they been given the opportunity, they would not have consented.
But for the tech companies, the incentive to continue running such experiments is significant. Not only do most tech companies constantly tweak user experiences in an effort to create a more compelling product, for sites such as Facebook, the ability to predict and influence a user's state of mind can have a direct effect on the success or failure of the ads Facebook shows – and from which the company derives most of its revenue.
Indeed, many researchers agree that no amount of public criticism is likely to stem the tide of experimentation by tech companies on their users.
"What's going to happen is Facebook is not going to stop these experiments, they're just not going to tell anybody about it," says Jean-Baptiste Michel, a French-Mauritian researcher specializing in large data sets. "That's a net loss for science."
Mr. Michel's own research shows the potential benefits of establishing a more collaborative effort between academic researchers and the new gatekeepers of mass behavioural data. Using the massive store of literature available through the Google Books project, Mr. Michel developed ways of measuring, for example, how long it takes for certain events in history to fade from the public discourse.
But as it stands, the vast majority of user testing data is simply off-limits to academic researchers.
"I think it sets up potentially a very imbalanced research landscape," says Elizabeth Buchanan, director of the Center for Applied Ethics at the University of Wisconsin-Stout. She points out that traditional academic researchers are bound by a number of restrictions and must gain the consent of test subjects and the approval of ethics review boards before completing behavioural studies.
"Anyone who didn't think about the research implications of these services is being somewhat naive," says Dr. Buchanan.
"Everyone is a data subject at this point in time."