Skip to main content

It is not easy to remember all the things you've said in life. There never seem to be enough neurons to go around, and besides, people say altogether too many things. But there's always been comfort in the fact that, if a lifetime of statements has floated out of our heads and into the ether, the ether has forgotten about them too.

Unless you're on Twitter.

The mushrooming social network – which people use as a group chat room, a forum for ideas and a personal notepad – hides users' histories of statements from them. Now, it's working on better access to your history – just not for you.

Twitter has licensed third-party companies to mine its giant archive of tweets. The first among them, DataSift, specializes in filtering and packaging huge swaths of data that market-research companies can analyze.

Now, there's nothing unethical about Twitter selling access to an archive of statements that users freely made in public: Twitter is playing by the terms it laid down for users when they signed up. But there's a catch. Unlike Facebook, whose Timeline lets users see everything they've posted, Twitter utterances vanish down the memory hole in a matter of weeks. The company is giving big corporate spenders access to writing that users created, but can't even see themselves. And while it might hew to the letter of its contract with users, it represents a grimly ironic breach of faith.

In fairness to Twitter, providing easy-to-use access to trillions of tweets isn't the simplest task. Casual users can quickly find themselves sitting on a pile of 1,000 or 2,000 tweets. More dedicated users can end up with archives of tens of thousands of tweets. On the whole, in March, the company said that users were pumping out one billion tweets a week.

Twitter archives every one of them. Every public tweet ever made can be individually accessed, if, and only if, you know its address. But this is hidden treasure without a map – each tweet is buried under an essentially random, unguessable URL. Various search websites, like and Google itself, index some of these tweets, but they are frequently selective or incomplete. The company has also given the U.S. Library of Congress access to its archives, and even it has reported to be wrestling with how to manage it all, and is planning to limit access to accredited researchers.

Twitter's vast, real-time flow of data is enormously valuable – and Twitter, of course, needs revenue. The company says it can't keep up with the demand, and so it licensed DataSift and another firm, Gnip, to use its data for the sole purpose of analytics. (Twitter says that protected or private messages will not be publicized. This is a good thing, since any release of private conversations would lead to the spontaneous termination of millions of careers, let loose an armada of libel suits, and tie up the human-rights tribunals for several years.)

DataSift is just part of a chain: The data it sorts and packages can then be bought by analytics firms like NetBase, who distill it into consumer insights for brands like Nike. By applying filters and language analysis, companies can get an instant picture of how consumers feel about products, be it a global computer launch, or a neighbourhood frozen-yogurt franchise opening.

DataSift's CEO, Rob Bailey, told me that this kind of real-time scanning could give companies a heads-up to breaking news before the media reports it, giving them a head start before the markets react. The company also offers Twitter data as a source of political research and insight.

The "historics" feature – which opens up a vault of archival data going back to 2010, will launch within the next month. This kind of access isn't cheap. Subscriptions can run thousands of dollars a month, though pay-as-you-go options are available.

Here's the irony: Twitter's users walked into this line of publishing on the premise that the data would be public. Now, having pumped the service full of data, that data has effectively been made private, accessible only to those with a corporate analytics budget.

Inequal access to information creates an imbalance of power. This is especially important to those who posted publicly with the expectation that they'd be able to see, control and prune their postings later on. Remember that in many parts of the world, political research isn't just policy-testing and mud-slinging; it's a matter of life and limb for oppositions, activists and dissidents. A Twitter feed can paint a very detailed portrait of someone's life, their activities and associations, even if no individual tweet is particularly revealing. Now, Twitter users have two options: Submit their histories for corporate or political analysis, or delete them and lose everything.

By locking users out of their own data, Twitter has managed a rare feat: making Facebook look good. Beyond Timeline, Facebook provides a "Download Your Information" tool, which will give you a copy of everything you've fed into the system. A similar tool from Twitter would be a great start. Otherwise, it will have achieved a worst-of-both-worlds scenario: Tweets are kept private from their authors, but made public to those who can pay. A fair deal? Forget it.