The rise of social networking demonstrates why the ability to process multiple signals at lightning speed is important. "When we crawled documents, we had minutes if not hours to analyze them," Mr. Singhal says of the early days of search. "Suddenly, we moved into a world where we had two seconds."
The biggest problem is determining relevance. Services such as Twitter and Facebook present fleeting information, ranging from status updates to 140-character tweets. With hundreds of millions of people compulsively using such services, sorting out the noise becomes especially difficult.
To start solving that problem, Google turned to an old staple: PageRank.
Just as Mr. Page's algorithm determined the value of web pages by the number of pages linking to them, Google values Twitter users by the number of followers they have. The thinking is the same: People would not flock to a source of information if it wasn't any good.
But there's a glaring flaw in that system. Sites such as Twitter are most useful for their immediacy in breaking news situations - what if the person who just happens to be at the dock when the plane crashes into the Hudson river only has two or three followers?
Google handles these situations by including yet more signals into the algorithm - signals that, in some cases, didn't even exist a few years ago. The first is the re-tweet rate, or how quickly other users reproduce a piece of information - the faster something spreads, the more important it must be.
Google combines the re-tweet rate with information about the user's geography to create a sort of heat map. If a plane goes down in the Hudson river, and a piece of information originating from the Manhattan waterfront begins spreading at lightning speed, odds are it's something first-hand and worth noting.
The importance of knowing a user's location helps to explain why Google has rolled out an aggressive mobile phone strategy, including an operating system for smart phones and even its own handset to compete with the BlackBerrys and iPhones of the world. While many assumed the company was simply trying to cash in on one of the fastest-growing retail sectors in the world, there was a second reason: Smart phones have the ability to pinpoint where a user is.
Imagine a user in downtown Vancouver searching for "farmers' market" - only now, Google knows exactly where that user is. The farmers' markets closest to the user in Vancouver show up much higher in the results page, and the search is much more useful for the user.
The ability to collect geographic information provides Google with perhaps the single most significant improvement to search quality since Mr. Singhal re-wrote the core engine some 10 years ago.
And it's one more step toward Google's ultimate goal - something Mr. Singhal describes as the holy grail of search.
Searches Gone Bad
There's a song by British electronica act Aphex Twin with a video that features a ridiculously long limousine.
If that's all you knew, how would you search for information about the song? You might simply enter what you know: "Music video with the long stretch limousine."
Sure enough, there it is: The second search result is the Wikipedia page for the Aphex Twin song, Windowlicker.
This is the goal that Google's search engineers are focusing much of their energy on - the ability to search for something a user can't name.
The challenge is always changing: If you line up all the queries Google gets in a day, and ignore duplicates, about one-third have never been seen before. To figure out what users are trying to find when they enter these never-before-seen queries, Google relies on "losses," or searches gone bad. Signs of a loss include a user who keeps coming back to Google with slightly modified versions of the same request, or someone who has to scroll through 10 pages of results before finding what they're looking for.Report Typo/Error
Follow us on Twitter: