Google engineers have spent 1,000 man-years trying to understand the apple.
Intuitively, it's simple: An apple is the round thing hanging from the tree. But it's also the company that made the iPod, and the Beatles' record label, and the thing your neighbourhood farmer's market sells every Sunday.
When someone simply searches for "apple," how do you tell what it is they're looking for?
"When I built the first search system, it exploded in my face," says Amit Singhal, the man responsible for the code at the heart of Google's core search engine. "People were searching for apples and we were giving them the computer company."
The story of exactly what happens in the microseconds between when a user clicks "Search" and when Google presents its results is the tale of one of the most fiendishly difficult puzzles of the digital age: How do you sort the Internet?
But telling apples from Apples is more than just an intellectual challenge. It's also the key to Google's $160-billion (U.S.) empire. While the company has expanded into everything from smart phones to social networking, the heart of its business remains the provision of quick, accurate answers to people's search queries. Beside those answers, more often than not, Google will display ads related to the queries - ads from which Google derives the majority of its $24-billion-a-year revenue.
Earlier this month, the company announced a complete overhaul to the way it indexes the Web. Called "Caffeine," Google's new algorithm collects information about hundreds of thousands of web pages every second. The resulting database takes up a massive 100 million gigabytes of space.
There is an excellent reason why Google is willing to go to such lengths to improve its search engine: There is no loyalty in search. If a better search engine were to come along, users would flock to it, and Google's dominance would disappear.
For the first time since Google began eviscerating the competition in the search market a decade ago, a new crop of sites are threatening its position. Millions of users are turning to sites ranging from Twitter to Facebook, Amazon to eBay, to look for breaking news, restaurant recommendations or shoes for sale. Apple is starting to leverage its massive mobile applications ecosystem to run an advertising business aimed squarely at taking revenue away from Google's search advertising. Suddenly, Google's list of potential competitors has exploded.
The most high-profile of those competitors is Microsoft, which launched its Bing search engine last year. Google still dominates the search business - it controls about two-thirds of the key U.S. market - but Bing's initial growth, which saw it grab about 11 per cent of the market, caught Google off guard. To maintain its dominance, Google needs to remain a step ahead of its rivals when it comes to search technology.
A rare series of interviews with Google's top search staff reveal how difficult a job that is. From dealing with synonyms to understanding context to monitoring real-time data, Google's chief engineers have to grapple with the challenge of how to monitor every bit of information in the world.
What they have their sights trained on is the holy grail of search: Figuring out what you're looking for, even if you don't know what you're looking for. Meeting that challenge will cement Google's No. 1 position among search engines. Failing to meet it will open the door to rivals that want a piece of one of the planet's most lucrative businesses.
Thinking of investing in Google?
'No quick fixes'
Udi Manber looks satisfied - Google has just learned Maltese.
The Israeli-born former computer science professor is Google's vice-president in charge of core search functions. That includes the company's efforts to translate all the world's information - something it already does passably to proficiently in about 70 languages. The addition of Maltese to that arsenal affects only a tiny fraction of Google's users, but it's emblematic of the company's determination to leave no part of the global data stream unmapped.Report Typo/Error