Google ran about 6,000 search-related experiments last year, and made about 500 tweaks to its search engine. The tweaks ranged from fundamental changes to the way the engine determines the quality of a website, to improvements in the Hebrew spell-checking software. With each change came countless hours of debugging and testing.
"For a piece of software like a word processor, you can say this works or this doesn't work," says Scott Huffman, who runs Google's search quality testing division.
"In search, almost anything will hurt some queries and help some queries. Almost anything gets some wins."
At the heart of all these changes is one simple rule: no Band-Aid solutions.
"Nobody is allowed to change one query," Mr. Manber says. "They must change the algorithm."
That means if, for some reason, a search for "Microsoft" returns the General Motors website as the first result, engineers can't just go fix that one problem - they must find out what went wrong in the core search code, and fix that. Otherwise, the list of Band-Aid solutions would quickly become unwieldy thanks to the billions of English web pages out there - not to mention their Maltese translations.
The "no quick fixes" rule is in many ways a direct rejection of how Internet search began. More than a decade ago, when websites such as Yahoo ruled the search landscape, some search engines were entirely human-driven. Employees would scour the Web for high-quality information and index it manually. With just thousands of good websites online, the process didn't seem all that ludicrous.
Indeed, when Mr. Singhal began his graduate work some 20 years ago, the biggest problem researchers had was too little information. In order to test their search algorithms, Mr. Singhal and his colleagues would rely on bundles of documents on CD. At a time when researchers were testing their search algorithms on blocks of a few thousand documents, few people anticipated a world where the search area would consist of trillions of pieces of information.
Google's co-founders, Larry Page and Sergey Brin, did foresee the future - but only sort of.
Their original Google search engine, developed in the late 1990s, depended on "signals," which are basically clues as to how relevant a web page is to a search query. Clues can come in many forms, such as how often previous users searching for the same thing flocked to the same page, or whether the words in the query show up in the web page's title.
Perhaps the most famous signal in the original algorithm is PageRank. Named after Mr. Page, the algorithm calculated the quality of a web page by measuring the number of sites linking to that page. Google refers to PageRank as a sort of search democracy, with each link a vote.
The original system worked. It worked so well, in fact, that Google overtook rivals such as Yahoo for the title of Internet search king But the algorithm had a fundamental flaw that was in some ways similar to the flaw in Yahoo's human-driven approach: It didn't envision how big and complex the Internet would get.
Mr. Singhal's rewrite of Google's core engine code, which replaced the founders' code in 2001, didn't do away with the concept of signals, it simply assumed that there were more signals out there that Google had yet to figure out. In effect, Google's engine now had the ability to make use of new signals that may come along in the future.
More than anything, it was this change that allowed Google to dominate search, as the Web changed from a few million static websites to billions upon billions of pages, blogs, videos and tweets.
Thinking of investing in Apple?
The Re-Tweet Rate