Craig Nevill-Manning: how Google is coded

Craig_nevillmanning_1 [Image courtesy of Geoff Oliver Bugbee]

Dr. Craig Nevill-Manning joined Google in 2000 and is responsible for developing new high-precision search techniques for use in the Google search engine. He has also done research into the field of computation biology.

Today the subject of his talk is Google World.

Following a brief introduction regarding the business support innovators can get in Kentucky, Nevill-Manning takes the stage. He announces that he will cover Five Things Google has learned:

1. Think broadly. "Computer science is not necessarily programming." Asking for five volunteers, he hands them papers with a number of orange dots on them. The volunteers organize themselves in several way so that the dots sum to several numbers, or are arranged in patterns. Calling out the numbers, the volunteers hide or reveal their paper. As he counts to up to 32, the sheets or hidden or revealed in a distinctive pattern which he describes. He says he's illustrating the binary numbering system. Dismissing the group he says that he will be doing the same thing with middle school kids later today, "which will undoubtedly be much easier." The joke draws a good laugh from the audience.

2. Enable Others - he proceeds to write a program using Google maps. He notes that the company published an API for the maps when it noticed the interest in them. Finishing his code he ties it to Google Maps.

By the way, he also explains that by clicking "I'm feeling lucky," Google will take the searcher directly to the top result in Google. I didn't know that.

People in Chicago have used the ability to tie data to the maps to create crime data maps, he explains. Housing maps can also be done in the same manner.

3. Use Deep Technology - Nevill-Manning described how Google can offer spelling correction in context. It employs the massive amount of data Google has, applies an algorithm to find the correct spelling.

4. Building for Scale - How can Google build an infrastructure to harness wide area computing? He demonstrates a cute saying he encountered at Stanford, "forget quality, go for numbers." The goal is not to provide the perfect result time, but it would sacrifice the quantity the results. "Google goes for "straightforward algorithms."

Instead of using vast mainframes, the company strategies around PCs. In a vast computing environment the processing power can be taken advantage of while the software falls in behind.

  • Basic principle: replicate everything
  • have multiple channels to different machines.
  • Single failures therefore don't hurt. Graceful degradation is the goal.
  • Replication is needed for scalability. He revealed the a great number of drives at Google are held in place by velcro to make replacing them easy. Cooling is also big challenge.

He said that it's now a tradition to have a fan in each data center, regardless how big, to remind the company of its roots.

5. Detect Trends - He shows query patterns that dip in the middle of the day in Spain, for example. He quickly finishes from there.

Taking questions, he confirms that Google is getting more interested in life sciences information retrieval. For example, how chemicals are coded online may provide opportunities for the company. It's obviously very different from how text is entered.

Nevill-Manning says the next big things for computing are computers that can communicate verbally with humans. He believes that in the next several years, actually typing in queries will seem very primitive.

On local search, he believes that finding different kinds of information (maps for example) is important. Mobility is also terribly important. Information must be available on mobile devices.

He also describes how Google provides search results so fast - web crawling, caching that information and making it available later.

That's all from here.