Google Fellow: Jeff Dean at Univ. of Washington

Interesting colloquium (video recorded) from the University of Washington where Jeff Dean, Google employee and fellow, known for his great work within the Google corporation talks about Google's existing hardware and software infrastructure and several of their problems and ways of solving them.

From:
http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details....

Abstract
In this talk I'll give some background of Google's existing hardware
 and software infrastructure. I'll then discuss what works well and what does not,
 and I'll highlight some areas where we see interesting unsolved research problems.
 The problems span a wide range of topics, including processor design, distributed systems,
 machine learning, information retrieval, text processing and many other areas.
 This talk is meant to cover a sampling of interesting problems/areas, not a comprehensive treatise.


Jeff is known for contributing on:

"   *  Google's initial advertising system.

    * The design and implementation of four generations of our crawling, 
indexing, and query serving systems, covering two and three orders of magnitude growth in number of documents searched, 
number of queries handled per second, and frequency of updates to the system.

    * The initial development of Google's AdSense for Content product 
(involving both the production serving system design and implementation as well as work on developing and improving the quality of ad selection based on the contents of pages).

    * Some of the initial production serving system work for the Google News product,
 working with Krishna Bharat to move the prototype system he put together into a deployed system.

    * Some aspects of our search ranking algorithms,
 notably improved handling for dealing with off-page signals such as anchortext.

    * The first generation of our automated job scheduling system for managing a cluster of machines.

    * Prototyping infrastructure for rapid development and experimentation with new ranking algorithms.

    * MapReduce, a system for simplifying the development of large-scale data processing applications.
 A paper about MapReduce appeared in OSDI'04.

    * BigTable, a large-scale semi-structured storage system used underneath a number of Google products.
 A paper about BigTable appeared in OSDI'06.

    * Some of the production system design aspects for Google's statistical machine translation system.
 In particular, I designed a system for distributed high-speed access to very large language models (too large to fit in memory on a single machine).

    * Some internal tools to make it easy to rapidly search our internal source code repository. "

Comments