
Designs, Lessons and Advice from Building Large Distributed Systems from the king of search Google
Google’s Jeff Dean was one of the keynote speakers at an ACM workshop on large-scale computing systems, and discussed some of the technical details of the company’s mighty infrastructure, which is spread across dozens of data centers around the world. His presentation give some insight about what’s going on at Google, and how they have found innovative solutions to meet their never ending quest of speed and bandwidth usage. All their figures have impressed me a lot!
You will learn some of their in house technologies, aka
- Google File System (GFS): a scalable distributed file system for large distributed data-intensive applications.
- Map Reduce is a software framework introduced by Google to support distributed computing on large data sets on clusters of computers [WikiPedia], see Hadoop project for a free open source #Java MapReduce implementation.
- BigTable is a compressed, high performance, and proprietary database system built on Google File System (GFS), see Hadoop HBase project for something similar.
- Their new project: Spanner which will be responsible for Storage & computation system to spans all over their datacenters.
Read now this great document online http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf (if it disappear, ask me for a copy)