| A treasury from Google Labs: a new programming model |
|
|
| Wednesday, 29 December 2004 18:55 | |||||||||||||||||
|
MapReduce: Simplifed Data Processing on Large Clusters "MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. " Statistics of jobs in august 2004, as usual impressive amount of data!!! Number of jobs 29,423 Average job completion time 634 se Machine days used 79,186 days Input data read 3,288 TB Intermediate data produced 758 TB Output data written 193 TB ... "It has been used across a wide range of domains within Google, including: large-scale machine learning problems, clustering problems for the Google News and Froogle products, MapReduce has been so successful because it makes it possible to write a simple program and run it ef ciently on a thousand machines in the course of half an hour, greatly speeding up the development and prototyping cycle. Furthermore, it allows programmers who have no experience with distributed and/or parallel systems to exploit large amounts of resources easily." more in the PDF here
Powered by !JoomlaComment 3.20
3.20 Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."
|
|||||||||||||||||
| Last Updated on Wednesday, 29 December 2004 19:52 |
| Another articles: |
|---|
|
| Powered By relatedArticle |












Tags
