blogger templatesblogger widgets

What is Percolator, Dremel and Pregel and How Google Uses it?



Percolator = Incremental indexing. Dremel = Like MySQL but for huge databases. Pregel = Solution to graph problems

I doubt this will help you with your SEO and rankings but hey it is always fun hearing an engineer talk about some really technical details about how certain things are used at Google.

Can you provide some insight into how Google uses Percolator, Dremel and Pregel? Blind Five Year Old, SF, CA

Matt starts off by letting everyone know that these are completely different tools. He then goes into some examples of how these programs are integrated into Google.
Percolator

Matt states that percolator, or Caffeine as it was known to the world outside of Google, is on top of Big Table. Percolator was the overall system that Google used to make sure that the whole system ran well. To explain what Percolator did Matt describes the way Google use to index content. He explained that Google indexed content in batches. In Matt’s metaphor, he compares this to a catching a train. Imagine everyone in line waiting for a train that only passes once a day. People fill up the train and everyone else in the line has to wait for the next train. Percolator took Google’s old batch indexing and turned it into incremental making it so that instead of having a thousand people waiting to leave on the daily train, people can leave as they come, let’s say, by getting into a taxi.
Dremel

Matt explains that Dremel lets you do great fast data analysis extraction over very large databases and lets you play and interact with them. Matt compares Dremel to MySQL as in they are both databases just that Dremel is huge! Cutts does say that there are some architectural differences but on a practical level what Dremel lets the people at Google do is take databases the size of the web and do very fast queries on over them.
Pregel

Matt describes Pregel as a system to deal with graph problems. It allows the people at Google to compute large graph problem as trying to calculate the reputation of your links or maybe connections between people and it does these computations very fast.
You can read more into the each of these from Research at Google here:
  1. Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications
  2. Dremel: Interactive Analysis of Web-Scale Datasets
  3. Pregel: Large-scale graph computing at Google

Matt Cutts from Google On Percolator, Dremel and Pregel - Infrastructure calculating PageRank and nodes of link graph!

Searches related to PageRank and Link Graph