- Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications
- Dremel: Interactive Analysis of Web-Scale Datasets
- Pregel: Large-scale graph computing at Google

Showing posts with label Crawling & Indexing. Show all posts
Showing posts with label Crawling & Indexing. Show all posts
Home » Posts filed under Crawling & Indexing
What is Percolator, Dremel and Pregel and How Google Uses it?
Percolator = Incremental indexing. Dremel = Like MySQL but for huge databases. Pregel = Solution to graph problems
I doubt this will help you with your SEO and rankings but hey it is always fun hearing an engineer talk about some really technical details about how certain things are used at Google.
Can you provide some insight into how Google uses Percolator, Dremel and Pregel? Blind Five Year Old, SF, CA
Matt starts off by letting everyone know that these are completely different tools. He then goes into some examples of how these programs are integrated into Google.
Percolator
Matt states that percolator, or Caffeine as it was known to the world outside of Google, is on top of Big Table. Percolator was the overall system that Google used to make sure that the whole system ran well. To explain what Percolator did Matt describes the way Google use to index content. He explained that Google indexed content in batches. In Matt’s metaphor, he compares this to a catching a train. Imagine everyone in line waiting for a train that only passes once a day. People fill up the train and everyone else in the line has to wait for the next train. Percolator took Google’s old batch indexing and turned it into incremental making it so that instead of having a thousand people waiting to leave on the daily train, people can leave as they come, let’s say, by getting into a taxi.
Dremel
Matt explains that Dremel lets you do great fast data analysis extraction over very large databases and lets you play and interact with them. Matt compares Dremel to MySQL as in they are both databases just that Dremel is huge! Cutts does say that there are some architectural differences but on a practical level what Dremel lets the people at Google do is take databases the size of the web and do very fast queries on over them.
Pregel
Matt describes Pregel as a system to deal with graph problems. It allows the people at Google to compute large graph problem as trying to calculate the reputation of your links or maybe connections between people and it does these computations very fast.
You can read more into the each of these from Research at Google here:
Matt Cutts from Google On Percolator, Dremel and Pregel - Infrastructure calculating PageRank and nodes of link graph!
Search Engine Method and Strategy in last 1 Decade for Indexing of WebPages
In short the on page optimization and off page optimization techniques and strategy is the ultimately improve the result of websites in search engine and meet the goal of the organization.
Search Engine Optimization term came into effect in the last decade due to reach of internet to millions of people and availability of smart devices in affordable prices. Search Engine Optimization (SEO) is itself interpret its meaning i.e. to improve or optimize according to the search engines like Google, Yahoo and Bing all over the World. SEO is totally depend on the keyword (i.e. we call organic search) search by user (people) in the search engine. People(users) put the words in the search box according to their requirement and search engine provide the required information(result) from the database, The result (website) that include the word related to search (i.e. we often call in internet marketing as keywords) is came into list one by one in search engine .The Websites are increasing on world wide web in huge quantity, therefore it is required for search engine to improve their method to index such large numbers of websites. And it is very difficult to put every website link onto first page of search engine and their appearing order. Therefore to provide correct information in arranged ways, Major Search Engines like Google fix some criteria on which the website are appear in search engine as following:
(A) On Page Optimization:
It include website content and structure, which includes the following factors :
- Content Quality: The content is present on every webpage of different website web pages, But search engine give preference to those content which is well defined and easy to understand to the user.
- Content Quantity: Quantity not always matter, but good content does having sufficient information indirectly have good quantity.
- Keywords: Keywords are the major tool to index the webpage on first page in search engine, Many of the SEO Company emphasize on keywords to improve the rank of website.
- Sitemap: There are so many WebPages which is difficult to analyze one by one. Therefore for better way is to show all information on single page so that the search engine get the required information.
- Google Webmaster Tools (GWT): The submission of websites in different directories is difficult. Therefore to ease this process Google webmaster tools help to index in major search engine Google.
(B) Off Page Optimization:
The off page is to work for website but not on it, but on another website from which we get the links(i.e often called Back links), The following factors are:
- Article Writing: In Search Engine Optimization (SEO) we wrote on particular product and putting the write keywords with easily readable content for user and submit it to Article Websites.
- Press Release: When a company launching new product Find Article, Its details is put in information and submitted to press release websites.
- Forum Posting: In this we share and gain information on products. 4. Content Scaling: Submitting the documents in PDFor other formats to websites like Slideshare.
Tags: Web indexing, Search engine indexing, How Google Search Works, Crawling & Indexing, Indexing Web Sites on the Internet
Subscribe to:
Comments (Atom)
