Jump to content

What's The Advantage Of Google Cache For... Google?


Recommended Posts

I understand how to use google cache, how it related to SEO and how it can be helpful to everyone in the interwebz, however, I really don't get it, what's the advantage of:

 

1)cacheing every single page (I mean, eachone must have at least 1 kb, which would accumulate easily for hundreds of TB's);

2)acting as a proxy (which, again, would 'consume' extra bandwitch in google's server);

 

What's google is getting out of it? I'm just curious.

If you're talking about the 'Cached' link google has next to search results, allowing a searcher to pull up a copy of a page as of the last crawl, it's mostly a convinence feature for users.  I seem to recall reading an article a long time ago about how they offered it mainly because it they were storing a copy of the page anyway, so why not let users see it. 

 

Offering the cached copy is good not only for end-users but possibly for site owners as well.  From an end user perspective, it means they can still get the information they want even if the original site is down or slow.  Users being able to get what they want means the users are more likely to continue using google.  More users using google means more ad revenue for google. The investment google has to make to offer the "cached service" is relativly minimal.  They already have to download and store the page anyway as a result of their indexing process, so all it really costs them is storage space and a minimal amount of development time to implement the feature.  Storage space is fairly inexpensive compared to other operating costs, and the development time to add a "Cached" link is minimal and probably only a one-time investment.

 

For site owners, having the cached content available on google means someone searching for something could still get to your content even if your site is currently down or running slow.  Sure you may not get that single hit, but if the content is good a user may note your site and visit it again later when you're back up and running again.

 

 

If you're talking about the 'Cached' link google has next to search results, allowing a searcher to pull up a copy of a page as of the last crawl, it's mostly a convinence feature for users. I seem to recall reading an article a long time ago about how they offered it mainly because it they were storing a copy of the page anyway, so why not let users see it.

 

Offering the cached copy is good not only for end-users but possibly for site owners as well. From an end user perspective, it means they can still get the information they want even if the original site is down or slow. Users being able to get what they want means the users are more likely to continue using google. More users using google means more ad revenue for google. The investment google has to make to offer the "cached service" is relativly minimal. They already have to download and store the page anyway as a result of their indexing process, so all it really costs them is storage space and a minimal amount of development time to implement the feature. Storage space is fairly inexpensive compared to other operating costs, and the development time to add a "Cached" link is minimal and probably only a one-time investment.

 

For site owners, having the cached content available on google means someone searching for something could still get to your content even if your site is currently down or running slow. Sure you may not get that single hit, but if the content is good a user may note your site and visit it again later when you're back up and running again.

I understand your point about favoring the users so they continue using google, that makes sense, however what do you mean by "They already have to download and store the page anyway", they do? Why?

however what do you mean by "They already have to download and store the page anyway", they do? Why?

 

As part of the indexing process they do for the web search.  If I am remembering properly what I had read, google's crawler bots only download and save the pages, they do not do any of the indexing work as far as extracting key words and ranking pages.  All that is handled by a separate process.  So the process goes a little something like:

 

1) Crawler bot downloads a page from the internet.

2) Crawler bot saves the downloaded file to the disk somewhere

3) Crawler bot adds that file to the indexer's queue.

4) crawler bot repeats the process from step 1.

 

So there are several crawlers all doing nothing but downloading files from the internet and saving them somewhere to be processed later by the indexer.  The indexer works off a queue type system where it goes though each file in the queue, extracts all the key words, ranks the page using an algorithm, finds links to follow, etc.  Once the indexer is done processing a file they basically had a choice between deleting it or keeping it.  Deleting it would free up space, but by keeping they can then offer that cache service as an extra feature for relatively little investment on their part.

 

Google cache has saved my ass a couple of times when one of our servers blew up and there was no backup of a few old static websites. I simply scraped the cache files from Google and had the sites back up and running really quickly.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.