What's The Advantage Of Google Cache For... Google?

Shizuka · November 17, 2012

I understand how to use google cache, how it related to SEO and how it can be helpful to everyone in the interwebz, however, I really don't get it, what's the advantage of:

1)cacheing every single page (I mean, eachone must have at least 1 kb, which would accumulate easily for hundreds of TB's);

2)acting as a proxy (which, again, would 'consume' extra bandwitch in google's server);

What's google is getting out of it? I'm just curious.

Adam · November 18, 2012

Google's "cache" is probably just a nice name for the latest version of a web page they have on disk.

kicken · November 18, 2012

If you're talking about the 'Cached' link google has next to search results, allowing a searcher to pull up a copy of a page as of the last crawl, it's mostly a convinence feature for users. I seem to recall reading an article a long time ago about how they offered it mainly because it they were storing a copy of the page anyway, so why not let users see it.

Offering the cached copy is good not only for end-users but possibly for site owners as well. From an end user perspective, it means they can still get the information they want even if the original site is down or slow. Users being able to get what they want means the users are more likely to continue using google. More users using google means more ad revenue for google. The investment google has to make to offer the "cached service" is relativly minimal. They already have to download and store the page anyway as a result of their indexing process, so all it really costs them is storage space and a minimal amount of development time to implement the feature. Storage space is fairly inexpensive compared to other operating costs, and the development time to add a "Cached" link is minimal and probably only a one-time investment.

For site owners, having the cached content available on google means someone searching for something could still get to your content even if your site is currently down or running slow. Sure you may not get that single hit, but if the content is good a user may note your site and visit it again later when you're back up and running again.

Shizuka · November 18, 2012

If you're talking about the 'Cached' link google has next to search results, allowing a searcher to pull up a copy of a page as of the last crawl, it's mostly a convinence feature for users. I seem to recall reading an article a long time ago about how they offered it mainly because it they were storing a copy of the page anyway, so why not let users see it.

Offering the cached copy is good not only for end-users but possibly for site owners as well. From an end user perspective, it means they can still get the information they want even if the original site is down or slow. Users being able to get what they want means the users are more likely to continue using google. More users using google means more ad revenue for google. The investment google has to make to offer the "cached service" is relativly minimal. They already have to download and store the page anyway as a result of their indexing process, so all it really costs them is storage space and a minimal amount of development time to implement the feature. Storage space is fairly inexpensive compared to other operating costs, and the development time to add a "Cached" link is minimal and probably only a one-time investment.

For site owners, having the cached content available on google means someone searching for something could still get to your content even if your site is currently down or running slow. Sure you may not get that single hit, but if the content is good a user may note your site and visit it again later when you're back up and running again.

I understand your point about favoring the users so they continue using google, that makes sense, however what do you mean by "They already have to download and store the page anyway", they do? Why?

kicken · November 18, 2012

however what do you mean by "They already have to download and store the page anyway", they do? Why?

As part of the indexing process they do for the web search. If I am remembering properly what I had read, google's crawler bots only download and save the pages, they do not do any of the indexing work as far as extracting key words and ranking pages. All that is handled by a separate process. So the process goes a little something like:

1) Crawler bot downloads a page from the internet.

2) Crawler bot saves the downloaded file to the disk somewhere

3) Crawler bot adds that file to the indexer's queue.

4) crawler bot repeats the process from step 1.

So there are several crawlers all doing nothing but downloading files from the internet and saving them somewhere to be processed later by the indexer. The indexer works off a queue type system where it goes though each file in the queue, extracts all the key words, ranks the page using an algorithm, finds links to follow, etc. Once the indexer is done processing a file they basically had a choice between deleting it or keeping it. Deleting it would free up space, but by keeping they can then offer that cache service as an extra feature for relatively little investment on their part.

Shizuka · November 19, 2012

Whoa, that is... pretty unintuitive, but I guess that makes sense after all!

Thanks for clarification.

JonnoTheDev · November 27, 2012

Google cache has saved my ass a couple of times when one of our servers blew up and there was no backup of a few old static websites. I simply scraped the cache files from Google and had the sites back up and running really quickly.

Sign In

What's The Advantage Of Google Cache For... Google?

Recommended Posts

Shizuka

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

Shizuka

Link to comment

Share on other sites

kicken

Link to comment

Share on other sites

Shizuka

Link to comment

Share on other sites

JonnoTheDev

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information