Lots of people know that Google shows the date when we last visited a page when you look at a cached page in Google. For example, the cached page for my blog might look like this:
You can see the red oval where I’ve circled the time that Googlebot last fetched my blog’s home page. Google was the first major search engine to start showing the crawl date, and I think at this point every major search engine shows the crawl date on cached pages, except for one. *cough* *cough*, sorry, I’ve been under the weather today.
Yesterday, Vanessa Fox did a great post about how we’re changing the crawl dates that you see at the top of Google cached pages to make them more accurate. (By the way, bonus points if you can spot the leetspeak in Vanessa’s post.) Google uses something called “If-Modified-Since” to use less bandwidth when crawling the web (which is a good thing for site owners). In essence, if a page hasn’t changed since the last time we fetched it, there’s no need to fetch it again. But even if we checked whether a page was unchanged, we didn’t update the crawl date in the cache–we’re changing that now.
In case that sounds complex, I made a video last night that uses candy to illustrate the point. Here you go:
Matt talks about how Googlebot crawls the web, and what “crawl date” is shown on cached pages. In this video:
– Red candy is a 404 page
– Purple candy is a 200 (OK) page
– Green candy is a 304 status code (page has not changed)
If you’re someone who prefers to learn visually, I hope the video helps.