<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Crawl caching proxy</title>
	<atom:link href="http://www.mattcutts.com/blog/crawl-caching-proxy/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mattcutts.com/blog/crawl-caching-proxy/</link>
	<description>neat fun stuff</description>
	<lastBuildDate>Fri, 06 Nov 2009 18:35:07 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Bambarbia</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-400112</link>
		<dc:creator>Bambarbia</dc:creator>
		<pubDate>Mon, 05 Oct 2009 17:18:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-400112</guid>
		<description>Hi Matt,

What is current status of this? I noticed for each incoming User-request Mediapartners-Google crawls a page 2 times - just because I have 2 AdSence units on a page!!!

Such an obvious performance bottleneck... I can use HTTP caching to help Mediapartners - but I&#039;ll have to disable gzipped output (problems with Apache HTTPD)

Additionally, I have some pages restricted from Googlebot, but enabled for Mediapartners-Google.</description>
		<content:encoded><![CDATA[<p>Hi Matt,</p>
<p>What is current status of this? I noticed for each incoming User-request Mediapartners-Google crawls a page 2 times &#8211; just because I have 2 AdSence units on a page!!!</p>
<p>Such an obvious performance bottleneck&#8230; I can use HTTP caching to help Mediapartners &#8211; but I&#8217;ll have to disable gzipped output (problems with Apache HTTPD)</p>
<p>Additionally, I have some pages restricted from Googlebot, but enabled for Mediapartners-Google.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Used limousines</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-349157</link>
		<dc:creator>Used limousines</dc:creator>
		<pubDate>Thu, 18 Jun 2009 04:13:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-349157</guid>
		<description>Matt’s diagram pretty much matches what we’ve been seeing, although I still think there might be some issues regarding robots.txt, but we’re still collecting some data on that, so I’m going to wait until that’s done before commenting on it.

Thanks
David Janes</description>
		<content:encoded><![CDATA[<p>Matt’s diagram pretty much matches what we’ve been seeing, although I still think there might be some issues regarding robots.txt, but we’re still collecting some data on that, so I’m going to wait until that’s done before commenting on it.</p>
<p>Thanks<br />
David Janes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Electric Golf Trolleys</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-277687</link>
		<dc:creator>Electric Golf Trolleys</dc:creator>
		<pubDate>Mon, 16 Mar 2009 11:09:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-277687</guid>
		<description>As a Qualified Microsoft MCSE NT4 and 2000 server series, I know that the original web server calle Proxy server and especially the later editions called ISA 200# (Internet Security Acceleration ) Server caches many pages for your LAN network.  So i guess the server models used by ISPs are doing the same.  Like your I.E. local PC application - how many times do you use F5 to refresh?</description>
		<content:encoded><![CDATA[<p>As a Qualified Microsoft MCSE NT4 and 2000 server series, I know that the original web server calle Proxy server and especially the later editions called ISA 200# (Internet Security Acceleration ) Server caches many pages for your LAN network.  So i guess the server models used by ISPs are doing the same.  Like your I.E. local PC application &#8211; how many times do you use F5 to refresh?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Svi Poslovi</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-128006</link>
		<dc:creator>Svi Poslovi</dc:creator>
		<pubDate>Fri, 30 May 2008 07:49:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-128006</guid>
		<description>I switched to VPS and now my job listings aggregator works like a charm. Well, there was a shift from cPanel to Plesk, as well, but I don&#039;t think cPanel was directly responsible. It was probably some flooding protection...</description>
		<content:encoded><![CDATA[<p>I switched to VPS and now my job listings aggregator works like a charm. Well, there was a shift from cPanel to Plesk, as well, but I don&#8217;t think cPanel was directly responsible. It was probably some flooding protection&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Svi poslovi</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-125574</link>
		<dc:creator>Svi poslovi</dc:creator>
		<pubDate>Mon, 07 Apr 2008 18:43:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-125574</guid>
		<description>Interesting post. I still do have problems with Googlebot crawling my pages after I turn on optimization (gzip compression) for:
text/html
text/plain
text/xml
text/css
application/javascript

You wouldn&#039;t believe it. Nothing is crawled. I look into it and let you know if I find out what&#039;s the problem. If anyone here knows about it, yell it out! :-)</description>
		<content:encoded><![CDATA[<p>Interesting post. I still do have problems with Googlebot crawling my pages after I turn on optimization (gzip compression) for:<br />
text/html<br />
text/plain<br />
text/xml<br />
text/css<br />
application/javascript</p>
<p>You wouldn&#8217;t believe it. Nothing is crawled. I look into it and let you know if I find out what&#8217;s the problem. If anyone here knows about it, yell it out! <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Horne</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-121286</link>
		<dc:creator>David Horne</dc:creator>
		<pubDate>Thu, 24 Jan 2008 20:10:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-121286</guid>
		<description>Also just as a side note remember it is extremely important to do a proper robots.txt for your proxy if you don&#039;t then your proxy will crawl the web and google will index other peoples websites under your domain.  This seems great but its pretty unethical if you ask me and causes a lot more problems then the benefits of fresh content.

Here is the proper format for PHProxy 5

User-agent: *
Disallow : /index.php? 

Just put that in your robots.txt file and upload it to the root of your server.

There are other ones for cgi proxy and glype.

If your worried you can&#039;t do well in the Search Engines without stealing content check out my site at http://www.proxybolt.com and I guarentee you I am doing very well in the SE&#039;s and have a ton of visitors without having to steal.</description>
		<content:encoded><![CDATA[<p>Also just as a side note remember it is extremely important to do a proper robots.txt for your proxy if you don&#8217;t then your proxy will crawl the web and google will index other peoples websites under your domain.  This seems great but its pretty unethical if you ask me and causes a lot more problems then the benefits of fresh content.</p>
<p>Here is the proper format for PHProxy 5</p>
<p>User-agent: *<br />
Disallow : /index.php? </p>
<p>Just put that in your robots.txt file and upload it to the root of your server.</p>
<p>There are other ones for cgi proxy and glype.</p>
<p>If your worried you can&#8217;t do well in the Search Engines without stealing content check out my site at <a href="http://www.proxybolt.com" rel="nofollow">http://www.proxybolt.com</a> and I guarentee you I am doing very well in the SE&#8217;s and have a ton of visitors without having to steal.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bishan</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-114152</link>
		<dc:creator>Bishan</dc:creator>
		<pubDate>Wed, 10 Oct 2007 05:13:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-114152</guid>
		<description>Hi Matt,

I have a site_map.htm page on my site which contains links to all of the pages on my site. google has indexed the page successfully but when I look at the cached page it is months out of date. google states that the page is a snapshot from a crawl on December 27th 2005 but this cannot be the case as the version shown is months old. i.e. I have updated the site map with several new pages but they aren&#039;t appearing on the cached version.

Any ideas on what the problem is? 

Thanks
Bishan</description>
		<content:encoded><![CDATA[<p>Hi Matt,</p>
<p>I have a site_map.htm page on my site which contains links to all of the pages on my site. google has indexed the page successfully but when I look at the cached page it is months out of date. google states that the page is a snapshot from a crawl on December 27th 2005 but this cannot be the case as the version shown is months old. i.e. I have updated the site map with several new pages but they aren&#8217;t appearing on the cached version.</p>
<p>Any ideas on what the problem is? </p>
<p>Thanks<br />
Bishan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Yuriy</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-113526</link>
		<dc:creator>Yuriy</dc:creator>
		<pubDate>Wed, 26 Sep 2007 23:08:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-113526</guid>
		<description>At last I found this information. Thanks a lot. It is a part of my diploma now !! :-) Any progress in &quot;Bigdaddy&quot; field ? There is not too much information in Internet about this. Thanks again.</description>
		<content:encoded><![CDATA[<p>At last I found this information. Thanks a lot. It is a part of my diploma now !! <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Any progress in &#8220;Bigdaddy&#8221; field ? There is not too much information in Internet about this. Thanks again.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Prem Kasera</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-112454</link>
		<dc:creator>Prem Kasera</dc:creator>
		<pubDate>Mon, 10 Sep 2007 12:46:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-112454</guid>
		<description>Hi Matt
I have purchased a expired domain before few weeks.put a web pages that shows &quot;Under Maintenance&quot; this has been live over a month. url is http://www.scryypy.com whenever i am checking the google cached version it shows me cache of website http://designerpad.org/ (http://209.85.135.104/search?sourceid=navclient&amp;ie=UTF-8&amp;rlz=1T4SKPB_enIN235IN236&amp;q=cache:http%3A%2F%2Fwww.scryypy.com%2F)

Can you help me why it is and how can i resolve this prob.

Thanks</description>
		<content:encoded><![CDATA[<p>Hi Matt<br />
I have purchased a expired domain before few weeks.put a web pages that shows &#8220;Under Maintenance&#8221; this has been live over a month. url is <a href="http://www.scryypy.com" rel="nofollow">http://www.scryypy.com</a> whenever i am checking the google cached version it shows me cache of website <a href="http://designerpad.org/" rel="nofollow">http://designerpad.org/</a> (<a href="http://209.85.135.104/search?sourceid=navclient&amp;ie=UTF-8&amp;rlz=1T4SKPB_enIN235IN236&amp;q=cache:http%3A%2F%2Fwww.scryypy.com%2F" rel="nofollow">http://209.85.135.104/search?sourceid=navclient&amp;ie=UTF-8&amp;rlz=1T4SKPB_enIN235IN236&amp;q=cache:http%3A%2F%2Fwww.scryypy.com%2F</a>)</p>
<p>Can you help me why it is and how can i resolve this prob.</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: SEO Expert Dubai</title>
		<link>http://www.mattcutts.com/blog/crawl-caching-proxy/#comment-111765</link>
		<dc:creator>SEO Expert Dubai</dc:creator>
		<pubDate>Tue, 28 Aug 2007 11:23:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=282#comment-111765</guid>
		<description>Hi Matt,

Thank you for clearing this issues witch was really hard to understand from other webmaster&#039;s point of view. any how it was clear and detailed information.

Thanks again.</description>
		<content:encoded><![CDATA[<p>Hi Matt,</p>
<p>Thank you for clearing this issues witch was really hard to understand from other webmaster&#8217;s point of view. any how it was clear and detailed information.</p>
<p>Thanks again.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
