<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: noindex test</title>
	<atom:link href="http://www.mattcutts.com/blog/noindex-test/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mattcutts.com/blog/noindex-test/</link>
	<description>neat fun stuff</description>
	<lastBuildDate>Fri, 06 Nov 2009 18:35:07 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: AjiNIMC</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-240711</link>
		<dc:creator>AjiNIMC</dc:creator>
		<pubDate>Mon, 09 Feb 2009 08:26:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-240711</guid>
		<description>Hi Matt,

As Bill mentioned
&gt;&gt; Matt, please see if your associates would enhance the URL removal tool to remove https pages or permit http or https to be used as the preferred domain.

Can we have some preferred option for http and https under webmaster console or even as a new parameter to robots.txt. It is sometimes so difficult to have two robots.txt seomoz.org/ugc/solving-duplicate-content-issues-with-http-and-https?

It will help a lot of people, looking for a reply, please do consider answering it, I will also appreciate if I get to know that I am answered :) (Though I will keep following it up under this blog post)

Big thanks for all your hard work Matt.

Regards,
Aji</description>
		<content:encoded><![CDATA[<p>Hi Matt,</p>
<p>As Bill mentioned<br />
&gt;&gt; Matt, please see if your associates would enhance the URL removal tool to remove https pages or permit http or https to be used as the preferred domain.</p>
<p>Can we have some preferred option for http and https under webmaster console or even as a new parameter to robots.txt. It is sometimes so difficult to have two robots.txt seomoz.org/ugc/solving-duplicate-content-issues-with-http-and-https?</p>
<p>It will help a lot of people, looking for a reply, please do consider answering it, I will also appreciate if I get to know that I am answered <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  (Though I will keep following it up under this blog post)</p>
<p>Big thanks for all your hard work Matt.</p>
<p>Regards,<br />
Aji</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: YellowSEO</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-159602</link>
		<dc:creator>YellowSEO</dc:creator>
		<pubDate>Tue, 18 Nov 2008 20:03:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-159602</guid>
		<description>could&#039;nt help but pay attention to the Interesting results, seeing it indexed by both Google, Ask, Live and Yahoo. Means as a best practice as a webmaster you must take note to double check robots.txt is not blocked if going to use no follow meta. But it still seems that just using  robot.txt still seems to be the best method.</description>
		<content:encoded><![CDATA[<p>could&#8217;nt help but pay attention to the Interesting results, seeing it indexed by both Google, Ask, Live and Yahoo. Means as a best practice as a webmaster you must take note to double check robots.txt is not blocked if going to use no follow meta. But it still seems that just using  robot.txt still seems to be the best method.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-152557</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Thu, 13 Nov 2008 05:49:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-152557</guid>
		<description>Matt, please see if your associates would enhance the URL removal tool to remove https pages or permit http or https to be used as the preferred domain.

A client has a site, running on the worst CMS I have come across.  No root access is provided and it is hosted on a Windows server. :(  So I have no ability to add meta tags, can&#039;t serve separate robots.txt files, and can&#039;t even use 301&#039;s.  What a nightmare and the company providing this horrible service is not compelled to help.

If we had the ability to remove just the https pages, or set the preferred domain to http or https, this could quickly solve some problems and remove duplicate content from the index.  As it stands now, the homepage is indexed with https, http, http://www, and previously had http://www.domain/default.asp.  The client wonders why their homepage no longer has its first page listings in Google Search...

Have a good day Matt. :)</description>
		<content:encoded><![CDATA[<p>Matt, please see if your associates would enhance the URL removal tool to remove https pages or permit http or https to be used as the preferred domain.</p>
<p>A client has a site, running on the worst CMS I have come across.  No root access is provided and it is hosted on a Windows server. <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' />   So I have no ability to add meta tags, can&#8217;t serve separate robots.txt files, and can&#8217;t even use 301&#8217;s.  What a nightmare and the company providing this horrible service is not compelled to help.</p>
<p>If we had the ability to remove just the https pages, or set the preferred domain to http or https, this could quickly solve some problems and remove duplicate content from the index.  As it stands now, the homepage is indexed with https, http, <a href="http://www" rel="nofollow">http://www</a>, and previously had <a href="http://www.domain/default.asp" rel="nofollow">http://www.domain/default.asp</a>.  The client wonders why their homepage no longer has its first page listings in Google Search&#8230;</p>
<p>Have a good day Matt. <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave (Original)</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-139078</link>
		<dc:creator>Dave (Original)</dc:creator>
		<pubDate>Fri, 31 Oct 2008 00:28:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-139078</guid>
		<description>&lt;blockquote&gt;I just want to check on how Yahoo/Live/Ask treat pages with noindex meta tags.&lt;/blockquote&gt;Yahoo at least does the same as Google, they ignore the tag, flex their muscle and Index it. I guess might is right?</description>
		<content:encoded><![CDATA[<blockquote><p>I just want to check on how Yahoo/Live/Ask treat pages with noindex meta tags.</p></blockquote>
<p>Yahoo at least does the same as Google, they ignore the tag, flex their muscle and Index it. I guess might is right?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: youfoundjake</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-138933</link>
		<dc:creator>youfoundjake</dc:creator>
		<pubDate>Thu, 30 Oct 2008 21:03:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-138933</guid>
		<description>So Matt, I guess the question is &quot;How can Google display similar pages to a page that has been blocked by robots.txt?&quot;</description>
		<content:encoded><![CDATA[<p>So Matt, I guess the question is &#8220;How can Google display similar pages to a page that has been blocked by robots.txt?&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-138880</link>
		<dc:creator>Tom</dc:creator>
		<pubDate>Thu, 30 Oct 2008 20:14:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-138880</guid>
		<description>Page isn&#039;t spidered because it&#039;s blocked, but it is added to index (with a best guess at page title) due to links. Take delicious.com as an example. Same thing is happening there. Meta noindex on the home page, but all files blocked by robots. So you do a search on Google for &quot;delicious&quot; and the new home page shows up, but with no snippet or meta description because the page wasn&#039;t crawled, but there is a page title associated with it, not because it actually is the page title, but because it&#039;s the domain name and it is used in anchor text (google will attribute a page title to any page that doesn&#039;t have one, and it&#039;s often the name of the root). 

If you do search for &quot;delicious&quot; on MSN, the page doesn&#039;t come up, so it appears that MSN is accessing the file (despite what robots.txt says) and finds the noindex meta. 

Yahoo! gives page title and a description, which actually isn&#039;t in the meta description or found anywhere on the page. It&#039;s pulled from the Yahoo! directory listing of the site. Yahoo! tends to assign Y directory data to pages that don&#039;t have it (and even often times when they do). So it would it appear that Y also follows the robots.txt directive. Of course delicious.com is a Yahoo! property, so you could draw a different conclusion, but this is my thought on the subject.</description>
		<content:encoded><![CDATA[<p>Page isn&#8217;t spidered because it&#8217;s blocked, but it is added to index (with a best guess at page title) due to links. Take delicious.com as an example. Same thing is happening there. Meta noindex on the home page, but all files blocked by robots. So you do a search on Google for &#8220;delicious&#8221; and the new home page shows up, but with no snippet or meta description because the page wasn&#8217;t crawled, but there is a page title associated with it, not because it actually is the page title, but because it&#8217;s the domain name and it is used in anchor text (google will attribute a page title to any page that doesn&#8217;t have one, and it&#8217;s often the name of the root). </p>
<p>If you do search for &#8220;delicious&#8221; on MSN, the page doesn&#8217;t come up, so it appears that MSN is accessing the file (despite what robots.txt says) and finds the noindex meta. </p>
<p>Yahoo! gives page title and a description, which actually isn&#8217;t in the meta description or found anywhere on the page. It&#8217;s pulled from the Yahoo! directory listing of the site. Yahoo! tends to assign Y directory data to pages that don&#8217;t have it (and even often times when they do). So it would it appear that Y also follows the robots.txt directive. Of course delicious.com is a Yahoo! property, so you could draw a different conclusion, but this is my thought on the subject.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: guy</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-138062</link>
		<dc:creator>guy</dc:creator>
		<pubDate>Thu, 30 Oct 2008 04:07:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-138062</guid>
		<description>lol, page is indexed, so noindex wont work :D</description>
		<content:encoded><![CDATA[<p>lol, page is indexed, so noindex wont work <img src='http://www.mattcutts.com/blog/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave (Original)</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-138015</link>
		<dc:creator>Dave (Original)</dc:creator>
		<pubDate>Thu, 30 Oct 2008 03:00:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-138015</guid>
		<description>&lt;blockquote&gt;Dave, but the noindex shouldn’t be spidered since it’s blocked by robots.txt.&lt;/blockquote&gt;I completely agree and have mentioned this to Matt about robots.txt and noindex is his post ASKING for opinions on the subject.

Unfortunately Google is a GIANT that often gets its own way, even when it&#039;s morally wrong.</description>
		<content:encoded><![CDATA[<blockquote><p>Dave, but the noindex shouldn’t be spidered since it’s blocked by robots.txt.</p></blockquote>
<p>I completely agree and have mentioned this to Matt about robots.txt and noindex is his post ASKING for opinions on the subject.</p>
<p>Unfortunately Google is a GIANT that often gets its own way, even when it&#8217;s morally wrong.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter (IMC)</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-137741</link>
		<dc:creator>Peter (IMC)</dc:creator>
		<pubDate>Wed, 29 Oct 2008 20:25:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-137741</guid>
		<description>huh? No spam protection anymore?</description>
		<content:encoded><![CDATA[<p>huh? No spam protection anymore?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter (IMC)</title>
		<link>http://www.mattcutts.com/blog/noindex-test/#comment-137738</link>
		<dc:creator>Peter (IMC)</dc:creator>
		<pubDate>Wed, 29 Oct 2008 20:24:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.mattcutts.com/blog/?p=1279#comment-137738</guid>
		<description>So Matt, how are the others treating the noindex? G, Y and M haven&#039;t indexed it.

what else can you tell about it?</description>
		<content:encoded><![CDATA[<p>So Matt, how are the others treating the noindex? G, Y and M haven&#8217;t indexed it.</p>
<p>what else can you tell about it?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
