Indexing timeline

Heh. I wrote this hugely long post, so I pulled a Googler aside and asked “Dan, what do you think of this post?” And after a few helpful comments he said something like, “And, um, you may want to include a paragraph of understandable English at the top.” πŸ™‚

Fair enough. Some people don’t want to read the whole mind-numbingly long post while their eyes glaze over. For those people, my short summary would be two-fold. First, I believe the crawl/index team certainly has enough machines to do its job, and we definitely aren’t dropping documents because we’re “out of space.” The second point is that we continue to listen to webmaster feedback to improve our search. We’ve addressed the issues that we’ve seen, but we continue to read through the feedback to look for other ways that we could improve.

People have been asking for more details on “pages dropping from the index” so I thought I’d write down a brain dump of everything I knew about, to have it all in one place. Bear in mind that this is my best recollection, so I’m not claiming that it’s perfect.

Bigdaddy: Done by March

– In December, the crawl/index team were ready to debut Bigdaddy, which was a software upgrade of our crawling and parts of our indexing.
– In early January, I hunkered down and wrote tutorials about url canonicalization, interpreting the inurl: operator, and 302 redirects. Then I told people about a data center where Bigdaddy was live and asked for feedback.
– February was pretty quiet as Bigdaddy rolled out to more data centers.
– In March, some people on WebmasterWorld started complaining that they saw none of their pages indexed in Bigdaddy data centers, and were more likely to see supplemental results.
– On March 13th, GoogleGuy gave a way for WMW folks to give example sites.
– After looking at the example sites, I could tell the issue in a few minutes. The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling. The Bigdaddy update is independent of our supplemental results, so when Bigdaddy didn’t select pages from a site, that would expose more supplemental results for a site.
– I worked with the crawl/index team to tune thresholds so that we would crawl more pages from those sorts of sites.
– By March 22nd, I posted an update to let people know that we were crawling more pages from those sorts of sites. Over time, we continued to boost the indexing even more for those sites.
– By March 29th, Bigdaddy was fully deployed and the old system was turned off. Bigdaddy has been powered our crawling ever since.

Considering the amount of code that changed, I consider Bigdaddy pretty successful in that I only saw two complaints. The first was one that I mentioned, where we didn’t index pages from sites with less trusted links, and we responded and started indexing more pages from those sites pretty quickly. The other complaint I heard was that pages crawled by AdSense started showing up in our web index. The fact that Bigdaddy provided a crawl caching proxy was a deliberate improvement in crawling and I was happy to describe it in PowerPoint-y detail on the blog and at WMW Boston.

Okay, that’s Bigdaddy. It’s more comprehensive, and it’s been visible since December and 100% live since March. So why the recent hubbub? Well, now that Bigdaddy is done, we’ve turned our focus to refreshing our supplemental results. I’ll give my best recollection of that timeline too. Around the same time, there was speculation that our machines are full. From my personal perspective in the quality group, we have certainly have enough machines to crawl/index/serve web results; in fact, Bigdaddy is more comprehensive than our previous system. Seems like a good time to throw in a link to my disclaimer right here to remind people that this is my personal take.

Refreshing supplemental results

Okay, moving right along. As I mentioned before, once Bigdaddy was fully deployed, we started working on refreshing our supplemental results. Here’s my timeline:
– In early April, we started showing some refreshed supplemental results to users.
– On April 13th, someone started a thread on WMW to ask about having fewer pages indexed.
– On April 24th, GoogleGuy gave a way for people to provide specifics (WebmasterWorld, like many webmaster forums, doesn’t allow people to post specific site names.)
– I looked through the feedback and didn’t see any major trends. Over the next week, I gave examples to the crawl/index team. They didn’t see any major trend either. The sitemaps team investigated until they were satisfied that it had nothing to do with sitemaps either.
– The team refreshing our supplemental results checked out feedback, and on May 5th they discovered that a “site:” query didn’t return supplemental results. I think that they had a fix out for that the same day. Later, they noticed that a difference in the parser meant that site: queries didn’t work with hyphenated domains. I believe they got a quick fix out soon afterwards, with a full fix for site: queries on hyphenated domains in supplemental results expected this week.
– GoogleGuy stopped back by WMW on May 8th to give more info about site: and get any more info that people wanted to provide.

Reading current feedback

Those are the issues that I’ve heard of with supplemental results, and those have been resolved. Now, what about folks that are still asking about fewer pages being reported from their site? As if this post isn’t long enough already, I’ll run through some of the emails and give potential reasons that I’ve seen:

– First site is a .tv about real estate in a foreign country. On May 3rd, the site owner says that they have about 20K properties listed, but says that they dropped to 300 pages. When I checked, a site: query shows 31,200 pages indexed now, and the example url they mentioned is in the index. I’m going to assume this domain is doing fine now.

– Okay, let’s check one from May 11th. The owner sent only a url, with no text or explanation at all, but’s let’s tackle it. This is also a real estate site, this time about a Eastern European country. I see 387 pages indexed currently. Aha, checking out the bottom of the page, I see this:
Poor quality links
Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled. As these indexing changes have rolled out, we’ve improving how we handle reciprocal link exchanges and link buying/selling.

– Moving right along, here’s one from May 4th. It’s another real estate site. The owner says that they used to have 10K pages indexed and now they have 80. I checked out the site. Aha:
Poor quality links
This time, I’m seeing links to mortgages sites, credit card sites, and exercise equipment. I think this is covered by the same guidance as above; if you were getting crawled more before and you’re trading a bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.

– Some one sent in a health care directory domain. It seems like a fine site, and it’s not linking to anything junky. But it only has six links to the entire domain. With that few links, I can believe that out toward the edge of the crawl, we would index fewer pages. Hold on, digging deeper. Aha, the owner said that they wanted to kill the www version of their pages, so they used the url removal tool on their own site. I’m seeing that you removed 16 of your most important directories from Oct. 10, 2005 to April 8, 2006. I covered this topic in January 2006:

Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com?
A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded.

You didn’t remove your entire domain, but you removed all the important subdirectories. That self-removal just lapsed a few weeks ago. That said, your site also has very few links pointing to you. A few more relevant links would help us know to crawl more pages from your site. Okay, let’s read another.

– Somebody wrote about a “favorites” site that sells T-shirts. The site had about 100 pages, and now Google is showing about five pages. Looking at the site, the first problem that I see is that only 1-2 domains have any links at all to you. The person said that every page has original content, but every link that I clicked was an affiliate link that went to the site that actually sold the T-shirts. And the snippet of text that I happened to grab was also taken from the site that actually sold the T-shirts. The site has a blog, which I’d normally recommend as a good way to get links, but every link on the blog is just an affiliate link. The first several posts didn’t even have any text, and when I found an entry that did, it was copied from somewhere else. So I don’t think that the drop in indexed pages for this domain necessarily points to an issue on Google’s side. The question I’d be asking is why anyone would choose your “favourites” site instead of going directly to the site that sells T-shirts?

Closing thoughts

Okay, I’ve got to wrap up (longest. post. evar). But I wanted to give people a feel for the sort of feedback that we’re getting in the last few days. In general, several domains I’ve checked have more pages reported these days (and overall, Bigdaddy is more comprehensive than our previous index). Some folks that were doing a lot of reciprocal links might see less crawling. If your site has very few links where you’d be on the fringe of the crawl, then it’s relatively normal that changes in the crawl may change how much of your site we crawl. And if you’ve got an affiliate site, it makes sense to think about the amount of value-add that your site provides; you want to provide a reason why users would prefer your site.

In March, I was able to read feedback and identify an issue to fix in 4-5 minutes. With the most recent feedback, we did find a couple ways that we could make site: more accurate, but despite having several teams (quality, crawl/index, sitemaps) read the remaining feedback, we’re seeing more a grab-bag of feedback than any burning issues. Just to be clear, I’m not saying that we won’t find other ways to improve. Adam has been reading and replying to the emails and collecting domains to dig into, for example. But I wanted to give folks an update on what we were seeing with the most recent feedback.

928 Responses to Indexing timeline (Leave a comment)

  1. Damn Ringtone People!!!

  2. Hi Matt

    Thanks for the much neede detailed update.

    I do hope that you, GG and later Adam (when he feels ready) to post more of the same and more often than you are doing now.

    IMO, its not enough of Google to tell us that they are listening. We need them to talk to us too. I.e communicate πŸ˜€

    Once again, thanks Matt. I know you must be also busy preparing for the vacation.

  3. Wow, looks like someone is going to have a short interview today πŸ˜› Thanks for the update Matt.

  4. Yawn !!!

    After the past 12 months of Google messing about and still no better results … I’ve completely learned how to live without you.

    Best wishes, You’re gonna need it

  5. Every time someone asks a novice question in google groups while at the same time saying that google s-u-c-k-s I will refer them to this post.

    Is adam bot or human? πŸ™‚

    Thanks Matt.

  6. Thank you Matt for the update. I really appreciate you finally using some real estate sites as examples. Since this is an indexing issue I thought I would bring it up.

    After checking the logs today I noticed this coming from Google pertaining to our site.

    http://www.google.it/search?hl=it&q=fistingglessons&btnG=Cerca+con+Google&meta=

    LOL now as you can see the #2 site is a real estate site listed for this search term.The page showing for this search is a property description page. As you can tell from the sites description it has nothing to do with this subject matter. Would you mind checking with the index team and see why maybe this would be indexed for such a phrase.

    On a side note it would be nice to see more examples of real estate sites used in the future. Thanks again for the update.

  7. Great post Matt. That really clears up a few things about how Bigdaddy works. Still seems like it is responding very slowly and I find that large companies are getting ahead of smaller sites for local terms even though they are not located in the same country. But that’s mostly because of my own business gripes πŸ˜‰

    Keep up the great posting.

  8. Great post Matt, thanks for putting in the effort to explain what’s being going on.

    I have a quick question – how long is it taking these days for Google to index new pages? I added a forum to my site a couple of months ago, and while it doesn’t have many deep links from external domains, it is linked to pretty well from within my site and is in my submitted sitemap. Google seems to be crawling it quite enthusiastically. However, none of it’s showing up in the index with a site: search despite the intensive crawling and waiting about a month. Does this mean that Google doesn’t think my forum is worth indexing? πŸ™

  9. Yeah, blame this disaster on webmasters, Google can’t index the web properly and it is the fault of webmasters working bad links?

    Funny that those that are running the biggest links scams on the net are ranking great Matt?

    Explain that one, will ya ???

    Where are the indexed pages Matt, do they just disappear, do you have an answer for all of us or are we all using linking scams?

  10. Thanks everybody. I’m glad that I sat down and got all this down. Yup Mike, I figured if I could get this post out before I talked to Danny, then we could just sit around and shoot the breeze. πŸ™‚

    Danny: So, how’s life?
    Matt: Not bad. How are you doing?
    Danny: Pretty good, pretty good. πŸ™‚ So how ’bout those Reds?
    Matt: The communists??
    Danny: No, the Cincinnati Reds!
    Matt: There’s communists in Cincinnati!?!?!

  11. Sina, it’s by design in Bigdaddy that we crawl somewhat more than we index in Bigdaddy. If you index everything that you crawl, you never know what you might be missing by crawling a little more, for example. I see at least one indexed post from your forum, so the fact that we’ve been visiting those pages is a good indicator that we’re aware of those pages, and they may be incorporated in the index in the future.

  12. Great post Matt! Good job. Nice to hear some more detailed feedback.

    Hey can you answer this for me? Finally we have been seeing some improvement to the indexing of our site. I have seen other webmasters mention the same occurance of indexing down to about level 3 pages and that is it. Althought deeper pages are being crawled (level 4+) they just don’t want to stick very long in the index. Linking a bit higher can get them to stick (turning them to level 3 and 2) but that just impossible to do with alot of content. Is this something that will correct in time? We have PLENTY of links at all levels so I don’t see this as a huge problem. Pretty much looking for reassurance to sit tight.

  13. I read two real estate sites and hoping one was mine, but neither applied to me. My real estate site only has outbound links to Home Builders, so I doubt this should quality as spam.

    It still seems to me that you are blaming this on penalties, which I’m fine with, but why would you crawl my site thoroughly on a weekly bases, then never put the results in the index? This has been happening for 2 months now.

  14. Hello Matt

    Thanks for the information.
    “Bigdaddy: Done by March” Is it really true. It means that I do not understand why there are still different search results between
    http://66.249.93.104/ and http://64.233.179.104/
    Please could you give us more details. It’s confusing.
    Where is really Bigdaddy!

    Thanks for your reply.

  15. Thanks for a very informative post. Just one quick question though, is there ever a time when link exchanges are considered legitimate? Maybe even an example of the case? It’s easy to tell the irrelevant link exchanges, but there has to be some instances that maybe a … real estate agent exchanges links w/ a … local moving company.

    Can you comment on this?

  16. HA!!!

    To celebrate this new information I deleted an old directory that was hanging off my most valued website. It made an awful shriek as I removed the database. In the coming weeks there will be a few autoemails asking “where is my link”??? and I will reply, “you will not drain my power anymore, die die!!!

    (ok enough of this Matt Cutts fellah for today, I got work to do, how about you?)

    πŸ™‚

  17. Hi Matt,

    Thanks for the post. Problem is… none of your explainations seem to fit my site. I’m trying to maintain a straight ship in a dirty segment. My links have been accumulated by form relationships with related sites (thus I’m building links a bit slower than straight link exchange would allow). My content is most certainly provided to educate the visitor. My affiliate linkage is quite low. But yet my pages seem to continue dropping and supplementals are increasing.

    Thanks for reading this,
    jim

  18. Matt thank you for the explaination about big daddy. But I have checked my websites for points you just wrote down. And I can’t find any of them for my site.

    I have pretty much backlinks. I don’t link to crappy sites and still my indexpages is like a wave.

    On monday I can have 800.000 pages indexed on tuesday 350.000, then back to 600.000 down to 400.000. The difference is way to big. And we had over a million records.

    I also requested a reinclusion request but we never heard from it or saw any changes. My domain name is techzine.nl I have www, forum, babes, msn and pricecheck.techzine.nl in use.

    We did have some problems in the past I e-mailed it a couple of times to google but never got an awnser about it.

    We changed to domain name of the website from tweakzone.nl to techzine.nl (oktober 2005). We forwarded it with 302 (stupid) I found that out later and changed it to 301 (permanent) redirect. No I am still trying to get the whole tweakzone.nl domain out of google and get techzine.nl indexed correctly. We asked many many webmasters to update their links and that worked. Our HTML code is by the book. But still we are not being indexed as we were. I’m running out of ideas and options to fix this. Can you explain to me what I am doing wrong. I have been reading SEO sites, webmasterworld.com, Google guidelines for months now and I can figure out what I’m doing wrong…..

    Kind Regards,

    Coen

  19. Strange how you ignored comments before, and now you have decided to respond.

    Unfortunately, the serps have become absolute trash, so the changes have failed, and I see more spam sites doing well than before.

  20. Thank you for the timeline.

    I find it rather frustrating to follow how your timeline basically outlines how everthing is working just as it should, and watch pages display as regular one day, supplemental the next, a week later regular and then back to supplemental. Searchable as regular listing, completely unsearchable as a supplemental.

    Good to hear you guys have plenty of machines with plenty of room. Perhaps someone should inform the CEO.

    I look forward to you finding other ways to improve.

    Dave

  21. Please, please, please delete all of the old supplemental results! I think if you took a poll, you would find very few webmasters (or end users) who actually value any of those old junk pages (many of which do not even exist anymore).

    I have even used the URL removal tool in the past – but those old pages just keep coming back!

  22. I don’t think what Mr. Cutts meant the mortgages sites, credit card sites, and exercise equipment sites were junk, most likely that they were unrelated.

    Now, I don’t think it’s fair to penalize a site for linking to an “unrelated” site, since many webmasters link to their other websites etc. Links being devalued because their coming from an unrelated page would be more fair.

    And what’s the deal with reciprocals? Although I rarely do them (time related), I don’t think it’s unfair. A vote is a vote right? Even if two people vote for each other. As long as it’s not automotive I don’t see why it would be a problem…

    What about the impact of getting a bunch of unrelated inbound links to your site? Image if someone used a linking scheme to point hundreds, or thousands, of links at your domain? All those links from “unrelated” or “junk” sites would surely put a hurting on you. Not fair.

  23. I agree that reciprocal link directories should be removed as they are link farms, so Google is doing the right thing there!

    Some reciprocal linking is natural though and sites should only have their sites removed if they have a high percentage of reciprocals in their totals.

  24. [quote]Google should NEVER * NEVER * even entertain the idea of deciding what Products or Services are β€œJUNK”

    This is a recipe for disaster, and extremely arrogant.

    What gives any search engines the right to decide that someone’s business category is β€œJUNK”. This would be analogous to Yahoo Directory or DMOZ devaluing certain TYPES of products or services.
    [/quote]

    It aint that often you’ll see me stick up for Google but MR SEW you are VERY wrong.

    Google can do what the hell they like with their search engine, cos it is THEIRS.
    If they want to devalue links in their algorithm, that’s their perogative, cos the algo is THIERS
    If they want to say certain business models are junk in their search engine then that is their right, cos the search engine is THIERS

    You have exactly the same right. On YOUR web properties you can say and do what you want. If you want to link out via affiliate URLs you can as the web site is YOURS.
    If you want to buy or sell links, you can as the web site is YOURS.

    When all is said and done when you own something it is up to you what you do with it. Google is no different with whatever it decides to stick anywhere on its domains than you or I am with mine.

    Personally I think Google makes lots of mistakes. I also believe so do many webmasters, myself included but they are our mistakes to make the way we see fit at the time.

    I’m happy with what I do and I am sure Google are happy with what they do. Personally I am going to carry on trying to beat Matt and his team at Google and I am pretty sure he and his team will carry on trying to beat me.

    He wins some, I win some but therein lies the nature of the web. On his site he can do what he wants. On my site I can do what I want. I suggest you, Mr SEW do the same πŸ™‚

  25. Damn … great summary Matt … the “other Matt” must be saying “gulp” to try to follow that act while you are gone. And yea, what are you going to talk about in a couple of hours on the radio show?

    BTW, here’s an oddball corner case that I would classify as a bug – one of your favorite subjects – redirects! πŸ˜‰

    So URL1 ranked well for keyphrase1. The SERP’s show a title, some text, and a URL. A (legit) 302 (temporary) redirect was setup to URL2. After a few days, the SERP’s for keyphrase1 show URL2, but was still using the title tag for URL1. The “other text” is pulled from URL2. Looking at the cache, it is all URL2. This persisted for several days – looked pretty darn funny actually in the SERP’s, since the URL2 title tag had nothing to do with keyphrase1.

    I think (?) correct behavior would be that if you are going to show a URL in the SERP’s, you should show title/text associated with that page … but in this case, some part of the indexing machine got confused by the redirects and the title1 piece got left in even though URL2 was displayed.

    Email me if you want more info, but you should easily be able to setup a test case based on that description. BTW, Yahoo has a similar bug in the SERP’s (I forgot how MSN handled it), so it’s not just the big “G” struggling with redirects.

  26. I had some clerical errors in my post above (automotive should be automated :), wish I could edit it… sorry.

  27. Hi Matt, great information as always. I have a question about this:

    were getting crawled more before and you’re trading a bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.

    How might this impact the typical blog with a lengthy blogroll? Many people have blogs with lengthy blogrolls… and many of those sites in my blogroll end up linking back without it really being arranged as a reciprocal exchanged.

    From what you are saying I get the idea that having a blogroll/recommended reading list doesn’t sound like a good idea.

  28. Doesn’t matter…..they don’t care about results. Bad results means more money for Adwords:)

    Microsoft will squash Google like it did Netscape. When Vista comes out….Google will fall.

  29. Matt. For me, that was the best post that you’ve ever posted here – by a very long way.

    I’m one of the people who has sites that are suffering right now. One of them is the site that we spoke about last year. It had a clean bill of health from you, and nothing has changed since then, and yet it’s pages are being dropped daily. Right now it’s down from a realistic 18k-20k pages to 9,350, but only around 500 of them are fully indexed – the rest are URL-only partials. Yesterday it had 11,700 but only ~600 of them were actually listed, and some of those were partials.

    From your post, I would say that the site fits the description of not having many trusted IBLs. Would that be correct? Reminder – http://www.holidays.org.uk

    To be honest, if it is correct, then I dislike it a lot. It would mean that it isn’t sufficient to have a decent and useful site any more to be fully indexed by Google, if the site has quite a lot pages. It would mean that we have to run around getting unnatural IBLs just to be fully represented in the index, and unnatural IBLs are one thing that Google doesn’t want.

  30. Chris, I talked about this a couple comments above:
    http://www.mattcutts.com/blog/indexing-timeline/#comment-27002
    With Bigdaddy, it’s expected behavior that we’ll crawl some more pages than we index. That’s done so that we can improve our crawling and indexing over time, and it doesn’t mean that we don’t like your site.

    arubicus, typically the depth of the directory doesn’t make any difference for us; PageRank is a much larger factor. So without knowing your site, I’d look at trying to make sure that your site is using your PageRank well. A tree structure with a certain fanout at each level is usually a good way of doing it.

    Ronald R, I’ve got a finite amount of time. πŸ™‚ I spent a large chunk of Saturday writing this up, but I don’t have time to respond to every comment. I wish I did. But improving quality is an ongoing process; if you see spam, I’d encourage you to do a spam report so we can check it out.

    CrankyDave, the supplemental results are typically refreshed less often than the main results. If your page is showing up as supplemental one day and then as a regular result the next, the most likely explanation is that your page is near the crawl fringe. When it’s in the main results, we’ll show that url. If we didn’t crawl the url to show in the main results, then you’ll often see an earlier version that we crawled in the supplemental results. Hope that helps explain things. BTW, CrankyDave, your site seems like an example of one of those sites that might have been crawled more before because of link exchanges. I picked five at random and they were all just traded links. Google is less likely to give those links as much weight now. That’s the simple explanation for why we don’t crawl you as deeply, in my opinion.

    Brian M, I’ve passed that sentiment on. I believe that folks here intend to refresh all of the supplemental results over the summer months, although I’m not 100% sure.

  31. How about a tool so that we know who we should be linking to or not?

    I see spammers in the google index. Maybe they should get penalized down to a PR of 3 for linking to a bad neighborhood! LOL. Just kidding.

    I guess you just may as well nofollow every external link just in case.

  32. Yes a good example of this is our link backs here, I linked to this blog entry from my forums and my link here goes back to the forum!

    Is this what Google is going to take out or are you looking for a high concentration of reciprocal links Matt?

  33. Problem with this post is that most of us would have identified the spam examples that you listed and yet most of us still don’t understand what has been happening to our sites, in our case going from 20000 pages indexed to less than 100 instead.

    You had indicated that there were only a “dougle-digit number” of emails sent to the bostonpub address and that someone was going through them over a week ago already. Today, you also stated that someone was still going throught them. We did send an email and we still have not received a reply. Based on the most recent thread on wmw, it looks like we are not the only ones.

    Real answers would help.

    Many small businesseses are suffering from these massive de-listings. It is not a light subject for us. From our point of view, bigdaddy has not been “pretty successful” and general replies are now a bit short on comfort at this point.

  34. Nice post Matt. Very informative and not at all too long.

    Shoemoney – Was that one of your ringtones sites?

  35. “arubicus, typically the depth of the directory doesn’t make any difference for us; PageRank is a much larger factor. So without knowing your site, I’d look at trying to make sure that your site is using your PageRank well. A tree structure with a certain fanout at each level is usually a good way of doing it.”

    Thanks MATT!

    I think it is a PR the factor but nothing is trickling down from the home page – (Backlinks for the homepage reported from google are completely ?????)

    We keep the most logical structure you could possible have. A pyramid strucure drilling down to the articles. Articles linking to related articles. Googlebot crawls just does not like level 4 +. If pr is a factor (I thought it now updates continuous) I am not sure why it does not filter down (besides I have no clue if it actually does since what is shown on toolbar may not be accurate).

  36. Jason Duke, I did another pass to mark all SEW links as spam. Gotta muck around and delete SEW from my user database. πŸ™‚

    Anthony Cea, I gave a quick example above. Someone was complaining about their pages being supplemental, but that’s the effect, not the cause. The right question is “Why aren’t as many of my pages showing in Google’s main results?” I picked five links to the domain at random and they were all reciprocal links. My guess is that’s the cause. I mentioned that example because CrankyDave still has an open road ahead of him; he just needs to concentrate more on quality links instead of things like reciprocal links if he wants to get more pages indexed. (Again, in my opinion. I was just doing a quick/dirty check.)

    Valentine, I made the links I showed an image so no one would feel the need to go digging into actual sites. πŸ™‚

  37. What if our problem isn’t crawling so much as seeing those pages indexed at all. I have checked the supp index and haven’t seen them there either but I have seen the Googlebot crawling the pages.

    P.S. Is there an email I should send to asking about this and if so where?

  38. OK Matt, so what you are saying is that we should produce great content and hope we get linked to because of the value of the page!

    But when is Google going to get real about schemes to game the engine so that natural links that are earned are rewarded?

  39. Matt… I have previously reported spam, and not in my sector. But nothing happens, so in the end I just gave up.

    I’m wondering how you gain relevant links, in some sectors, without reciprocating, or paying? Do you believe that rivals would give you a free one way link, lol?

  40. @Matt:

    Some days I really wonder why you even post to your blog at all lol It seems that for every 1 legitimate query there are 10 others holding you personally accountable/responsible for their serp/penalty/crappy result.

    I mean really… if the amount of Q&A here was T&A a team of plastic surgeons couldnt wipe the grin of your face πŸ™‚

    anyway …. “my site is getting crappy results and no traffic …” its your fault and Google sucks… LOL not really… but I want to get in on the fun too !

  41. Dear Matt, thank you for explaining us google’s view of link exchanges.

    We have dropped low-quality link exchanges months ago, now going on only with high quality links, added tons of new and unique stuff to our site, but the crawler does not crawl much, and the site is low rated. One year ago it was on top of many competitive searches.

    Is it possible to overcome this bad backlink reputation? It’s almost impossible to get rid of low-quality links once they are there. Do you have an advise for sites like ours?

  42. I have a snall site that offers a free downloadable tool. So I registered a sitemap and waited.. some months. Still not indexed. Every day the bot visits, picks up the site map then the index page then the download exe (which is about 3.5M) Any idea why the bot should try to spider exe files?

    I needed a slightly different version of the tool for a specific audience. so I registered a new domain, copied the site with minor changes. Did not register a sitemap because I wasn’t particularly bothered if it was index or not. The new site was indexed in a week or so, and now has a PR of 4. The original, near identical site, still not indexed.

    The original site has been in Yahoo and MSN for months….

  43. I don’t blame Google for dumping on webmasters that try to game the engine with manufactured links, purchased links, traded links, links from reciprocal link farm directories and so on, this is good long term if they can index the web properly taking these things into consideration!

  44. Better late than never πŸ™‚ Thanks Matt, you put my mind to rest on a lot of issues

  45. I cannot wait to forward this to my mortgage lender, who just asked me just the other day,
    “You work in SEO any idea why I’ve lost so many of my pages in Google?”
    Your explanation sounds so much nicer and more official than… “It could be because your website has a bunch of crap in it, and on it, and connected to it”

    BTW- “It could be because your website has a bunch of crap in it, and on it, and connected to it” is an accurate analysis for many of the mortgage and realtor sites who do not rank well on Google right now.

  46. Personally I don’t care about where my site ranks. I believe that would happen ranks would happen naturally if you serve your visitors well.

    What many of us DO care about is having equal treatment as any other website owner large and small as well as equal opportunity. Spammers should not be there when legit sites should be there but are not being indexed for some reason. I believe that it is healthy for to get a bit of feedback and give feedback to google so that such equal opportunities can exist.

  47. Matt, thanks for the information…but it doesn’t help me at the moment! My most important pages just aren’t getting indexed but are getting crawled. We have a really useful website with thousands of members but it seems that only Google thinks its not good enough! Any advice would be greatly appreciated.

  48. Anthony Cea, you’ve got some people who were relying on reciprocal linking or link buying complaining specifically that they’re not crawled as much. So as far as “when is Google going to get real about schemes to game the engine so that natural links that are earned are rewarded,” I think that we’re continually making progress on judging which links are higher-quality.

    Ronald R, we’ve been checking spam reports more closely lately. You ask “I’m wondering how you gain relevant links, in some sectors, without reciprocating, or paying? Do you believe that rivals would give you a free one way link, lol?” My answer is that trying to force your way up to the top of search engines is in many ways not working in the most efficient way. To the degree that search engines reflect reputation on the web, the best way to gather links is to offer services or information that attract visitors and links on your own. Things like blogs are a great way to attract links because you’re offering a look behind the curtain of whatever your subject is, for example.

    Mike B, I’ve talked to the sitemaps folks a lot. Having a sitemap for your site should *never* hurt your domain. On the other hand, don’t expect that just listing a sitemap is enough to get a domain crawled. If no one ever links to your site, that makes Googlebot less likely to crawl your pages.

    That’s a very concise way to say it, Bob Rains, although a lot of variation that I see is also if someone’s domain is hardly linked at all. At the fringe of the crawl is where you’re likely to see the most variation, while a site like cnn.com with tons of links/PageRank is going to be less likely to not be crawled.

    It’s funny, because most people understand that on a SERP there are 10 results, and if one webmaster is unhappy because they dropped out of the top 10, then some other webmaster is happy that they have joined the top 10. In the same way, we have a finite amount of crawling that we can do as well. Bigdaddy is more deep, but we still have to make choices about whether to crawl more from site A or site B.

    Well said, arubicus. Adam recently sent me 5-6 sites that he thinks we could do a better job of crawling, for example. So I wanted to give people an update of how things looked right now, but we’ll keep looking for ways to improve crawling and indexing and ranking.

  49. Hi All

    Anybody wish to say hello to our new friend Adam_Lasnik of Google Search Quality team πŸ˜€

  50. >Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled.

    So is the conclusion that sites that are deemed “low quality” will also have “light crawling” correct?

  51. Thanks for the feedback Matt, I really appreciate it. Made me feel thoroughly warm and fuzzy inside :). Seriously, it’s really great to have people at Google who directly talk to webmasters and demistify things that can seem a unusual to outsiders. Keep up the great work!

  52. graywolf, it’s true that if you had N backlinks and some fraction of those are considered lower quality, we’d crawl your site less than if all N were fantastic. Hope that makes sense. Light crawling can also mean “we just didn’t see many links to your domain” as well though.

    Glad I could answer questions, Sina. It’s nice that I didn’t have any meetings this afternoon, so I could just hang and answer questions. Then I’ve got Danny in a half-hour or so. But that’s okay too. Maybe for some of the questions, I can just be like “Ah yes, Sina and I talked about this in paragraph 542. It helps us to crawl some more pages than we index so that we can see which pages might help us improve our crawl coverage in the future.” πŸ™‚

  53. Boy matt you have to have a vacation after all of these posts you are doing.

    “improve crawling and indexing and ranking.”

    I personally expect things to move more from an SEO standpoint to more of a QUALITY standpoint in that businesses and sites to compete more on the QUALITY level rather on the SEO level. I believe now (after what you mentioned) this is where you want us webmasters to compete (probably always have). This push for quality will make this a WIN WIN WIN game for all of us.

  54. Yup, exactly, arubicus. There’s SEO and there’s QUALITY and there’s also finding the hook or angle that captivates a visitor and gets word-of-mouth or return visits. First I’d work on QUALITY. Then there’s factual SEO. Things like: are all of my pages reachable with a text browser from a root page without going through exotic stuff. Or having a site map on your site. After you’re site is crawlable, then I’d work on the HOOK that makes your site interesting/useful.

  55. Yep, clear enough, and what I suspected, thanks.

  56. Matt, I have to agree with Joe and Anthony in that spanking webmasters for reciprocal links is often unfair. And I don’t have an intelligent suggestion on how to spot reciprocal link breeding facilities vs. honest, natural reciprocal links…at least not anything that can’t be instantly and easily “gamed”.

    My industry might be a good example to use to look at reciprocal linking, actually (it’s weddings & honeymoons). In this market, there are certainly a large number of blind link-exchangers out there, adding no value to the end user with their hydroponically engineered reciprocal link spaghetti. But on the other hand, a site like mine (honeymoon travel) might have pages that list a small number of recommended related businesses (e.g. half a dozen wedding coordinator companies in Hawaii…an online jeweler for rings…an association of wedding officiants…etc.). We list other wedding-related companies on our site with whom we’ve done business (and been happy with)…and naturally, many of them also recommend us on their sites. We each are happy to recommend other companies in our general industry whom we believe do a great job for our customers and yet don’t compete with us.

    Now, without thinking algorithms, should this kind of link be very important in determining good sites to return to users?

    And what should one think about two companies where one thinks the other is great and links to them….but the feeling ISN’T mutual?

    So there’s my argument for SEs being VERY careful when it comes to designing algorithms to discredit or punish for reciprocal links. Yes, I realize that massive reciprocal linking campaigns are evil and manipulative, but there may be some baby parts being thrown out with this bathwater.

  57. Matt:

    I haven’t experience the pages dropping problem webmasters are attributing to big daddy, but I have seen some behavior I would like to understand.

    Through the middle of April, our SERPs showed with our homepage and then the product page indented on the next item. It looked really great. Over the last month, the deep linked pages no longer show up for some high volume keywords, only the homepage.

    I won’t list the keywords in a blog, but if you want to look into it, I would be glad to provide a list.. Alternatively, look at our sitemap page and you can see it for the 3rd, 4th, 6th and 7th term listed (terms 1 and 2 are our brand name)

    Am I alone in seeing this or does it represent a trend?

    Thanks

  58. Matt,

    Somethings been eating at me…

    If link exchanges are frowned upon and buying links is a no no. How is a new site supposed to ever be able to succesfully enter a competitive space? It seems the only people who would be able to compete are very old sites (not neccesarily the best) and people who maintain a zillion domains for interlinking purposes. Google seems to be placing an unfair barrier to entry UNLESS spammy tactics are employed.

    -jim

    -Jim

  59. Circling back to folks who just had comments approved. Joe Hayes, it’s not that reciprocal links are automatically bad. It’s more that many reciprocal links exist for the wrong reasons. Here’s an email that I just got:

    Dear Site Owner,
    I am looking for quality link exchange partners for several of my
    sites. I have browsed http://www.mattcutts.com and it seems like a link exchange
    between our sites will benefit us both.
    If you are interested in doing a link exchange between our sites, I
    would be glad to hear any offer you might have.

    In general I will give back a link from a page with the same PR rating
    of the page I will be given.

    If you own any other sites for which you are willing to trade links,
    please let me know.

    I’ll be glad to hear anything you have to offer.

    Kind Regards
    Loki

    I’d recommend people spend less time on trying to gather links that way or via some automated network, and more on making a great site with a creative angle or two that makes the site stand out from the crowd.

  60. Matt, everyone knows that Google has a Supplemental Index, but no one outside of Google knows exactly what it is and what its purpose is.

    Even if you cannot give us the details, will you please share a working definition that SEOs can point to as the most reliable description?

  61. Okay, I gotta go do a pass at email before meeting up with Danny. Talk to everyone later.. πŸ™‚

  62. Michael Martinez, personally I’d think of it as a fallback way that we can return results for specific queries where we might not have as many results in the main index. Okay, now I really am going to go. πŸ™‚

  63. >>>>The sites that fit β€œno pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the ***inlinks*** or the outlinks of that site.

    Nice. We can destroy our competition by making spammy sites and then linking to the competition!!! SWEET!!!!

    Maybe Google should update ‘There’s almost nothing a competitor can do to harm your ranking or have your site removed from our index.’

    at

    http://www.google.com/support/webmasters/bin/answer.py?answer=34449&topic=8524

    Now it’s easy to harm the competitions ranking!!!!

  64. Thanks again for the feedback!

  65. Great Update Matt!!!

    It looks like I had put it together pretty well in my explanation of why people were disappearing from Google that can be found at http://www.ahfx.net/weblog/80 . I just needed to build on the devaluation of reciprocal links.

    The only remaining question is whether it is the reciprocal link that is bad (we had already discussed that reciprocal links were losing value back in November.), or that the “unrelated” outgoing/incoming link that is bad. My bet is on the lack of quality of the inbound/outbound links. It seems the “tighter” the content, links, and tags are, the better the page does. Although, I agree also that reciprocal links should be devalued.

  66. Matt,

    I’ve seen mentioned that duplicate content can potentially hurt a site. On one of my sites I’ve had people write FAQs, etc, and am now wondering how much of what was written might not be original content. Can you, or anyone else, point me in a direction of being able to check for duplicate content, other than just pluggin sentances into Google. How divergent does content need to be to be considered original?

  67. I’m sitting here watching Danny. πŸ™‚

  68. Matt,

    Example, a website contains a “link exchange” button within their navigation. When you look closer, the websites forming the link exchange are real companies but the majority of links are unrelated, e.g. car-hire, wood art gifts, labels. Would I be correct in assuming that the non-related links carry no weight and that the domain is scoring only from the related “link exchanges”. Note: I say link exchanges and cringe as I’ve usually been against this however, having just read your latest note I feel encouraged to build a link exchange page and provide reciproical links to associated quality websites. Have I got the wrong end of the stick here? Thanks in advance for your time.

  69. Is adam bot or human?

    Clearly, I’m a bot.

    Aaron Pratt, what is your a/s/l?

    Matt Cutts, c/t/c?

    I am a magic 8-ball. Type !future to read your future.

    Okay, goofy stuff aside, this sort of a statement was long overdue. I can’t speak for anyone else, but I was ripping my hair out for the longest time watching people bitch, moan, and complain because their spamtastic sites weren’t getting indexed or that they were dropping. Tough **** for those people. Let ’em build something worth visiting.

    The only problem is that now the idiots will come up with some random and illogical explanation that “linking to other websites and forming alliances isn’t a bad thing, and Matt should be listening to me because I’ve created some 3-page keyword stuffed piece of crap and think I’m an expert.”

    Anyone else wanna bet that SEW says something stupid in response? πŸ™‚

    I just have one very stupid question:

    Things like blogs are a great way to attract links because you’re offering a look behind the curtain of whatever your subject is, for example.

    Doesn’t this also lead to the possibility of increased blogspam as far as people reading this comment going and creating BSLogs (TM) full of meaningless drivel about something loosely related to the topic at hand and/or cross-posting to other blogs related to topics (moreso the former concern)?

    Personally, I’d rather not see blogs like yours and Aaron Pratt’s and Jaan Kanellis’ blog get dragged down into the mud because a few dumbasses ruin the concept.

  70. Matt: ‘I’d recommend people spend less time on trying to gather links that way or via some automated network, and more on making a great site with a creative angle or two that makes the site stand out from the crowd.’

    The thing is, just writing great content isn’t enough. I’m not saying my content is the greatest ever in the whole world, but its pretty good. If people can’t find your site, along with all its great content, they will never link to it. I don’t know what the answer is, I can see how some reciprocol links are bad, and how buying links is a problem for SE, etc. But it is extremely difficult to get links to a site with just good content. Unless maybe you know lots of people who can give you links, etc. For shy people like myself its tough, I just don’t know enough people and because of the shyness I haven’t participated in any online communities like I should have – I’m working on that though. It seems that getting traffic from SE is kind of like a popularity contest – its like highschool all over again – I could be real nice and real smart, but too shy to be popular so my site is just ignored by SE.

    Oh well, sorry to whine. I’m trying to write high quality blogs to attract links. (Doesn’t seem to be working too well yet though. )

  71. Nice. We can destroy our competition by making spammy sites and then linking to the competition!!! SWEET!!!!

    That’s not what he said. He said the spammy IBLs would not help. He didn’t say they’d hurt. They basically have no effect at all.

    The worst thing you’ll do is give that person no increase in traffic. The best thing you’ll do is give them a bunch of direct traffic from your spamlinks.

  72. “They basically have no effect at all.”

    The only thing I see happening is when your site used to rely on the effects of such links in the SERPS and now since the effects are gone you may see decreased rankings and spiderings (even fewer indexed pages) and lower PR.

  73. Matt,

    That was your best post so far on this site!

    The reason I liked it so much was that you gave many examples.

    Please keep the examples coming. That’s where we learn the most!

    Dave

  74. Matt. What you’ve described really sucks, and not only from a webmaster’s point of view, but also from a Google user’s point of view. I know that you are the spam man, so it’s not your fault, but the whole thing is just plain crazy.

    What you described means that a website with quite a lot of good, useful pages, won’t be fully indexed unless the site has enough IBLs, and not just any IBLs – certain types mustn’t dominate. What kind of search engine is that? FWIW, I don’t mind the death of reciprocals (I’ve never got invloved in it anyway), but it’s crazy for a search engine to require a certain number of IBLs for a site with a lot of pages to be fully indexed.

    For one thing, as a user I want a search engine to show me all the relevant pages that it knows about, and I don’t want good pages left out just because the sites they belong to didn’t have enough IBLs. I want good service from a search engine, and depriving me of good relevant pages is a very bad service.

    For another thing, as a webmaster, if my pages are good, index them, dammit. What on earth do IBLs have to do with it? Doesn’t Google want to show good pages to its users? If you don’t want to rank them very highly, don’t rank them very highly, but there is no reason in the world to leave them out of the index, and deprive Google’s users of the possibility of seeing them. It’s just crazy, and makes no sense at all.

    No, I’m not talking about the site I mentioned earlier in the thread. Forget that site – there’s nothing wrong with it, but let it go out of the index. I’m talking about Google users who are being *intentionally* deprived by Google, and the owners of perfectly good websites who are being shafted because their sites just don’t happen to have enough IBLs to satisfy Google.

    The other nonsense is the outbound links that you mentioned. What the hell has it got to do with a search engine what links a website owner puts on his/her pages? If people want to put affiliate links on a page it’s entirely their own business. And if they want to link to off-topic sites it’s entirely their own business. And if they want to sell real estate on their sites, it’s entirely their own business. It has nothing whatsoever to do with search engines, so why are they penalised by not indexing all of their pages? Why are Google’s users *intentionally* deprived of good and useful information, just because a site’s pages contain things that are nothing to do with search engine’s?

    From what you described in your post, Google has consigned many perfectly good sites to the scrap heap, just because they didn’t have enough IBLs, or because the sites had some perfectly valid links in them. And they’ve intentionally deprived their users of a lot of perfectly good results for the same stupid reasons.

    I’d recommend people spend less time on trying to gather links that way or via some automated network, and more on making a great site with a creative angle or two that makes the site stand out from the crowd.

    Yeah right. Just what Google has always said – concentrate on making a great for visitors. And if the site doesn’t have enough IBLs to satisfy Google??? What a load of ….

    Frankly, the whole thing stinks, and it stinks big time! I’m just not going to run around getting unnatural links to satisfy a bloody search engine, as you suggested to a couple of your examples. Why should anyone need to do that? My attitude to it is “stuff it”, and stuff Google!

  75. Great post from PhilC, I agree with his statement that IBL should not determine if a sites pages are indexed, Google should not be guilty of selective indexing of the web as Microsoft calls it.

    To be a world class search engine you have to index pages to serve relevant results, Microsoft is indexing pages on the web much better than Google and so is Yahoo at this point in time, thus their results are much better and more relevant than Google SERPs.

  76. PhilC said it perfectly.

    And what really sucks is this is KILLING small businesses that just want clients to be able to find information on them. What do they know about inbound linking or reciprocal linking? They just want to be found for [product anytown, usa]

    I have a one off italian pizza place that just wants people searching for catering to be able to possibly find him the area. He’s in Google Local, but some people don’t even look at that, or depending on the query it doesn’t come up. He links with all his other local buddies: a clown, a hotel for catering, an iron worker they did his little cafe fence. Now this seems to be discouraged. They just want to share business, not join this big link scheme.

    If i type in my small town name on Google now, the top 20 hits are all gigantic spam sites, that contain the equivalent of a Wikipedia article.

  77. and what is wrong with affiliate links? how else do some sites make money?

  78. Thank you for addressing my concerns directly Matt. I do appreciate it.

    I must say that I’m really disappointed.

    I’m really disappointed that related sites with good and logical reasons to exchange can no longer exchange links without harming themselves.

    I’m really disappointed that if an authority site links to me, I cannot link back to the authoritative information they provide without damaging the crawling of my site and theirs.

    This is not a matter of “not counting” something. This is a matter of blindly punishing sites, and most importantly, searchers.

    No, Google has not not moved forward. They’ve taken several steps back.

    Dave

  79. So, how does this relate to the inented index page event that people have been seeing. It’s not hosting crowding

    Example: Search for “MY company Name” would normally brining up the listing index page from Google directory. Now, it brings up another page from the site with index page indented under it.

    Penalty, fluke, ??

  80. “They just want to share business, not join this big link scheme.”

    The way I see it is that there is NOTHING wrong with trading links. Just don’t expect higher rankings and faster indexing because of them. If you rely on recip. links and junk scraper/directory links and have not much for any other quality links you may see some adverse effects because those links are not counting for much anymore. Go out and promote sure but be smart on who who cross promote with just do expect your ranking to go up because of it.

  81. EDIT: Go out and promote sure but be smart on who who cross promote with just do expect your ranking to go up because of it.

    should read

    Go out and promote sure but be smart on who who cross promote with just don’t expect your ranking to go up because of it.

  82. thanks for clearing everything up matt.
    enjoy your new man-boobs on your plastic surgery vacation.

    love,
    tmoney πŸ˜‰

  83. Matt,

    First, I appreciate you maintaining this blog and responding to some of the comments.

    I realize you can’t analyze every site, but from what I’ve seen at Webmaster World, the sites you have picked are not very representative of the sites which are having problems with the supplemental index and not being crawled. The sites you have picked are obvious offenders, but sites such as my own and many others have none of these issues. To us, it seems that building a site to the best of one’s ability isn’t good enough; unless you can play the Google game, you’re out of luck. For instance, the inbound link issue. There are only a couple active fansites related to mine (most are no longer updated, and my site is only a few months old). Therefore, I am stuck with a couple inbound links unless I try to contrive inbound links, which I have no desire to do. Of course, the related sites also naturally link back to me – I’m related to them too, after all! Now that’s bad? It’s quite a Catch 22.

    I think one should hesitate to imply that all the websites with supplemental problems “deserve it” because they’re all doing something so terribly wrong that they no longer are recognized by the index. There are many sites which do not fit into this penalty schema that have lost pages – too many to blow off as abberations in an otherwise successful change.

    I care because my site, the last time I checked, had seven pages out of over 600 that are non-supplemental, and it is jumping wildly in the Google rankings daily for main keywords, varying from 35-75 any given day. Meanwhile, it varies between #6 and #8 on other search engines.

    But frankly I am more concerned with the fact that so many pages with good content are being ignored. If I were #105 for my keywords but could look at site:[my site] and see that my pages are indexed, I would be OK with that. At least they’re there, and people who are looking for content unique to my site can find it. However, now, according to Google, only 7 pages on my site are searchable for the average Google user – only seven pages of my site exist in Googleland. I can put exact phrases from supplementally indexed pages in the search engine and get no results returned. With almost nothing indexed, I feel like all my honest efforts are worthless to Google for some mysterious reason.

    Yes, it’s your search engine and you may do what you like. However, I’m sure you understand that a search engine that throws out good content is not doing its job. Hopefully, you will not shrug off the numerous legitmate concerns because you were able to find in the vast array of e-mails you received some egregious offenders.

  84. Matt,

    Thanks for confirming my theory. I – and a few others – have been saying all along that the Dropped Pages bug is being caused by a faulty or out-of-date backlink index.

    You just confirmed it. Do you honestly think that all of the people making a noise at the moment are naughty people with some irrlelevent outbound links, or “not enough inbound links”? Isn’t it far more likely that Google just arent’t finding or indexing the backlinks properly since Big Daddy?

    Are you looking on Yahoo or MSN for backlinks before you go generalising about sites not having enough? Because that’s Big Daddy’s problem: many, many, high quality backlinks are just not registering as backlinks anymore. It’s a bug. You must have a very low opinion of an awful lot of people to just dismiss us all as whining idiots who didn’t know you need a few backlinks. Take a look at Yahoo’s backlinks for the effected sites before you condemn them all to the garbage.

    How long is it going to take you guys to notice your backlink bug? It probably doesn’t help that you keep deleting any comments that mention it.

  85. Matt: β€˜I’d recommend people spend less time on trying to gather links that way or via some automated network, and more on making a great site with a creative angle or two that makes the site stand out from the crowd.’

    I would recommend as one other poster that if Google wants to get a handle on reciprocal link farms to look at real estate sites. I have pointed out before and I have been guilty of this myself but there are huge link farms operating with high Google rankings that are nothing but link farms.

    Multiple site creations on the same subject, directory creations, scrapper sites all that are created to increase the manipulation of Google and to benefit the present link farm group even further in Google.

    A Good example of this was some research that I performed last week on our # 1 competitor in Google. Out of 1000 links, this site had 40% of them coming from 5 IPs. Yet Google has rewarded this type of linking scheme with top rankings.

    Based on my own personal experience Google has rewarded reciprocal link farms and continues to do so. Based on these subject sites if a link farm is created and is themed, Bigdaddy is rewarding these unnatural link schemes.

    You have groups and some Seo companies that are able to point 1000s of links at their clients sites or create a closed off network of themed reciprocal link exchanges that are not natural according to Googles definition. Myself and others as I am sure you understand Matt that these systems are only meant to manipulate Googles serps.

    On the flip side of this coin is the fact that new sites who are trying to compete with these sites must follow the example set by Googles reward of high rankings of these practices. As long as Google rewards even a few sites with these type practices new sites that may offer more to the online user will forever face an uphill battle for business in Google.

  86. so, no affiliate links? or how many is ok? cause you know, why not just kill the affiliate business model all together.

    let’s have a look at some examples: amazon.com – currently nothing but a site promoting other site’s merchandise but have own transaction processing capability and sell some books whathaveyou on the side (177 million pages indexed by google). any site providing syndicated news? nothing but a “duplicate content” aggregator. every coupon site on the web (type in “coupons” in google, all those sites are there) is nothing original but a bunch of affiliate links (mostly cloaked). are you gonna not index any of those? i say let the users decide which ones they like most. bookmarking rate maybe? i don’t know. things like that. backlinks? well if you delisted all the sites that originally linked to some site, there will be no backlinks left i guess. you know all the small sites that decided to give each other a boost.

  87. Great post Phil C. It’s nice to see somebody who is pro business. Google wants to corner the market on search but has stifled small business’s ability to make money. BD seems to favor only their β€œfat cat friends”.

    Google: Our goal is to index the entire world’s information but
    alas we’ve found it more lucrative to censore.

  88. I have a question about sites missing from the index, and I wasn’t sure where else to get a reply, so I hope you don’t mind me asking here.

    Last fall I had five sites completely banned from Google for having “outgoing links to pharmacy sites”. I removed all outgoing links from all the sites, and filed reinclusion requests. One site, a PR 7, was immediately back in the index and continues to show up on page one of the search results. The other four sites have never reappeared at all, despite the fact I made the same modifications to them.

    The Google reinclusion people wrote to me in March about my missing four websites, saying, “Please be assured that your site is not currently
    banned or penalized by Google.” When I wrote back and asked why my sites were missing completely (grey bar, and the domain not in the index at all), I was told the matter would be investigated by the engineers. That was three months ago, and my sites are still invisible. They’ve been gone from Google for 8+ months now, after being in the index previously for over two years.

    Have my sites been “sandboxed” or something, prior to reinclusion? They were only a PR 5 or 6, so did the PR 7 site get some sort of priority? I really would like my sites back in your index, and I’m at a loss as to how to achieve that when your own engineering team claims my sites aren’t banned at all.

  89. Matt, it seems that google picking on reciprocal links just makes it more attractive to buy expired domains.
    you always avoid talking about this type of webspam, yet its doing more to upset the balance of good serps tahn any other type of spam.
    You also mention that blogs are a great way to develop one way links.
    That also plays into the spammers hands.Expired blogs still work a treat and that profile I gave you many weeks ago is still live and active. http://www.blogger.com/profile/17839170
    So much for your inside man at blogger taking care of it.

  90. Matt, thank you for the update While I appreciate the information it does little to change my philosophy that it is almost impossible for small site (25 – 100 pages) playing by the rules in a competitive market to rank in Google.

    It is sad to come to the realization that the only sites that Google feels provide any value to the web are the large multi-nationals or sites with 10k+ pages and thousands of incoming links. How relevant will Googles results be if webmasters abandon efforts to rank in your index and focus their efforts on the other engines?

  91. So Matt,

    Are you partly responsible for this debbacle then? Even if you didn’t have a backlink bug (which clearly you do), your logic is fatally flawed. The innevitable end result of requiring more and more inbound links before you will even dane to index a site is Spam. Spammers do this stuff full-time. They spend no time on content, and no time on value-added functionality.

    The more ludcirous hoops you make sites jump through to qualify for the index, the more you pave the way for Huge Companies or Spammers. The in-betweens get sidelined.

    Incidentaly, why does a site need a gazillian artifically bartered inbound links before it is worthy? No one at Google seriously believes that inbound links are still a measure of relevance do they? Have you read your own posts? They talk none stop about how to go about aquiring the right kind of links.

    You’ve all lost the plot. You’ll delete this message without even bothering to pause and consider whether or not I’m right.

  92. Ah, finally. Maybe now we can finally kill off the link exchange program cottage industry. A few particular countries are not going to be happy about this! πŸ˜‰

    Hey Matt, when is Google going to implement the long awaited SERPs Randomizer? I mean, we’ve talked about it in the past and it would be great to see those first 30 SERPs rotating randomly. Do that and watch the life expectancy of a search engine marketer drop by a few years. πŸ˜‰

  93. Matt,

    I know google is not giving us webmasters a full picture with the link command. I did the link command on yahoo and msn and I noticed some scraper sites copied my content and added some links to a few of my websites. I have a feeling google is looking at these links as questionable. I am in the process of emailing these scraper sites webmasters and getting the links removed because I did not request to put them there and they violated copywrite by taking our content.

    Since google crawls better than msn and yahoo, will there be a way in the future for us webmasters to see these links? Honestly right now if a competitor wants to silently tank a websites rankings in google all they need to do is drop a bunch of bad links. Without google giving us webmasters the ability to see the links we may never even know this could happen.

  94. Hi Matt. I appreciate what you have explained here. I suffered through supplemental pages earlier than many others, and at this time I am happy to report that nearly all of my pages have returned when doing a “site:” type search.

    Unfortunately my Google traffic has not recovered yet. At one point it dropped down to about 2% and has recently risen to around 5%. This is not good as it used to run closer to 75-80%. Have surfers changed search engines? I don’t think so as the total numbers from other engines hasn’t varied a whole lot.

    Earlier I did a search for a page on my site and it was found on the 4th page. That’s fine for that page, but the sites that came up ahead of it were not even related to the subject and only mentioned in passing the words that I had searched for. I expected to see well known sites in the very same niche appear in that search, however none did. It looked like crap was floating to the surface instead. It looked like relatively had disappeared out the window and that cannot be good for Google’s business.

  95. Great post Matt, thanks for sharing all the insight. Congrats on getting more help recently, I hope that this frees you up to make more posts like this.

  96. Hello, Im really new to dealing with google and I really appreciate finding some feedback from you guys, great!

    I have new site that has about 3750 pages. The total indexed pages are constantly hopping from 30 to 340. It would be great if I could get them all indexed. lol

    But I’m completely lost as to what I am supposed to do to get all my pages indexed? I really dont want to be going around the net trying to get links to my site and we are being told its better we create good content instead. But hang on how will my great content get indexed if I have no links? As your also saying we need links to get indexed, but not any links they must be β€œgood” links. Im lost again! lol What I mean is that for someone with little experience reading that they need links its really hard to judge what are good links and be able to find places to get good links. This again seams to mean that established sites with big SEO budgets are always going to be ahead regardless of there content.

    I think PhilC made a really good point above too. I have some unusual specialist information on my site that isn’t indexed. There are currently no results for related search terms for this information. Now where is the benefit for people that these pages arent indexed as there is not enough links pointing to them?

    What if you have one large site with 60k inbound links that has a page of information about a subject and it’s the only page returned for a search term. Then you have a small site with no links that hasn’t been indexed but has a similar page that’s a 100 times better content wise. Why not index it and show it second in the results? Surely that’s better for everyone?

    Lastly, if the dropping of people index is because of site trust issues then why is my own new sites index going up and down like a yo-yo? Newly indexed pages then hardly any pages and then newly indexed again. Is it having trouble making up its mind if my site is trusted or not?

  97. Hi, Matt!
    I was wondering if you guys changed something to the algo in the last days…
    A few hours back, my site dropped from 3 pos to nothing, although it’s a good site. The sitemap acct doesn’t show any spam warning, but google started to delist the pages…
    Can you have a look? I’m a total mess now…

    Thank you,
    Chris

  98. Are you kidding Chris?

    Did you read Matt’s post? Your site is a piece of junk not worthy of Google’s index. It’s true. Matt has personally checked. And every site that has been de-indexed that he has looked at has not had enough inbound links or else has had outbound links that are just completely off the wall. Imagine a real estate site having the gaul to link to some other kind of site. What a joke. You’d better get busy and go after links. It’s links links links from now on. It’s official Matt says so. You are junk if you don’t have links. Google love blogs you know. You shouldn’t really be allowed to have a website nowadays unless you are willing to link yourself silly on your own blog. It’s the future you know. And it’s great. Matt says so.

  99. Yeah, for a porn site that is some great spam work indeed man!

    Google is taking porn sites out of the index if you have been reading the news there are lawsuits flying around about them being in the index!

  100. Oh, my, goodness! It just so happens that at about the same time my remaining indexed pages disappeared I had just added a reciprocal link to my site!!! Ugg!

    Soo… now that I’ve removed all links from my minute template based website and added a no follow command to the three remaining links, should I expect to see a change in indexed pages on the next crawl? Or am I banned for a year or something?

    By the way thanks for the update I’ve been stalking your blog for over a month waiting for something like this post.

    Heh… and I’ve only ever had two internet customers… (but they were recently which is why I was inspired to get my site indexed πŸ˜‰ )

  101. Anthony Cea Said,
    May 16, 2006 @ 7:24 pm

    Yeah, for a porn site that is some great spam work indeed man!

    Google is taking porn sites out of the index if you have been reading the news there are lawsuits flying around about them being in the index!

    Yeah the funny thing about that ranking is that my site is real estate, not porn. It only shows a flaw in Googles algo and ranking system. I kind of liked Midwestnets comment on DP

    Fisting lessons with your new house, anyone? πŸ™‚ The page ranked for that term is a property detail page of a listing in Las Vegas. First I thought ok maybe this page was hijacked but it hasnt been, then I thought ok did someone get access to the site to change title tags and meta descriptions, wasnt that.

    Checking that page I found no backlinks to it with that anchor text, so this only leads me to believe that somehow someone at Google turn over a cup of coffee on their computer πŸ˜‰ and caused all this mess..LOL

  102. Zoe C,

    Shame on you. You added a reciprocal link! Why? It’s a simple fact that natural links just materialise out of thin air if you are any good. How? Because people find you, think you’re great and link to you. How do they find you? Why, on a search engine of course…ummm…wait a minute…Oh my god. The system is flawed! Heh Google. You’re a bunch of idiots.

    I guarantee, history will not look kindly on this particular period in Google’s history.

  103. Hi Matt,

    I wanted to ask a couple of questions. In the next days I am going to launch a new site that will be offering a certain service to bloggers and webmasters. Basically it will offer a script for free. I am going to ask the people using it at their blogs and websites to link back to my site, that can attract all kinds of backlinks because the script can be used at any kind of site. If some sites from bad neighborhoods according to google use this script and link back to me will this penalize my site?

    The other thing that I would like to ask is: on my blog I have a niche affiliate store related to my blog’s theme as way of monetizing it. Will this lower the overall trustrank of my domain? for example can this cause a decrease of the rate my blog is being crawled or cause my site to loose it’s current rankings for certain keywords?
    If that’s the case I think it would be very unfair, it would be like msn penalising sites that have adsense code on them.

    Thank you,
    Dimitris

  104. Matt, thanks for your great post.

    One question relating to sites that send traffic in exchange for linkbacks. Say 20,000 sites link to a page, and in tern that page sends traffic to each of those sites. Here’s the twist: that page rotates links in and out periodically, so that on any given day, it only displays 200 links. I consider the 20,000 incoming links as manufactured links, but technically, 16,000 of those links are not reciprocal. Will Google be dealing with this type of linking scheme anytime in the future?

    “What do you think of that? Hmm? I said ‘What do you think of that?’ Don’t answer. You don’t have to answer everything.” πŸ™‚

  105. Hi, Matt !

    This is a very valuable post indeed ! It has given good insight over the quality parameters which Google considers when indexing the pages.

    A better web can be made by openly sharing the problem & comments.I feel that there is need for something/ some forum where volunteers /enthusiastic can contribute to share their real time expereince about black hat seo /non ethical SEO practices followed by many sites in an annoymous way.This will help to improve the Google filters continuously and a better web can be made.

    Thanks & Regards,
    Ajay

  106. Hi Matt,
    Thanks for the post! I pretty much expeted everything you have said. After all Google is going to keep trying to improve itself so after all in the long run only the quality sites are going to last. Any thing that tryies to game the SE with backlinks or whatever will eventually get kicked out!

    Anyways,on my site I have a link to my “web stat counter” at the bottom. Will that be concidered as a bad link at the bottom to have?

    I have other bad links too…but i want to know specifically about the web stat counter link? Is it a bad link to have?

    Thanks

  107. I have a small, noncommercial, ad-free site (with good-quality content). You could say I’m not so much a webmaster as just some guy with a website. There are a lot of people like me, who seem to be being left behind by the new Google with its infatuation with giant business enterprise.

    From my perspective, both yahoo and msn do a far better job than G at returning results from my site when they are pertinent to specific search queries. At some point — early February as I recall — I noticed that traffic to my website had virtually stopped. I then found I had dropped out of the Google index. After a little research I decided I was being penalized for duplicate content (which probably occurred when I moved the site to a new domain). I filed a reinclusion request and at least got my site indexed, although at its previous host — defunct for almost a year — it was still showing better results than the same site its current location last time I checked.

    Right now I feel I’m doing about all I can, which is to improve and expand my content and hope someone notices. Maybe Google will some day start to return better results from my site so that traffic will pick up again, but it’s kind of out of my hands.

    All of which is a long preamble to a comment about how organizations fail. I’ve looked at this issue a little, and typically there is some fatal flaw that seems insignificant at first but gradually become magnified and turns out to be their undoing. (I suppose that insight was the genius of Greek tragedy.) Anyway, it’s looking to me like Google’s fatal flaw is paranoia. By obsessing about people scamming its SERPs, it has started dropping valuable content. It expends too much of its energy in a kind of perpetual chess game with black hats, who are simply playing whatever system G devises, and so it has turned into a mirror of their tactics. The esclating back and forth is like an endless succession of reflections in funhouse mirrors. Meanwhile, its competitors, perhaps just by doing nothing, are now returning more useful results.

    Or maybe I’m wrong, and the ship will correct its course. I hope so.

  108. I am with PhilC. This whole thing is ridiculous now.

  109. After 4 years online, my large content based adult website dropped like a rock today in Google. I lost maybe 80% of traffic in a single day, and after talking with my competition it seems they’re all doing fine. Not sure what to make of this so far, we haven’t made any major changes lately, or have any duplicate content.

    We have setup XML link trading in the past few months to help our customers find similar articles about the sites we list, adding many top quality IBL’s. Hundreds of exactly-relevant links from a site ranked 575 in Alexa for example.

    We have never used any spammy techniques (to our knowledge!) or anything black hat. I’d say we’ve followed all rules to a T since 2002. We have never wanted to risk our good relationship with Google. What can I do? It hurts to see cloaked sites and sites with no content out-ranking our high PR, old and established pages, with relevant, useful content.

    It seems to me that google is having big problems with .biz, .info and .us sites lately, too. My 2 cents.

  110. * clap clap clap clap clap *

    That was brilliant, Phil. You’ve managed to come up with the most emotionally compelling arguments on this blog any of us will ever see. Like many of the things you have written in the past, it is truly a work of art. It’s passionate, it’s inspired, it’s emotionally charged, I laughed, I cried, I felt stirrings from the very cockles of my heart…no wait, that was a gas bubble. Sorry about that. My bad. Really.

    If they were even remotely sensible, then they would have been great arguments. The problem is that you’re making the same fundamental mistake that most others make when they try to convince others (especially guys who have stroke, such as Matt): they don’t argue from any point of view other than their own. We’re all guilty of that, though. You do it, I do it, Aaron Pratt does it, Wayne does it, we all do. We can’t help that. It’s human nature.

    (Side note: for those I mentioned here, it wasn’t an attempt to single anyone out. I was merely mentioning names as examples. So please don’t take it personally; I’m not trying to attack or insult anyone).

    But the whole point of what Matt was trying to say here is something I think most dedicated SEO-types tend to miss, and that’s “worry about the site first as far as a resource for people goes, and THEN start SEO after.”

    When webmasters start linking to ringtones or MP3s or Viagra or pet urine control from unrelated sites, that doesn’t do a thing to help the end user. It either sends the user on a wild goose chase or turns the user off.

    When webmasters start receiving those links, they’re getting trash traffic at best. I’d rather have 10 visitors from a relevant search query than 10,000 from some trash-traffic link farm scheme (assuming it was even that good).

    For another thing, as a webmaster, if my pages are good, index them, dammit. What on earth do IBLs have to do with it? Doesn’t Google want to show good pages to its users? If you don’t want to rank them very highly, don’t rank them very highly, but there is no reason in the world to leave them out of the index, and deprive Google’s users of the possibility of seeing them. It’s just crazy, and makes no sense at all.

    First off, who are you, I, or any of the rest of us to judge whether our own sites are good enough to be listed and indexed? All Google is doing by using quality IBLs as a sign of quality is extending the concept of human referral and word-of-mouth. If it’s good enough for a human to link to it organically, it’s good enough for Google to list it. How else are they supposed to figure out what to rank and what not to rank? People would complain if the Toolbar were used; on-the-page content can be manipulated very easily; and any other form of monitoring would be met with some heavy-duty scrutiny at best.

    Where are these pages that are so perfect that Google is doing a disservice to the web and that don’t have hyperlinks to them from any other web destinations, anyway?

    Second, if Google is going to list pages so that users can find them, they’re going to need to list pages in such a way as to provide users with easy access to them. In 99.999% of cases, the SEO-types call this “ranking highly in search engines”. So you want to be listed somewhere in Google SERPs for your content so that users can easily find it, yet you’re okay with it not ranking highly. Does anyone else see where a guy like Matt might have a bit of a problem with that?

    As far as OBLs go, this is an area where webmasters should take some responsibility and show some moral judgement (and, to be fair, most webmasters are pretty good that way.) We have a certain moral obligation to those who may visit our sites to guide them via the hyperlink structure in a manner that will give our users the best possible experience. How does irrelevant OBL linking do that? How does providing a link to otherwise useless content help the user?

    For those of you still not convinced that building a good website, putting up content and drawing visitors the natural way works, there is at least one website that has done a terrific job of doing exactly that.

    The owner doesn’t obsess constantly about where his site’s positioned in any engine.

    The owner has never linked to a spammy site without using the nofollow attribute, and has ensured that the spammy link was relevant to the site’s theme on the rare occasions that he has done so.

    The owner has never bothered to participate in link schemes, exchange reciprocal links, or do any of that stuff.

    The owner has quietly built up his content, and in the process has attracted a large, loyal and active userbase, which if I’m not mistaken is what we’re all supposed to be doing when we build websites.

    It’s not a perfect site…none are (including my own). It could be improved, and I’m sure the owner would say the same thing. But at least the owner is focusing his/her efforts on his/her site.

    And each and every one of you reading this has visited the owner’s website. In fact, you’re on it…right now.

    Just something to think about the next time someone offers you a wonderful reciprocal Viagra link, or maybe buying some text links from a broker.

  111. Matt et al,

    Thank you for this insight. You’ve put the pieces of the puzzle together, and I appreciate it.

    What I am getting from this is that links to a site that were previously considered a positive vote are no longer considered that, so some of your pages that were in the index because of that vote may now dissappear. Now of course if those pages had links on them, the sites that received those links may now dissappear. Thus the gradual deindexing of sites.

    This effort has been put in place so that un-natural linking schemes such as link farms, directories, and paid listings.

    Now since people like me cannot afford a superbowl ad to get the name out, and I’m not a seasoned SEO with 2000 sites under my control to “naturally” gain links my pages will go unknown to googles users. Unless of course they get fed up with the same old sites at the top of the SERPS and go to the other search engines that cache fresh sites.

    All of this effort appears to made to discourage un-natural links, however I believe it will only increase them. Why you ask? Because I know my site was killed due to the new filters, perhaps I didn’t have enough “quality links”. However if I search for some very specific terms, 3 of the top 10 results are simply made for adsense mini-directory sites that have 10 links on them, some scraped content, etc. If I check their links it is trully a bunch of junk. So my only conclusion I can make from this is that the junk links still work, it just take a whole lot more than before. Until the day when all of those sites are gone that can be the only conclusion.

    Now to address the paradox. To get indexed you need natural links, to have natural links you need webmasters to view your site, to get them to see your site you need to be in the index…yada yada. BUT the webmasters have just been told not to link to lower ranking sites and if you do use the NO FOLLOW tag. Why not simply show these low linked pages on page 800 of the SERPS and track if they are found. In my line of work (engineering) I frequently search very deep into the serps to get to sites written by real people in the field and not the corporate presence that rule the industry and the first 100 pages. In other words as a page is found included it, the more action it receive the more it moves up. This of course could be tracked with activities such as watching the back button (a vote for not finding what you wanted) etc.

    Just my 2 cents. And I’m off to spend the night find a few thousand sites that want to link to me to get my pages listed again.

    ~John

    PS If you don’t delete this I added my URI this time as yahoo doesn’t seem to care about the NO FOLLOW thingy.

  112. I realise that “adult” results arent exactly your “forte” but how can you explain google’s seemingly deliberate action of making adult search terms give irrelevant results..

    This practise has a string of problems with it..

    Heres the main problem i see. do a search for an adult term like “porn clips”

    the top 10 results are somehwat relevant, but the other 90 are filled with domains like this

    http://www.thechurchillscholarships.com/analporn.htm
    http://www.lewisandclarkeducationcenter.com/farmsex.htm
    http://www.nyotter.org/porn.html
    http://www.argenbiosoft.com/amateurporn.htm
    http://www.universityplazahotel.com/porn.html
    http://www.plannedparenthoodcouncil.org/amature.htm

    notice a trend ? all are recently expired , non-adult listings.. the main pages are usually direct copies of the previous site pulled from the web archive , then each domain is filled with easily identifiable doorway.cloaked pages.

    Not only does this give a bad impression of the adult industry in general , but the other problem is most of these sites contain trojans / virii /childporn/beastiality that then alter surfers browsers.

    The only reason i could see google allowing this practise is they make more from google adwords this way..and they realise adult webmasters dont have a voice as loud as mainstream ( even though it is a huge p[art of google’s revenue )

    I also notice a trend of google adsense sites jumping up in popularity when the sites dont even have relevent content, just ads for google adwords/adsense

    Now i notice my “adult” website that is very relevant and is several years old and established with several hundred relevant backlinks is close to #300 position , while the vast majority of sites above me are either frshly created/expired domains with no content or guestbook/forum spam on mainstream sites that the owners cant fix without losing their entire website.

    Why the foolishness ?

    p.s. its really irritating when you write out a big long post and the “security code is invalid ” so it makes you hit back button and all your post is erased .. grr

  113. Matt,
    Your article reminded me of my startup days.
    Having written a very deatils business plan (about a 100 pages long), I was told that although it was very heavy to hold, what most venture capital analysts I met with would read is the executive summary on the first page and that I should focus on it. Funny how an indexing article can get mo to remember those days πŸ™‚

  114. Matt I think you guys still have a lot of work to do. I know one of the real estate sites sites you mentioned in your main post and it’s got all it’s pages back but they bought ALL their links, and their content is dire! Then a site like mine, mainly natural links with good relevant content gets stuffed. Seems there’s still a long way to go…

  115. Hi Matt

    Thanx for the post. In the last 48 hours I’ve seen allot of change in the SERPS. The question still stands though, how the hell do one promote a new site if we’re not allowed to trade links with similar sites? Ok, so it’s not ‘not allowed’ but won’t help ranking. I assume it will however help with indexing, so all in all not a bad thing?

    What I don’t understand is being penalised for linking to unrelated sites. For instance I’m really proud of the city I live in, so I run a blog about it. I also link to many city related sites, but they are all in various different niches, yet still in the city, so it’s kinda tourism related. Is that a bad thing? After all I am giving the user useful info about where to find what in the city.

    Actually it doesn’t really matter where that site ranks, allthough I’m trying to get a better understanding of how things work…

  116. I agree with Justin, google has to re-think its strategy about link-valuation.

  117. Dave (Original)

    Matt, my site is the best out there on my chosen topic. Despite this, there are many sites above mine in the SERPs for my chosen targeted phrase. Please fix this so the whole World can see my site at #1 when searching. Until you do, the Google SERPs are crap!

    Oh, can I also have some more PageRank. My paid links just dont seem to work like they use to.

  118. I really think a lot of you need to understand that the days of gaming the SE’s with links is coming to an end, links have nothing to do with Google’s problems with indexing the web, they could index pages if they had the storage space and dump the pages with bad links to the bottom of the SERP’s, the problem is the lack of indexed pages at the moment!

    http://blogs.zdnet.com/web2explorer/?p=173

    The above link was left on one of our forums and is common knowledge!

    “““““““““““““““““““““““““`

  119. Hi Matt!

    Great post, and great answers.

    My question is if BigDaddy, and the effects thereby are equally significant in alla languages?

    I am seeing alot of link exchange, linkspam and other desceptive teqniques earning top positions in the index for certain non-english languages.

  120. PhilC, we try very hard to find ways to rank mom/pop sites well. As I mentioned Bigdaddy is more comprehensive (by far, in my opinion) than the previous crawl/index. A site that is crawled less because their reciprocal links are counted for less is a different type of situation than many mom/pop sites, for example.

    Halfdeck, I’m happy if it helped clear things up.

    John, that’s your choice if you decide to chase thousands of links in one night. I just don’t think that’s the best way. BTW, just because Yahoo reports nofollow links in the Site Explorer, I wouldn’t assume that those links are counting for Yahoo!Rank (or whatever you want to call it πŸ™‚ ).

    Justin, of the three real estate sites that I mentioned, two are unhappy because they’re not crawled as much as before.

    Dave, nice one. I made it several sentences in before I got the (dry) humor. πŸ™‚

  121. And I gotta get some sleep now..

  122. Hi Matt,

    I am the owner of the health care directory domain you used as an example above. Thanks for having a look at the site, your comments are helpful and much appreciated.

    I would like to clarify something on how and why I used the removal tool as I don’t think that was described properly.

    I had pages that were indexed under both www and non-www in a directory such as:

    http://www.domain.com/directory/

    Those pages were mired in the supplemental index and indexed under both www and non-www. (At first, my server did not have a redirect from non-www to www but I have since put one in place. That is likely why they were indexed under both www and non-www.)

    I removed those pages( /directory/ ) from my server and used the removal tool to let Google know they were gone. I re-built those pages at:

    http://www.domain.com/new-directory/

    I used the removal tool because I wanted to start fresh and didn’t want to get penalized for having the same pages under two directories. I did not use the removal tool hoping that just the non-www version pages were removed from the site. I used the removal tool to let Google know those pages were gone forever (six months in Google’s eyes).

    Since the above maybe a little confusing, I am going to summarize one more time for clarity. I removed pages from my server (that were indexed under both www and non-www) and then used the removal tool to let Google know they were gone. I rebuilt those pages under a new directory to startover and hopefully get those pages indexed correctly.

    I very much agree that the site could use some links. Thanks for your time and help.

  123. NIce Day Matt,

    hope you got some sleep.

    Didi you recognise that thousands of small businesses are out of business now. Google was a search engine where small business could compete against big business. That days are over cause now the balance has changed up to the big business. That is a pity.

    In a comment you said you/ the team are going to observe the spam reports more closely?! IMO spam reports donΒ΄t work. I find well ranking sites with more than 12,000 pages with javascript redirects!, wll ranking Dupliacte Content with 3 or more domains. Nothing happend to them. When does your fight against that begin ?

    greets, Martin

  124. Google has lost its edge: or more accurately, the crawl fringe. And as a result, it is officially broken in my view.

    I have been using Google for about six years now, and Altavista before that. The key advantage Google had over Altavista in the early days was that its ordering was improved. In the very early days, Altavista had a lot more results than Google, but that changed fairly quickly. At any rate, Altavista always had the results, but you had to dig deep. In Google, the results were just ordered “right.” Indeed, if your search phrase was particularly detailed or otherwise unique, you could often click on “I’m feeling lucky,” a button unique to Google. And jump straight to the page you needed.

    I have been following a change in Google’s results for the last four or five months which has gotten steadily worse: that is that Google is returning results which are not “dead on” anymore. That is, it seems to be using PageRank not as just an ordering tool, but a pruning tool as well. Now, it is true that PageRank has always been used in this way, but the aggression of this is now too extreme for my purposes.

    Matt, you have effectively said in your entry and in the comments that PageRank is now being used to eliminate pages from the search results completely. I think this has what has broken Google, because PageRank is the foundation of the algorithm that used to make Google work.

    A page which does not appear in the index has an effective PageRank of 0. Any pages linked only from this page have PageRank 0 also. In this way we find that this feeds a recursive loop- as a page disappears, it takes pages with it, these pages take pages, and so on. Yet these pages have keywords on them- they often have unique variations of them. Google used to be able to find these, even to “whack” them. Now it simply cannot. It has lost its power.

    Now this wouldn’t be so bad- what point is a page without incoming links after all- except that this isn’t the only change Google has made. Google now has a manual switch which zeroes PageRank of sites it deems to be “unfairly gaming the system.” It also has a scheme which lowers the PageRank of pages in “bad neighbourhoods” or using known “black hat” SEO techniques- this is often dubbed TrustRank, but we have no indication from Google that it is separated from PageRank in the Google architecture. Additionally, it now appears that Google can detect duplication in results, which also seems to feed into PageRank in an unspecified way.

    Matt, you have said before in an entry on canonicalization that everyone should 301 from site.domain to http://www.site.domain (one or the other,) but there are likely to be millions of websites which cannot or just won’t out of ignorance or laziness. Are these pages actually worth less than the others? Do they deserve to fall into PageRank 0 hell?

    Surely you can see that what was already a nasty problem now has the potential to snowball. And this is what appears to be happening. The low ranking pages of the web, made by small people who don’t go out and get lots of links, have been caught in the SEO/Google crossfire. These small people had relevant pages for detailed search queries, not the so-called “competitive phrases” Google staff actively monitor. Now these phrases generate generic “authority” crud, really nasty black hat spam. or worse. The Googlewhack has become a “no results” and the “I’m feeling lucky” has been set to an instant trip to the Wikipedia world. Google is horribly broken as a result.

    I fear, Matt, that if what you say is true, all my fellow techies can forget typing some bizarre error text into Google and hitting a three year old web discussion on some portal where someone else had the same problem. You’re just gonna hit the boring table of manufacturer error codes… or maybe nothing at all. It’ll be back to Altavista for me, I expect.

  125. You didn’t sense my tone of sarcasm in my voice when I typed that!?! I didn’t spend the night chasing links, actually just wrote some articles on a subject I know something about, this interwebby stuff is to volitile for me right now. Someone will find it interesting, and natural links have to come..well…naturally. I think generating natural web traffic is like pushing a boulder over a mountain, it takes a long time on the way up, but on the way down you can’t keep up.

  126. Matt let me see if I can summerize this correctly:

    Big Daddy attacked crap backlinks and therefore if you have less backlinks you dont get deep indexed till your site earns it with quality links or site age.

    Everyone who has either bought some links or traded for some links or sold unrelated links on their site will suffer. If not then the quality sites that do link to you lost some of their PR power becasue they lost backlinks and therefore you lost reputation points from them. It is hitting so hard now because of the chain reaction of the death of crap backlinks either effecting you or a site(s) somewhat connected to you.

    My view on the affliate links is that if your site has nothing more then affiliate external links and product dup content then you are no more valuable then any other site with the same and therefore it goes back to backlinks and indexing. The only way for a site like that to rank is to out PR the other crap and then you are still in an up hill battle since your site does not provide anything more.

    Simply put G knows what portion of your site is affliate crap and what is quality original content. IE quality .vs crap links/dup content ratios.

    How am I doing?

  127. Great Post Matt – you really do deserve your holiday now –
    but

    can we just clear up the reciprocal link question

    is it OK to have RELEVANT reciprocal links – and could it even be beneficial.

    My directory type site has many outgoing links to relevant sites and articles for which I’ve never requested reciprocal linking – but I was just about to run a campaign asking most of them to link back to my RELEVANT pages – would this be OK and not harm my position or ranking.

    cheers

  128. YES! That’s my exact question, summed up Weary.

    I’m fairly sure that my massive drop yesterday was due to the inclusion of XML link trading with my competitors. My review on SiteXYZ links to their review on XYZ. I thought this was valuable, relevant information to my users, and valuable IBL’s for my site.

    They don’t look so hot now! Oh gosh.

    Anyways thanks Weary, looong day. I really hope this gets fixed, and I’d hold off on the relevant reciprocal linking!

  129. The web will be transformed to take the shape of our current world.
    Those who sell something has to be the Walmarts and the Amazons.
    Of course it’s their merit that they are so big.
    Just that the net was something that was equal for everybody and it’s now transforming so that the little ones don’t have a chance.
    As a little one, I don’t have a chance not even with cpc now. I have to do tricks like Shoe to get something.
    I think its all over folks.

  130. Matt,

    Thanks for you great post. However, something really concerns me – in the above example of outbound linking you state that the “Real Estate Site” has dubious linking, by linking to a “mortgage site” ??

    Are you serious about this ? Or is it a mistake ? As this has very serious imlications.

    Surely if I am looking to buy a house, then I am also extreamly likely to be looking for a mortgage, and that link is actually very relevant to the browser – I am actually hard pushed to think of anything that could be more relevant.

    Could you please extend on this as if this is not a mistake, then along the following lines I would expect:

    – Holiday sites will get penilised for linking to car hire sites
    – Wedding sites will get penilised for linking to honeymoon sites
    – Finance sites will get penilised for linking to credit card sites

    In essence if the above holds true, a site will get penilised for linking to anything that is not exactly the same theme as the site it links from.

    If you looked at numerous property sites I would guess you will find hundreds of adverts that have been paid for and hand picked by mortgage companies as they know that they are very likely to get the perfect customer from that site. It would seem that google is therefore going against the best human knowledge.

    All I can see is, that if the above is not a mistake, then it is asking for the destruction of the web as everyone is so paranoid that they may be linking to a site that is not exactly the same as their own, that they pull all their links.

  131. Hey Matt. Since you ignore my mails and poosts, I thought you might like a visit outside the blog before you head for the hills for your hols.

    http://gooogle-search.blogspot.com/

  132. How could it be possible that some important sites can put some dirty links on their footer, and not being dropped from the index ? eg : http://www.pixmania.com/fr/fr/home.html

    And now : how could a directory be indexed, because it give some links to many differents sites ?

  133. These kinds of posts are really good. It gives the lots of webmaster a feeling that google isn’t evil at all, just that they need to pollish their website.
    Keep up the good work …

  134. I hope in the future, and my opinion is, that bigdaddy is a step in the right direction. The future target is very easy for everyone to understand, give good sites the first places in the serps. There is only one thing that can make this happen, and all the bigdaddy stuff against link exchange … is the beginning. Sites that have good content becomes fee links, thats all. But its an enormous projekt to build a searchengine that wil give you really good serps. And all thes problems that google fighting against are self-inflicted by google. And of course of this i can understand people who are angry, causw they spend much time too be high ranked and now googles algo is changing.
    I build my site with real good content and i hope the future will be good.
    So kepp on try’n matt (and give nnew sites with good content a chance, and not so many filters πŸ˜‰ )

  135. Hello Matt!

    I read the whole post of you and I have very strange feelings about Big Daddy and your reciprocal links penalization.

    Why? It’s simply. Let me build a dog breeder site. I want to get some surfers, so I’ll ask my friend to add link to my site on his one. He will also ask to add link on my site, so he will probably get some fresh surfers from me.

    Then I’ll want to have my site on some dog directory, so I will ask them to put link to my site and they will ask for link to their site for sure.

    But Big Daddy is telling me that if I want my site to be indexed and have good results, I have to not add links to other sites, but pleased for links to my one…

    This is really stupid algorithm and a kid would create a better one.

  136. This emphasis on IBLs is nuts.

    Just for a favour (no money changes hands) I run a Chinese takeway site for a friend. He makes a good product and serves a quite specific geographical area.

    Why in hell’s name should I have to run around getting “high quality” links to his site when that isn’t the way anyone would seek to access it?

    And what is a “high quality” link to a Chinese restaurant? Local CHamber of Trade – as it happens, it has all the appearance of a spam site, with hundreds of links to unrelated businesses that happen to be members. Is a link from such a site “untrusted”?

  137. Hi Matt

    Dammm – I am late to this post – I hope you revisit it.

    Matt, you say that crawl depth etc is largely based on PR.

    PR at the moment though has been acting very strange – some sites that lost PR regained this in the last PR update – however, depth of crawl still looks like it maybe based on perhaps an older level. (EG PR5 site not getting crawled – was prev. PR0 – Due to ban, canonical, error – I dont know) – but it is still getting crawled like a PR0 – eg hardly at all. πŸ™

    Now – as you know some pages/sites didn’t have PR updated at the last change (about 4-5 weeks ago ?)

    Soooooo – whats the score with PR at the moment – I would assume that an update will be coming soon that updates the PR of the sites which did not change at the last change over ?

    These sites which regained PR after a long absence but no ranking changes – does this point to perhaps increased crawling in the future when PR is updated accross all sites/pages ?

    PS. I did not get a reply from Boston email address thang – sniff πŸ™

  138. PhilC, we try very hard to find ways to rank mom/pop sites well.

    Maybe you do try very hard to do that, Matt, but it’s just not working. The new criteria for crawling and indexing that you explained in this thread is so bad that’s it’s hard to actually believe. To base whether or not a perfectly good site gets all of its pages in the index on how many links it has pointing to it (and the type of links), and what types of link it has on its pages is sheer lunacy. I asked before – doesn’t Google want to index good pages any more? Doesn’t Google want to give full choices to its users any more? Or is Google happy in the knowledge that there are so many pages in the index that there will always be some relevant pages for the user’s results, even if they deprive them of plenty of good ones?

    Most people wouldn’t mind at all if Google identifies and drops certain types of links (reciprocals, etc.) that they don’t want to count for anything. If you don’t like certain links, cut off their juice – treat them as nofollow – drop the links from the index – but there is no sense or justification whatsoever in dropping a decent site’s pages from the index, and virtually killing it off because of them. It’s clear that Google can now programmatically recognise some of the links it doesn’t like, because you say that’s why some sites are being treated badly, so drop the links – remove the links from index – but don’t refuse to index the site’s pages because of them. It’s a sh..ty thing to do to sites, and it’s a sh..ty thing to do to your users – the very users that Google claims to think so highly of, but are now being short-changed.

    Most people would support getting rid of spam links, but to treat sites that just don’t happen to have attracted enough natural links to them as second class and on the fringes, it plain stupid. Nobody would support that – especially Google’s users if they knew.

    Google now wants us to go out an acquire unnatural links for our sites if we want them to be treated fairly. Whatever happened to, don’t do anything just because search engines exist? What an embarrassing about face! As I said in the previous post, I am not going to run around getting unnatural links just for Google. I’ve never gone in for it before, apart from submitting to a very few directories, and I’m not going to start now. You can stuff that stupid idea!

    The site I mentioned earlier had a clean bill of health from you personally, and nothing has changed since then. 4 days ago it had 17,200 pages in the index, and on subsequent days it had 14,200, 11,700 and 9,350 yesterday. It started at an unrealistic ~60,000. I’m past caring about the site now. It’s a decent and useful resource, but who cares if your valued users ever see it or not? Google knows best about what their users want to see, so they are stopping showing them most of that site’s pages – right? They’ll love you for it! The site has only one reciprocal that is down in a very specific and relevant page in the site – and it’s staying there. The site has never had any link building done on it, and because of that, Google is dumping it and depriving their users of a useful resource. Nice one Google! If only your users knew how well you look after them.

    That’s just an example of what a great many sites are *unfairly* suffering because of the sheer stupidity of the new crawling and indexing regime. Nobody gains by it – including Google’s users, who are being intentionally short changed. Actually, that not true. Those who gain are those who link-build. The filthy linking rich get richer, and ordinary sites are consigned to poverty. Is that what Google wants? You want the poor to turn to crime? That’s what you will drive them to. The whole bloody thing stinks!

    Matt. My posts are not aimed at you – they are aimed at Google. I’m sorry if you take any of it personally – it’s not intended.

  139. One last point…

    All that this will achieve is that the link-poor will start unnatural link-building, and in ways that will deceive the current programming. Google will have caused it – not the site owners. This sledgehammer treatment of innocent sites, just because they haven’t naturally attracted enough IBLs for you, is madness.

  140. At least now we know that the indexed pages filter is based on external linkage.

    Thanks.

  141. I agree it’s important to filter out low value sites (although it’s debatable what low-value means). Unfortunately, the same techniques used to promote such sites are the same as legitmate sites.

    As a webmaster with a lmited budget trying to get a new site going, or dirve more traffic to an existing site, what are the options? No reciprocal, can’t buy links, can’t sell links, can’t compete with the big boys (Walmart, Target, Amazon, Overstock, Ebay…) in PPC …. what’s a webmaster to do?

    Soon top Google results will be primarily big companies with big name recognition. Of course such sites gets thousands of back links. How could it not? But what about poor JoesSunglassStand? Sorry Joe, McDonalds is hiring. Or there’s PPC if you have the cash to go against the afore mentioned companies (not likely).

    I don’t think you can determine a web site’s subjective value with an objective algorithm. And now the small webmaster’s site doesn’t even show in the results because he doesn’t have a few hundred natural backlinks, or he sold a link for $10 to a Ringtone or Credit Card site.

    Despite all Google’s efforts, I can still easily find sites using black hat techniques (such as cloaking) that appear high in Google results. Here’s one I’ve reports a half dozen times:

    term: comforter sets
    linensource.com – Offers Down Comforter Setslinensource.com – Your one-stop source for all your down comforter set needs. The Linen Source offers a wide variety of down comforter sets.
    http://www.linensource.com/down_comforter_set.asp

    The asp page is a cloaked page that redirects the user to the main site.

    I applaude Google’s efforts to bring order to chaos, but I can’t help but think that they are doing in a manner that is more and more exclusionary to the small website owner.

    It seems to be a fact of societal evolution that democracies eventually ‘evolve’ into republics, where the power and wealth ultimately end up residing in the hands of an elite few, rather than being equitably spread through the population.

    The irrefutable guiding principle of our undeniable monetized society is inescapable. When it comes to search engine position of ecommerce sites, it’s not about who is most deserving, it’s about who has the most money. Mega sites with name recognition and multi million dollar traditional media marketing budgets are taking over the serps, and it’s only going to get worse.

    If you want to make a site about the mating habits of New England barn owls, or any other esoteric research topic, you can do great in Google. But if you want to run an online business that relies on the sales of products or services, you’re in for a tough time.

  142. I don’t doubt you are trying hard, I like PhilC, believe you’ve simply got it wrong. Very badly wrong.

    Deindexing part of a site or refusing to index deeper parts of it for any reason defies logic. You either index it or you don’t. How you rank it among the other pages is another matter.

    Big Daddy may be far more comprehensive, but the results are not if you choose to deindex pages, or refuse to index them based on the types of links and not the links themselves.

    Dave

  143. Matt, are you sure BD is over, and does Yahoo link to bad neighborhood ? πŸ˜‰

    site:www.yahoo.com – those were supposed to be 400k .

  144. Thanks for the post Matt !
    I translate some of the most significants extracts in French : -http://www.malaiac.net/seoenologie/91-bigdaddy-liens-sortants.html (hope the – is enough to not make a link)

  145. Once again PhilC has put my concerns in a more coherent way than I could. As I stated above I’m new to this and struggling to work out what I’m supposed to do.

    This is an example as I see it of not being indexed in action: Try the following term in the UK (google.co.uk and select UK)

    Beta Tools 920A 1/2 Socket Set (a completely possible search)

    As you can see the search returns 2 things firstly my XML sitemap which is pretty useless to anyone searching for the above item. The second is something completely irrelevant.

    Now wouldn’t it be better if this page was indexed and then returned?
    http://www.shacktools.com/beta-tools-920a-22-piece-12-socket-set-p-5415.html

    This is just one example, and probably not the best, of how this is affecting my site. I have 1000’s more like the above. As you can probably guessed my XML sitemap page is very busy but people cant find what there looking for from that and then exit the site.

    The thing that worries me is that this page has no competition so it doesn’t matter where it ranks just so long as its indexed. So to get this page indexed I need to go around adding links to other sites? This seams such a completely unnatural thing to do.

  146. PhilC has a very good point.

    I know of a site that can’t rank above page 3 for anything. I naturally thought it had some kind of penalty, as prior to its plummet it did fine on all manner of queries, then kaboom, overnight, I find myself in search engine purgatory.

    To cut a long story short- I sent a letter to the good people at google asking if it had a penalty, only to be told that I should look at getting a few more quality links.

    In general, webmasters can improve the rank of their sites by increasing
    the number of high-quality sites that link to their pages. You can learn
    more about how Google ranks pages at…

    So yes effectively I have a penalty. I like to call it the lack-of-IBL-I-didn’t- go-out-and-aggressively-pursue-lots-of-links-penalty-cos-I-always-thought-it-would-bite-me-on-the-ass-oh-but-how-wrong-I -was-I-wish-I-had-penalty! πŸ™‚

  147. I would just like to pick up on the affiliate issue – what gives Google the right to determine that affilaite sites are bad? The internet is about choice and these affiliate schemes work, giving people a living!

    Does Google not feel any responsibility for the thousands of people who will loose their income?

    With so much unemployment these days, the internet and affiliate schemes offers – or did offer – a way of people earning money, setting up businesses and providing surfers a choice, even if it does mean they end up buying from the same place in the end.

  148. I completely agree with John. Google is about to destroy the original linking spirit of the WWW. Matt reflects the whole paradoxon in his posting:

    a) Since Bigdaddy, a high quality site xy.com is considered less relevant due to the fact, that inbound links might have been paid for.

    b) On the other hand: Google’s webmaster guidelines and also Matt himself keep on recommending webmasters to get “quality relevant inbound links” for their sites to gain more relevance for Google.

    c) Since Bigdaddy, another site ab.com is also considered less relevant, because it has got outbound links to sites that might cover different topics than the site itself (see the real estate example in Matts posting). But why does this happen? Because Google and Matt recommended other webmasters to get quality links.

    Matt: Do you really think, that quality sites will ever link to other quality sites for free again, as it used to be in the old days of the WWW? As it used to be one of the major ideas of the WWW? If you link to another website, you’ll have to be afraid to get punished for this action. So, why link to other sites but for money? And on the other hand: How will you ever get “natural” links to your site again?

  149. [Quote from Matt] it’s true that if you had N backlinks and some fraction of those are considered lower quality, we’d crawl your site less than if all N were fantastic.[/Quote]
    I have seen sites linked to by scraper sites whose only content is Adsense ads and scraped Google search results. Does that mean that my site would be penalised by the actions of a third party spam site, over which I have no control?

  150. Matt:

    Would you please clarify your comments regarding reciprocal linking and
    discuss RELEVANCY and the RATE at which a site obtains reciprocal links?

    The 2003 Google patent says ‘obtain links with editorial discretion’..
    reciprocal linking is tough to avoid when site A won’t link to site B unless
    site B links back to site A. And as you have noted, paid links are not
    always the best course of action so where is the line drawn? Free
    advertising is not very prevalent in this world. Paid or bartered
    (reciprocal) are the current options.

    Most sites (especially hobby, niche, small business) will not provide a link without a link back. That’s the nature of the web, you scratch my back, I’ll scratch yours. If sites didn’t link to each other, the web wouldn’t be a web.

    Responsible reciprocal linking should be done for the end user and to
    generate qualified traffic from like minded sites. When done correctly,
    relevant and useful links offer content to the end user and provide
    additional resources and “trains of thought” to continue the learning
    process on a subject.

    Relevant exits links add value to a site again through providing the user
    with another “knowledge gateway” to pass through leading to more information or related information on a subject. This is the essence of the web. And many site operators won’t provide a link unless they can get one back.

    If Google is giving less value to sites that engage in HIGH VOLUME
    IRRELEVANT linking, I applaud this move as it sends the right message to website operators to keep linking relevant and for the end user.

    If Google is penalizing sites for engaging in ANY TYPE of reciprocal
    linking, that smacks all of the small businesses who have engaged in this
    practice correctly, and ethically since the beginning of the Internet,
    pre-Google.

    Can we get some clarification on reciprocal linking please?

  151. My goodness. It appears “many” in this thread either did not read Matt’s first post in it’s entirety, or are only reading the parts they want to read.

    First; In NO way is Google only looking at “inbound” links. In no way is Google only looking at reciprocal links. In no way is Google only looking at links in general in regards to how often a crawl takes place, or how many pages of a site are indexed, or not.

    Many of you are not thinking about the much bigger picture. I also know that “some” firms out there, including firms who do seo/design for clients, have not experienced ‘any’ of the problems found in this thread. Matter of fact; all positions have actually gone up.

    One thing in Matt’s post went something like this….. for every one page that dropped out of first page serps, another page took it’s place with a happy camper. He also stated this, and why I say many of you did not read in it’s entirety.

    [quote]After looking at the example sites, I could tell the issue in a few minutes. The sites that fit β€œno pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling.[/quote]

    Sorry for the ‘speech’, but it’s sometimes tough reading stuff and not responding. πŸ™‚

  152. Excellent post, Matt.

    As an observer it is fascinating to see how Google seems to be trying to balance or reconcile what appears to be a long-standing corporate culture of secrecy with the increasing need to share more knowledge, information and insight with the world at large. I think that your role in that process is not to be underestimated.

    My technical prowess in this field is virtually nill and I am thankful that my site has not suffered a loss of pages in the index and is still maintaining a good ranking on my keywords. I am, however, very puzzled by the major differences in inbound links as listed by Google vs Yahoo and MSN. What especially struck me today when I checked was that, as well as listing about 6 times as many, Yahoo seemed to be giving greater prominence to what I would view to be better quality links. In particular the links from my CafePress store are prominent in Google whereas links from relatively obscure (but probably much more credible) sources get more prominence in Yahoo results.

    The other observation I would make is that I have always thought that Google was not as adept at searching images as at searching whole pages. For whatever reason, although able to consistently maintain a top 3 ranking for keywords ‘stained glass’ I have totally failed to get images even into the top 50 for the same keywords. I’m now reading (today) that perhaps a dash instead of a space might help but I do also believe that the image search mechanism is something of an achilles heel for Google. Just MO.

    Keep up the excellent work.

  153. Thanks for the great post Matt.

    Matt: “graywolf, it’s true that if you had N backlinks and some fraction of those are considered lower quality, we’d crawl your site less than if all N were fantastic.”

    One of my sites is a mom n pop jewelry store which has some really unique content. The link building process is going slowly because there’s only so much I can do, however, I find that a lot of backlinks are from scraper/junk sites which are totally beyond my control. Does that mean that my site will get crawled less because of these junk sites linking to me?

    Site sitemaps is making the webmaster be proactive in helping G with crawling, so why not do this with backlinks too. It would be really cool if G also provided a link removal tool, where you could specify domain pattern matches to discount certain links from being counted. I’m sure you’d also love to see the aggregate info. You could also tie it in to the spam report function… ok enough rambling.. need coffee…

  154. Doug. You are wrong about people not having read Matt’s whole post. It’s true that smaller parts are being focussed on, but that doesn’t mean that we haven’t read it all. A smaller part:-

    Some one sent in a health care directory domain. It seems like a fine site, and it’s not linking to anything junky. But it only has six links to the entire domain. With that few links, I can believe that out toward the edge of the crawl, we would index fewer pages

    and more about the same site…

    A few more relevant links would help us know to crawl more pages from your site.

    Google knows that the site exists, and they know that there are more “fine” pages that they haven’t indexed. They don’t need to be told to index more so that the engine is more comprehensive. They should try to make the index as comprehensive as possible.

    Matt’s best guess is that it is a low priority crawl/index site, and that they are intentionally leaving some of the site’s pages out of the index, just because it hasn’t attracted enough natural IBLs. That’s no way to run a decent search engine. It is grossly unfair to link-poor sites, and it short-changes its users.

    Now if you can think of a good reason why some of that site’s pages should be left of the index, just because it has only attracted 6 natural IBLs, then tell us. We’re all ears.

    You are whitehat incarnate. Do you think that webmasters should have to do unnatural link-building, just so that a search engine will treat it the same as other sites? Do you think it’s a good idea for Google to tell webmasters that their sites can’t be fully indexed unless they make the effort to do things that they (and you) have always talked against – doing things solely because search engines exist?

    A general purpose search engine should try to index all of the Web’s decent content as far as they are able. It should never come down to leaving stuff out just because it hasn’t had enough votes. If it’s there, and if it’s useful, index the bloody stuff.

  155. I guess webmasters still don’t understand what a link farm is because they keep asking if reciprocal linking is great!

    Some websites run reciprocal linking pages, you have seen them, “cut and paste this code into your pages” and we give you a listing and many directories do the same thing for you to gain a listing in their link farm database!

    These networks are simple for the SE’s to bust, webmasters must figure out that all this exchanging of links and getting a million links from anywhere and rank high stuff is old and worn out!

    I can see this is hard to accept because many have conducted “SEO” this way for years and don’t know any other way!

  156. My first post was long enough, so I didn’t adress it, but will now since Doug brought it up:

    “The sites that fit β€œno pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site”

    Inlinks?

    So now my position in organic results, or the number of pages Google chooses to index from my site, can be affected by the sites that link too me?

    I hope I’m interpreting that wrong….

  157. Hi Phil, For the record, my post was not aimed at you. I actually did not fully read your last post until now. I’ve just seen many in here that really don’t get the overall picture about what Google is trying to say.

    First off; The overall structure/architecture of a site has lots to do with ‘crawling’ in general,,.. and btw; has lots to do with how Google views your site as a whole… quality wise. Quality for pagerank… “INTERNAL” Google PR, and quality of other sites in your network of links in and out. AND: quality of the programming involved with the site. It’s the overall picture. Robots are not getting dumber,.. they are getting smarter.

    I think many simply believe that someone can go into an existing site and change some code here and there… and presto, the site is doing good. I also think many still think that this stuff is mainly about links coming in and going out.

    None of that can be further from the actual real world.

    [quote]You are whitehat incarnate. Do you think that webmasters should have to do unnatural link-building, just so that a search engine will treat it the same as other sites? Do you think it’s a good idea for Google to tell webmasters that their sites can’t be fully indexed unless they make the effort to do things that they (and you) have always talked against – doing things solely because search engines exist?[/quote]
    I can’t speak for anyone else, but my firm “stopped” looking for reciprocal links about 2 1/2 years ago. Matter of fact; we deleted all ‘link pages’ only that clients had. We don’t ever plan on “pursuing’ links in any way, shape, or form. It seems to work just fine. And no; some are in competitive markets as well.

    IMO; The best built websites that will do well into the future are those sites built “strictly” for their visitors. Period. That’s the philosophy we have had for along time now. If built that way, the robots will like the sites as well. At least they have up to this point in time.

  158. Matt, it’s an old topic but I have a question related to it.

    Is it possible say that if I put adsense on say, page1.html and it links to page2.html .. can that cause it to fetch and cache page2.html? Even though there are no links coming to page1.html or page2.html from anywhere else on the web, and it hasn’t been submitted to google?

    If so, this is a potential problem.

    Example: I put up a domain a while ago that just had 4 words on it… while I developed the site under a subdirectory.

    Anyway, one of these subdirectory pages had adsense on it while under development (testing placement etc..) and happened to have it’s links pointing to the main URL.

    What I noticed shortly after was that the main URL was now cached and in google’s index (while the page with the adsense wasn’t).

    I’m trying to think of another reason, but the main URL had no in-links at the time from anywhere (at least none showing in msn, google or yahoo, and no visitors from anywhere)

    Now the site is live, and I can’t get googlebot to revisit it. It’s been a month since it’s first uninvited visit, and since it saw just a “coming soon” type message, I can’t blame it for not coming back.

    Could this be an unintended consequence of the adsense caching thing?

  159. From google sitemaps

    “Some of your pages are partially indexed”

    Explanation from google sitemaps

    “We are always working to increase the number of fully indexed pages in our index. While we cannot guarantee that pages in our search results will always be fully indexed, crawler-friendly pages have a greater chance of being fully indexed. Our crawlers are best able to find a site when many other high-quality sites link to it.”

    So what I need to do its create links or I wont be indexed and create my pages for crawlers……..Zzzz

  160. HI Matt,

    I read your comments in a different way to most of the negative posting.

    It seems to me that Google have realised that they cannot index every page on the web every day and simply have to prioritize.

    Therefore sites with lots of good themed inbound and outbound links are “prioritized” in the crawl cycle in a similar way as they are “prioritized” in the SERPs.

    Sites with a high percentage on non-related reciprocals or spammy links are not given the priority treatment and therefore get crawled “Less Often” (rather than not at all).

    The result being that internal pages of these sites, especially the deeper pages of very large sites of this type, may drop out of the index from time to time resulting in poor SERPs for some pages of these sites.

    May seem unfair to some but if you think about it a themed natural link (or reciprocal in some themed cases) can be taken as a vote for the site by the WWW and as such makes it worth crawling by Search Engines more than a site without many or any votes.

    Question is should Google concentate on delivering high quality “popular” sites in the top ten, where most of the public click, or concentrate on indexing every page on every site even if the page is unlikely to be returned in the top 50 results.

    My take is the first option with the top ten results including the top ten most popular sites calculated by:

    Natural inbound and outbound themed Links (votes)
    Click Through Rate & time spent at site (popularity)
    Clear Relevent Title & Meta Tags (keyword friendly)
    Original useful updated content (quality)
    Some themed Reciprocal Links (community)

    If your site passes the above tests you really should be OK.

    Good luck all.

  161. Matt,

    Snow Crash by Neal Stephenson is one of my favourite cyber-punk books, you could also try Mr Nice by Howard Marks.

    Enjoy your vacation!

  162. >>Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled.

  163. Don’t know what happened to my message – here it is again.

    >>Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled.

  164. In regards to this whole revelation about crawling priorities and all that jazz, I have one question that should clear up alot of things for all of us webmasters and SEO folks.

    Lets say we have said site that is under a year old. It has well written and informative original content, clean coding, no shady stuff going on whatsoever. To promote a healthy link exchange, webmaster/SEO installs a link exchange directory which is accessible from all pages of the site. Now is having a link exchange directory that contains many different business categories (no porn, pills, warez, casino and hopefully no “made for adsense” sites) a bad thing?

    This is important becauase many MANY sites have this kind of a setup. And with the amount of free dynamic scripts out there that enable and automate the link exchange process, are they now considered to be tools of damnation in Googles eyes? Please give a good example of what is good in this scenario and what is bad about this scenario.

    I think alot of us are at the breaking point with Google, and that can only spell trouble in the long run for everybody.

  165. Hi Matt, I have a decent sized forum on my site with about 221,000 posts in 8,000 threads.

    I recently moved my forum from domain.com/forum to forums.domain.com, and I now have only 6 pages listed in google. I am guessing this will change, but I have lost some of my domain.com listings that were unrelated to the /forum directory.

    My concern is, I want to use some kind of redirect to send visitors to the appropriate link – what should I use? At the moment I am using a php redirect in /forum/index.php and /forum/showthread.php to redirect to the ppropriate link.

    And also, could this move and redirect be affecting my other results on the top level pages?

    Thanks for your help!

  166. um, so that is a great post… but like many others I am seeing strange things happening with one of my site where a 301 seems to be producing all sorts of strange results. Our site travelscotland.co.uk now seems to be registering on Google as http://www.scotland.net and strange variants of that domain such as http://www.facts.scotland.net even tho these have all been correctly 301ed to travelscotland as they are supposed to. I thought Big Daddy has cured this. I now notice that this problem seems to have resulted in the site not being indexed much anymore – caches are all for april where we used to be always up to date in the old days. Is this another artifact of the Big Daddy changes?

  167. Hello Matt, this morning I send a mail to bostonpubcon2006 at gmail, may be resolving this issue helps cut down the noise in your comments.

  168. Hi Matt, Hi JohnScott. Hi everybody.
    I’m the webmaster, the person who did this.. http://www.mattcutts.com/images/poor-quality-links.png
    Matt, first, thanks for not mentioning the url, my name or email – its really, really stupid step i took to bring back my clients site indexed in google again.

    In short:
    the website is about 6 months old, with around 30 uniques a day. I was building “natural” links for 2-3 months, few pr5,6 sites, but mostly 1-2-3.

    Right now it has 520 pages indexed, Matt, you can check them;)
    And again 30 uniques a day πŸ˜‰

    2 moths back – the website had (almost)each page indexed – around 1000.
    There were NO LINKS IN THE BOTTOM. Not a single link ! Only internal links.

    And the pages went down to one. Three in the best days. For a month.
    My clients were not satisfied – they asked me to fix this. So what I did ?
    Pyramid Linkings – 3 urls in the index. 5 DP co-ops. Reciprocials from directories – the clients were not going to pay me for this, so i just spend exactly one hour to set pyramidlinkings, coops, stupid directory listings.

    JohnScott, I AM REALLY REALLY SORRY for this link (v7 contest)! It IS RECIPROCIAL LINK FROM A DIRECTORY! I absolutely did not looked at the anchor text. SORRY AGAIN πŸ™

    When I bring back the “new website”, with this links in the bottom, I got nothing. Google started bringing back my webpages.. with around 10 to 30 per day. One day I saw they were back to 100 or something, then I sent the mail to Matt Cutts. The other day the index was good – 400 or something, from this day, Matt pointed the date, I was getting +10 pages a day. With this links from the image. An hour back I removed every single outgoing link, I left only the internals.

    My hands are shacking.
    I wont be able to sleep…

    Matt, I did not wrote everything above in the right direction – I dont think myself this is the reason for NOT being indexed in google.
    The situation is not the way you describet. The reason is something else, i know my website, i know what i did with this. I can proove that this links are not affecting my dropping and upping pages…
    I will not post anything else here if Matt dont want to.

    JohnScott, you are great person, sorry again for mentioning your contest with this bad topic πŸ™

    /sorry for my broken english – Im from Eastern European Contry πŸ˜‰ /

  169. Matt,
    Topic Specific links
    I’m a believer in topic specific links. Is this post saying that off-topic links – whether inbound or outbound – will incur a penalty?

  170. Isn’t relevance more important then if there is a wrong link on your site?

    Sorry, this is bullshit. I only want to find what i’m looking for. And if there’s a wrong link on the page(which could be relevant to the human reader) doesn’t intrest me.

  171. IMO; The best built websites that will do well into the future are those sites built β€œstrictly” for their visitors. Period. That’s the philosophy we have had for along time now. If built that way, the robots will like the sites as well. At least they have up to this point in time.

    Can we get an Amen and a Hallelujah for this?

    TES-ti-fyyyyyyyyyyy, mah brutha!

    Come on, everyone, throw your hands up for the GOOD Word.

  172. If I have to look at one more Amazon listing at #1 and #2 after being bumped to #3, I’m going to run screaming into the night.
    Just because it’s dAmnazon doesn’t mean EVERY page of their site is more relevant than everything else on the planet…

  173. Hi Doug. I didn’t take your post as being aimed at me personally πŸ™‚

    Unlike you, I’ve never done reciprocals, so none of my sites, or any site that I’ve ever worked on, will ever suffer because of that method. I haven’t even been talking about any sites that I’ve had anything to with, although I used one as an example, because it specifically has a clean bill of health, and it’s frustrating watching it die – presumably because I didn’t do any link-building, so it doesn’t have enough IBLs.

    This isn’t about a site “doing good”, Doug, or about rankings (you mentioned that in your previous post). This is about a site being treated fairly – just because it’s there. If a site contains useful pages and resources, then it should be fully indexed, regardless of how many votes it’s managed to get. That’s what search engines are supposed to do. That’s what a search engine’s users expect it to do for them – they (we) expect a good search engine to give us the opportunity to find as many useful pages and resources as it can. They (we) do not expect the engine to intentionally leave out useful pages and resources.

    Doug. Can you come up with a good reason why that example site (the one I mentioned in my previous post) should not have all of its pages indexed?

  174. Thank you for the post; it was the most information I’ve seen on the situation thus far.

    Unfortunately, it was also rather disheartening as regards inbound links. Some sites just do not naturally generate all that many valid inbound links – and I’m not talking about small “mom & pop” sites, either. My two largest sites are B2B catalog shopping sites, not really small as each has in excess of 2000 products online. Both have been in operation for years, and both were hit very hard in the past few months by pages dropping out of the index. I would stake my year’s salary that there aren’t any spam issues with either site (and both of which did QUITE well up until BD started rolling out) – but (at least according to Google’s light) there just aren’t many links out there to either site – the only places that WOULD link them would be spammy b2b directory sites, for the most part. They are undoubtedly in a lot of people Favorites folders, because they have very high rates of return customers, but they will never ever collect up any quantity of relevant natural links. I mean, think about it. Would walmart.com need a bunch of backlinks in order to rank highly, and why would anyone put a link to walmart.com on their site in the first place?

    So now I have to explain all this to my clients, who aren’t going to understand anything except that they’re pretty much out of Google for the foreseeable future, and there’s nothing legitimate we can do to get back in.

    Like I said, disheartening.

  175. Matt,

    Sounds like Google is now actually penalizing for poor quality inbound links. Does that mean that a malicious competitor could link our site to FFAs, link farms and other bad neighborhoods and actually hurt our rankings or get our crawl cycles reduced?

    Also sounds like Google doesn’t like affiliate links. But if AdSense is okay then why are affiliate links bad? Seems that coupon sites, shopping comparison sites and reviews are in jeopardy of being dinged now.

    I have a PR7 homepage (www.shopping-bargains.com) but no rankings and my homepage isn’t even in the index (I just checked and we have only 10 pages there — was 8 yesterday). We used to have thousands of pages indexed. Something seems odd — we have no reciprocal links, don’t sell links, don’t buy links in mass (occasionally buy a banner or link for marketing reasons in newsletter or blog, etc.). We have original content and have been online for 7 years. The index is fluctuating wildly though for us.

  176. I too agree with PhilC!

    And, all I know is that as one who has tried to actually use Google to buy some rare plants, the results are way bad!

    Trust me that when you do a search for an item to buy and all you can find are the paid for listings and stupid content sites, the value sure ain’t what you were looking for!

    I have even put “buy” and “for sale” in the searches and all I get are stupid content sites. Some great value that is! I want more choices than the stupid paid listings. Thank you just the same!

    All I have found are stupid sites describing and telling the history of the plants. What I wanted was the small plant nursery in New York that sells the plant. I was ready to buy! That is why I typed “buy” in the queries. Too bad and so sad that they didn’t attract enough natural links! No results for me!

    Google results suck for shoppers of rare plants, at least!

  177. Matt,

    About indexing. Why is it that the site: command for some sites still showing more pages than the site actually has? I know of sites that have maybe 12.000 pages and Google is happily claiming they have 35.000 pages indexed.

  178. [quote]Sounds like Google is now actually penalizing for poor quality inbound links. Does that mean that a malicious competitor could link our site to FFAs, link farms and other bad neighborhoods and actually hurt our rankings or get our crawl cycles reduced?[/quote]

    Mike, IMHO, Google is not actualy penalizing a site for poor quality IBL. What they do is they just discount those links in a much more drastic way than before. What is more, they now regard reciprocal links as very poor quality links.

    As you have less IBL, your PR drops and your site is less crawled / indexed than before.

  179. I think you guys might be missing one thing here. I dont think what happened is permenant thing. Your pages have dropped out because of crap links not counting or your quality links own crap links not counting. It is all about reputation. If your pages are gone now it is because you lost your reputation. This doesnt mean your pages will not get indexed. This just means your pages are pushed back to a waiting almost sandbox like state where it will take time for them to index again. Quality natural links just help like they allways did. Now it is just harder to fake.

    Maybe Matt can tell us if that is true… Will the pages eventually get indexed even if they dont have the reputation from other quality sites? Is it a matter of time and age if there are no links?

  180. Adam Senour… what do you think about webmasters who submit their sites to directories?

    These types of links arent’t exactly natural, and often aren’t relevant. Are these people trying to manipulate the search engines?

  181. Alex Duffield

    OK, Matt, I just don’t get it, how on earth does a site do this sort of linking

    http://hyak.com/links/links_computers_internet.htm

    And not get penalized.

    They have thousands of links like this, totaly irelevant to the content of the site.

  182. It is not a penalty.. it is a matter of what counts and what doesn’t PERIOD.

    What feels like a penalty is just the fact that you lost reputation due to your links or your links own links are no longer valid. Your competition can not hurt you by doing bad link in your site’s name. That might actually help with whatever small amount of traffic you get from that. Other then that it will not hurt it will just not help.

  183. Matt,

    Thank you for all the information that you furnished to us and the examples that you showed. It was a very welcomed feedback.

    The information that you furnished surely helped everyone understand what is happening and why it is happening.

    Personally I hope that Google continues to work on presenting quality sites that assist a visitor with helpful information in the high SERP’s. People have to remember that Google sets guidelines and those don’t follow the guidelines will suffer. We have to follow the “rules” in order to win the high SERP’s.

    Again thanks for your post.

    Now enjoy your vacation and the time with your family.

  184. People have to remember that Google sets guidelines and those don’t follow the guidelines will suffer. We have to follow the β€œrules” in order to win the high SERP’s.

    Exactly what “rules” did the Health Care directory site break? (that’s one of the examples that Matt gave)

  185. Hi Matt, thanks for the post, I agree it was good to hear some real estate examples used.

    I have a question regarding overuse of reciprocol links possibly causing lack of crawling? I run a network of real estate sites, one site for each different country/region we offer (total 9). Each site links to each of the other sites for obvious reasons.

    My question is would this interlinking of 9 different sites to one and other from every page on each site be regarded as spammy for the google bots. Would this have a negative effect on my sites?

    If anyone else can offer any advice I’d be very grateful. Thanks.

  186. PhilC, again do you feel that the pages even in time will not get indexed without quality links? If a site has something indexed, don’t you think in time the rest will get indexed? I think so, but how long is the real question.

  187. Matt, could link selling affect a site’s ranking in the SERPs, directly or indirectly, despite the site’s not “losing” any pages in Google’s index?
    I’m seeing a client that sells some (on topic) text links on some of his pages suffer in the SERPs since a couple of days ago, yet that client hasn’t lost any pages whatsoever in the index – the only thing I’m seeing is that all pages that previously ranked well are now not ranking well (yet still can be found)… Could there nevertheless be any conection?

  188. Mike, it might have something to do with the many many links you have from spam pages like this one:http://www.creativehomemaking.com/articles/112603g.shtml

    and the fact that all of these spam pages participate in the same “web rings” with the links on the bottom.

  189. Hi,
    Matt,
    If the sites you stated above as having poor links, if the outwards links had a rel=nofollow would it improve the number of pages indexed?

    thanks

  190. Phil my site is a mom and pop site, I don’t have a lot of quality incoming links (have a ton of links from scrapper sites), and all my pages are indexed. Like you I have never done a link exchange and don’t plan to. So the idea that you have to out and build links in an unnatural way is not the case for every site.

    Most of my pages can be reached by 3 clicks, a few 4 clicks from the home page, and or the common navigation that is on every page. I don’t use Googles site map. I do have a pretty good html lite map, so I’m wondering if navigation may be part of the problem for some sites loosing pages or not getting them included in the index.

    I see a big difference in the number of pages when checking with the API and actually checking at Google.

    I do have a problem that has surfaced in the last couple of weeks. For me Google is having problems with 301 redirects again. At least for me.

    site:domain.com 500 plus pages
    site:www.domain.com plus pages
    site:www.domain.com -www.domain.com 300 plus pages which are supplemental so maybe they are on their way out again.

    Up until a few weeks ago site:www.domain.com -www.domain.com showed 0 pages. According to the API domain.com is showing PR again for domain.com.

  191. hmm, so things are becoming more and more difficult each day and i am now of the view that people will need to understand the real importance of Good Content updated frequently and having good links only.
    No short cuts anymore 😐

  192. Connie, is your site ranking for smaller terms only? I bet your niche is not a very competitive niche and you are ranking for obscure terms for then large general terms.

    I too am associated with a niche site not doing any link building that has not been affected by Big Daddy, but this site is ranking for small stuff.

  193. [quote]Can you come up with a good reason why that example site (the one I mentioned in my previous post) should not have all of its pages indexed?[/quote]

    The one in frames? The one using javascript links and no hrefs? The one that looks like an MFA site? Is that the one?

  194. It’s worth remembering that not every site can ever appear at the top – seems obvious, but Google has to place resuts in some kind of order, and I have no quarrel with newer sites not being featured as fully as mature sites; if they develop, their turn in the sun will come.

    If the result is a cleaner, more spam-free search result, then I doubt many users will be complaining. And I suspect Google has not forgotten the needs and preferences of searchers, as we consider our sites, our client sites and the spam sites that get in the way.

    This has been a very useful thread – but has it really contained any surprises? I don’t think so – Google has long warned of the bad practices mentioned above; just some people just never believed they could put their money where their mouth is; kudos to Matt and the teams for significant progress on reciprocal and paid-for links.

  195. I’ve spent alot of time today simply looking through the serps and reading posts here there and everywhere.

    Yesterday we saw a massive change in the serps and now my target market results are full of nothing but spam, cloaked pages and general rubbish.

    Now, I run a number of affiliate based sites, but i build my own pages with my own content and as of yesterday, i basically don’t get found, has Google scanned my site, realised that my visitors are joining an affiliate program and therefore penalised me due to that?

    I can appreciate penalising sites that buy a domain name and then basically copy their affiliate programs text etc. and thow the site up for themselves with nothing originall to offer at all, but just becasue I promote affiliate dating does that mean that the 2 years of work that has gone into the site is simply ignored?

    Matt, I’d be intrested in any feedback and the url is available if you get the time to respond.

  196. Connie. It may be that a very large number of sites haven’t suffered – yet. But I can’t see that what’s happening is to do with the navigation. For one thing, Matt used examples where he said that, if they get more IBLs, the new system will know to index more pages (as if they don’t already know). For another, the many sites are having their pages dropped, had their pages indexed – so why drop them if they are already in the index?

    We have always known that PR affect the frequency and depth of crawling, so crawling was never equal amongst sites. But now they have added links to the criteria, and if a site doesn’t do well on links score (e.g. not enough IBLs that can be trusted), it just doesn’t get crawled as it deserves, and its owner is froced to either accept it, or to get spammy and do some link building.

    I was tempted to suggest that it still may be down to just PR, because IBLs bestow PR, but my site that I’ve used as an example currently shows PR5 on the homepage, which has always been enough for decent crawling. Even so, the toolbar PR is always out of date, or they may have simply moved the scale up a bit. But Matt said that the new crawling/indexing system is new, so I’m sure that it isn’t still just PR, and that IBLs, and maybe OBLs are significant factors.

    The example site that Matt gave – the one that I used in a post – is a directory, and, as a directory, it probably needs to be drilled down – good pages that are plenty of steps away from the homepage. If the number of steps could have been the cause, then I’m sure that Matt owuld have said, instead of simply saying get more IBLs and we’ll crawl more of the site’s pages.

  197. Hi Matt,

    some URLs lose pages in the index but this site is still growing :
    site:69.41.173.145 –> Do you think your duplicate content algo is ok ?

    Greetings from germany,
    Tobias

  198. Hi Phil, don’t know. If you post your site in question, maybe “ihelpyou” with it. πŸ˜€

    You know there is no way a general answer to an unseen website with problems is a good thing. I’ll put it this way; I really doubt your problem with your site has anything to do with “links” in or out. The entire backend code and html code output might need to be redone.

  199. Adam Senour… what do you think about webmasters who submit their sites to directories?

    These types of links arent’t exactly natural, and often aren’t relevant. Are these people trying to manipulate the search engines?

    In and of itself, I don’t have a problem with the concept. The problem lies in the quality of the directory, whether it offers free one-way inclusion to sites that deserve it, and how many of said sites they submit.

    Submitting to a directory, particularly to one with a captcha tool or similar device, is not automatic, nor is the approval (depending on the quality of the directory again). I don’t really see it as “unnatural” either, since the basic premise of these sites is to act as informational portals. Submitting to a directory provides them with the content that they need to build their own site, and gives the webmaster a link for traffic generation purposes (notice how I didn’t use the phrase SEO purposes).

    A good example would be Human Edited Directory (yeah, I’m a mod there, so I’m slightly biased…although I’d say the same thing if I wasn’t). You don’t get onto that directory unless you damn well earned it, although you can submit for free.

    What’s “unnatural” to some about submitting to directories is that you have to go to some stranger’s site, find a relevant category, and ask for a link in that category.

    So no, I don’t have a problem with it, as long as the directory has some quality standards in place and the link provided is a relevant category backlink.

    To borrow from something Phil has stated in the past, it’s quite often not the concept, but how people choose to abuse it.

  200. Hi Matt,

    I have a quick question for you regarding the supplemental index. When I search for a manufacturer part QUA41965 in Google it starts returning supplemental results on the first page (4 out of ten). Each additional page is primarily in the supplemental index. When you drill down to page four you can still find results that are not in the supplemental index though. Should not pages not in the supplemental index be returned before those that are? It seems to be against what is considered supplimental IMO.

    Thanks.

  201. No more free traffic

    hehehe

    Buy some Adwords folks. Free traffic is now only free for the multinationals, spammers and those with special deals with Google.

    All the others buy some AdWords please…contribute to the great cause…

  202. Is SPAM any attempt to deceive the SEs to artificially increase rankings?
    What if I have a nice W3C validated site with some 16.000 clothing products from various vendors, nicely categorized, with updated datafeed, some coupons, with no spaming techniques, no hidden text, no cloaking, nada. zip, zero.
    That, in the definition of SPAM… is not SPAM.
    So why am I reduced to one indexed homepage in Google?
    Isn’t the ability to search for the same class of products from various vendors at the time, and compare prices, service enough for the mighty Google?

  203. Hi Matt,

    Great post, and thanks for following up the comments, makes for a happy community πŸ™‚

    Like most people here, I have a number of otherwise good sites with 5 or so rubbish footer links on each page. I was never happy about putting them there, but only did it because it does work, and there is no point writing original content if nobody will read it.

    Are you saying that these links are now completely worthless? I’m all for “best practice” and following the guidelines, but I’m reluctant to stop this kind of linking if it still works for other people.

    Don’t get me wrong, I’m very keen to see the death of unrelated footer links, “resource” directories and begging emails – but if Google still rewards these practices with good rankings then people will continue to use them.

    Thanks for the responses so far.

    Harvey.

  204. Eternal Optimist

    In reference to supplementals, I am sorry but the issue has not been dealt with here, as doing a site: check on a number of both small and large sites, it is difficult to find a site that does not have any supplemental issue.

    This must mean that Google has an inbalance in the settings of the algos, OR that they deem virtually no sites are worthy of the merit of having a clean bill of health.

    I can check the same sites, which aren’t even mine, on a daily basis, and they delve deeper and deeper into supplementary hell.

    Why would webmasters have coined the phrase ‘supplementary hell’ if there was no issue?

    Thanks for trying to appease webmasters Matt, and we don’t blame you. It’s just that we feel it’s about time things improved.

  205. One thing I want to be clear about is that Bigdaddy isn’t especially intended to do differently on spam; it’s just an infrastructure upgrade to our crawling, and we get better at judging link quality, our crawl changes as a natural consequence of that.

    The other thing is that I certainly don’t want to imply that everyone who is still seeing less pages crawled was somehow getting spam or lower-quality links. I just wrote up the five cases that I analyzed in more depth. As a large change in our crawling infrastructure, it is to be expected that some sites will see more or less crawling.

    In fact, I just got out of an hour-long joint meeting with crawl/index. Jim, we talked about your site, the one where you said “I’m trying to maintain a straight ship in a dirty segment.” There’s absolutely no penalties at all on your site; it’s jut a matter of PageRank in your case. You’ve got good links right now, and several hundred pages crawled, but a few more good links like you’ve got now would help some more.

  206. what does this command do? site:www.domain.com -www.domain.com

  207. I have to admit, from reading the post and Clikz column, that what is happening in practice is that sites with less money and marketing spin behind them are regarded as less important, and are therefore to be pretty much to be ignored. It’s no longer about document relevancy, as much as site popularity.

    Perhaps one day Google will go a step further, and simply take the top 1000 sites according to Alexa, and return only results from them? πŸ˜‰

  208. Let me also describe a little bit of the interaction between the main results and the supplemental results. Think of the supplemental results as a large set of results that are there in case we don’t find enough main results. That means that if you get fewer documents crawled/indexed in our main results, you’ll often see more supplemental results.

    So I wouldn’t think of “having supplemental pages” as a cause of anything. It’s much more of an effect. For example, if you don’t have as much PageRank relative to other sites, you may see fewer pages crawled/indexed in our main results; that would often be visible by having more supplemental results listed.

  209. That’s a post Cutts! hehe. As you’ve stated that poor choices in outbound links can cause crawls/indexing to be negatively effected, I’m wondering if the opposite can be said of linking to high quality (trusted) relevant links? What say you Inigo?

    BTW, great show yesterday. Hope you can do similar more often!

  210. shorty, much appreciated. I wanted to get the timeline out of my brain and talk about what I was seeing before I headed out for some time off.

    Alex Duffield, in my experience those links aren’t making much/any difference with Google.

    Peter, without knowing the site I couldn’t be sure. It’s possible that we’ve indexed the site with www and without www, or there might be some session IDs or other parameters that are redundant.

    “Sounds like Google is now actually penalizing for poor quality inbound links.” Mike, that isn’t what’s happening in the examples that I mentioned. It’s just that those links aren’t helping the site.

    David Burdon, no, off-topic links wouldn’t cause a penalty by themselves. Now if the off-topic links are spammy, that could cause a problem. But if a hardware company links to a software package, that’s often a good link even though some people might think of the link as off-topic.

    nsusa, WordPress seems to have problems with the greater-than sign.

    Peter Harrison, thanks for the book recommendation! I love early Neal Stephenson (less so his historical fiction).

  211. “The other thing is that I certainly don’t want to imply that everyone who is still seeing less pages crawled was somehow getting spam or lower-quality links. I just wrote up the five cases that I analyzed in more depth. As a large change in our crawling infrastructure, it is to be expected that some sites will see more or less crawling.”

    Kind of enlightening that this MAY not be the case for us and others. Still hurts not having the whole site (well the better part of our sites) being “avoided” in the index and not know why this is happening after 5 years of business.

    The sad thing is that the only thing we really have to go on is sharing experiences and this isn’t getting many of us very far just that there is some sort of problem and we can’t find a correction.

  212. Matt, what about sites that have some pages indexed.. with no links to the site or very few, will the enitre site ever get indexed? Is it a matter of time or do you have to get more links to get pages indexed deeper?

  213. nuevojefe, thanks! It felt pretty business-like and on-topic. After the mike turns off, then I took Danny up to an office and we just chatted for a couple more hours. It’s amazing to me just how much fun some of the top people in search are. πŸ™‚

    To go to your other question. I wouldn’t be thinking in terms of “if I like to Yahoo/Google/ODP/whatever, I’ll get some cred because those sites are good.” If it’s natural and good to link to a particular site because it would help your users, I’d do it. But I wouldn’t expect to get a lot of benefit from linking to a bunch of high-PageRank sites.

    Peter Harrison, I’m going to go buy some books right now; you’ve inspired me. πŸ™‚

  214. But I still don’t see an explanation for pages not showing up on the regular or supplemental index that have been craweled and that are over a month old.

  215. What a Maroon

    Matt, why is it bad for a real estate site to link to a mortgage site? They seem to go hand in hand. Obviously I couldn’t follow the link to check the site to see if it was just a scum sucking scaper site, but your statement seemed to overgeneralize. If your bot does the same, then Google has a problem.

    I am also perplexed by the reciprocal linking issue. Is it now always a bad thing? Is the relevancy of the topic a compensating factor? While it may be gamed, it has also become a powerful networking tool for many. In my personal services sector where referred business is an integral part of the business model, I have received referred business from those i meant via reciprocal links that amounts to close to $50k in income in the last 10 days alone. How is this bad?

    I am also confused by the apparent contradiction regarding links, PR, crawling and indexing. Sounds like a chicken and the egg scenario. Its implied not to buy links or reciprocate, but if that advice is followed, then Google wont crawl it or index it, so how is anyone to find it to be so overwhelmed as to be compelled to graciously link to it?

    OMG, did I just agree with Jill and PhilC on the same issue in the same sentence?

  216. Let me also describe a little bit of the interaction between the main results and the supplemental results. Think of the supplemental results as a large set of results that are there in case we don’t find enough main results.

    This has also been part of the problem Matt. The supplemental results have been unsearchable. They have not been being returned when you don’t find enough main results.

    Dave

  217. What’s Googles definition of ‘Find web pages from the site domain.com’? If you click those links, you some times only get the index page. Even supplemental results should show up when you make that search.

  218. Wow Matt

    You got some cahones and came out and said the CEO was wrong about the machines being full??

    Everything else is old hat SEO that amounts to “Webmaster Quality Guidelines” being followed.

    Obviously you could have saved some carpel tunnel just telling people what I and others have been saying ,.reciprocal links have zero value other than to hurt you. Follow Googles webmaster guidelines and you’ll be fine.

    Clint

  219. Wayne Said,
    May 16, 2006 @ 1:07 pm

    Thank you Matt for the update. I really appreciate you finally using some real estate sites as examples. Since this is an indexing issue I thought I would bring it up.

    After checking the logs today I noticed this coming from Google pertaining to our site.

    http://www.google.it/search?hl=it&q=fistingglessons&btnG=Cerca+con+Google&meta=

    LOL now as you can see the #2 site is a real estate site listed for this search term.The page showing for this search is a property description page. As you can tell from the sites description it has nothing to do with this subject matter. Would you mind checking with the index team and see why maybe this would be indexed for such a phrase.

    Matt Could you please check on this for me with the index team. I am sorry but I am getting a lot of traffic from this according to my logs which we shouldnt be ranking for fistinglessons for a home listing details page. The house listing has been removed from that but its the kind of traffic I do not wish to have.

    If you decide to dig around in our site. Your thoughts on whether we are abiding by what Google likes to see would be nice.( I know I ask for it so what I get I wont hold against you πŸ™‚ We want to stay 100% Google compliant but as I have said before we are small fish in a big pond so we make mistakes like everyone else.

  220. Saying you can’t do reciprocal linking is just sheer idiocy. How does Google expect you to get back links?

  221. Guys can you please stop asking silly questions…

    The message is crystal clear…use AdWords…

    πŸ˜‰

  222. Hi Phil, don’t know. If you post your site in question, maybe “ihelpyou” with it.

    You know there is no way a general answer to an unseen website with problems is a good thing. I’ll put it this way; I really doubt your problem with your site has anything to do with β€œlinks” in or out. The entire backend code and html code output might need to be redone.

    I didn’t ask about my site, Doug. I asked you if you could come up with a good reason why the health care directory site that Matt used as an example shouldn’t have all of it’s pages indexed. Perhaps you should have *all* of Matt’s post πŸ˜‰ There isn’t a good reason. Matt’s best judgement is that it’s a shortage of IBLs. Simple as that. The site had had it’s pages indexed, but with the new BD crawling/indexing it’s pages have been dropped simply because it doesn’t have enough IBLs. It makes sort of sense at all.

  223. That last sentence should have read…

    It makes NO sort of sense at all.

  224. Matt: is there any way to tell Google “index this page, but serve the permalink in the SERP?” This is a problem for my blog … when entries are being served off the main page (http://dossy.org/), searches for keywords in those entries return the main page URL in the Google SERP. However, it seems the index is updated less frequently than entries dropping off my main page, so while Google’s SERP brief text shows the relevant content — causing a user to click through on the result — the page they end up on no longer has the content. Eventually, it seems Google’s crawler figures things out and the SERP eventually links to the permalink for the entry … but I’m sure this behavior is frustrating some users.

    I tried adding the meta header “noindex” to my main page to prevent it from showing up in SERPs, but then for searches for “dossy” where the main page SHOULD be #1, no longer has the #1 spot — very annoying. So, I’ve removed the meta “noindex” from the page and am waiting for Google to crawl my blog again.

    Any advice? Thanks!

  225. Matt,

    You have no idea what your two sentence comment has done to lift the spirits of 2 down and out guys in boston… thanks!

    jim

  226. It appears that everyone is getting on board the link train as the
    problem.

    So I did some checks on one of my sites, When I do a link:www.mydomaininquestion.com on Google , I get Results 1 – 1 of about 45 linking to http://www.mydomaininquestion.com. (1.25 seconds) in the bar, with only a page from my actual site shown.

    However if I do the same thing on yahoo i get Results 1 – 10 of about
    671 for link:http://www.mydomaininquestion.com.

    So for some reason 44 of my links that google knows I have are hidden
    from view but show up in the count, and perhaps hidden from the
    indexing algorithm, this seems like something very specific that could
    be checked out on your end.

    Thanks.

  227. Is your site losing pages from the index, John? I just did link: check on my site (the one I mentioned earlier), and it’s the same. It will only list 16 of about 629 links. For my site, I’d put it down the constant changes in the index as pages are being dropped wholesale on a daily basis. I’m thinking that the index might be a bight confused concerning my site right now.

  228. PhilC,

    I know Matt doesn’t want this to turn into a discussion board, so feel free to delete this post, but to answer your question yes, we were at a high of 17,000 pages in march, two weeks ago 500, saturday 140, today 39. I’m not going to check anymore after today, because I know whats next: NO INFORMATION FOR THAT SITE

  229. Some thoughts:

    1) The explanation that Google isn’t fully indexing sites based on the lack of quality/quantity of incoming links and lack of quality of outgoing links sounds like a policy change. Has Google always lacked a commitment to building the most comprehensive index it can and this is just the first time those sentiments have been voiced, or is this something new, perhaps in response to a storage crisis and Google’s inability to keep up with the growth of the web?

    2) How finely tuned is this improved link quality filtering? Does it simply look at the percentage of IBL’s that are reciprical, and apply a filter after a certain threshold, or does it attempt to determine the relevancy of those recipricals before placing a value on them? When evaluating the relevancy of both inbound and outbound links is this just a quick semantic analysis that would miss the fact that the ironworkers endorsement of a pizza joint is certainly a good link? How much good content is Google willing to hurt while trying to prevent having it’s results manipulated?

    3) What I see here is good times ahead for link-building SEOs. Panicky phone call from owner of a marketing site seeing it’s thousands of pages dropping from the index… Calm explanation of Google’s new approach – show them your blog here Matt… Tell client what will be involved in building a link network on multiple domains across Class C’s with relevant content that Google will perceive as quality IBLs… You could always just spend that money on Adwords and trust that Google won’t bill you for click fraud – show them lawsuit pages…

    I hope Google has something better than this coming along quickly, because the future doesn’t look pretty.

  230. Hi Matt C – reworded, jeesh

    Many have an odd issue, what is the Big Daddy cause here?

    Sites have done well for a very long time, sites have some good natural incoming links from major sources in science magazines and even links from large established online portals.

    Since April 26 or so almost all our pages went from page 1 to page 4 across the board. Is this a penalty, why such a drop so fast on all positions on all terms? This is happening to many sites.

    Not sure what to make of this, we did submit a re-inclusion request but I have seen no change, why such a huge dump on a site?

    Thank you Matt

  231. Matt, I got the feeling, that you are very harsh with the new filters… I just dropped from 3M results to below 300k with my site. That does not seem right, since I am one of the few places, where you can download MP3 files, which you can buy on CD elsewhere. I think G found the duplicate description and tries to filter now. That is not good. Mine are downloads, having the same description of the artists like the tangible goods… And I am gaining incoming links like a maniac, by signing up over 10 digital merchants a day… I get incomings from every place of the net, but I do hope that incomings can not hurt one? Maybe the over 500 scrapers, who live from my RSS feeds are causing that?

  232. Dave (Original)

    PhilC and others

    Have you ever considered that with soooooo many web pages out there Google (at this time at least) HAS to limit its crawling and indexing.

    Lets face it, they are (and have been for years) doing a better job that the other BIG 2.

    When I search the SERPs to BUY, I often see mom & pop pages ABOVE those of the BIG merchants.

  233. Well it just went critical for me after 10 years on the Internet.
    Google has eliminated so much of our site, rankings, and traffic we likely won’t survive. What’s going on is just way too harsh. We play as much by the rules as we can. We only have about 75 links but apparently Google is annihilating our small site. I wish I could take a vacation but I’ll have to worry about putting food on the table.

  234. Phil, I was responding to you thinking you meant the site you were watching.

    Okay, …. health care directory?

    Without looking; I’d say it’s more about the sites that directory is listing than anything else. Does it require a link back? Is it all paid? Does it exist strictly for adsense? Does it have a real purpose for users on the internet? Being in the market it is in, I’d have to have those questions answered and see the site. I’ll bet big bucks it’s because the quality of sites listed isn’t good. A major search engine has to start drawing the line somewhere. No one could continue to simply index page after page of low quality websites, especially directories.

  235. You say that like it’s a bad thing:)

  236. Dave (Original)

    How many people ACTUALLY search for a directory anyway? Answer, not many. They are so low in demand that Google shifted its DMOZ clone off their main page years ago. Check your log stats, even DMOZ sends next-to-nothing.

    Besides, why on earth would/should a SE list a directory page?? It would be a link to more links! There is no longer any need for this 2-step approach as SE’s are so much more advanced than when directories WERE popular.

  237. I didn’t ask about my site, Doug. I asked you if you could come up with a good reason why the health care directory site that Matt used as an example shouldn’t have all of it’s pages indexed. Perhaps you should have *all* of Matt’s post There isn’t a good reason. Matt’s best judgement is that it’s a shortage of IBLs. Simple as that. The site had had it’s pages indexed, but with the new BD crawling/indexing it’s pages have been dropped simply because it doesn’t have enough IBLs. It makes sort of sense at all.

    Actually…that’s not what was said at all. That’s what you chose to read. The key sentence is actually here.

    Hold on, digging deeper. Aha, the owner said that they wanted to kill the www version of their pages, so they used the url removal tool on their own site.

    It was dropped because the site owner made a mistake. Not a spammy mistake, and certainly an honest one, but still a mistake.

    That’s not BigDaddy.
    That’s not Google crawling/not crawling/indexing/not indexing.
    That’s not Matt pretending to be God and striking down upon some site that apparently doesn’t deserve it.

    That’s a webmaster relaying a message, unintentional as it was, to Google asking for a removal.

    So there’s a perfectly good reason for Google to remove it…they were asked to do so.

  238. Matt, thanks for this update, I have to say, this confirms what I’ve been increasingly suspecting about a vast majority of those webmasterworld posters who have been complaining about these specific issues, and it fits exactly with what I saw over a year on another search forum I did for a while.

    Especially amusing was the guy who had 10k indexed then dropped to 80, I’ve read him, he comes off as if he’s lilly white, and there’s that typical spam garbage.

    This isn’t your problem, but I think wmw’s policy of not allowing any reference to the site in question is starting to seriously damage the viability of their search forums, especially their google forums. As you found, and as I’ve suspected, quick checks showed the weaknesses easily. That’s exactly what I found over a year of doing site checks too, that’s why I stopped, it got boring and predictable.

    But still very glad to read the updates on this, I’ve been following that supplemental nonsense for a while, I pretty much ignored the big daddy indexing stuff because it was pretty clear what was happening even without being able to look at the sites in question.

    I don’t envy you your job at all, having to dig through this stuff all the time.

    Too many comments, didn’t read them, no need, your post was pretty concise.

  239. Here is a question about quality “earned” links and recip or poor quality links. I am a web developer, and I create amazing websites that are linked to from all across the web because people are talking about the design, or functionality or other legitimate means.

    Now I have a credit for work link as an image (My companies designed by logo) on each site. Thus in essence each of these sites is a backlink for me. By itself is this a good link?

    Now here is the other thing – I also like to place my designs in my online portfolio for prospects to view (Im showing off my work) – and this usually includes a “visit this website” type link so they can see what the site looks like in real time and what the client is doing with. Have I not in essence created a reciprocal relationship here? Will these links be discounted in some way, were they poor quality links to begin with?

    This relationship seems very natural whether link popularity or pagerank existed or not – companies would put their credit for work logo on a site, in hopes that others who appreciate the design would see the designers insignia and hopefully hire that company. And of course we artists always want to show off our work.

    So what’s the deal? I know all recips arent bad – but where is the line?

  240. Addition to above, and I failed to mention it – I am not complaining.

    I rank #4 out of 153,000,000 in Google for my most important targeted term which is quite competitive, like I said, I’m not complaining. However after reading this, it almost made me want to remove the links in my portfolio to my designs, or put a no follow on or something. And what about my discussion forum? I run a vBulletin forum on my site that has thousands of members, all of which are also web designers or clients. They are there to post and chat and learn about web design, show off their latest projects etc. Now they have signatures so people can view their work, and they are html links. Thousands of them also link back to my site with varied anchor text, usually along the lines of “proud member of” that sort of thing. How does this sort of thing play out with reciprocal links in this situation?

    I am not going change my portfolio of course, because they way I have my portfolio set up makes sense to me and it is not about link pop for my clients its about showing off my work. But my forum is another matter – we try to keep it as clean as possible and have great mods that kill spam right away, so I am still confident in the quality of my members signatures, but should I let posts like this scare me – is there a benefit to my users (From a link perpective to having the links in thier sig, is it a detriment to my site?)

    I mean no matter what the members would still post if I took sigs away, they are their for the education and a sense of community, but what bugs me is that I feel like I have to do something special in fear of loosing my google rankings, that “If search engines didnt exist” I wouldnt do. Signatures come stock in vBulletin and members like to get creative with them and use them, they have fun with it. Do I need to alter this natural element to appeas the great G gods?

  241. Before you try to dig through all of the comments, I’ve just written an executive Summary of Matt’s comments on reciprocal links at http://www.ahfx.net/weblog/83

  242. Doug Heil >>> No one could continue to simply index page after page of low quality websites, especially directories.

    I don’t think thats a good idea – shouldn’t it be more like, index but don’t rank if something is low quality? After all, a SE can go wrong in what it thinks is low quality – but me as a user would prefer to occasionally go through the first 100 pages in search of that one thing I am looking for. I would prefer if its there somewhere, and someone else doesnt ignore it altogether. Even the directories – why punish when you can’t be 100 % sure?

    Search engines were meant intiially to index everything they could, but rank as they judge. Unless of course you have other issues like overload and unmanageable data…

  243. h2, I completely understand the policy on WMW. You can’t go into specifics without it quickly unmanageable. The other thing is that those were the five cases that I dug into. But Adam found several domains that we’re digging into more, for example. I asked someone to dig into your domain for example, John. But John, bear in mind that we only show a subsample of links that we know about.

    Dossy Shiobara, I see that on my blog sometimes too. It’s natural, because if we see the same article/text in two places, we pick the more reputable page (which is the root page of your blog, in this case). I wouldn’t use a noindex tag; you might consider putting fewer articles on your root page though. That would more quickly put the right text onto the individual pages.

    Jack Mitchell, you said “Saying you can’t do reciprocal linking is just sheer idiocy. How does Google expect you to get back links?” I’m not saying not to do reciprocal links. I only said that in the cases that I checked out, some of the sites were probably being crawled less because those reciprocal links weren’t counting as much. As far as how to get back links, things like offering tools (robots.txt checkers), information (newsletters, blogs), services, or interesting hooks (e.g. seobuzzbox doing interviews) can really jumpstart links. Building up a reputation with a community helps (doing forums on your own site or participating in other forums can help). As far as hooks, I’d study things like digg, slashdot, reddit, techmeme, tailrank to get an idea of what captures people’s attention. For example, contests and controversy attract links, but can be overused. That would be my quick take.

    And now my cat insists that I spend some quality time with her before going to bed.

  244. Dave (Original)

    Matt, you have the patience of a Saint.

    It matter not what you write, many here will only put on their selective reading glasses anyway.

    The funny thing is, most of what you write is just plain old common sense.

  245. Matt, It sure looks like you’ve got your hands full here with all these posts. I tried to read all of them, but they were just too many. It’s funny how people (or should I call them, concerned searchers) view Google’s efforts in providing quality results in the SERP’s.

    Even though I have my own battles as an SEO and SEM marketer, I have to abide by the rules and make sure my client sites are ready, not only for Google, but other SE spiders as well.

    In my mind Google is doing their utmost to keep on providing accurate results based on the search terms. Why would you destroy the very “kingdom” you yourselves have built. Surely you’ll want to maintain your position as the #1 Search Engine worldwide ??

    Anyway, great article Matt. It does indeed explain a lot. Thanks.

  246. Matt,

    Say that I did recip links in the past and now I decide to remove all the links. How long does it take for Google to know the change and adjust my ranking (crawling priority) accordingly?

    Steve

  247. Hi,

    a question regarding cache. Most of caches pages from my site are dating from last february. Since I have change URLs. Olds one’s are redirected to the news one, but as google bots don’t like my site anymore I have loose 6000 indexing pages and no new pages are indexing since bigdaddy update. So I have 2 questions :
    – why my caches pages are so olds (most of them were uptodate the day before bigdaddy update)
    – how make google love my site again ? πŸ˜‰

  248. Phew, glad you cleared that up about reciprocals otherwise I’d be deader in the water.

    I deleted an old http site map and added my www. sitemap as nothing was gettin indexed. Still nothing getting indexed. Did I drown myself by deleting the old map?
    Do I need to do a resubmittal form?

    My site only has two links in google and theyre both from the same place! I know I need more, but is it the links or the sitemap keeping it from getting indexed?

  249. Dave (Original). Coincidentally, I posted this in my forum just a few minutes before I read your question.

    The more I think about how the dumping of pages doesn’t make any sense, I’m wondering if they really are short of space in spite of what Matt said. He said that they have enough machines to do it all, but to do what – run a pruned index or expand the index?

    I can imagine a meeting where they discussed whether or not to keep on adding machines and capacity, or to be more selective about what they have in the index.

    Doug Heil. The point is that Matt’s assessment of the health care directory site (and he examined it) is that it needs some more IBLs for Google to crawl and index more of its pages. It isn’t just any directory site that we can generalise and guess about – it’s one that Matt examined, and that was his assessment.

    You said that, “No one could continue to simply index page after page of low quality websites, especially directories.” I don’t disagree with that, but it would depend on the definition of “low quality”. Matt said that the health care directory looks like “a fine site”. It doesn’t sound low quality to me.

    Dave (Original). I’m sure you are mistaken about the usefulness of directories. Niche directories can be very useful, and some people really do use directories, so for some people, they are useful. Either way, they are not low quality sites by definition.

  250. I dont know whats going on now but if I do site:www_mydomain_com

    It says results 1-2 of 68, was about 320. Even though it says 68 its only showning 2 pages, homepage index and the an attached forum index.

    Now that isnt ussual is it?

  251. One of those is Supplemental too

  252. Caios

    This:
    link:www.shacktools.com πŸ™

    Maybe because of this:

    link:www.shacktools.com πŸ™

  253. Is Matt Cuttsa Matt Cutts – or someone pretending to be Matt ?

  254. Thats true but why index over 300 pages and then remove them? market links link checker shows about 300 links through MSN. Also site: says 68 pages but only shows 2?

    I know I need to build links to get my site indexed but I personally would rather not spend all my time doing that. With there being less sites out there indexed then that means that even if I did have more links out there then they are surley less likely to be seen by google.

  255. Spam Reporter

    Matt:

    Over the past few months, I’ve submitted many Spam Reports (basically whenever you ask for them here) on a competitor’s site that is using Hidden Text, yet that site is still in the index. This site has been doing this for at least the past three years (that’s how long I’ve been in competitiion).

    When Google finds SPAM on an internal page, is just that page removed from the index or is the entire site penalized?

    It’s just frustrating. I keep submitting the reports, and yet the site still remains. Feel free to e-mail me at the address provided and I will supply more info if you’re interested in details.

  256. When I go to a library to do research, I do NOT care how many people have read or checked out the book that I am looking for. I only care that the book is relavent to my research.

    When I am doing research on the Internet, I do NOT care how many people have linked to that site. I only care that the site contains the material that I am researching.

    TOO MUCH emphasis has been placed on links coming to a site. TOO MANY sites with excellent and exclusive content are being left out in the cold because they have no incoming links.

    The Internet is about INFORMATION. By putting all their trust in the incoming links, Google has made the Internet all about POPULARITY.

    That is NOT what a search engine should be concerned about.

  257. This is my first post in your blog Matt and thanks for giving us an opportunity to spell out our views.
    I’m talking about affiliate sites.
    Individuals, who do not have the capacity to go on for something big, has no other option but to continue affiliate marketing for simple reason of earning few bulks.
    Mostly optimised for less common keywords, these sites would provide products/services for a small group of people.
    Google policy suggests W/Ms to think “whether you would do that if there were no search engine”. Needless to say, it makes no sesnse for a T-Shirt affiliate site to give history, origin and such unnecessary things about T-Shirts just to satisfy Search Engine Bots.
    It is really difficult to change the description of the products like T-Shirt and such sundry other items, though that can be done using scripts to change words like “this is” to “we have” and to try and befool the crawlers.
    Probably, it is possible to add valuable content for sites that offer a domain name registration/ web-hosting service through affiliate links.
    And who wants to drive hard earned traffic to a different site and with the fear that the traffic may never come back and that too being well aware of the fact that had it been a direct sell, the publisher could have earned much more! It is nothing but compulsion.
    So far as the value that such sites add to the internet – an analogy may explain it. Why do we visit a street-side small shop, when everything is available in a big shopping-mart, undoubtedly providing the best comfort.This is the very basic human nature and really difficult to explain.
    And finally, it is rather easy to befool a crawler(!) but not a bonafied customer. Just because there is some links on a website, a customer is never going to buy anything from that site unless and until he gets something meaningful.
    So this is my small request to let the visitors decide what they want to do, whether to buy it from the Principal site or go via an Affiliate site.
    Regarding this issue, your earlier stand seems to have more sense where affiliate sites would have come in SERPs for rather uncommon keyphrases and Pricipal sites would enjoy the traffic for more commonly used keyphrases.
    Thanks again.

  258. Is Matt Cuttsa Matt Cutts – or someone pretending to be Matt ?

    I saw that too. I think he went Italian.

    Heyyyyyyyyyyyyy CUTTSAMATTAYEW, huh?

  259. Hi Matt,

    Great post and great information!

    Ok some say there is no such thing as “sandbox” but there is a holding cell. I have been working on a site for nearly a year and a half now and I still can’t get the site to rank- even for the most unused/stupid keyword. I do see tons of supplemental pages in the results and your explanations seems to fit the bill. But still don’t understand why it won’t come out of the holding cell. Was there something new in the update for new domains that keeps’em in the cell longer other then getting quality links and everything else I know?

    You can email me for more details if you like.

    Thanks,

    Beth

  260. Hi Stephen

    “Is Matt Cuttsa Matt Cutts – or someone pretending to be Matt ?”

    I’m sure it was Matt. Its his style 100% .

  261. Yes, looks like his style – but also not his style in a way.

    EG. It looks like it is has been made to fit his style but some things dont add up ( He has two cats for a start πŸ˜‰ )

  262. Caios

    Let me put it like this:

    Either you play the “Backlinks Game”..or you don’t play at all πŸ˜€

  263. Stephen

    “EG. It looks like it is has been made to fit his style but some things dont add up ( He has two cats for a start πŸ˜‰ ) ”

    I know. But it seems that it was Emmy that insisted that Matt spend some quality time with her.

    While J.D guy might have had more important things to do than to waste his time on Matt πŸ˜€

    You know women always need more attention than men πŸ˜‰

  264. Harith

    You might be right JD is probably still just running around the house chasing laser pointers, its tail etc.

    Matt

    Any chance of an update on the PR situation ?

    As has been noticed at WMW and other places. The last PR (Early Aprl) update only seemed to effect some sites and no ranking changes were noticed as a result of this update. (OK this might be hard/impossible to notice anyway – but from the outside it just looked like the last PR update was purely cosmetic)

    Other pages kept the old PR which was probably updated around February time…..

  265. Unfortunately, it still sounds like sites that wouldn’t naturally collect large amounts of links aren’t ever going to be spidered/indexed completely.

    I don’t care about ranking at this point; if I can get the pages into the index in the first place, I can *make* them rank. I just can’t get them back in.

  266. A software site has a “these people use The Widget” page linking to their customers’ sites.

    We’re thinking of doing that, and probably will. Our site will probably drop in PR, maybe even disappear from the Google listing altogether, because most of those links will be to sites not relevant to our site because those who are interested in our product would not also naturally/automatically be interested in the products of the sites we link to. We get sales by other advertising and word of mouth, so SERPs really don’t matter as much to us as they might for others. Our concern is for our customers.

    Question: If we are penalized, will that also penalize our customers?

  267. Alex Duffield

    [quote]
    Matt Cutts Said,
    May 17, 2006 @ 1:49 pm

    Alex Duffield, in my experience those links aren’t making much/any difference with Google….
    [/quote]

    Matt, I am sure you know better than me, but the fact remains that the site I pointed out comes up number 1 for Many searches (rafting BC) and in the top 5 for just (river rafting).

    I manage the site for one of there competitors, and have kept an eye on these guys for many years. Befor they started participating in this sort of link scheme, they did not recieve this sort of ranking.

    There site does not include nearly as good user valuable content as any of the others in the top 5.

    My main concern here is that my clients think they should (need to) also participate in this sort of linking scheme in order to compete. I insist that good content, a well designed site combined with regular updates and good (honest) linking is the better approach. I have pointed out that Google guidelines clearly stat that “Linking schemes designed to improve PR” are against the rules and tell them that in the long run they will get burned, but I fear I am slowly loosing the battle against the fact that it does work.

    All I am looking for is some ammunition to convince my clients against this coarse of action.

  268. Netmeg, as a major change in crawling/indexing, I expected to see some people say “I’m not crawled as much.” Somehow the people that are crawled more never write in to mention it. πŸ˜‰ But we take the feedback and read through it. I’ve been talking to someone in crawl/index about the future of the crawl, for example. We keep looking for ways to make the crawl better.

    Stephen, I haven’t asked around about PR lately. Yes, the one cat is much younger. He can keep himself busy with of string for an hour. It’s the other cat that often demands attention. πŸ™‚

    Spam Reporter, a lot of the time we’ll give a relatively short penalty (e.g. 30 days) for the first instance of hidden text. You might submit again because sometimes we’ll decide to take stronger action.

  269. Hi Matt

    Thanks – sometimes I wish I was a cat – less stress.

    I have sent another email to the Boston address and a follow up as it looks like someone had a look last night but no reply – OK – perhaps I should be more patient.

    I dont know if you are looking into the site deeper, just ignoring my site, or what ? It seems to have regained PR at the last change – but still suffering from a penalty – I have given more details of perhaps why in the email.

    Cheers

    Stephen

  270. Hi Matt,

    A gold mind of stuff, great.

    However one question. I understand the principle of relevant OBLs and improving the visitors experience, but here is a quote

    “Moving right along, here’s one from May 4th. It’s another real estate site. The owner says that they used to have 10K pages indexed and now they have 80. I checked out the site. Aha:

    This time, I’m seeing links to mortgages sites,”

    I can’t see how linking to a mortgage site from a property site would not be deemed as a relevant link and not improve the visitors experience.
    I would not be able to buy a property without a mortgage and my guess is, that this would aply to most people.

    Is there no slack for cross subject linking?

    I have a “Breakdown Recovery Site” I have a lot of information, regarding cars and motoring and driving holidays. It is not directly related to your car breaking down, but it is related?

    Thanks

    Mark

  271. Being someone who consults to many companies there is a need from Google to avoid everyone from spinning wheels and wasting time, I speak for ALL website owners & even folks at Google dealing with all the questions.

    I had to deny a Google Paper Publications advertisement as I was not sure doing more advertising was good or bad, I feel like anything I do could cause a penalty. So in the end I turned off my Google Adwords account 100% and will not use the Google publications anymore for advertising. All going to the other large players. (Why would anyone at Google want this). I just can not figure things out with Google. (Investors must love this part) Thus Google does not make an extra few K now at least. My other clients, all off! Now we are at about -15k a month for Google. (And this was my professional answer, do not take chances)

    Please (if possible) find a way to let folks know that there is a penalty, makes such sense, everyone would save time, all would win.

    The way it stands it appears search engines do not want to tread in these waters, but be professional and let people know, get a great lawyer, write a disclaimer, save everyone hours of time.

    Is it a dream?

  272. i built that t-shirt site matt said wasn’t interesting to my visitors. well, my bookmarking rate is 15-20% monthly. so, the users! find it interesting. i just put together stuff i liked, and users didn’t have to go around looking for this stuff for days. just a fashion magazine. was hoping that google would be in the business of “indexing” not editorialising… the affiliate links are nobody’s business but mine, it’s legal. some of the content has been provided by business partners and syndicated on the site – this is also legal.

    matt, i sent the specifics of my website traffic to to the original email address if you need the proof of what IS and ISN’t interesting for people that search for this info.

  273. Matt,

    There appears to be two datacenters that show a completely set of results from the others. I believe these datacenters are the original BD Dc’s, not sure. Would you expect these to spread? My real question is at what point in the timeline would you expect some stability and consistency across all datacenters.

    Thanks Matt.

    Chris

  274. MATT, Can a site get fully indexed in time without having to get links? I know you can get indexed faster with them, but I want to know if a site never gets links (they do have a page indexed though) will they ever get indexed or will they never see there pages(entire site) in Google until they do get links.

  275. hey matt i didnt hear back from you regarding the adult listings as mentioned above..

    I think this topic needs some investigation. I see a continuing trend of freshly expired NON-adult domains getting insane rank for adult serps while older established adult sites are pushed further and further down the list.

    The talk amongst the “adult” seo community is the only way to get good google rank for adult these days is by buying or getting your links on NON-adult mainstream pages..

    This practise makes everyone looks bad and in continuing the way google is operating it is really harming the non-adult community..

    The top 100 listing for most adult terms are filled with SCHOOLS and EDUCATIONAL domains that recently expired. Banking on the fact many other schools will still have links up to the expired domain..

    So now all we have done is made the serp’s irrelevant , and shown alot of porn to kids and unsuspecting people , all for some google rank..

    If that wasnt bad enough the rest of the results are guestbook spam of adult links on mainstream results. If google didnt reward these spammers , they wouldnt attack innocent sites with automated software just to add their adult links..

    So by continuing to allow these sort of methods google is creating a problem where one didnt exist..

  276. Netmeg, as a major change in crawling/indexing, I expected to see some people say β€œI’m not crawled as much.” Somehow the people that are crawled more never write in to mention it.

    Ironically enough, two or three years ago we had to contact Google to throttle back the crawling on one of the sites that so concerns me, because it was being hit way too hard at the time. Oh, for days gone by…

  277. Matt,

    You seem to have referenced the fact that sites might be penalized without being banned, and I know the question has come up a couple of times, but I’ve never seen a clear cut answer on this. Is there such a thing as “penalizing” (drop in position but still listed), and if so, is doing some of the stuff you discuss here, such as recipricol linking, a possible cause? I’m not talking about linking to spammy sites, as you have been clear on that, but what abnout recips in and of themselves?

    For instance, I was told recently that I should submit one of my sites for a particular award. I’m pretty sure that receiving that award means being listed on the list of award winners with a link to my site. Are you saying that if they link to me, great, but if the award logo hotlinks back to them (and thus becomes recipricol), not only would the link from them then become worthless (well, aside from the ego boost I know I’m going to get if I win. πŸ™‚ ), but that it might actually hurt me?

    I somehow doubt that’s what you’re saying, but it certainly isn’t a clear cut issue.

    Thanks.

    -Michael

  278. Hey Matt, how about mentioning something to the sitemap folks about adding a feature to get rid of dead/404 urls from a site. Google seems to take forever to rid itself of 404 pages, could be a great asset to Google and webmasters if there was a functional way to remove urls via the sitemap system? Kind of a dumptheseurls.xml anti-sitemap deal.

    Cheers,

    John

  279. it seems google has forgot a basic fact, webmasters are the net, and google is a bridge between users and webmasters, users opted for google because it gave them the most in-depth and the most choice when it came to searching a certain term, after finding a site through google, users decided later on which site to use or bookmark, now it seems google is trying to choose for them what they should see and what they shouldn’t see, as somebody mentioned this is editing content and not indexing content.

    I have a question, lets assume someone had a website with a lot of original and unique information, yet at the same time they were involved in heavy and excessive link exchange to generate traffic from other sites (like the net old days, exchange links for traffic), will you curtail that site valuable content from millions of users because a dumb crawler saw lots of links?

    The sad fact is thousands of webmasters have lost thousands of pages, and millions of people have lost tons of information, because a bunch of spammers have decided to manipulate google, while manipulating search results could be and is a serious problem for google, the way big daddy have been designed to solve it is not proper.

    Google mission was supposed to be organize the world information,
    however I believe the mission has evolved into β€œediting the world information according to a blind algorithm, because we got blinded by spam!”.

  280. Hi Matt,

    With over 200 comments it is time consuming to read through each one so I apologize in advance if this question has been asked.

    With regards to Web design companies, it is standard practice to insert “Designed by Company Name” etc along the footer of our clients pages. No suprise there.

    Now, usually these links appear site wide. What impact do you find these will now have with the recent updates? Is there a better process that Google perfers to have our clients credit our work?

  281. Michael VanDeMar, yes, a site can be penalized without being outright banned. Typically the reason for that would be algorithmic. I wouldn’t worry about being listed on a page of sites winning prizes though, unless it’s the Golden Viagra Mesothelioma Web Awards. πŸ™‚

    Netmeg, I’d like to see us provide some ways to throttle crawling up or down, or at least give preference hints.

    Adultwatcher, don’t take a non-reply as not reading it. I did pass all of those on to ask how some new things do on stuff like that. There’s a part of our pipeline that I’d like to shorten, for example.

    Relevancy, I wouldn’t count on getting a large site fully indexed without any links at all. We do look at things like our /addurl.html form and use that, so it’s possible that a smaller site could do it without links.

    dude, I didn’t mean to cast stones at that site. Someone who gets to the site can certainly buy a T-shirt from different brands. But at least some of your links are from stuff like an “RSS Link Exchange” and those links just aren’t helping you as much.

    Bruce, that’s your call of course. Advertising with Google wouldn’t affect your either way though (help or hurt). I think we’ve talked about your site; 1-2 of the pages on your site, plus the “sponsored by” link plus the “Search engine optimization” message on that page would be where I’d start.

    Stephen, Adam is going through all the emails. He’s writing back to the ones that he can, but he can’t write back to every single one; I need him to do other stuff too (e.g. keep an eye out for other feedback across the web, learning more spam detective skills, etc.).

  282. I dont think I have ever seen so many comments on one Cutts blog.. this will make comment #263 (sorry Matt if this comment violated your Guidelines on Comments) but just wondering, did this thread break a record? Whats the highest # of comments a Cutts blog has seen?

  283. Matt can you explain why sections of our website are now showing supplemental results for almost every single one of our home listing detail pages. Each of these listings are unique and required to be on the site if we want to make our visitors happy.

    The system houses about 25k listings all with unique information. I guess I dont understand why these pages should be placed in to the supplemental results.Site: (mygorealty.net)

    If there is a problem on our side we want to correct it. If it is a problem with Google it might be nice to know what that problem is so your crawl/index team can correct it.

  284. Matt:

    I’ve read your post with interest.

    About four or five days ago, I noticed that google had dropped all but four of my pages on my new site. Now, four or five days later, it has dropped them all, besides the index page.

    Shocking and unexpected and, for me, unexplicable.

    My site is almost 100 percent original content and even though I do feature an occasional affiliate link, it is certainly more content-oriented than affiliate oriented.

    so….after reading your comment policy, not sure how to phrase my question so that it can be general interest but here goes and i hope it flies……

    if a site has almost 100 percent original content, will a few affiliate links cause google to stop indexing it?

    Thanks, neva

  285. matt, link exchanges with relevant sites for small guys is the only way to get traffic. please discount the exchanges, just not factor them into your results. if a site can’t get to the top of the results, exchanging links is one of the few means to get some visitors. automating the procedure kinda makes sense. cheers.

    ps: i have approved every link on the exchange with the goal of not getting random visitor, but a targeted visitor. granted, i didn’t get too many as the result from that one, so i am not doing it anymore.

  286. Eternal Optimist

    Matt, firstly thanks for your further comments on supplementals πŸ™‚

    There has been a considerable amount of ‘statement of fact’ in forums, although it is probably rather more speculative, but does Google take into account things such as age of site, time spent on pages by visitors, percentage added to favourites, number of years a domain is registered with the domain provider, etc, or are these all academic to ranking and indexing?

    By the way you mentioned that supps. may be caused/affected by, higher ranking pages being indexed by priority,but I notice that there are some very high ranking websites with many supplementals πŸ™‚

    Thanks πŸ™‚

  287. Matt,

    Thanks, no, it’s not a Grande Cialis Hair Loss Award, but it isn’t strictly related either. Loosely they tie in, but it might take a second to see the relationship.

    As for the penalties that you mentioned… would an email from Google indicating that you were not penalized or banned cover those? Or might you be anyways? And would that class of penalty be something that might get mentioned in the Sitemaps Penalty Awareness program…?

    Also, I’d like to know the answer to Joel’s question too… is this a record comment count for the blog?

    Thanks. πŸ™‚

    -Michael

  288. Matt,

    Sorry for double posting, forgot this one. This is a long post to read with the comments, and I think that if you start from the top without reloading, and then go to comment, the security code might time out. Didn’t happen this time, but it has in the past. You should really make it a forward-only process on missed codes, retaining what has been typed to the next page, to keep people from having to retype comments. πŸ™‚

    -Michael

  289. So is there a way to know if we’re linking to what Google “thinks of as a bad neighborhood”? Also I am interested in what someone said about sites such as coupon sites. Obviously these sites don’t have original content. Will linking to other coupon sites still help you? Also, on a mall site how can any link not be related?

  290. Matt,
    I quote:
    Yup, exactly, arubicus. There’s SEO and there’s QUALITY and there’s also finding the hook or angle that captivates a visitor and gets word-of-mouth or return visits. First I’d work on QUALITY. Then there’s factual SEO. Things like: are all of my pages reachable with a text browser from a root page without going through exotic stuff. Or having a site map on your site. After you’re site is crawlable, then I’d work on the HOOK that makes your site interesting/useful.

    I find some real advice: Word-of-mouth = Be popular and get links from related sites. Factual SEO = get the tech right (and clean). Work on the HOOK = Try to be interesting in your own way. (We don’t care about this)

    One thing bothers me between the lines: “Return visits”. Please tell me that you are not tracking and using return visits as part of your algorithms.

    /chris

  291. Now, usually these links appear site wide. What impact do you find these will now have with the recent updates? Is there a better process that Google perfers to have our clients credit our work?

    I may be about the only person in the world who feels the way I feel about this issue (and those who know me have heard me say this before), but I’m still gonna say it.

    Unless the work was non-commissioned (which is highly unlikely), putting a hyperlink on a client’s website is tacky and unprofessional, and deserves no real credit. It’s like watching a Ford commercial and seeing the logo from the ad agency who designed in the lower right-hand corner.

    Personally, I’d like to see no credit whatsoever given to these links. It does no benefit to the customer and goes against the whole organic link concept. If there were ever an “unnatural link”, that would be it.

  292. Google has been my preferred search engine for many years, however in the last year or so the results seem to be getting worse and more irrelevant, with other big search engines results are improving. I fear that because Google has become the number one search engine it has made itself number one target for financial gain for web authors through ppc, very much like Microsoft Windows became the number one target for hackers.

    Its quite frustrating that unique specific content isn’t enough to get ranked on Google, it seems that you have to get links regardless of their quality or relevance.

    Personally I don’t do reciprocal links, if a site wants to link to me then great, if I think a site will be of interest to my visitors then I will provide a (nofollow) link.

    I hope that Google will sort this Internet search mess out, the sooner the better, my suggestion is to penalize; directories, duplicated content, automated sites that have thousands of pages etc etc

  293. Very nice to see you take the time out to communicate stuff like this Matt —– definetly worth the time to stop by and read….. Thanks!

  294. Matt, Isnt the point of your addurl and sitemaps program supposed to help get sites indexed? If that is the case what is the point of them if it takes links to get indexed?

  295. — You will see the site but not index them with addurl and sitemaps? I know sitemaps does more, but it’s original point was to help pages get indexed. Now it doesnt help that.

  296. Google sitemaps is a great idea and the perfect mechanism for Google to communicate to webmasters.

    It should have no effect on rankings, but merely act as a mechanism to inform Google of the the structure of your website and any new pages that may need crawling.

  297. I understand if there are 2 pages that talk about something and one page has tons of links and the other doesn’t.. that site should rank higher, but if there was a site that had tons of dedicated pages about that term with no links(maybe its new or a mom and pop) it should at least be indexed and judged on its relevance and merit.

  298. Supplemental Challenged

    Matt, you would do yourself and everyone else a good service by not allowing a lot of the above confusion about “reciprocal links” to go unanswered. Just say “there is nothing wrong with Wikipedia linking to Dmoz and Dmoz linking to Wikipedia.”

    You can end the FUD once and for all, and put a lot of link brokers and “three way” spammers out of business just by saying its not reciprocation that is the problem, but spam and deception and trying to pretend a site is more important than it is.

  299. Matt,
    I have another question for you. I will repost my first one here along with my second question, so they are consolidated.

    #1 There appears to be two datacenters that show a completely set of results from the others. I believe these datacenters are the original BD Dc’s, not sure. Would you expect these to spread? My real question is at what point in the timeline would you expect some stability and consistency across all datacenters?

    #2 I am really frustrated with the quality of serps in a few instance, where I am trying to do research. Today, I was doing a little medical research and was trying to find information about a schedule 2 narcotic. Specifically, I was searching – difference between oxycodone and hydrocodone – (without quotes). I got page after page after page of junk scraper/directory style sites with links to other sites like this one: getcreatis.com/oxycodone.html

    Many of these urls in the Google index immediately redirect to the affiliate page. Total junk. I am not trying to sound overly critical, but I wanted to point this out to you. It is very difficult to conduct any type of scientific research, especially medical research, when these spammy, worthless affiliate sites with page after page of just spammy links or adsense are ranking so well. Many of the pages have zero PageRank, so I find it amazing they are ranking so well. Actually, I see a lot of pages with PR0 ranking well these days.

    Thanks again for your help.
    Chris

    Thanks Matt.

    Chris

  300. Google Pagerank on average seems to be a good indicator of quality pages, but seems to have little relevance to ranking on Google at present, but I can understand why Google is holding back.

    I’d suggest it being included as a filter when searching Google or maybe include it as a filter on Google Toolbar search at least.

    My only criticism of Google Pagerank is that its very slow to update new pages. (Why dont they link it to Google Sitemaps?)

  301. Hello Matt,

    Seems like the recurring theme is that recip links are now bad. This is hard to fathom. Isn’t this the nature of the web?
    Especially in my sector which is fishing charters, guides, trips, etc where you have so many many less than professional websites that will never rank that high.
    Now when they come to me asking for a link trade I have to deny them for fear of suppressing my ranks?
    These link trades for these usually poor charter Captains that barely eeke out a living are their life blood of the Internet and now I am going to tell them “Sorry Charlie” no links because Google doesn’t like it.
    OK I have good links and don’t actually need a link back so since you are freely spouting out great info and insight can you take it a step further and let us know the heads up on linking to sites like I mention, one way.
    I want to continue being “friendly” to the charter Captains and guides that are struggling to survive, so will sites that only link out to relevant resources have a “drain” or negative affect on their respective websites?
    I certainly understand that a fishing site linking to a credit card site is bad and that would be an obvious sign of link laziness or just someone trying to manipulate the system but a fishing information site trading a link with a fishing charter site should be considered what makes the web go round, no matter how many times this is done.
    Anyway have a nice evening and I wish I could have caught this post when you first wrote it. Thank You – Joe

  302. Hi,
    Matt,
    If the sites you stated above as having poor links, if the outwards links had a rel=nofollow would it improve the number of pages indexed?

    1)Also I had a directory, with 15000 pages listed, I have been hit hard and now only have 600 pages listed, (do you not like directories). (the site contains very few outbound links).

    2)Also directories would principly have links coming in from all cateogries they list, so which category could be taken as relevent for a directory.
    thanks

  303. Dave (Original)

    PhilC, I was actually thinking more along the lines of not enough hrs in a day/week/month/year to index ALL pages out there. It would appear that Google NEEDS make a choice in many case and what Matt describes would fit.

    I’m not trying to say ALL directories are of no use, just that the number of people (in the scheme of things) that they are useful to are low.

  304. Hi Matt,
    Here is an idea for Google to sole its reciprocal linking problem. Why not only give value to a certain number of defined reciprocal links. If Google promised us that only 100 reciprocal links from our site would count then it would seem to solve a large part of the problem. To make things simple these could all be put on one page- with a specified name. Of course a site could have as many outgoing one way links as it wanted and as many reciprocal, but only 100 would count in terms of the search engine. This would only apply to reciprocal links- all other links would be as they are now. A number of different similar schemes could be thought of- what about 25 per a year. We would be a lot more careful about reciprocal linking if we new we only had a certain number and that those links in a sense defined our site – as well as the one way links the site was able to attract.

  305. YES!!
    [quote]I may be about the only person in the world who feels the way I feel about this issue (and those who know me have heard me say this before), but I’m still gonna say it.

    Unless the work was non-commissioned (which is highly unlikely), putting a hyperlink on a client’s website is tacky and unprofessional, and deserves no real credit. It’s like watching a Ford commercial and seeing the logo from the ad agency who designed in the lower right-hand corner.

    Personally, I’d like to see no credit whatsoever given to these links. It does no benefit to the customer and goes against the whole organic link concept. If there were ever an β€œunnatural link”, that would be it. [/quote]
    No Adam; you are “not” the only one who feels “exactly” like that. I find it “extremely” Unprofessional for a design firm OR SEO firm OR both who stick their links in the footer of client websites. It’s so bad. It’s very amateurish and not only looks bad for that site the links are on, but looks bad for that designer/SEO as well.

    Not only all the above, but that particular client is “unknowingly” linking to a SEO or designer without the full and clear knowledge of what linking can actually mean in the long run. That firm they link to could get caught for spamming, or be deemed a ‘bad neighborhood’ firm, which would indirectly affect the poor client who is linking.

    We all hear all the time in this industry about SEO firms/designers practicing “full disclosure” to clients. What does that mean exactly? Does it mean that as long as the SEO asks the client if they can stick a link in the footer, then it’s perfectly fine? This goes for any technique the SEO claims they do for clients and then trying to explain it in “full disclosure”.

    What this industry does not get is the fact that NO way does the average joe client understand all the ramifications involved with “anything” their site is doing, whether done by the SEO or done by the client. It should be up to OUR industry to educate that client and then blame ourselves for the bad SEO’s/designers in this industry. We shouldn’t be giving free passes out to firms who show Unprofessional-ism day in and day out. But you know what?… we sure do hand those free passes out very freely.

    Getting back to the links in footers….. can you imagine seeing a link on Google.com in the footer that says:

    “Designed by Church of Heil” LOL

    or

    A link on Sony that says:

    “Consulting by Doug”

    with a link to Doug’s website?

    Why do firms feel the need to jeopardize client websites in this way, and feel the need to advertise in such a cheeky and unprofessional way? I’ll never know.
    (Editing disabled while spellchecking)
    Stop spell checking

  306. Spam Reporter

    Matt:

    Just submitted another SPAM report with your name and my name (above) in the message box. Please have a looksee and take action!

  307. No Adam; you are β€œnot” the only one who feels β€œexactly” like that. I find it β€œextremely” Unprofessional for a design firm OR SEO firm OR both who stick their links in the footer of client websites. It’s so bad. It’s very amateurish and not only looks bad for that site the links are on, but looks bad for that designer/SEO as well.

    Not only all the above, but that particular client is β€œunknowingly” linking to a SEO or designer without the full and clear knowledge of what linking can actually mean in the long run. That firm they link to could get caught for spamming, or be deemed a β€˜bad neighborhood’ firm, which would indirectly affect the poor client who is linking.

    We all hear all the time in this industry about SEO firms/designers practicing β€œfull disclosure” to clients. What does that mean exactly? Does it mean that as long as the SEO asks the client if they can stick a link in the footer, then it’s perfectly fine? This goes for any technique the SEO claims they do for clients and then trying to explain it in β€œfull disclosure”.

    What this industry does not get is the fact that NO way does the average joe client understand all the ramifications involved with β€œanything” their site is doing, whether done by the SEO or done by the client. It should be up to OUR industry to educate that client and then blame ourselves for the bad SEO’s/designers in this industry. We shouldn’t be giving free passes out to firms who show Unprofessional-ism day in and day out. But you know what?… we sure do hand those free passes out very freely.

    Getting back to the links in footers….. can you imagine seeing a link on Google.com in the footer that says:

    β€œDesigned by Church of Heil” LOL

    or

    A link on Sony that says:

    β€œConsulting by Doug”

    with a link to Doug’s website?

    Why do firms feel the need to jeopardize client websites in this way, and feel the need to advertise in such a cheeky and unprofessional way? I’ll never know.
    (Editing disabled while spellchecking)
    Stop spell checking

    DUUUUUUUUUDE!

    Do you have any idea how long I have been waiting to see someone actually get this? Just to find one person who truly understands the ramifications of these links and the potential negative ramifications of such?

    This post is truly a thing of beauty. Other designers/developers/SEOs, read and take heed. SEO reasons aside, all the marketing stuff Doug mentioned here is reason enough not to do this.

    Hey Matt, would there be any possibility of a future blog post or at least a comment on this, since it’s one where a large percentage of your readers would be interested in it (including the three who posted about such)? I wouldn’t go postal on you or claim you’re an asshole or anything like that if you didn’t, but we (and I say we because there are at least two of us who asked) would love to hear your take. Thanks in advance…and if not, thanks for posting stuff like this that lead to the tangental thoughts that others have.

  308. It seems that quite a few people who are complaining about having lost indexed pages are stating it is due to reciprocal links. Well, I can say that after some research we finally found out why our site was not Googlebot friendly. We fixed what we thought was our issue and went from 3500 indexed pages to 80,000 indexed pages in short order. Well, now I am down to about 600 and seeing this go down the last few days. I was enjoying the traffic while it lasted. The kicker is that we do not have any reciprocal links at all. I have some one way inbound links I have been working on obtaining but no reciprocal links. So I wonder if you do not have enough inbound links if that hurts as well and will cause you to lose indexed pages?

  309. In my eyes Goolge IS BROKEN it’s been ruined. Forget about SEO

    It’s unreliable, the results are from the easy to sort through and find literally any page indexed google that existed 2 years ago.

    I no longer can find anything that I seek in google. If I do it’s 20 pages deep. Your algo and focus on combating spammers has maken quality results and the basis of what google started as tyake a back seat.

    PLEASE for the sake of a decent search engine enough is enough with exlcuidnf results, and deciphering whos’ back links are valid or not.. There are many quality bl’s in my eyes that don’t even get counted by you guys or get very little credit given.. like a user reccomending a solid site they found useful in a forum by providing a link to it, that to me is one of the best most valuable ways to determine if a site is worthy..

    Regardless I could go on for an hour..

    Google is broken. I can not find anything i search for, and any of the hard work put into a few quality sites that have been around for years is being negated and de-indexed page by page day by day

  310. Matt–

    As much as I often disagree with his analysis, Phil C. is essentially correct.

    Google is dividing the web into “haves” and “have nots.”

    It is no longer enough to build a decent, spam-free, original content site. Now you need to attract links from major players or your content is not worthy of the index.

    Shame.

    It seems to me that you guys are trying your best to stunt or at least ignore the natural growth of the web via your new selective indexing policy.

    Might it have something to do with a capacity problem? Let’s ask the boss: “We have a huge machine crisis – those machines are full”.

    Matt we all know how hard you work, but you’re beginning to sound a little bit like “Baghdad Bob.”

    Be well.

  311. I find it β€œextremely” Unprofessional for a design firm OR SEO firm OR both who stick their links in the footer of client websites. It’s so bad. It’s very amateurish and not only looks bad for that site the links are on, but looks bad for that designer/SEO as well.

    Why stop at web design/SEO? Let’s remove logos from all products – from cars, tins, clothing, computers, etc. After all, logos are unnecessary – why the need to advertise the company who made the product (just like web designers advertise they made the page the person is reading)? What’s the difference between developing a car, and developing a website, in that respect? Why is it OK (in your mind) to have your logo on a car you created, but not a link on a website you created?

    I would respect your opinion if you actually stated WHY it is unprofessional. I think a discrete link at the bottom of a page is fine – it’s actually doing a service to the reader – they may LIKE the way the website is laid out/designed and want to know who made it. Sure, you can just put the raw text of the web design company’s website address at the bottom of the page, but it’s hardly friendly to force users to copy and paste links into their browser rather than simply click on it.

  312. Dave (Original)

    Why do soooo many base the success/failure of Google on their site(s) position in the SERPs? (that’s rhetorical guys)

    I wouldn’t mind betting that Google has indexed more pages than ever since Big Daddy.

    “Google is broken”, “the SERPS are crap” yada yada yada all boils down to “My site isn’t ranking like I want”.

  313. Matt Cutts: re wmw and looking at sites: imagine talking about paintings without being able to look at them, by policy. Then you’ll get art critics getting into big arguments about some piece of garbage, without even realizing that the painting is garbage. I think once you take an absolute position like this, year in and year out, it begins to erode the overall quality. At least that’s what I’m seeing.

    The one positive of that other search forum I did was that I finally got to see the garbage sites that people had been complaining about not ranking. At least 95, probably 99, out of 100 were total junk, spam, tricks, keyword stuffing, link spamming.

    It’s a question of creativity, thinking outside the box. Lots of ways to do it. Brett has often said that he wants his stuff to be reference quality, thus no specific examples that will be different and changed in the future. But that pretends that anyone is ever going to go in and read google threads from a year, two years ago, which, let’s get real, is ridiculous, only the most hardcore of seos are going to sit reading old search threads. Only a tiny fragment of the world’s population would ever think of doing that, and an even smaller fragment would actually do it.

    Anyway, doesn’t matter, the quality drop is what’s readily apparent, things move on, blogs are getting more interesting than webmaster forums, at least blogs like this one. Brett’s decision to not allow blog linking [with the exception of yours I guess] is going to continue the quality drop, since more and more authoritative sources are writing in blogs. More and more I’m getting my primary information from developer type blogs.

    Doesn’t matter in the larger picture, but this particular thread/posting was really revealing to me in terms of how low the quality on wmw google forums is getting. Much of what you said was fairly obvious to anyone who’d followed jagger update, no real surprises, except to the spammers who continue to complain about getting caught by the algo.

    Re the footer links, I’ve been guilty of that, more out of ignorance and laziness than anything else, I started pulling them all off sites I’ve done a year or two ago, and I’m happy I’ve done that, I agree with the poster who said how amateur that is, it is. And it’s a cheap trick.

    Personally, I’m tired of cheap tricks, I’m happy to just let the cards fall where they will, if search engines like my sites, fine, if they don’t fine, if people like them, fine, if they don’t, that’s fine too. Life is too short to worry about how stuff ranks every week or month.

  314. Ranking doesn’t bother me so much. It’s the fact that pages aren’t getting indexed that bothers me.

  315. Matt, nice to hear you guys got to cut loose a bit afterwards.

    As far as the linking goes, yea that’s understandable. I guess what I meant was that if the crawl depth is being reduced based on low quality inbound links and spammy/off-topic outbounds, someone less informed could infer that they could just not have outbound links altogether in order to avoid some of the reduction. That obviously wouldn’t be a good thing for users so I guess i was just prying to see if in relation to crawl depth, G might now also be taking into consideration on-topic analysis and quality analysis not just for the purpose reducing it.

  316. There are sites out there with millions of automatically generated pages designed to manipulate Googles index.

    There is a well known site which is showing 162 million pages, sites like these should have a penalty applied to all but its root level pages.

    Its no wonder the results are in a mess.

  317. There are sites out there with millions of automatically generated pages, designed to dominate the search results for practically every subject matter you can think off.

    One well known site has 160 million pages, its no wonder search results are getting worse.

    Sites like these should have a penalty applied to all but its root level pages.

  318. Replying to Joes comments,

    Reciprocal links are bad because they are open to so much abuse.

    Im in the same sector as you, charter boat fishing.

    The point is, there is no harm in your fishing info site having a reciprocal link with the fishing charter boat sites, just include nofollow in the link.

    The link is there for your visitors to follow, not to get either site a higher ranking.

  319. Dave (Original)

    IMO, most directories are totally useless, and are there for other purposes than providing a useful resource for people. I have an very negative attitude about them because of what they are. They are are there because search engines exist. It’s just that the directory that Matt used as an example isn’t like most directories – not according to Matt, anyway. It sounds like a useful resource that is being unfairly treated by Google, AND Google is intentionally depriving their users of much of that resource. I see no sense in it at all.

    Doug (Heil)

    Use the HTML blockquote tag to quote. Forum-type codes don’t work in this blog πŸ™‚

    Robert G. Medford

    As much as I often disagree with his analysis, Phil C. …

    Perhaps you have read my other analyses closely enough, Robert πŸ˜‰

    Jack Mitchel said:

    Ranking doesn’t bother me so much. It’s the fact that pages aren’t getting indexed that bothers me.

    For me, that’s the crux of this. I said it earlier, and I’ll say it again – let the rankings fall where they may, but index decent pages – just because they are there! That’s what a search engine is supposed to do. That’s what its users expect it to do. Allow your users to opportunity to find decent pages – just because they are there.

    I’m seriously wondering if Google really is short of space, as Eric Schmidt (the CEO) said. Matt said that they have enough machines to run it all, including the index, but to run what exactly? A pruned index? I can imagine a decision being made as to whether or not they keep on adding new machines and new capacity, or start being a bit selective about what they index. Perhaps Google really is short of indexing space after all.

    Whatever the reason for the new crawl/index function, it is grossly unfair to websites, and it intentionally deprives Google’s users of the opportunity to find decent pages and resources. It’s not what people expect from a good search engine. By all means dump the spam, but don’t do it at such a cost to your users and to good websites.

  320. That should have read…

    Perhaps you haven’t read my other analyses closely enough, Robert.

  321. Hi Matt thanks for update but it does raise a number of concerns. As mentioned by PhilC and Robert Medford above – I do wonder if you are not in danger of dividing the web into the ‘have’ and ‘have nots’ regarding links.

    The web is a very big place and there are very many, very diverse users and publishers – some are very well skilled in the web and code etc. and many others are not (my self included). This is what makes the web the interesting place that it is – you can find real gems of information – that you really rate but which may be of little interest to the majority of surfers. There is a site out there somewhere about growing pineapples and other exotic fruit in your living room – great just a few pages with a real rough and ready look to it – but who is going to link to a site like that. With your new BD policies sites like that will disappear and we’ll be left with thousands of bland, corportate clone sites that are SEO’ed to the hilt and are as dull as ditchwater!

    Try looking at some asian sites, especilly japanese to see rampant creatvity – little robots and clowns and racing cars etc and not an SEO to be seen anywhere!

    Back to the main point, which is that within the web community there are those who are well connected and savvy about links etc. and there are very, many more who are not. So some publishers start off with a huge advantage regarding linking strategies and others are always at a disadvatage. If the rate of indexing is to be determined by the number of quality IBL’s they will always be at an advantage. The unconncted will suffer a double disadvatge, they won’t have the benefits of extra traffic that links provide and also they won’t get indexed – therfore they will just fade away.

    The sort of quality links G is looking for are presumable; .gov and .edu links, large corporates sites, all of these give a natual advantage to a website if you are well connected and can get a link. Likewise folks in the SEO and SEM community – know their way around and can easily get links. But what about the small, enthusiast webmaster, the small business or hotel and small community sites. How are they going to get quality links to their sites. They have to rely solely on reciprocal links with simlar sites. From what I can gather from this blog these changes will wipe out all of these sites. But why – they are the life blood of the web – they are what keeps it going. G will stife the webs diversity if you are not careful.

    We’ll end up with a web of full optimisted, cloned, corprate brochure sites and thousands of blogs talking abour the web in the good old days!

    Anyway thats enouth of that. Have a great break – we expect to see some nice pics when you get back. Oh and don’t take any electronic devices with you – camera excepted!

    regards dave

  322. I think this is why I like MSN. They seem to rank their pages based on what the page is about rather than spammy linking techniques that seem to work in some other engines.

    My site (yes talking about my site) for example – there are only 2 websites on the subject in the whole of the WWW, MSN recognise that my site is relevent to the topic, wheras Google doesn’t see it as relevant at all, infact, Google decided to drop the pages it once indexed – now I read that this could possibly be because Google doesn’t think I have enough high profile IBLs?

    I’m not trying to knock Google, because I like Google in general, but I know when people search for things relating to my site, and knowing they would be glad to find it, they wont, because Google dropped the page and don’t rank it.

    Just another thought to throw in – how can people naturally link to a site they can’t find?

  323. All right Matt this a preventive intervention! I’m not asking to go into Google-purgatory, just having some fun, because sometimes you have to laugh to keep from crying.

    I was reading a SEO Forum a discussion came up regarding link-bait and you, well my gears started turning (the engineer in me), and I threw up a quick blog post with my attempt at graphic arts.

    I won’t spam your Blog with the link, but if you are interested, its one click away from my URL.

    John

  324. I totally agree with DavidW and the rest of the folks who wrote about dividing the net into have’s and havenot’s. At the moment it all boils down whether you play the backlink game or not. But even if you want to play that game – for some non-commercial sites with good content that’s just not feasible. If you’re in a niche like us with an enthusiasts audi s and rs models website it’s quite hard to get decent links and it get’s even harder if you’re in a niche and your website language is not English. Where should we get that many high PR links from to get a deep googlebot crawl into our discussion board topics? English websites usually don’t link to us or or blog about us. Of course we’re using sitemaps but that obviously doesn’t help as long good IBLs are missing. It’s a lot easier if you play the backlink game in the English market because it’s so huge.

    Cheers,
    Jan

  325. I totally agree with Nicky.

    I’ve been checking whois data on websites that have lost there indexes and those that are mostly intact or at least shown a load of old pages.

    So far anything less then 6 months old has been dropped and anything older is still there or has a load of old supplementals showing.

    What does this mean? You don’t get indexed until 6 months from registering your domain?

  326. Matt, just a comment for your consideration…

    If as you say there’s no server crisis or problem storing data at Google, then how do you actually see the new crawling method as benefiting users in terms of relevancy?

    Certainly there’s a lot of “spam” content that wants to be indexed, but there’s also lots of new “good” content that wants to be indexed, too.

    It used to be th case that you could get listed at directories, get link exchanges, or buy a couple of ads to help the spiders know you were there.

    Seems to be the case that Google is intent on killing these methods.

    In which case, how on earth is a new and useful site suppoed to get useful links?

    The suggestion seems to be that a site must be exemplary to get the gets and indexing, but surely you are aware how difficult it is for newer sites with good content to be exemplary?

    Not ranking sites for some types of links was one issue – it’s understandable – but not even indexing sites to any degree on those grounds isn’t going to be helpful for anyone.

    It used to be the case that Google wanted to index the entire web – access the cotent that was normally difficult for search engines to find – and crowed about the huge size of its index.

    But now that index is backfilled with supplmentary junk that very commonly comprises of nothing more than long-dead URLs and 404’s. And his type of content is preferable to new content?

    I have to say, the situation does sound more like a server problem and the indexing issue is simply Google’s immediate response to addressing the problem. In which case, I can only hope this is true, and that normality will return, because otherwise you will simply continue to provide less and less relevancy in your results.

    2c.

  327. The core of my disappointment is that the Internet is no longer a level playing field. ‘Back in the day’, the Internet provided an unprecedented business opportunity for anyone with a little gumption and willing to put in the time and effort. By way of perserverance and elbow grease, and minimal capital (depending on how much I did myself), I could build a site that could compete with the ‘big boys’.

    That’s no longer the case. Developing an online business now is like trying to open a hardware store next door to Home Depot. Site age, backlinks, link age, link churn, degrading purchased and reciprocal links and other filtering factors have more and more of an influence on position, while actual content seems to matter less.

    Google isn’t Walmart. Google is the Department of transportation, and all the roads it’s building lead more and more to Walmart and less and less to Mom and Pop’s Tool Emporium.

    The Internet started as a democracy, with everyone equal. In almost any eCommerce or Service segement however, it’s evolved into a monarchy, with stores like Amazon, eBay, Walmart, Target etc. ruling as kings while the serfs fight over the scraps and try to eek out a living.

  328. Hi Matt,
    I read with interest the bit about URL’s with hyphons and the issue there had been.

    I checked my sites and sure enough, those with hyphons seemed to be hit hard with pages removed, one site to one page only.

    You suggested there was a quick fix, but as up to now there has been no difference in my sites.
    Will there be a difference do you think to the main fix?
    Or are you suggesting where my sites are now are my normal stats and this issue has now been completed.

    Also is there a time differnce to the UK as to the USA?

  329. lol, did some one say “Mom and Pop’s Tool Emporium”

  330. Some thoughts and comments in random order.

    1. It seems most of the spam sites we talk about contain adsense adverts, if Google manually approved urls first, then I’m sure that this would help cut down on spam.

    2. People have mentioned 3 ways links what are worse are 2.5 way links, which a lot small site owners get tricked into.

    Site A links to Site B
    Site C links back to A
    and site C is just some spam directory or dmoz clone, that is owned or found by Site B

    3. But I wondered if all of this an 80/20 or 95/30 chop.
    At a guess 80% of people look at 20% of the sites,
    or 95% of people look at 30% of sites.

    4. I thought expired domains were automatically dropped from the index and not reindexed anyway ?

    5. Blogs seem to be the new forums, years ago everyone had forums but dropped them because if no one ever visits the site then no one ever posts. How many forums have you seen with 2 posts and 5 registered members ? Now blogs are similar all these sites now have blogs with just 3 or entries which aren’t of any real interest
    eg This is my new exciting blog
    or Updated the widgets page yesterday
    or This site had 25 hits yesterday
    So what ! how is that valuable content for visitors?

    6. I also agree with ‘Not pwned Adam’ about footer links

    7. Extra content, every widget site now has extra pages like
    The history of widgets
    Taking care of your widgets
    Widget traditions
    News articles about widgets

    Not because they provide a valuable service to the visitor but just because they want to rank better

    8. A few people have mentioned about ecommerce and using adwords to provide traffic, as people have already mentioned, isn’t adwords just paid for links?
    Also adwords isn’t cheap enough if you are selling low ticket items.

  331. This thread is unique in two ways:

    (1) I believe that it’s the biggest thread ever in this blog.

    (2) I don’t think that anybody has agreed with Google about this issue. Probably most of this blog’s regular contributors back Google to the hilt, but I don’t recall anyone doing it about this issue. Even Doug Heil backed off and didn’t come up with a good reason why the health care directory shouldn’t have all of its pages indexed, and he’s Mr. Whitehat.

    Doesn’t this tell Google something, Matt? The overhwelming opinion from all sides is that Google is doing it very very wrongly. Nobody is talking about rankings, and nobody is talking about spam. Everyone, including the hard-line Googlites and spam haters, is talking about Google being very unfair to ordinary websites, and to Google’s own users.

    There’s a *very* big message here for Google.

    If Jill and I are in agreement (somebody mentioned that – I don’t know personally), then Google really should take notice πŸ˜‰

    Nicky’s site is one of only two in the world. It’s an information site – but it’s a gonner.

    I repeat – there is a *very* big message here for Google.

  332. It’s kind of like playing darts after having been spun around a few times while blindfolded. I might hit the board at some point, but I’ll have no idea how or why I did it, and the odds are agin it. And I’ll never be able to replicate it.

  333. What webdango said.

    The problem nowadays is webmasters have become OBSESSED with SEO. And why? Because of the way the major search engines work. They’re obsessed with rankings. They will forfeit good content and come up with gibberish that just happens to have the right keyword density for certain keywords. I am one of those unfortunate souls plying a trade in e-commerce. The top SERPs in my sector are also an embarrassment to my sector – extremely low on content, but tweaked to the max in terms of SEO. I did a search for “web design UK” on a particular search engine and the top result was a web design outfit that hadn’t updated their site for 2 years. Even more, they claimed to have won an award – when I clicked on the ‘award link’, it was a scam site giving out ‘awards’ to anyone who joined their affiliates program. But their site was keyword-stuffed to the brim, so they get to be on top of the SERPs. What a total joke. Where’s the quality?

    We need to forget about the major search engines. Seriously. I’m concentrating all my efforts on local trade – that means getting out in my car, and meeting people. I’m placing adverts in certain magazines. I’m doing some telesales. I tell you what – it’s working. It’s slow and hard work, but it’s the way to get things done in 2006 if you haven’t got a keyword-stuffed, 500-backlinks-to-high-PR-sites-that-have-been-bought website, it’s effective.

    What the internet needs is Google to get smaller, and for many other search engines to enjoy their share of search traffic. Web standards should replace SEO. SEO isn’t adding any value to the web – it’s making people write text for robots, not humans. Reward clean HTML (robots can do). Reward rich content (employ a human). Ban spam quickly.

    p.s. I changed my default home page from Google to MSN.

  334. (and btw, to whomever it was up there who mentioned about putting a “designed by” credit on a client site – I actually agree with you, and stopped doing this years ago, even though I’d gotten at least two projects directly from such a link. Somehow it just doesn’t look right, or professional anymore. So there’s three of us.)

  335. [blockquote]Google is broken. I can not find anything i search for, and any of the hard work put into a few quality sites that have been around for years is being negated and de-indexed page by page day by day.[/blockquote]
    That is your opinion based on the opinion of a website owner/webmaster. The thing is, your opinion should not be the major issue of a major search engine. The “users” of that engine who actually do real-world searches looking for products or services is who the major players are that engines actually put their priorities on. If those “real” people doing the searching actually find they don’t like Google anymore, they seek out a search engine that serves their needs.

    This is all basic stuff… survival of the fittest. If people seek out other engines for their research on products and services, then that other engine will rise to the top, right? So far to date, I’m not seeing that at all. The greater majority of searches are done at Google. Website owners can claim “bad Google” all they wish, but it’s the real searchers who have the most control. … common sense.

    This thread is still pretty much all about the “linking” thing in many minds. Even after Matt stated in another post that this is “not” all about linking, members still insist on thinking that it’s all about linking.

    Remember that “each” website is different than the other website. Each site has problems that the owner really does not understand nor know about. It’s very true that links could be a problem for “this” site, but it’s also true that something other than links could be the culprit. Unless or until you have someone from Google specifically “look” at your individual site, or someone more knowledgeable taking a look, there is absolutely no way you can read anything into your individual site’s problems by using a very blanket and general statement about “links”.

    Further; it’s certainly not in the very best interest of an engine like Google to specifically state in a blog about how a website should build their site or how they should avoid penalties, etc. That’s like giving our “secrets” to the people in the world who want to kill us. It simply makes zero sense for an engine to state exactly how their algo works in a given point in time.

    Google is doing a great job of reaching out to you all. It should really be appreciated the vast amount of info they are giving these days. But don’t think for one second that a general type statement ‘must’ pertain to your individual site as well, as that couldn’t be further from the truth.

  336. Hi guys,

    I’m a bit worried about the simplicistic concept of “relevant” or “related” content used by MC when he talks about linking and reciprocal linking.

    I’ll explain what I mean with an example: we are a hotel reservation website and we deal with hotels in various destinations of the world.

    Our “related resources” are the ones that would be _USEFUL_ for a traveller.

    As the traveller will book the hotel with us, the rest of the resources are “complementary” resources and not competitive resources.

    Example of what we link and what our travellers want us to link (as these are useful things to know if you have already booked or about to book an hotel):

    – Car rentals
    – Airport transfer services
    – Bicycle Rentals
    – Art Galleries
    – Cinemas
    – Museums
    – Theaters
    – Bars
    – Food Festivals
    – Restaurants
    – Casinos (Yes, if you book an hotel in Las Vegas, you want to know the best casinos if you don’t have one inside your hotel)
    – Clubs and Discos
    – Festivals & events
    – Nightclubs

    I also have another 195 categories of resources that we regularly link in order to build a good service for our hotel-bookers.

    As you see, these are all hotel and travel related resources, that makes our websites very visited and one-way-linked just because these are useful info for a traveller than wants to book an hotel and know more about the area.

    NOW: I’m worried about what MC says in his blog and about the use and definition that all the SEO world has done about “relevant/related” content.

    It should be natural that a website will link COMPLEMENTARY resources, not COMPETITORS. Therefore, the keywords to be inspected on our outgoing links are 100% different from what we sell.

    Therefore, I’m deeply worried about the concept of “related” that Google will or is applying in evaluating what type of links you have on your pages.

    MC says:

    “another real estate site……I checked out the site. Aha, Poor quality links…mortgages sites….”

    Now: is MC aware that mortgages sites are natural and relevant and pertinent to be linked if you are a real estate agent, as you might want to give related services to your visitors telling them how to find the money to buy his services?

    Or does MC search for the related content in terms of a simplicistic “real estate words are good, anythign else is bad”? I mean: is Google even thinking about the fact that a real estate site cannot link a competitor but will be more likely to link complementary services?

    In short: does Google and MC want us (a hotel reservation service) link Hotels.com as it will be relevant (and a complete nonsense as they are our competitors) or is googe “mapping” the related (complementary) services for every industry?

    I doubt that Google will have a map of every complementary service for any given industry: therefore, I’m afraid that “related” for MC means “same topic, same industry… competitors, essentially”.

    Will MC want Expedia to link Orbits, in order to evaluate Expedia’s lik as relevant?

    Or will MC and Google better evaluating (or not “worse evaluating” at least) Hotels.com linking Avis or Budget?

    Thanks

  337. I don’t think that anybody has agreed with Google about this issue. Probably most of this blog’s regular contributors back Google to the hilt, but I don’t recall anyone doing it about this issue. Even Doug Heil backed off and didn’t come up with a good reason why the health care directory shouldn’t have all of its pages indexed, and he’s Mr. Whitehat.

    Just because an opinion wasn’t openly voiced doesn’t mean that people don’t agree, Phil. However, what tends to happen is that the voices of discontent drown out the silent majority.

    Since apparently you seem to think no one “agrees with the issue”, while I don’t see one I can see exactly what Matt is saying. And it’s quite simple:

    Look inside before you look outside.

    No site is perfect. That includes your site, that includes my site, that includes Matt’s site, that even includes Google. No matter what, there is always room for improvement.

    In the cases of everything Matt has discussed here, the webmaster has made a mistake each time. It isn’t always an on-the-page factor, and in the case of the health care site it’s a factor that only Google would be privy to knowledge of, but there has been a factor each time.

    BigDaddy has served to expose a larger number of errors than it has in the past, things that were previously tolerated or “forgiven” but aren’t any longer. It’s not perfect. It’s got a ways to go (particularly with scraper sites, but that’s a big mother of an issue). But it’s certainly on the right track.

    Doug Heil gets this, h2 definitely gets this, Dave (Original) gets this, I get this, and I’m sure there are others but there are just too damn many comments at this point. If I ignored yours and you do get it, it’s an unintentional and honest mistake and I do apologize in advance.

    As far as the health care site goes, I’m not sure if you read the comment I made on it, or for that matter the answer Matt gave because it was fairly buried. So I guess benefit of the doubt applies here.

    And now, the answer Matt gave (side note/suggestion for Matt…the next time you want to show why something’s wrong, either bold it or put it in its own paragraph. That way, it can’t get missed.)

    Aha, the owner said that they wanted to kill the www version of their pages, so they used the url removal tool on their own site.

    The health care site used the URL removal tool on their own site, in effect asking Google to delist it.

    What else is Google supposed to do at that point? “Oh, they didn’t put in both the www and non-www versions so they probably only want the non-www version. They don’t have a 301 redirect to accomplish this, but let’s just go ahead and guess at what they mean anyway. Then we’ll have a nice fat canonical issue to deal with from webmasters who wonder why their non-www is listed and not the www version.”

    I don’t think there’s an issue at all, other than what people need to do to fix their own stuff.

  338. there are still many sites with hidden text coming up high in the rankings of Google. also, there are still a TON of dead links out there. as always thanks for the update!

  339. So is it ok to use reciprocal links if they go to on topic real non spam sites like lawyers linking to lawyers and dentists linking to dentists. I know of a Dentist site that all of his links come from paid links in directories or he has a reciprocal link program. His site has a directory that he uses to exchange links from other dentists. And this dentist ranks for the terms he wants to.

  340. Doug Heil – “That is your opinion based on the opinion of a website owner/webmaster. The thing is, your opinion should not be the major issue of a major search engine. The β€œusers” of that engine who actually do real-world searches looking for products or services is who the major players are that engines actually put their priorities on. If those β€œreal” people doing the searching actually find they don’t like Google anymore, they seek out a search engine that serves their needs.”

    Ummm many of us ARE regular searchers also looking for services and products. Many are experts in our industry in that we know which sites are complete junk and not and what sites that “should” be listed for the average JOE. Also if every site one day were to exclude Google from indexing, you would see just how important us webmasters are. Even more important than the average JOE surfer is actually. If this happens Google would have no product. No matter how much they tweak the algo for average surfer it would do no good because the base product (us) no longer exist. In business the PRODUCT always comes first with “user” in mind. Service of the product comes second.

    Remember that Google was built by the PR of us webmasters. We help start it. Even before media PR took off. Only after did the average joe surfer follow along. The same CAN and WILL happen in reverse. I can walk up to 20 people who know I do business online as say “Google is not longer the thing – MSN is now the popular engine” people would actually listen. “He does business online and knows what he is talking about”

  341. Thanks for the update.

    Referring to the comments about irrelevant links and link networks, it appears G is trying more to give priority to relevance. I always did see G as the SE with most weight on relevance. Naive maybe, but I’m convinced relevance will win the search market.

  342. Like someone above said, I don’t care as much about where I rank at this point than just indexing. In looking at what google has on my site… if I do a site: search I find about 100 pages that are in the “main” index (that’s a bit under 10% of the site itself I guess) and the rest are supplemental, yet, when I try to do a text quoting a sentence from a supplemental page – no results found. I thought when the “main” index was exhausted that results should be pulled from supplemental? Have I missed that somewhere along the way? The supplementals don’t seem to be used in general searches at all from what I’ve tested. EVEN when the main results come up empty. It looks like most of the site is indexed, but supplemental now is just not being offered up.

    Sure, I probably don’t have tons of quality links coming in. I can do a link: search and only see a couple (one from another of my sites.) but if I search for “mydomain.com” then I see tons (yes most of what I’ve looked at there are really links there. instead of just the text.) Of course, they may not be “worthy enough” links to pull my site out of the supplemental banishment, but amazingly some of those sites seem a LOT easier to find in the results which makes me think they have higher pagerank.

    My biggest concern is the uselessness of the “site search” from google. Since text from the supplementals are not searched I’ve had to start putting in another site search so that things can actually be found.

    As for the lower crawl priority for sites that don’t have quality incoming links – I’m getting crawled every day. Just out of curiousity I checked a couple pages against the site: search and indeed many of the supplemental pages are getting crawled. It just doesn’t quite seem to add up. From what you’re saying, here’s what I would expect… lower quality inbound links – more likely that pages get supplementaled and don’t get crawled every day, but once every blue moon. Pages that are “higher quality” due to their backlinks get checked more often… then from the search end – search term pulls up results in main – supplementals are offered as an additional resource, search term pulls up NO results in main, so supplemental is used to give an answer. This doesn’t seem to be what’s happening – I’m getting almost continuous crawling – even of supplementals – they’re not updated in the index (old caches in site: search for supplementals) and they aren’t showing up as even being there when a quoted text search is done.

    I’ve already documented some of my other frustrations with the placement in results in other venues – how my MAIN computer service site always seems to be the dead last result even with the quoted text of the title of the page (which is quite long and includes my name.) It’s almost as though I’m being penalized for something else although I don’t know what it could be other than the lack of “quality inbound links” although what’s ironic – most of the sites that come in above me in that search are actually linking to my page… go figure…

    But to bring it full circle – what is most concerning is not “where I rank”, but the broken-ness of “site search” and the fact that now google is useless to me for finding things that I’ve written on my own site. What’s truly ironic is there is VERY current and full coverage in the blogsearch.google.com area, but I don’t think there’s a “site search” box that I could put on my site for the blogsearch.

    Are supplemental results supposed to be offered up if no results are found in the main index?
    Is there a penalty against my main site?

    Thanks for any feedback.

  343. Google is just one website. The internet consists of billions of pages. Google has reached its zenith (surely?) in market share. Businesses that are turned off by the way Google works will find alternative methods. This is the way business works. Google can continue to ignore the complaints all it likes. It will see its search traffic go down, and advertisers use other PPC networks. You can be the main player and choose not to listen to your customers. Eventually you lose business. Eventually competitors find new strength from your complacency.

  344. Adam.

    I don’t recall anyone agreeing with Google on this issue – that’s what I said. If people chose to be silent, I can’t help that, but it’s rare for such silence in this blog.

    I asked Doug to come up with a good reason why the health care site shouldn’t have all of its pages indexed, and he didn’t come up with anything. I’ll ask you to do the same – please come up with any good reason why the health care site shouldn’t have all of its pages indexed. We’re all ears.

    I didn’t forget that the owner had made a mistake with the delisting thing, but that wasn’t Matt’s answer to the problem (the delisting had lapsed some weeks ago). Matt’s answer was:

    That said, your site also has very few links pointing to you. A few more relevant links would help us know to crawl more pages from your site.

    Now why would Google need help to know to crawl more of the site’s pages? Google already knows that the pages are there – they have URLs. And why would more IBLs help them to know?

    What difference does it make if a site has only one IBL or a thousand IBLs? Does having only one IBL make it a bad site that people would rather not see? If it does, why have ANY of it’s pages in the index?

    These aren’t rhetorical questions, Adam. I’d like answers to them please.

  345. Remember that Google was built by the PR of us webmasters. We help start it. Even before media PR took off. Only after did the average joe surfer follow along. The same CAN and WILL happen in reverse. I can walk up to 20 people who know I do business online as say β€œGoogle is not longer the thing – MSN is now the popular engine” people would actually listen. β€œHe does business online and knows what he is talking about”

    While I do believe in the awesome power of GWOM (Geek Word of Mouth), it was far from the only factor.

    What about ISPs that use Google for its default search?
    What about the Netscape tie that existed for years?
    What about AOL (I don’t count AOL as an ISP because anything that destroys a TCP/IP stack is not a real ISP)?

    And once something like that is entrenched into users’ behaviour, it’s very difficult to remove it.

    I could probably do the exact same thing you just said…tell 20 people to use MSN and get them to do it. Hell, I could walk into one office alone and do that and have it done in about 30 seconds. But it wouldn’t accomplish a damn thing, because those 20 people would tell no one else. So I told 20 people to switch out of the billion that presently use Google. If all the webmasters that hated Google right now did the same thing, you might get a million people to switch. A drop in the bucket.

    “Us webmasters” don’t all share the same point of view anyway. I don’t have a problem with Google SERPs for the most part and finding what I want, assuming I’m doing a completely objective search (about the only thing it seems to give me problems with are used car parts). Doug obviously doesn’t. And even if we did, we’re a small portion of the community. You’re a drop in the bucket, I’m a drop in the bucket, Doug’s a drop in the bucket. As he quite rightly pointed out, it’s the overall user perception that matters, not what a few egocentric people who didn’t get their way and want to bitch in Matt’s blog think.

  346. Google faces an insurmountable problem. Making EVERYONE happy.

    What I’m starting to realize is that we aren’t going to rank #1 for every single keyword we desire. It simply isn’t fair. Because there are sites that are more relevant for certain words than ours’ might be. It’s frustrating yes, but it’s life. I’m actually starting to respect sites that are better, or more relevant than mine that rank higher for certain keywords. I say okay, this site is solid and it deserves to be here. I still want to beat it though, so I may optimize and enrich my site to compete. Complaining won’t solve anything…

    However, the annoying aspect to all this is the garbage sites interspersed among the results. That is what pisses everyone off, including myself. You know these sites don’t belong above yours, or among the results period! And you feel that it is so obvious, yet no one seems to be doing anything about it. Problem is there are billions of sites and keywords, so what seems like a simple fix is multiplied by a billion or so making it a severe task.

    So in closing instead of worrying about the garbage sites, worry about the sites that are better than yours, and that rank better for your desired keywords. That should be the focus. The junk will always be there. The web is simply a microcosm of life, and life contains junk.

    Word.

  347. The heart of the issue:

    An ‘objective’ algorithm making a ‘subjective’ decision. The only person who can make a true assessment as to the ‘value’ of my or any other site is the user. Only the user will never get to my site because a computer formula has decided it isn’t ‘worthy’.

    Doug said “This is all basic stuff… survival of the fittest.”

    No, it’s not. Like most things in our society, it has devolved to what it always devolves to: he who has the money makes the rules, although in this case it’s he who has enough money to brand through traditional media gets the ranking.

  348. Phil…normally I would tell someone who asked me a question in the manner that you did to stick it straight up his ass for being so damned arrogant. (Seriously, dude, you need to let up on that. You come across as being very elitist sometimes.)

    But I’m not going to do that in this case.

    Here’s how the scenario, as I see it, would have played out:

    1) Health care site gets listed in Google, has a series of 6 IBLs.
    2) Site inadvertently asks to get delisted. (Webmaster error.)
    3) Time lapses on the delisting request and again, webmaster is unaware.
    4) Webmaster presumably doesn’t file a reinclusion or resubmission request (I don’t know for sure…Matt would have to fill in this blank)

    At this point, the site for all practical intents and purposes is a “new site” again. How does Google know whether or not it’s worthy? There would have been no fresh IBLs that would suggest that other webmasters find the content to be worthy enough of a backlink…the webmaster would have not indicated to Google that he/she wants to be back in other than submitting what apparently was a manual request via email.

    Matt may have offered an opinion, but it’s just that…an opinion. It’s subjective. So if Matt likes the site, it should be in there? What happens if he hates one? Can you imagine the ramifications and bitching that would go on if Google started filtering results on personal whim?

    Now…at this point, two things are known:

    1) There are very few IBLs, and presumably none since the reinclusion request was made.
    2) The webmaster had submitted a delisting request and had done nothing to indicate that he/she was still interested in being part of the index.

    How would anyone or anything evaluating the scenario objectively be expected to know whether or not a site should be included in the index given those parameters? It was a site with very little external credibility and even less recent credibility that had already said it wanted out (accidental or otherwise).

    Google doesn’t have anything else to evaluate on, other than the inbound links in this case.

    It can’t go based on the spidering of the page alone, because that may not reveal all of the information related to that site.

    It can’t go by Matt’s personal opinion of the site, because it’s one person’s opinion, very subjective, and prone to human error.

    So the only thing it can go by is a series of links provided by other webmasters. In this particular case, the recent backlinks would be more beneficial since it would establish that the health care site is being viewed in a positive light by those who link to it.

  349. “What about AOL (I don’t count AOL as an ISP because anything that destroys a TCP/IP stack is not a real ISP)?”

    AOL came around AFTER they became popular. Even at that most people (even when google was on yahoo) thought that AOL search is just that and Yahoo search was Yahoo’s. How many “average surfers” knew that looksmart was feeding MSN for years before MSN developed their own?

    “because those 20 people would tell no one else.”

    How do you know for sure? You state it as fact but is just an opinion.

    “‘Us webmasters’ don’t all share the same point of view anyway”

    Nobody shares the same point of view due to unique life experience.

    “I don’t have a problem with Google SERPs for the most part and finding what I want”

    The key phrase ‘for the most part’. Problem is if sites in the industry are not fully indexed what information, points of view, products are you missing that COULD be there? That is why we are complaining and NOT because a site dropped from #4 to #20 or whatever in that there is fair and equal opportunity.

    “As he quite rightly pointed out, it’s the overall user perception that matters, not what a few egocentric people who didn’t get their way and want to bitch in Matt’s blog think.”

    And where does that overall perception come from? Sometimes a few egocentric people have changed the world, insighted revolutions, etc.

    “And once something like that is entrenched into users’ behaviour, it’s very difficult to remove it.”

    The thing is…it really is not all that difficult to change it. A behaviour can be changed in an instant. A moment of decision. All there needs to be is something to interrupt the pattern and a viable alternative with that alternative needing to be reinforced. That is all! The difficult part is just being able to do it.

  350. I am new to SEO, and trying to learn,
    and as I read in the post, a lot of times, there are “fake software made” websites, only for adsense with bad content.

    Anyway, I have a question, what about new websites, are them indexed as they don’t have PR, and no links?

    So people will not find my 3 months old site when they tape in google.

    And what about the Future of article submissions?

  351. Adam.

    I’m sorry if the way I write isn’t very good, but I’m just trying to write logically and put points across as clearly as I can. I don’t intend to talk down to anyone, but asking questions in such a way that answers are almost a requirement is intentional – to make points πŸ˜‰

    The things that we know about the health care site are:

    (1) Some of it was delisted by request, but not all of it, so it doesn’t have to start from scratch.

    (2) It has six IBLs that it’s reasonable to assume are not new. That’s reasonable because it was fully indexed before the delisting mistake (at least one IBL was necessary), and since then some of its pages have stayed in the index.

    (3) Matt, who is the expert at finding wrong things with sites, didn’t see anything wrong with the site – and he looked – so the problem isn’t internal.

    (4) Matt said clearly that, if the site got some more IBLs, then it would get more pages indexed (my paraphrase).

    I haven’t suggested that all of the sites pages should be in the index at this moment in time – we know it takes time for that to happen. What I’ve asked for is any good reason why all of that site’s pages should not be indexed.

    The reason I’ve asked is because Matt’s comment about the site made it clear that, with more IBLs, more of its pages would be indexed, and by inference, that if the site doesn’t get more IBLs, then it’s not likely to have all of its pages indexed. Specifically, he said “With that few links, I can believe that out toward the edge of the crawl, we would index fewer pages.” I want to know why.

    My big objection is that IBLs have nothing whatsoever to do with the quality of a site, and should have nothing to do with how many of a site’s pages are indexed. That’s the reason I’ve been using that particular site as an example. If you can come up with a reason why all of its pages shouldn’t be indexed, then please tell me. Or if you can come up with a good reason why the number of IBLs *should* be used to determine how many of a site’s pages should be indexed, and how many to leave out, please tell me, because I can see no sense in it.

  352. I totally agree with DavidW and the rest of the folks who wrote about dividing the net into have’s and havenot’s. At the moment it all boils down whether you play the backlink game or not. But even if you want to play that game – for some non-commercial sites with good content that’s just not feasible. If you’re in a niche like our enthusiasts audi s and rs models website it’s quite hard to get decent links and it gets even harder if you’re in a niche and your website language is NOT English. Where should we get that many high PR links from to get a deep googlebot crawl into our discussion board topics? English websites usually don’t link to us or or blog about us. Of course we’re using sitemaps but that obviously doesn’t help as long good IBLs are missing. It’s a lot easier if you play the backlink game in the English market because it’s so huge.

    Looks like (non-english), niche websites with good content are going to be the losers?

    Cheers,
    Jan

  353. So isn’t there a tremendous amount of ambiguity in who exactly determines the “quality of links” you speak about? Will the site that I am describing be penalized somehow (less pages crawled because of the digital camera links and Incorporation links)?

    Also – the site is big and ranks in the #1-5 spots for high traffic keywords in the webmaster space in Google. How is this consistent with your post?

    Thank you for giving us an outlet to read about Google and respond!

    I

  354. Incidentally, Adam, Google did become popular through the buzz that web-type people caused. The tie-ups that you mentioned came afterwards.

  355. “I asked Doug to come up with a good reason why the health care site shouldn’t have all of its pages indexed, and he didn’t come up with anything. I’ll ask you to do the same – please come up with any good reason why the health care site shouldn’t have all of its pages indexed. We’re all ears.”

    Show me the exact site in question, and I might be able to give an “exact” answer of some kind. Phil; you know better than most that it’s impossible to give a general answer to questions that will fit into ‘most’ other sites as well. I thought I gave you some possibilities with a prior post? How many more answers do you want me to give? I know it’s not the answer you were looking for…. like “bad Google”, but it’s the only answers I can give without actually viewing the site, right? Please read my prior post where I gave some answers on that health care directory.

    The Adam that doesn’t belong to Matt; … who the heck are you? It seems you read my/our stuff as you are “spot on” with my way of thinking. LOL Great post!

    arubicus;… read “the other Matt’s” post again and again. “We” are a very small minority compared to the internet users as a whole. I sometimes think that this “little” community of webmasters/owners/SEO’s actually think we are some huge majority of the internet, and that Google and other major engines should bow down to us because of it. Believe me; we are very small.

    I agree with you that “we” indeed are regular users/searchers as well, but we are small. You can’t only look at certain groups who may be unhappy with google serps right now, but you have to look at the big picture.

    It IS only survival of the fittest. The major search engine with the most users who do searches daily, is the engine who gets the majority of searches. It ‘is’ common sense stuff, right? Just because a few owners are unhappy does not mean the “majority” are unhappy as well. IF AND WHEN the majority are unhappy with Google, then that majority will move to find another search engine, right?

    That’s called…. survival of the fittest, and has zero to do with how much money anyone has. It does have to do with common sense.

    Keep this in mind as well; do you all realize “how tough” it would be to give an answer to a question if you were an employee of a large company, and that answer had to appease a whole bunch more, other than the site the answer was given to? My goodness; what a tough job it would be. What’s the result of Matt giving answers to questions in here?

    The result is many more questions pop up because of that particular answer he gave. Why is this? Because “each” site has it own little set of many hidden problems that are impossible to know about unless the site is manually reviewed and diagnosed. Speculation about problems is just what it is… speculation.

  356. “Incidentally, Adam, Google did become popular through the buzz that web-type people caused. The tie-ups that you mentioned came afterwards.”

    Yep. Us webmasters help get them started. Usually this is the way it works on the internet. The WOM buzz is much stronger in the online world. Look at MSN and ASK running commercials for their searches. Not much impact was made. Now if MSN, Yahoo, or ASK creates a better search result then the Buzz will be on them starting through the webmaster community. This buzz sets a foundation of confirmation that indeed they do have better search results which in turns grounds a basis for perception change for ‘average Joe” (Interrupts patterns, challenges current habits and associations, sets basis for decision/change). Since many webmasters and online business owners know their industries our word tends to hold more weight and resinate longer.

    Major search players AOL and the like may switch to whomever provides their user with the best experience. This would be a major pattern interrupt for the “ignorant average Joe” and sets up patterns for change for them. Of course the more they use the new search then new habits forn and the other brand gets associated instead.

    This is the SAME route that Google took!

  357. Matt,

    Kudos to the team for their job on Big Daddy.

    Dropped a set of my sites from over 100,000 site: listed pages to around 50. Average PR = 6. But you still manage to suck up about 400K hits & 1Gig of bandwidth each month.

    WTG. Use my resources and my money to collect and analyze my sites without even allowing an accurate site index.
    NP though, as the sites are very popular and successful with the other SEs.

    I truly hope G attains remarkable growth. Just as the Anti-Rockefeller’s did some 100 years ago.

  358. (1) Some of it was delisted by request, but not all of it, so it doesn’t have to start from scratch.

    It likely isn’t. It just isn’t going to have other stuff indexed.

    (2) It has six IBLs that it’s reasonable to assume are not new. That’s reasonable because it was fully indexed before the delisting mistake (at least one IBL was necessary), and since then some of its pages have stayed in the index.

    Agreed.

    (3) Matt, who is the expert at finding wrong things with sites, didn’t see anything wrong with the site – and he looked – so the problem isn’t internal.

    No offense to Matt, but again, he’s human…he can overlook something. The initial comment appeared to be a surface-glance kind of thing. It’s probably accurate…but possibly inaccurate.

    (4) Matt said clearly that, if the site got some more IBLs, then it would get more pages indexed (my paraphrase).

    That would make sense.

    My big objection is that IBLs have nothing whatsoever to do with the quality of a site, and should have nothing to do with how many of a site’s pages are indexed.

    If you look at IBLs individually, that would be an accurate statement. 1 or 2 backlinks, for the most part, shouldn’t make a difference either way.

    But would it not be reasonable to assume that a site with 10,000 backlinks, from various sources and varying degrees of credibility, is more useful to the population as a whole than a site with 1? I think it would. I might not like the site with the 10,000 backlinks, but the majority of other people would. And the variety of sources indicates a prevailing opinion as opposed to that of just one person.

    If the site continues to gain backlinks and credibility, then why shouldn’t it be indexed more often and more deeply? It shows at least one of two things:

    1) The webmaster is sufficiently proud of his/her work to be able to promote it.
    2) Other sites find it to be a valuable resource.

    Now, I don’t know how many backlinks a site should get in order to be indexed fully. That’s an arbitrary number that could be debated until we all turn blue in the face. But a webmaster that’s so concerned about getting indexed, never mind ranked, should have a lot more than 6 links to his/her domain. Links bring in traffic, directly or otherwise…why wouldn’t any webmaster try to get as many as possible?

    Here’s a scenario involving a small number of IBLs:

    http://www.google.com/search?q=%22216.89.218.233%22&hl=en&lr=&rls=GGLG,GGLG:2006-19,GGLG:en&pwst=1&filter=0

    For those who don’t know what the IP is, that’s my server’s testing IP address (or redirector depending on what I want to do with it). There are backlinks there…a small number.

    But is any of that content worthy? By my own admission, no. It’s all testing stuff.

    How would a search engine be able to determine something like that? It’s got a link. Google knows it’s there. There’s nothing blocking robots from indexing it.

    Think of the crap and the potential for manipulation my scenario allows for webmasters who decide they want to be snaky.

    Let’s take the scenario above and put a slight twist on it (just for hypothetical sake).

    Domain A is bought and used by Company A.
    Company A puts up a site, gets 6 IBLs.
    Company A submits a delisting request.
    Company A lets Domain A expire.
    Company B snatches Domain A for purposes of building a new site.

    Now…would Company B be entitled to the credit for the backlinks that Company A went out and got beforehand? No. They didn’t do any of the work…all they did was snatched a domain name. It’s still a new site and hasn’t established anything since the delisting request.

    How many times does that scenario play out? Quite a bit. And that’s not all that different than what we’re seeing here.

    And finally, the biggest problem that not one person who has whined about the IBL issue has yet answered:

    What is a more effective measure to determine whether or not a site provides a valuable resource to users and deserves to be indexed fully, without potentially diluting the existing results?

  359. “What is a more effective measure to determine whether or not a site provides a valuable resource to users and deserves to be indexed fully, without potentially diluting the existing results?”

    This is the question of all questions! Even Google hasn’t figured out the answer. Nice something to ponder about really. It seems to me it is becoming time to stop playing the manipulation game, the lazy whinning I created it and I think it is good so it should rank game, and start playing the quality site/business that has a unique whatever to gain interest in the internet community game

  360. The comments on the wasted efforts and strategies are indeed an example of the strength of the human spirit. Like wanting to commit suicide, but just can’t get it done. The truth is that the art of Google is a myth. They do any thing they want because they have the ball and bat in this game. Webmasters are playing T ball thinking they are playing for the Yankees. Am I the only one that has noticed that they are a monopoly? Tat is suppose to be illegal in the US. No wonder they are going to China; that form of government is more along the lines of what Google would want.

  361. Matt,

    I understand you believe no one out there is experiencing any more problems related to Big Daddy, however, is anyone at google looking at the results being returned now? It seems like google is partying like it’s 2001. I could find not much of a sign that any modern websites are being returned. Is the new ranking priority that a site must be old and not redesigned in 5 years? Is Def Lepard at the top of the charts again? Is Google Search going to be renamed Google Retro? In case you haven’t noticed, things are getting ugly.

    -Jim

  362. Hi Jim, I honestly don’t see what you are seeing as far as serps go.

    http://www.google.com/search?sourceid=navclient-ff&ie=UTF-8&rls=GGIC,GGIC:2005-09,GGIC:en&q=children's+gift+baskets

    has not changed at all…

    http://www.google.com/search?hl=en&lr=&rls=GGLC%2CGGLC%3A1969-53%2CGGLC%3Aen&q=ethical+seo&btnG=Search

    Has not changed at all.

    I could do searches on many, many phrases, and don’t see any serp changes that look bad. For everyone who watches over or owns a site that dropped out of the results, I’m sure there are more than one other who has stayed the same or has gotten better.

    This thread is a tiny minority of people.

    And just because a website is not being shown in it’s entirety doing a site: search on the domain, does not mean a whole bunch. As anyone thought about the idea that Google is doing lots of changes right now, and really doesn’t want to “spill the beans” until things are all done?

    My forums are not showing but a couple hundred pages now, but that hasn’t stopped all the referrals we get from Google daily.

    I simply refuse to get all up in arms about something unless I know things are set the way they will be for awhile. All of this whinning, etc does no good. To make a drastic change “right now” does no good either. Trying to decipher whatever Matt says about something, and then “making” that answer pertain exactly to your individual site’s situation, is certainly doing no one any good either.

  363. It seems to me it is becoming time to stop playing the manipulation game, the lazy whinning I created it and I think it is good so it should rank game, and start playing the quality site/business that has a unique whatever to gain interest in the internet community game

    Now THAT I’ll drink to. πŸ™‚

  364. Is Def Lepard at the top of the charts again?

    Some of us LIKE Def Leppard. Damn new music sucks ass now. 80s glam hair rock and acid-washed jeans forever!

    And a little Mecca Lecca Hi Mecca Hiney Ho (for those truly on that higher plane of consciousness and understand that very obscure 80s reference.)

  365. Good post, but there are still too many problems with Google. In the Las Vegas travel market, the top 10 results have been the same for over 2 years for the keyword “Grand Canyon Tours” , many other search terms, the results ahve been stagnant.

    We are led to believe these odler sites have been Grand-Fathered in.

  366. Why are people still talking about the influence links should have on getting a site indexed? This has nothing, or should have nothing to do with getting a site indexed. This should affect rankings, not whether and how much of a site gets indexed. As far as links showing a popularity of a site, we all know this is not even remotely true. It is very easy for a site to get tons of links and this should show Google that it is a horrible thing to measure rankins and/or indexing on.

  367. Matt, you menation a problem with hyphenated domains that you say you think its solved.

    Could this be effecting pages with URLs like this?

    http://www.domain.com/nice-web-site-in-my-head.html

    If it could be that could explain quite a lot of the pages Ive lost from the index.

  368. Dave (Original)

    RE: “Now why would Google need help to know to crawl more of the site’s pages?”

    Perhaps they want quality over never-ending quantity. I others words, if they DID index nad list all those ‘other’ pages they would never rank anyway.

    PhilC, I think are you HUGELY mistaken by assumming silence means agreement/disagreement. The VAST majority of the people who come here ARE bias and generally only look at an EXTREMEMLY minute part of the whole picture. Of these, only those with PROBLEMS generally post.

  369. Caios,

    Venessa Fox just addressed this very issue on their blog.

    http://sitemaps.blogspot.com/2006/05/issues-with-site-operator-query.html

    And further information can be found in the newsgroup in the response from Google Employee here: http://groups.google.com/group/google-sitemaps/browse_thread/thread/0fc2ae32ef28da7e/961cf2c0421208fc#961cf2c0421208fc

    Where they say, “One issue is with sites with punctuation, which definitely affects you. We’ll keep you posted as we get this resolved.”

  370. Doug. I already explained which site I was asking about – more than once. It’s the health care site example that Matt gave. There is enough there to answer the question based on Matt’s information alone. But I don’t think you want to answer, do you, so forget it.

    Adam. You are talking from a web-type person’s point of view, but I am talking in general terms. The vast majority of people who have sites on the Web wouldn’t have a clue about getting some “buzz” going to gain links, as Matt suggested – and not only in this thread. They just want to put their sites online so that people can find it. It’s not a search engine’s job to decide whether or not it has value for anyone, or for how many people it might have value. It’s their job to index what they can, and dump spam as and when they find it in their index.

    It’s not a search engine’s job to determine which sites provide the most valuable resource. And even if it were, there is absolutley nothing to suggest that a site with 10,000 IBLs is any more valuable than a site that has 0 (zero) IBLs. For example, which site is more valuable to me right now? The one where I can order a pizza or Google? I’m starving and I really want a pizza, so the local pizza site is far more valuable than Google is right now.

    You see, a site can have great value to one person, and a site can have great value to a very large number of people. They are both equally valuable because the value of the pizza site to me is as great as the value of the other site is to you, and to him, and to her, etc. The degree of value is equal.

    Can you say that the pizza site should not be in a search engine’s index, just because there aren’t many of us who are likely to look for it? Of course not. Can you say that all of it’s pages shouldn’t be indexed because there aren’t many of who value it? No.

    You haven’t noticed the answers that you say nobody has answered yet? I’ll try to make it clearer…

    If Google is short of space, and they need to limit the number of pages in the index, then fair enough – let the most popular sites have the bigger shares, because they are wanted by more people, and IBLs could give some indication of that. But if there is no shortage of space, then limiting the number of pages that a site can have in the index is simply wrong. It’s editorial, and it’s not what Google’s users want or expect from them. It’s not a search engine’s job to be editorial.

    It would be acceptable to take IBLs into account if there was a reason to do it, such as a shortage of space. But there isn’t a reason to do it that we know of, so IBLs aren’t needed as an effective measure to determine whether or not a site provides a valuable resource.

    You can argue as much as you like, but you still can’t come up with a valid reason why any decent, perfectly clean, website should not be fully indexed, given that there is plenty of space in the index.

  371. Dave. I never referred to people who were silent, and I didn’t assume anything about them. I referred only to people who posted in this thread.

    Perhaps they want quality over never-ending quantity. I others words, if they DID index nad list all those β€˜other’ pages they would never rank anyway.

    Perhaps they do want quality over quantity, but if they do:

    (a) IBLs are entirely the wrong metric to use for measuring quality.

    (b) If pages are allowed in the index based on IBLs (and PageRank), and some sites only get some of their pages in, then NONE of the site’s pages should be in, because the low IBLs and PageRank score mean low quality. Of course, it doesn’t, so quality isn’t the issue here.

    (c) It’s not a search engine’s job to decide what their users want to see and what they don’t want to see.

    I don’t think for a moment that Google is doing this in an attempt to index quality. There aren’t any programmes yet that are remotely capable of doing that. I’ve no doubt that it’s to do with spam.

    As for your last point, there is an enormously long searchterms tail, and pages will rank highly.

  372. That should have read…

    I’ve no doubt that it’s to do with spam, or they really are short of space.

  373. “It’s not a search engine’s job to determine which sites provide the most valuable resource.”

    So it’s the actual website’s job to tell Google that it is the most relevant resource, right?

    I see.

  374. Dave (Original)

    RE: “You can argue as much as you like, but you still can’t come up with a valid reason why any decent, perfectly clean, website should not be fully indexed, given that there is plenty of space in the index.”

    Oh there is always a reason, we just don’t know for a *fact* what it is. However, Google do know for a fact why. In fact, Matt has stated part of the likely reason (see his disclaimer) why. You just don’t agree, that’s all. However, I feel VERY safe in saying that Google are in a better position than yourself in determining what and how much they index.

    RE: “I referred only to people who posted in this thread”

    Exactly! These people all have their own barrows to push and are extremely bias. Most post with PROBLEMS not PRAISE. However, Google (I’m sure) is more wiley than simply giving the squeakiest wheel the most oil.

    RE: “β€œIt’s not a search engine’s job to determine which sites provide the most valuable resource.”

    But the SE (in Google’s case) probably isn’t as a whole. Other sites are and Google searchers are by the search terms they use and the sites they visit.

  375. I would like Matt to look at rent.com since he review several real estate sites, why not include a big one owned by EBay?

    Look at the bottom of the page and look to where it says “Other eBay companies: eBay | Kijiji | Shopping.com | Epinions”

    How are some of these related to real estate? These are just as bad a ring tones.

  376. The Adam that doesn’t belong to Matt; … who the heck are you? It seems you read my/our stuff as you are β€œspot on” with my way of thinking. LOL Great post!

    You know that guy who can generally fit into a conversation with most, if not all groups of people, comes in like a bat out of hell, raises some issues, makes people think, and then disappears as if he never existed in the first place?

    That’s me. πŸ™‚

    Okay, seriously…I’m a webmaster of about 7 years (holy Christ, that’s a long time), and to be totally honest there was a time when I would have said a lot of the things that others were saying…why isn’t my site ranking, what’s wrong with Google, etc.

    I then came to the realization that no matter what I think about my own work, it’s going to be biased. So it’s not up to me to decide…it’s up to others to decide. And if they decide against what I’m doing, then I should listen and try to improve where and when I can rather than sit there and throw a hissy fit about it.

    Unfortunately, as in many ideas and concepts that I have, it’s probably a few years too early. πŸ™‚

    I’m what you’d also call a gun for hire…I work for a select few clients now (used to have a lot more, but it didn’t work for me as a business model), and come and go pretty much as I please.

    There are people here (Aaron Pratt for one) who might be able to tell you a bit more about me, since I’m not all that comfortable talking about myself.

    Anyway, that’s my story. What’s yours?

    Phil: I’ll get to your unique brand of inciteful ranting in the morning. I actually did type out a post in response, but it took me too long and the captcha tool kicked in and it got erased so to hell with it, I’m not doing it again until then.

    I do, however, have one question for you to chew on in the meantime. You say that IBLs are the wrong “metric” (side note: does anyone else hate this word and find it to be a corporate buzzword, or is that just me? Just wondering.) What would be a better way to do it, and why?

  377. Matt,

    How hard of a mod would it be to have the number of unique commenters along with the number of comments? I think this entry may indeed beat both. πŸ™‚

    Also, you should maybe add a compulsory spell checking step to people posting. Just a suggestion. Able to post anyways, but at least force them to view the mistakes first…

    -Michael

    PS Got the invalid security code thing again, not everyone is going to think to select all and copy just before posting. πŸ™‚

  378. Dave (Original),

    I’m guessing that your background is non-technical? It’s just that you seem to have enormous and unfounded faith in Google’s algorithms. Anyone who knows the first thing about the limitations of any algorithmic approach just wouldn’t say things like “Google know better than you do!”. You know you are comparing an algorithm’s fraction of a second, rule-based perception to that of a person who has spent years developing their site, right? You know that these algorithms don’t actually “understand” anything, right? The truth is, you are just assuming that all of those complaining are lying. A judgement you are clearly not in a position to make.

  379. What is a more effective measure to determine whether or not a site provides a valuable resource to users and deserves to be indexed fully, without potentially diluting the existing results?

    How can a page that ranks +1000 in the SERP’s possibly dilute anything?

    Noone can link to a page that they cannot find. They are certainly not going to find a page if Google refuses to index it. A site should never need more than a single link just to have it’s pages indexed. Actually, no links should be neccessary to have an entire site indexed. Telling Google directly with a site map submission should be enough.

    This is not about SPAM nor is it about rankings. It’s about being deemed worthy enough to have your pages indexed based upon nothing more than links.

    This leaves it up to the webmaster or siteowner, to manufacture enough links deemed worthy enough simply to get their content indexed. How is this a good thing?

    In the end, it will be the searcher who decides. A searcher looking to buy tomatoe plants in New Jersey or looking to find a Carpet Cleaner in Wyoming is likely to be just as diasappointed in Google as the merchants whose pages they refuse to index.

    Dave

  380. I wrote:

    It’s not a search engine’s job to determine which sites provide the most valuable resource.

    Doug Heil replied:

    So it’s the actual website’s job to tell Google that it is the most relevant resource, right?

    I see.

    You forgot your glasses, Doug. Value and relevancy or not the same things – not by any stretch of the imagination. But since you asked, it’s a search engine’s job to determine relevancy to a search query – it isn’t a search engine’s job to determine the value of a website.

    Dave (Original).

    You are correct that Google is in a much better position than me to determine what and how much they index. They are the *only* people who can make that determination. BUT, they are not in a better position that anyone else to decide what should and should not be indexed. Everyone can have opinions about that. The only difference is that Google are able to go with their opinions, but it doesn’t necessarily mean that their decisions are the best ones, or even the right ones.

    Look. I made an accurate assessment about the posts in this thread. It made no attempt to include opinion that hadn’t been expressed here. You know as well as I do that there a great many hard-line Google supporters who post in this blog, and they had not expressed support for Google about the new crawl/index criteria, up to the point when I made that assessment. Alright? Please stick to the topic, and forget that sideline. The assessment was correct at the time of writing it, which was way down the thread. And even now, there are only a couple of people who appear to be supporting Google’s new crawl/index criterai, and I don’t think I’ve seen any outright statements of support from them – yet.

    Phil: I’ll get to your unique brand of inciteful ranting in the morning. I actually did type out a post in response, but it took me too long and the captcha tool kicked in and it got erased so to hell with it, I’m not doing it again until then.

    (I always type my posts in a text editor, so I never have a problem with the captcha.)

    Inciteful ranting? Inciteful debating, perhaps, but I left the ranting near the top of the thread πŸ˜‰

    The only reason that we’re going on and on is because you haven’t yet given me a valid reason why a perfectly good, clean, website should not have all of it’s pages indexed, regardless of how many good, clean, on-topic, IBLs it has pointing to it. I say there is no valid reason, you disgaree with me, but you haven’t stated a valid reason. Actually, I’m the only one who offered a reason – shortage of space, but Matt said they are ok on space.

    I’d like anybody to give me a reason, not just you, but you are the one who continues to debate with me.

    I do, however, have one question for you to chew on in the meantime. You say that IBLs are the wrong “metric” (side note: does anyone else hate this word and find it to be a corporate buzzword, or is that just me? Just wondering.) What would be a better way to do it, and why?

    I’m not overkeen on the word “metric” myself, but it’s what people use these days.

    To answer your question: I said that IBLs are the wrong things to consider for determining a site’s value. My answer is what I said before – it isn’t a search engine’s job to determine the value of a site. It’s an engine’s job to index sites (except spam stuff) and determine relevancy to a search query. It is users who determine value for themselves. So no way of measuring value is needed.

  381. Damn! I wish there was a way to edit these posts, or even even to preview them.

    Everything in the last post, after the first 2 paragraphs in response to Dave, is a response to The Adam That Doesn’t Belong To Matt.

  382. I want to try and clarify the value of a site and its pages, to help avoid us going off on the wrong thing.

    It’s easy to think that a site that get thousands of visitors a day is a much more valuable resource than a site that gets 4 or 5 visitors week. And in one sense it is – the popular site is more valuable to the world than the less popular one.

    But search engines don’t deal with the world – they deal with individuals – single people sitting in front of their computers. They present results to individuals, and not to the masses. For an individual, a site that gets few visitors is just as valuable as a site that get millions of visitors. As an individual, the pizza site that I mentioned is just as valuable as Amazon, for instance. In fact the pizza site is a much more valuable resource than Amazon, because I never use Amazon.

    The value of a site and its pages is down to each individual user, and search engines cannot measure that. So can we get away from the idea of a site’s value, because it’s fair to say that all sites have value to someone. Also, Google haven’t said that they attempt to determine a site’s value, and it’s just red herring in this thread.

  383. “This is not about SPAM nor is it about rankings. It’s about being deemed worthy enough to have your pages indexed based upon nothing more than links.

    This leaves it up to the webmaster or siteowner, to manufacture enough links deemed worthy enough simply to get their content indexed. How is this a good thing?”

    Based on nothing more than links? Wow; I am very surprised by many comments in this thread. This is not ‘based’ on anything at all but …. common sense.

    I will tell you this; … if you “manufacture” incoming links, you do run a big risk, as it should be.

    PhilC wrote:
    “But since you asked, it’s a search engine’s job to determine relevancy to a search query – it isn’t a search engine’s job to determine the value of a website.”

    Agreed. The search engine searchers determine “value” by their individual preference of leaving that site, or staying on it and maybe even buying something. You are stating the obvious.

    PhilC wrote:
    “BUT, they are not in a better position that anyone else to decide what should and should not be indexed. Everyone can have opinions about that. The only difference is that Google are able to go with their opinions, but it doesn’t necessarily mean that their decisions are the best ones, or even the right ones.”
    Who would be in a better position to determine what sites/pages should be on “your” website Phil?

    Well sure, it certainly is Google’s opinion about which pages or sites show up in SERPS or even in their index. Afterall; it is “their” index and they can do whatever they wish with “their” index. This is not a right or wrong thing at all. It’s simply stating the obvious. I choose to ban, edit, delete members in my own forums. So do you. Would you rather have an outside party determine who or what or when or how your own website should be run? I don’t think so.

    The Google users…. real people who are trying to find info or buy a product or service are the people who determine which search engine is most popular. If and when Google loses market share on “search” is only because all the people of the internet found a better place to search.

    None of this has anything to do with all our individual sites or our client’s websites. We all want to get indexed by Google with good positions on our phrases we are targeting. But guess what? We want this to happen in a “free” environment.

    This thread is nothing but webmasters/owners/seo’s, etc who think that their sites deserve to be listed and ranked. That’s human nature to think and act that way, but is it “real life”? And should it be the highest of priorities that Google takes the health care directory and ranks it according to how “you” want it to?

    People in here are focusing on a few comments Matt made about a few individual websites. Do you really believe that Matt would do the required research “in detail” to determine “exactly” why that health care site is not doing well? That would be suicide for Google to do that.

    If you ran a real large search engine that handed out free referrals to others, would you want to do things manually or automatically? If auto, why would you give every search engine spammer on the planet access to “your” exact algos and reasons for doing something at any given time?

    I keep going back to this:…. Common Sense Stuff.

    I know darn well if my firm builds a website for it’s visitors, it automatically does well for all the se’s. That’s saying your people actually know how to build websites in a good way…… that good way just happens to be the good way of se’s as well. The key is in knowing what is that “good” way.

    Again; … common sense stuff.

  384. Damn! I wish there was a way to edit these posts, or even even to preview them.

    Agreed with the preview, with a recap on my earlier forced spell checking on that preview, and thrown in that the security code should only be on the initial data entry if you do add that. πŸ™‚

    -Michael

  385. Doug Heil.

    I will tell you this; … if you “manufacture” incoming links, you do run a big risk, as it should be.

    Doug, I think you should read Matt’s original post again. This discussion isn’t about things like that. You keep trying to take it off on general stuff, but we’re specifically discussing the new BD crawl/index criteria.

    Agreed. The search engine searchers determine “value” by their individual preference of leaving that site, or staying on it and maybe even buying something. You are stating the obvious.

    Yes I know. It was a reponse to your error.

    “BUT, they are not in a better position that anyone else to decide what should and should not be indexed. Everyone can have opinions about that. The only difference is that Google are able to go with their opinions, but it doesn’t necessarily mean that their decisions are the best ones, or even the right ones.”

    Who would be in a better position to determine what sites/pages should be on “your” website Phil?

    I would, Doug, but my sites are not search engines. The function of a general purpose search engine is to show users all the resources that it can for a given query. If a search engine intentionally doesn’t show useful resources, then it is being editorial, and is not a proper search engine. Google’s users don’t expect Google to be editorial, except when it comes to spam.

    Afterall; it is their index and they can do whatever they wish with their index

    Yes they can, but not if they want to continue as a top class general purpose search engine. Their users don’t expect to be intentionally deprived of some resources, just because Google feels like it. They expect Google to do the best they for them, and being editorial is not what Google’s users expect.

    None of this has anything to do with all our individual sites or our client’s websites. We all want to get indexed by Google with good positions on our phrases we are targeting.

    This thread is nothing but webmasters/owners/seo’s, etc who think that their sites deserve to be listed and ranked

    Doug, leave rankings out of it. You’re the only one who keeps bringing them in, but this discussion has nothing to do with rankings, and it’s best not to get sidetracked.

    Yes, Matt used only a few examples, but what he said about those examples is extremely significant – he didn’t need to repeat it for 20 or 30 examples – just a few were sufficient. What he said is that, with the new BD crawl/index function, a perfectly good site cannot have all of its pages indexed until it has enough decent IBLs. He said other things as well, but that’s the one that caused the most outrage.

    You use the phrase “common sense” a lot in your posts, Doug, but you don’t even try to discuss the issue. You talk only in general terms, which isn’t helpful at all.

    Now if you, or anyone else, can come up with a valid reason (other than a shortage of space) why a perfectly good, clean website, should *not* have all of its pages indexed, just because it’s there, then please do. Please try to address that question. I’ve asked it several times in this thread, and nobody has yet given a valid reason, other than things like, Google can do what they want to do, which is no answer at all.

    As long as that situation exists, Google is being grossly unfair to a great many good clean sites, and also to their own users, who don’t expect them to intentionally deprive them of good clean resources in the results.

  386. PhilC wrote:
    “You use the phrase β€œcommon sense” a lot in your posts, Doug, but you don’t even try to discuss the issue. You talk only in general terms, which isn’t helpful at all.”

    If you honestly think that Google also doesn’t talk in “general” terms, then I can’t help you. Do you really believe Google is going to research every problem each webmaster has with their website? And do you really believe it would be in the best interest of “any” major search engine to talk and discuss things “other than” in a general way?

    Why would Google tell you or I exactly how things work?

    Come on Phil, you are a smart man.

    I talk in general terms with most everything. Unless someone has hired me to do a specific thing with their specific website, how can one talk in any other terms when we all know darn well that “each” website has it’s very own special problems that cannot be solved by general answers?

    I will guarantee you that Google’s crawl patterns, etc has “much” more to do with many other things than simply the number of quality IBL’s the site has.

    And yet again;… Common sense is what I used to make that statement.

    It’s you all that want to focus on IBL’s and whatnot, not me. πŸ™‚

  387. If you honestly think that Google also doesn’t talk in general terms, then I can’t help you

    I know you can’t help, Doug, but that’s another matter.

    Doug. Nobody asked you for help, and I’m not aware of anybody wanting your help. You were, however, asked to answer a specific question several times, and each time you declined to answer it – because you are unable to answer it honestly, without appearing to disagree with Google, which is something that you can’t bring yourself to do.

    On the other hand, I’ve tried to help you because you obviously needed some help. I told you how to quote here, but you couldn’t figure it out. I’ll tell you again – use the HTML blockquote tag. It’s very easy.

    It’s you all that want to focus on IBL’s and whatnot, not me.

    We know that. The rest of us are discussing what Matt said about their new BD crawl/index – y’know – the topic of this thread, but you don’t enter into the discussion. You keep trying to take it off into generalisations, but good discussions should remain focussed. It’s common sense.

    The discussion in this thread is about something very specific. If you don’t want to discuss it, why bother posting at all?

    I will guarantee you that Google’s crawl patterns, etc has much more to do with many other things than simply the number of quality IBL’s the site has.

    Nobody has suggested anything different. For instance, PageRank has always determined the frequency of crawl, and still plays a big part. The TYPES of IBLs and OBLs also play a part. You see, you don’t have to guarantee anything, Doug. It’s all right here in this thread – in Matt’s original post, and in some of his later posts – if you care to read it. Your guarantees aren’t needed.

    The Adam That Doesn’t Belong To Matt
    If you think that I’m talking down to Doug, it’s because I am. We have a small history, and this is nothing compared to what’s gone on before πŸ˜‰

  388. Hi Matt,
    Was wondering if it would be possible to have a section for SEO’s on google to dig into our sites according to the google index? I already use sitemaps BTW.

    Just would like some additional tools.

    Can we have a tool to find out where our sites lay in the index for a given keyword/keyphrase? And perhaps a way to plot how we are doing overtime? IE my site ranks # 110 for the term “blue widget underwear” So I add a link or two from a respectable relevant link partner or write a few articles for reprint and have those links indexed, wait a while for the reindexing to occur and see if that improves things…. OR rewrite/add/shuffle around content to my home page to see if that makes a difference….. wash rinse repeat.

  389. Why are you making this personal Phil?

    I’ve been “extremely” nice to you.

    Again; you are focusing on IBLS like they are the main thing that leads to more crawling. You could not be further from the truth. I know exactly what Matt said and didn’t say. It’s what he did not say that I’m trying to tell you that you better look into.

    I’ve not gotten off the topic of this thread at all. You have though by sticking in snide comments that have no place. You don’t want to get into a personal debate with me, believe me. Let’s keep this NON-personal please. I’ve even agreed with you against that JW character. Don’t bite the people who actually stick up for you from time to time.

  390. In Matts original post all I see mentioned is IBLs and affiliate links. So how can you say the issue isn’t about IBLS??!!

    And don’t get me started about affiliate links. A link is a link. It shouldn’t matter whether it’s an affiliate link or not. All coupon sites are nothing but affiliate links. Google would be suicidal to try and eliminate these kinds of sites.

  391. PhilC Said,
    I want to try and clarify the value of a site and its pages, to help avoid us going off on the wrong thing.

    It’s easy to think that a site that get thousands of visitors a day is a much more valuable resource than a site that gets 4 or 5 visitors week. And in one sense it is – the popular site is more valuable to the world than the less popular one.

    But search engines don’t deal with the world – they deal with individuals – single people sitting in front of their computers. They present results to individuals, and not to the masses. For an individual, a site that gets few visitors is just as valuable as a site that get millions of visitors. As an individual, the pizza site that I mentioned is just as valuable as Amazon, for instance. In fact the pizza site is a much more valuable resource than Amazon, because I never use Amazon.

    The value of a site and its pages is down to each individual user, and search engines cannot measure that. So can we get away from the idea of a site’s value, because it’s fair to say that all sites have value to someone. Also, Google haven’t said that they attempt to determine a site’s value, and it’s just red herring in this thread.

    Completley agree with Philc.

    Relevant and Quality information is most often only found on the small sites, that concentrate on writing good content as a resource for all to read, rather than simply an attempt tp generate income from someone else.

    Its a real pity that Amazon and simlar sites simply frustrate and waste the time of the average person who is searching for something and anything.

  392. The problem for a small specific topic site is that its near impossible to get relevent inbound links. The only option is to submit to directories, but we all hate them as they will inevitably rank higher for the link to the small specific topic site.

  393. Doug.

    Sorry, but every time you avoid answering that very simple question, and every time you overuse the phrase “common sense”, and when you say things like “I can’t help you”, as though anyone asked you to help them, I assume that you are just trying to interfere rather than trying to debate sensibly. If I’m wrong, I apologise.

    Again; you are focusing on IBLS like they are the main thing that leads to more crawling.

    I am focussing on IBLs because they are now evaluated as part of the crawl/index function, and I’m focussing on the total unfairness of that, because (a) they can give no indication as to whether or not a site should be fully indexed, and (b) pages are being dropped from the index wholesale – partly because of them. It seems very reasonable to focus on that part of the new crawl/index function.

    When Matt says that a perfectly good site needs more IBLs so that Google will index more of its pages, then I consider the new evaluation of IBLs to be well worth focussing on, because I consider it to be very wrong in two important ways.

    I am not trying to discuss any wider than that. I haven’t even started on the “types” of OBLs, like Jack Mitchel just mentioned. The way that they are evaluated for the crawl/index function is also very bad – they have nothing to do with whether or not a site should be fully indexed (spam excepted all round).

    You sided with me against Jill??? In what way? πŸ™‚

  394. I missed this bit:

    You don’t want to get into a personal debate with me, believe me

    Really? The only debating that I’ve ever seen you do was in your forum, where quickly resorting to flames (your side) was the order of the day, as expected. I can’t debate against flames, because it’s just stupid, so you would win every time. But we don’t do that in my forum, so if you’d like to debate there, you are more than welcome to come along and voice your opinions on any topic πŸ™‚

  395. Jack Mitchell Said,

    And don’t get me started about affiliate links. A link is a link. It shouldn’t matter whether it’s an affiliate link or not. All coupon sites are nothing but affiliate links. Google would be suicidal to try and eliminate these kinds of sites.

    Google, please please please wipe out the affilliate links, they are just parasitic sites that simply waste my time !

    Links should be categorized, those that are relevant to the topic subject should have priority whilst those who are just trying to freeload should be penalized.

  396. Philc and Doug, you both obviously know what your talking about, but please stop throwing stones at each other and discuss the topic in question specifically IBLs.

  397. Another Question, what is the penalty in terms of time for a new site?

    PS I actually agree there should be a penalty but think it should be proportional to the number of pages on the site.

  398. No thanks.

    Phil Wrote:
    “When Matt says that a perfectly good site needs more IBLs so that Google will index more of its pages, then I consider the new evaluation of IBLs to be well worth focussing on, because I consider it to be very wrong in two important ways.”

    You are assuming Matt did the long and hard process of reviewing and researching that site with a fine tooth comb, right? You really can’t assume that, and you cannot assume Google is going to specifically tell you all about a website. To me, that doesn’t make any sense.

    You say it’s a perfectly “good site”. That’s great and I hope it is, but that doesn’t mean that the “only” thing keeping the site down is a lack of quality incomings.

    Phil wrote:
    “I am focussing on IBLs because they are now evaluated as part of the crawl/index function, and I’m focussing on the total unfairness of that, because (a) they can give no indication as to whether or not a site should be fully indexed, and (b) pages are being dropped from the index wholesale – partly because of them. It seems very reasonable to focus on that part of the new crawl/index function.”

    I’ve thought for along time that quality incoming links that were “natural” in nature were evaluated as “one” part of the index/crawl function. This is no surprise to me that Google has stated it as such. It’s just like the many, many other things/parts of the index/crawl function. There are boatloads of parts for this in the algo. The weights given to each part changes all the time. I’m trying to get you to look at the bigger picture of things and not strictly focus on a statement Matt made. He makes lots of them all the time, but he certainly cannot make statements of fact in regards to Google, Inc. He makes “general” type statements so as to help the most sites. But those “most” sites should not believe that “one size fits all” in regards to their individual problems, and that includes crawling patterns.

    That’s all I am saying.

    Phil; I dislike those who portray being whitehat, but are “not” whitehat, more than I dislike those who are blackhat, but state as such. When I say “dislike”, I mean that “business” wise. I actually personally like many blackhats, just dislike their business ways tremendously. I have much more respect for a blackhat who knows it and states it, than for a whitehat who really is NO whitehat at all. In other words, fence-sitters get my goat in a big way. You have more respect from me if you firmly take a stand on things.

  399. Hey Matt,

    Is this an endorsement of directories? “Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites. ” as seen on http://www.google.com/support/webmasters/bin/answer.py?answer=35769

    I would imagine ‘relevant’ and ‘industry-specific’ are the key words here. Seems like I have heard a lot of discussion on getting links without paying or trading for them, so this appears to be an official stance on what types of links may help a new site get crawled/indexed.

    Thanks,

    John

  400. Since we’re not being personal, at your request ….. oh! I see that we are still being personal:

    You have more respect from me if you firmly take a stand on things.

    Doug, I’ve seen you in action more than once, and believe me, I have no desire for your respect. I don’t respect your views, and I don’t respect your actions, and I don’t respect you, so why would I want any of your respect? Don’t flatter yourself, Doug.

    I don’t know what brought that on, but I wear my views on my sleeve and in my site for all to see. I have certain views about spam. I’ve never changed them and, when it’s useful to a topic, I state them. Perhaps you think that I’m pretending to hold pure whitehat views because I say something along the lines of, it’s a search engine’s job to rid itself of spam, but I’m not. It really *is* a search engine’s job to do that.

    If you’re confused, I’ll state it clearly. Use whitehat seo as much as you can, and only turn to blackhat if and when whitehat won’t work, but if you use blackhat, you must accept the risks that are associated with it. Never ever use blackhat on behalf of a client without the client’s full knowledge of the risks involved, and his/her agreement to take those risks. Happy now?

    It’s good not being personal, init? πŸ™‚ Now where were we?

    You are assuming Matt did the long and hard process of reviewing and researching that site with a fine tooth comb, right?

    With the tools that Matt has, it’s not a long and hard process – he does it live at conferences. I’m assuming that he took a pretty good look at the sites, and I’m assuming that he knows what he’s talking about. When he said that the health care site needs more IBLs for Google to know to index more of it’s pages, and when he said that he is not surprised that a site with only six IBLs wouldn’t get all of its pages indexed, I assume that he knows what he is talking about. And, if it’s all the same to you, Doug, I’d rather assume that Matt knows what he’s talking about much more than you do, and I’d rather take notice of him, and not you.

  401. Okay, now that I have a few minutes before I have to put steaks on a grill (mmmmm…steaks πŸ˜› ~~~~~~~~), I’ve decided to try and encapsulate my thoughts.

    Phil, I gave you a perfectly valid reason, and a series of possible scenarios that outline whether or not a site should be included solely on the basis of its existence.

    There are many sites out there that do not want to be indexed for a variety of reasons:

    1) Intranets.
    2) Confidential information.
    3) Integrity of information.
    4) Under construction/incomplete/testing areas.

    It would take very little effort for a competitor to submit a site that falls under any of the four categories. Under your scenario, these sites could still be indexed with few or no IBLs, and that could prove to be very damaging to Google, to the end user who may stumble upon these sites, and especially to the site owners themselves.

    If a site is indexed, the possibility exists that it could rank for something. Even if it’s an 8-9 word phrase, it could rank for something.

    Yes, the owners of these sites could put in a robots.txt file (or use the meta tag for that matter), but let’s face it, most webmasters do not necessarily possess the knowledge of the robots protocol. A good many do, but most don’t. Even so, a lot of those that do don’t care.

    Before anyone goes all moral and claim it’s a webmaster’s job to care, you’re probably right. But that doesn’t help Google in this case to determine whether a site wants to be there or not.

    What would happen if all of those sites and pages in the 4 examples were indexed? Is the content “of value”? No, because the average end user is not going to gain any benefit from visiting those sites in the present incarnation.

    A greater number of IBLs and a greater collective quality measure of those IBLs indicates that a website is “live” and ready to be seen by its targeted end users.

    The problem with the health care site is that there were a series of IBLs generated (a small one at that), a partial delisting, and then nothing after that. How would Google know whether or not this site were “live” again? It could have had those parts taken out, and then added back in with new features…the parts could have been removed for spamming reasons…there are other possibilities.

    And how does Google even know that the reinclusion request even came from people involved with that site? That’s a big assumption in and of itself. There would be nothing stopping a competitor from submitting that site, getting it indexed, having it found under some obscure search term, and thus pissing off an end user. Matt doesn’t even know that…that request could have come from a competitor just to research a potential threat to his/her business (much stupider stuff has happened than this?)

    The point is that IBLs provide, at the present time, the best measure of how interested a webmaster is in promoting his/her site and the benefit to the end user. The greater a number and greater a collective IBL quality, the more likely it is that it is a site that provides some user benefit and that has an ownership that is even interesed in having the site in there in the first place as opposed to the other possible scenarios which I outlined above.

  402. If you think that I’m talking down to Doug, it’s because I am. We have a small history, and this is nothing compared to what’s gone on before

    I’ll tell you what…don’t tell me about it, and I’ll never ask because I really don’t want to know.

    Deal? Deal.

  403. The Adam That Doesn’t Belong To Matt

    I wasn’t going to explain about Doug and me to you. I just didn’t want you to think that it’s normal for me to talk down to people.

    The example possibilities of sites that don’t want to be indexed are fine, Adam, but oddities like that are not what we are talking about.

    About the health care site…

    And how does Google even know that the reinclusion request even came from people involved with that site? That’s a big assumption in and of itself.

    There wasn’t a reinclusion request was there? What Matt said is that the exclusion period had expired some weeks earlier. It’s automatic. When a page is requested to be taken out, it is taken out for 6 months, and then it comes back again. It doesn’t need a reinclusion request. I don’t know if it’s put back in immediately, or if the URL is placed in the list to crawl in its turn, but it comes back automatically. Matt didn’t suggest that the pages would actually come back with the new crawl/index function. He suggested the opposite.

    The problem with the health care site is that there were a series of IBLs generated (a small one at that), a partial delisting, and then nothing after that. How would Google know whether or not this site were “live”again? It could have had those parts taken out, and then added back in with new features…the parts could have been removed for spamming reasons…there are other possibilities.

    Yes, of course, but that doesn’t make any difference. About the site, Matt said, “With that few links, I can believe that out toward the edge of the crawl, we would index fewer pages.” and “your site also has very few links pointing to you. A few more relevant links would help us know to crawl more pages from your site” It’s clear. In Matts view, six IBLs is not likely to be enough for all of the site’s pages to be indexed. Those other possibilites don’t come into it.

    The point is that IBLs provide, at the present time, the best measure of how interested a webmaster is in promoting his/her site and the benefit to the end user.

    Yes it could (not does) provide some sort of indication as to how much a webmaster is interested in promoting the site, but that should never even be a consideration. Surely you are not suggesting that sites that are promoted should be treated better than those that are not? I don’t believe you mean that. If you do, we might as well stop right now.

    Assuming that you don’t, I want to ask the same question again, but with modifications…

    Assuming that the site owner wants the site to be fully indexed, and assuming that nothing odd has happened, and assuming that the site is perfectly clean all round, and offers some value to some people, but has only one IBL, do you think that there is any valid reason why the site should not be fully indexed (other than Google being short of space)? If you think there is valid reason, what is it?

    The site I am trying to describe is just a normal average site that offers something to some people, and has had no odd stuff happen to it, and hasn’t engaged in any promotion, so there’s no spam around it – just yer average site.

    The answer I am looking for, Adam, is “No, I can’t think of a valid reason for such a site not to be fully indexed” or “Yes, I can think of a reason, and here it is….”.

    Reasons such as, maybe the site doesn’t want to indexed, and maybe some spam has gone on in the past, etc. are avoiding the question. I’m sure it’s obvious that I’m only asking about yer normal, regular, unspoiled, unspammed, etc. website – just yer average website.

  404. An average website is like leprechauns, fairies, and France. It doesn’t exist.

    And I’m not avoiding the question at all. I’m pointing out that possibilities exist whereby a site isn’t meant to be indexed fully (or at all). Those possibilities, as ridiculous as some of them may be, do exist and have to be considered.

    It’s similar in a sense to the warning on a curling iron: “do not insert into an orifice.” The policy behind that warning isn’t in place for the vast majority of people, who would be smart enough never to attempt such a stupid thing…it’s in place for the few stupid people who are.

    Assuming that the site owner wants the site to be fully indexed, and assuming that nothing odd has happened, and assuming that the site is perfectly clean all round, and offers some value to some people, but has only one IBL, do you think that there is any valid reason why the site should not be fully indexed (other than Google being short of space)? If you think there is valid reason, what is it?

    Assuming all of that is true, nothing. But how does one prove the validity of the assumption? Accepting assumption is very dangerous at the best of times.

    The reason I brought up other possibilities isn’t to avoid the question…it’s to point out that the scenario you describe isn’t as black and white as you are making it out to be. There are too many other possibilities that exist, and far too much room for blackhat manipulation and/or other error, for your scenario to play itself out, and one cannot make the assumption that you’re making based on that.

    Those same examples and oddities affect the health care site as well.

    Who knows who asked Matt to review that site? Do you know it was the owner of the site? Do I? Does Matt?
    Does the site in question have those delisted areas in place any more?
    Is the site in question actively promoting itself in non-SE ways?

    And perhaps the biggest question (one that one of us, including me, should have asked a long time ago):

    If the site had made a delisting request that had lapsed, and the time period is six months, why did it take so long for this to even become an issue for the site in the first place? Seems to me that a webmaster who would have taken a real active interest in his/her site would be aware of that situation and have reacted one hell of a lot more quickly than he/she apparently did (and even here, I’m assuming it’s the webmaster that pointed it out.)

    In other words, my questions, and the possibilities that I raise from them, comprise the reason why the site shouldn’t be indexed fully…yet. There are too many outside scenarios and too many variables that need to be dealt with first.

  405. Assuming all of that is true, nothing.

    Thank you. I finally got an answer, and one with which I agree. Although there may be oddities with some sites, generally speaking, there is no valid reason, other than a shortage of space, to not index the full content of a perfectly ordinary website. Imo, most sites fall in the category of ‘perfectly ordinary’, but I accept that you don’t think so, Adam.

    If the Google programming determines that the site has spam links about it, or spam content in it, then I’m fine with the site not being indexed. What I am not fine with is the quantity and/or quality of IBLs playing any part in whether or not a site is fully indexed. And I am not fine with the types of OBLs playing any part in it. It is not a search engine’s business whether or not a site contains affiliate OBLs, paid OBLs (advertisements), link exchanges, off-topic OBLs, or anything like that, with the exception of when it is blatantly obvious that they are specifically there for ranking purposes, such as in a known linking scheme. Affiliate OBLs are never for rankings, so why Matt would criticise a link to a mortgage site is beyond my comprehension – especially since it is in an real estate site! It just isn’t a search engine’s business.

    I don’t mind search engines devaluing the links that they don’t want to count, including the mortgage one on that site, but I have very strong objections to penalising sites because of them, and limiting the number of pages a site can have in the index is nothing other than a penalty for the site, and it short-changes the engine’s users.

    About the health care site:
    There may be other things involved with why it isn’t fully indexed yet. We don’t know. I only used it because, according to Matt, it seems like “a fine site”, which to me means that it’s a perfectly ordinary site without anything negative about it, and yet, in Matts judgement, it is unlikely to be fully indexed until it acquires some more IBLs. That’s what I find wrong.

  406. Just wanted to say thanks for a good post. I should read your blog more often.

    No need to approve this – ’twas just a quick note, no meat innit πŸ™‚

  407. sheesh Phil, You didn’t figure out I was responding because you asked me what I meant by the JW comment? My goodness; now I’m not so sure about you as you thought my comment was for you? LOL

    phil wrote:
    “I don’t know what brought that on, but I wear my views on my sleeve and in my site for all to see. I have certain views about spam. I’ve never changed them and, when it’s useful to a topic, I state them. Perhaps you think that I’m pretending to hold pure whitehat views because I say something along the lines of, it’s a search engine’s job to rid itself of spam, but I’m not. It really *is* a search engine’s job to do that.”

    I know YOU WEAR THEM on your sleeve, …. my gawd, take a chill pill Phil. I was talking about HER… sheesh.

    The rest of your comments are just plain silly stuff. I really don’t give a hoot what you think of me Phil. You have proved your worth in this thread and by what you write.

  408. Dave (Original)

    RE: “I’m guessing that your background is non-technical?”

    Wrong guess.

    RE: “It’s just that you seem to have enormous and unfounded faith in Google’s algorithms.”

    Of course I do and so do most of the other people on the Planet. This is why G has been MILES ahead of the rest for many years.

    RE: “Anyone who knows the first thing about the limitations of any algorithmic approach just wouldn’t say things like β€œGoogle know better than you do!”.”

    Of course Google no better than “you” or me. You must living in a fools paradise to say otherwise. Do you have a non-algo approach?

    RE: “You know you are comparing an algorithm’s fraction of a second, rule-based perception to that of a person who has spent years developing their site, right? You know that these algorithms don’t actually β€œunderstand” anything, right?”

    LOL! You think Google should perhaps employ the whole of India to do manual checks! I would think most (I did think it was all until now) here know full well it must be an algo and not manual human intervention. You do know that algos ONLY do what humans tell them, don’t you?

    RE: “The truth is, you are just assuming that all of those complaining are lying. A judgement you are clearly not in a position to make.”

    You guess a lot don’t you. I have never said anyone was lying. However, I will say you are lying by saying “The truth is…”

    Now, you typed a lot but I cannot see a point in any of what you have written. Why not back-up PhilC and others and stop focusing on person(s) disagreeing?

  409. philc, I’m going to cast my vote: if I never read another word you write it will be a good day, I don’t want to see your stupid comments and bickering, grow up please and stop wasting matt’s blog space on your pointless bickering and childish whining.

    To be clear, I have no idea of who you are, and I don’t care, your words are all I have to see, and judge you buy, and if I were you, I’d give some serious thought to giving some thought to what you type before hitting the submit button, it’s not interesting, and it’s not worth reading, and it’s not worth the electrical energy it requires to transmit those bytes.

    Again, I have no idea of who you are, and I don’t care.

  410. CrankyDave – I agree πŸ™‚

    If pages aren’t indexed, how can pages link, and how can the pages that should have IBL’s find themselves in the results if they rely on these pages?

    Tomato growers in Wisconsin will have to rely on Adwords.
    Users will also go elsewhere

    Matt – Are you saying theat indexing is working well – guaranteed – no disclaimer links – no effect on the IBL’s ? It doesn’t like this to a lot of us out here.

  411. Hello Matt,

    You mentioned that a fix had been identified for Google for hyphenated domains. Can you confirm if you have taken into consideration more than one hyphen being used. A legimate example of this might be shoe-store.com as opposed to shoe–store.com, or even shoes—store.com? I believe that this is a valid and important question because in the past it appears that Google has recognised shoe-store.com in the listings but has ignored shoe–store.com. In view of the fact that there are so many retailers of shoes online, it is quite reasonable for someone to want to use double or triple hyphens simply because that text does ideally suit their business?

    Has this been taken into consideration in the fix you described above?

    Thanks,

    Nick

  412. Matt, I thought that I would add this comment to the question I asked about in my last posting just a few minutes ago.

    In reviewing my posting after it appeared on your site, interestingly enough I see that all the hypnens I used in between the words “shoe” and “store” have all been reduced to a single hyphens. This is very relavant to my questions because will then Google and GoogleBot make exactly the same mistake? Will they see shoe(1 hyphen)store.com exactly the same as shoe(2 hyphens)store.com or even shoe(3 hyphens)store.com as exactlt the same site? Will they take into consideration that they are 3 distinct sites and index and speider them as 3 distinct sites – assuming of course that each of the 3 sites has its own unique content?

    After seeing what happend on my last post I am now even more keen to see your response! Thanks!

  413. Dave (Original)

    RE: “BUT, they are not in a better position that anyone else to decide what should and should not be indexed”

    I think they most definitely are.

    RE: “but it doesn’t necessarily mean that their decisions are the best ones, or even the right ones.

    No of course not. However, the chances of “them” being correct over a bunch of Webmasters/SEO are so much more.

    RE: “You know as well as I do that there a great many hard-line Google supporters who post in this blog, and they had not expressed support for Google about the new crawl/index criteria, up to the point when I made that assessment. Alright?”

    So silence makes you right in your mind? You denied that earlier. I also stated earlier that Matts Blog (and most SEO forums) are ONLY full of problems, rarely praise. Why should this one be any different?

    RE: “Please stick to the topic, and forget that sideline. The assessment was correct at the time of writing it, which was way down the thread. And even now, there are only a couple of people who appear to be supporting Google’s new crawl/index criterai, and I don’t think I’ve seen any outright statements of support from them – yet”

    I thought I was, at worse I was responding to your off topis writings. Phil, you SURELY must understand that these forums, blogs and whatever are extremely bias and negative on top of all else. I fully support a LOT of things I never comment on. Have you ever heard the term “silent majority” or “vocal minority”??

    RE: “I say there is no valid reason, you disgaree with me, but you haven’t stated a valid reason”

    The “reason” (or at least likely reason) has been posted by Matt. Why ignore it? More links he said and if it were MY site it would be more links I would get. That would be a better use of my time than complaining about what I cannot change. I’ll run my Website and let Google run their SE.

    RE: “They present results to individuals, and not to the masses”

    I disagree there. They present the same results to masses who search via the same term. Personalized search is different, but that’s not the issue, is it?

    Phil. Do you think BD was used to index more or less pages? I say more and that is a good thing IMO. If it’s less, then there are “reasons” as there are for everything. We just don’t know what they are. But Google do!

    RE: ” I’ll state it clearly. Use whitehat seo as much as you can, and only turn to blackhat if and when whitehat won’t work”

    Then you ARE a blackhat.

    RE: “when he said that he is not surprised that a site with only six IBLs wouldn’t get all of its pages indexed, I assume that he knows what he is talking about”

    Then why not also assume Google know better than you on what, how, why, when etc they index? Or is Matt the only one at Google that “knows what he is talking about”?

    Phil, I might be wrong here, or confusing you with another.., but haven’t you argued in the past that Google SHOULD NOT have carte blanche to index all pages out there and make money from them?

    RE: “do you think that there is any valid reason why the site should not be fully indexed (other than Google being short of space)?

    I don’t recall Matt stating the health care directory site would NOT be fully indexed. He said it would “help”. Keep in mind that Matt also said “self-removal just lapsed a few weeks ago”. Perhaps he means it would happen sooner with more links?

    BTW. Do a quick count on the number of times you have used the word “assume”. You know what they say by “ass-u-me” don’t you πŸ™‚

  414. BTW. Do a quick count on the number of times you have used the word β€œassume”. You know what they say by β€œass-u-me” don’t you

    That really depends, Dave…when I say that to my girlfriend, it takes on different meaning. πŸ˜‰

    HOP HOP SMACKY SMACKY HOP HOP SMACKY SMACKY! πŸ™‚

    (Someone’s gotta put some comic relief into this before everyone wants to hang themselves.)

  415. I don’t normally pick apart one specific section of a post, since it generally removes that section from the greater context of the post, but I think this particular section has a context in and of itself…so here goes:

    And I am not fine with the types of OBLs playing any part in it. It is not a search engine’s business whether or not a site contains affiliate OBLs, paid OBLs (advertisements), link exchanges, off-topic OBLs, or anything like that, with the exception of when it is blatantly obvious that they are specifically there for ranking purposes, such as in a known linking scheme. Affiliate OBLs are never for rankings, so why Matt would criticise a link to a mortgage site is beyond my comprehension – especially since it is in an real estate site! It just isn’t a search engine’s business.

    With the possible exception of off-topic OBLs, the answer is pretty obvious and simple. I suspect you know what it is, so I’m not actually directing my answer to you as such…I’m directing it to anyone who might not have considered the other side of this.

    (For those wondering why I’m debating things like this with Phil, that’s your reason…it’s not really for Phil, who I know is a smart guy that way. But it’s for the other people who may follow a message without considering some of the other angles behind it.)

    The problem with the links mentioned above, with the possible exception of off-topic OBLs (depending on circumstance), is that the links aren’t there for purely organic reasons. Whether the interests are fiduciary (affiliate links/ads), SERP/traffic increase (link exchange) or whatever the reason is, these links are biased links and aren’t purely organic. It’s not really a “vote for” a site in its purest form.

    So it is a search engine’s business, since this does have impact.

    I’d come up with more, but it’s 3:30 in the morning and sooner or later I should go to bed.

  416. Adam said:

    The problem with the links mentioned above, with the possible exception of off-topic OBLs (depending on circumstance), is that the links aren’t there for purely organic reasons. Whether the interests are fiduciary (affiliate links/ads), SERP/traffic increase (link exchange) or whatever the reason is, these links are biased links and aren’t purely organic. It’s not really a β€œvote for” a site in its purest form.

    I wouldn’t agree with that. If I am showing affiliate links then I am endorsing that link. I won’t put up a link for say a merchant that runs a shady business. When I do link exchanges I I arrange them in related categories. Since my main site is a mall site, everything is going to be useful to somebody, depending on the category they are interested in. That is why I set my links up by category instead of just one generic link page like many sites seem to have.

  417. firstly, thanks for being there.

    secondly:

    i have submitted above url to sitemaps on google.

    it finds two pages as inappropriate named – lloydsbank……very long name…htm
    and weight,loss.htm a typing error originally as i am an intermittant moron – increasing with age.

    how do i change these to meaningful names without people getting 404s off the old ones if i leave them extant, and presumably if i do they will still mess up google/google sitemaps?

    ive not phrased this very well but im sure you understand.as in i have two that are wrong…i can create two that are right but what about original two – if i remove them using removal tool people will lose their weight loss link – somewhat important in this day and age. i should stress the whole site is frree and lost a lot of impetus due to russian porn guys putting links in my guestbook as was until i spotted it = 15 months in the google doldrums so far from first pages originally.
    so there are a lot of self help pages as i was an alcoholic/chain smoker/painkiller addict now all fortunately history for me.
    i need to get the site back in favour to help people, and i think these two may be one of the last stumbling blocks – so i would value your help.

    malc pugh – rubery – england
    http://www.stiffsteiffs.pwp.blueyonder.co.uk

  418. Doug

    No I didn’t figure that out at all. I’ve just re-read what you wrote, and it does read as though it referred to me – there were no clues that I could see. You say it was about JW, and that amazes the hell out of me, as you can imagine. Oh well. I’m curious, but I won’t ask – it’s not my business – and you wouldn’t tell me, anyway πŸ˜‰

    h2

    Then I suggest that you skip my posts. It’s easy enough to do. Just look at the name and, if it’s mine, skip to the next post. Then I won’t waste any of your time. Easy huh πŸ˜‰

    Dave (original)

    Please get off the silence thing, and read what I wrote. I only referred to the opinions that were written in this thread at the time, and the assessment was correct. It’s all up there in glorious black and white, plus a smattering of colour. Keeping on about it is so unnecessary, and so inaccurate. Here’s a reminder of what I actually wrote about this thread at that time:

    I don’t think that anybody has agreed with Google about this issue. Probably most of this blog’s regular contributors back Google to the hilt, but I don’t recall anyone doing it about this issue.

    You see? It was all about what had been written in the thread at that time. Up to that point in the thread, the following comments had been made by different people:-

    Great post from PhilC
    PhilC said it perfectly.
    I think PhilC made a really good point above too
    I am with PhilC
    Like PhilC, I believe you’ve simply got it wrong
    Once again PhilC has put my concerns in a more coherent way than I could.
    PhilC has a very good point.
    I too agree with PhilC!
    OMG, did I just agree with Jill and PhilC on the same issue in the same sentence?

    There was a lot of agreement, and many more people has expressed dissatisfaction with Google’s new crawl/index function, whilst hardly a word had been written in agreement with it. It’s true that 3 people (you, Doug and Adam) have debated in favour of it since then, but that doesn’t make the assessment at that time wrong. Alright?

    As for the rest of your post, I’ll agree to disagree. I don’t have the inclination to go through it sentence by sentence, as you have done. I’ll just reply to one of your questions, though…

    Then you ARE a blackhat.

    My views and practises are what I described. I do whitehat as far as it is possible, which is almost all of the time, but if it can’t work, then I am happy to do blackhat. I never do blackhat for a client without the client’s full knowledge, understanding and agreement – never. You should read more – then you wouldn’t need to ask.

  419. Dave (Original)

    No offense, but you need to work on your basic comprehension skills. It’s just not possible to debate anything with you because you don’t seem capable of understanding the basics of what is being argued about. For that, you need to be able to read someone’s comment, comprehend what it is saying, and respond accordingly.

    The tangents you go off on defy any logic. Lord knows what you’ll make of this one…

  420. Adam

    Your points about OBLs are valid, but it’s not black and white.

    Links on the Web worked fine before Google came along. Websites linked to other websites because it was good for their visitors. People bought ad space (text and banners) on websites for the traffic, etc. etc. Links are what the Web is about – links *are* the Web.

    Then Google came along and largely based their rankings on link text (alt text for images), and as Google became more popular, people started to manipulate the links for ranking purposes. It couldn’t be any other way. The effect was that Google largely destroyed the natural linking of the Web. Because of Google, people are now wary of linking to other sites in case they are bad neighborhoods, or they may become a bad neighborhoods in the future. People exchange links for rankings, and not as much for their users or traffic. People don’t want to link to a site unless the site links back, AND from a page of equal value (PageRank). The natural linking of the has largely been destroyed by Google and the other engines that copied Google’s links-based rankings. In that respect, Google has been very bad for the Web.

    It’s true that many many links are there just for ranking purposes, and it is a links-based search engine’s task to identify and nullify them within its system. I have no obections to that, even though Google brought it upon themselves.

    What I won’t accept is Google telling webmasters that there is “right” way and a wrong way to put paid link (ads) on their pages, and giving people the impression that doing it the wrong way (the natural way) could attract some sort of penalty, as happened last year. It is not Google’s business to tell webmasters things like that. It is sheer arrogance to assume that paid links are there solely to boost the rankings. The same applies to other types of links.

    But Google does have a problem. They caused the link manipulations, and it has affected their results, so they’d like to identify and nullify the effect of ranking-type links. I don’t object to that. What I do object to is penalising sites on the blanket assumption that certain types of links are there just for ranking purposes. I don’t mind it if Google simply discounts certain types of links for rankings and PageRank, but I do mind if a site is penalised because of natural links.

    Intentionally leaving some or all of a site’s pages out of the index because of assumed link manipulation is morally wrong, imo. Matt didn’t say that sites are penalised for it, but he did imply that sites won’t have all their pages indexed unless they score well enough in OBLs and IBLs, among other things. He also said that such links are not hindering, but they aren’t helping. Imo, intentionally leaving pages out of the index, for whatever reason, is a penalty.

    That’s why I say that it isn’t a search engine’s business what types of links a site has in its pages. I don’t mean that an engine should count all links for everything to do with rankings, but I totally disagree with actively penalising sites because of them – unless it is blatantly obvious that they are spam. An off-topic link cannot be assumed to be there for rankings. An affiliate link is never there for rankings. Link exchanges cannot be assumed to be for rankings, whether on or off topic. By all means discount them if you don’t trust them, but don’t tell people what types of links they should and should not have in their pages, and don’t actively penalise sites unless it is certain that the links are there to boost rankings.

    As far as we can tell, what the new crawl/index function does is actively penalise sites, partly on the basis on links.

  421. Hi Matt,

    our site lost the Google-Directory PR Bar in July 2005. On the Google Toolbar we have PR7, the GDPR seems to be PR0. The PR of all of our Subsites are also PR0, only our Homepage has a TBPR 7.
    Could this circumstance correlate with inbound Links?

    Thank in advance,

    Greetings form Germany,
    Markus

  422. Eternal Optimist

    Matt,

    What a great shame a few people are hijacking your blog. The content of your post has been driven into the background by personal and sometimes egoistical comments from a few self-opinionated people, who should know better than to use your extremely informative blog for their own benefit.

    Over 400 posts so far, and I am sure that many of us pass by most of them, so as to keep within the framework of the topics you chose to discuss. I think you are owed huge apologies from certain posters on this thread. πŸ™‚

  423. Over 400 posts, eh? It just shows how interesting Matt’s original post was, doesn’t it?.

    I think if you read the whole 400, you’ll find that there are precious few posts that are off-topic. But if discussion and debate about the original posts aren’t allowed in these comments, some people really do owe Matt an apology. Let me see now – the post above this one is off-topic, isn’t it? Would you care to be the first? πŸ˜‰

  424. I guess I should have clarified something before, and I can understand the confusion. My bad on this one…but I’m not backtracking on my stance. Just clarifying it a bit.

    What I actually intended to convey was that the OBLs mentioned in examples above were in general terms. Jack Mitchell, you probably wouldn’t link to something shady (I don’t know you, but I’ll at least give you the benefit of the doubt). But unfortunately, whether you wouldn’t do so or whether I wouldn’t do so or whether Dave wouldn’t do so really doesn’t matter worth a damn because we’re all individuals. There are a significant number of people that would do so, and that’s part of the reason I made the statement.

    Mind you, that’s the secondary reason. The primary reason is that, in general terms, the links mentioned earlier aren’t purely organic links. The question that needs to be asked when it comes to these types of links is “would I still link to this site if there wasn’t an income/traffic opportunity associated with the link?”

    The answer is, in most cases, no. I’m sure there are some cases where someone would link to a site regardless and more power to them for getting money for it. But the much greater majority of those links are there because they’re affiliate links…they may be topically relevant, but there is a bias associated with them.

    That’s important because it affects the end user. For example, let’s take Bob. Bob wants to go buy some Lawrence Sanders books.

    Bob visits Ted’s site.
    Ted has a “bookstore” of sorts listing Lawrence Sanders books from Amazon and Barnes and Noble (all affiliate links).

    Are the links of relevance to Bob?
    On the surface, yes.

    But, when we look at it much deeper, does Bob derive the maximum benefit from Ted’s site, and is he being presented information in a fair and unbiased manner?
    No. Most people would have said “how much does Chapters offer the Lawrence Sanders books for? What about Indigo? What about XXX bookstore?” And so on and so on.

    In other words, Bob got information that he could possibly use, but it’s biased information and may not be the best source of it.

    And before anyone goes off on this tangent, I can see what some of you are saying. “If the content is relevant to Bob, so what? SEs shouldn’t penalize based on this.”

    But…consider two groups of people in there before you make that statement.

    1) Bob, the average web user. He may not even know what an affiliate hyperlink is. Most people don’t. Is Bob going to be necessarily aware that the information he is provided contains a very real potential for bias? No, he probably isn’t.

    2) The small business. The one who may not be able to afford an affiliate program or have the human resources required to maintain one. Since that represents the vast majority of businesses, that’s a big problem.

    So yeah, discounting those links makes a buttload of sense. The information is potentially biased, the links in general terms aren’t purely organic, and the vast majority of users and companies get affected in a negative way.

    As far as paid hyperlinks not being there to influence rankings, I agree…there are legitimate advertisers that are after traffic from their advertisements. All they care about is that people are visiting their site from the ads they put out.

    And that in itself is an even more logical reason to use the nofollow attribute on a paid hyperlink. Knowing that certain stats programs (e.g. Live Stats) have a tendency to misreport bot traffic as user traffic, would it not make more sense to head that behaviour off at the pass and ensure that the traffic generated isn’t from bots (at least the ones that adhere to the nofollow directive) and that it is from actual people?

  425. Matt,

    What a great shame a few people are hijacking your blog. The content of your post has been driven into the background by personal and sometimes egoistical comments from a few self-opinionated people, who should know better than to use your extremely informative blog for their own benefit.

    Over 400 posts so far, and I am sure that many of us pass by most of them, so as to keep within the framework of the topics you chose to discuss. I think you are owed huge apologies from certain posters on this thread.

    I’m just wondering if you can be a little more specific as to who you are referring to and why. I don’t really see this so far (although at this point, recordset pagination would be a REAL good thing.)

  426. What I’m getting at about paid links (ads), Adam, is that it isn’t a search engine’s place to tell webmasters that there’s a right and wrong way of putting paid links on their pages, as happened last year. The problem that engines have is internal, and should be sorted out internally.

    But that’s a bit of a digression on my part, because I have a ‘thing’ about search engines trying to *change* what webmasters to with their sites, instead of working with the Web as it is – as it always was – and resloving their problems internally.

    I don’t disagree that engines can treat links in any way they want. I’ve said a number of times in this thread that I’m not against them discounting or ignoring any links that they prefer not to count, and affiliate links are certainly not within the scope of what links-based engines want to count for rankings. So I don’t think that we have any disagreement on that score.

    What I am dead against is penalising sites just because they contain links that search engines don’t want to count, and intentionally omitting some or all of their pages on account of them is well out of order, imo.

    It may be that nothing has changed on that score, and that we simply hadn’t noticed that pages were intentionally omitted. We knew about less frequent crawls for sites with less PageRank, but I don’t think anyone noticed if pages were being intentionally left out, and they may have been. If they were, then the new crawl/index function merely improves the identifying of non-countable links, and as Matt said, they just aren’t helping any more.

    If Google was doing it before, then my view is still the same – they should not penalise sites on the strength of links. Discount them if you like, treat them as nofollows if you you like, but don’t intentionally omit pages unless you are short of space, or unless the links are definitely spam.

    I’m sure there are some cases where someone would link to a site regardless

    Coincidentally, I have a site that is less than a year old. I’ve no idea how many pages it has, but it’s probably in 5 figures. It’s a decent and useful resource, with no affiliate stuff, and no paid anything in it, and I intentionally built it to ignore all linking ‘strategies’, including link exchanges (it specifically says that it doesn’t do link exchanges). Linkswise, it’s just a plain ‘organic’ site. It links out to hundreds of totally relevant sites, and the only IBLs that I gave it were several from 2 of my own sites – one on-topic site, and one off-topic site. Those IBLs were just to get it noticed by the engines. It was doing fine until this fiasco. Plenty of pages were indexed, and plenty of people were using it, because it’s useful. But no more. It’s dead now. It has 13 pages indexed normally, and 407 pages in Supplemental, and all of the pages have useful content.

    Where’s the sense in that? Apart from the 2 off-topic starter IBLs, the site is so organic that you could eat it for lunch. I haven’t complained about it, although I’m obviously disappointed, but where’s the sense in it?

    Judging by Matt’s post, I would guess that the 2 IBLs from my off-topic site are no longer counting, but the few from the on-topic site should still count, although they have very low PRs. So there’s a useful site that people were using, that is now dead because of what? Presumably it doesn’t have enough trustable IBLs. Is that a good reason?

    Btw. Am I the only one who sometimes has to refresh the thread because the captcha code is unclear? I’ve had to di it several times in the last few days.

  427. So how would you suggest affiliates make money then? If Google discounts all affiliate links then people with affiliate sites will be starving. Take a look at the price comparison sites that just use an affiliate link but compare prices from the different merchants. Does this offer value? In my eyes it does. As do coupon sites. The big problem here is to try and cover all affiliate sites and say they don’t offer any value, which of course is not true. Yes there are some that don’t, but you can’t punish all the good ones that deliever value just beacuse of the bad apples out there. In my mind, it’s a lot like link directories. On a side note, it needs to be said that an affiliate is doing the same thing basically as Adsense..i.e advertising a good or service. Somehow I cannot see Google penalizing tho for Adsense so how are affiliate links any different?

  428. It seems to me that they just implemented a second “sandbox” (I know, but you may call it any way you like :-)). Not only that they do not list, or should I say “mention”, sites in competitive industries for up to 18 months (or even more) they prevent every “sandboxed site owner” from getting traffic by creating more pages (read content) now.

    I never liked the idea of being depended from a single search engine in terms of revenue, or any third party for that matter, but I strongly consider whether webmastering makes any sense at all anymore.

    I feel an article comming… πŸ˜‰

    Star

  429. Well I’m getting increasingly despondent with Google. I have a website that is full of useful content, not remotley spammy, I only link to relevant and useful sites, my content is updated and added to constantly, but the more my site grows the more Google seems to hate me πŸ™

    Now Google has dropped all but 48 pages from my site and I don’t appear anywhere in the the search results for my main keywords. There are sites in the second page of results that bear no relevance to the keyword search and are spammy.

    It seems Googles big updates are penalizing many honest sites and rewarding the spammers.

  430. I feel an article comming

    LOL!!! πŸ™‚

    Jack. Imo, it’s ok for Google to discount affiliate links, so that they don’t help anything, as long as they don’t dump a page because of them (except pages that are nothing but affiliate links, of course).

    Matt said a couple of times that certain links weren’t harming, and that they just aren’t helping any more. That’s fair enough, but what I don’t understand is why point out OBLs, because they never helped the page or site, anyway – at least we didn’t think they helped. If they helped, the dumped site I just described should be stinking rich with Google muscle πŸ™‚

    I can only think that the reason for pointing out the OBLs is because they are now scoring negatively for crawl/index purposes, and that, imo, is just plain wrong.

  431. Sorry Phil; as I just can’t resist this comment: πŸ™‚

    Phil wrote”
    “I never do blackhat for a client without the client’s full knowledge, understanding and agreement – never.”

    The thing is, you are assuming you are giving that client “full knowledge”. How do you know that? I don’t know about you or anyone else, but the phone calls I receive from sites who got penalized by a firm out there, “all” said the firm gave them full disclosure. They simply had NO IDEA of the consequences of being caught for search engine spam, and were not happy about it at all.

    Isn’t it the job of the firms in our industry to “educate” that client that they “never” have to do blackhat stuff? Shouldn’t we be showing them how it’s done? I guess if a firm is an SEO “only” and does not or can not redesign sites so as to do things that are within the se guidelines, then that firm has to resort to blackhat stuff. I think that’s bogus. I also think it’s just not right to assume “every” person on the internet happens to fully understand everything involved with the risks, and then understand exactly what it means to get a ban or penalty.

    This is the biggest beef I have with blackhats. They always fall back on this thing called “full disclosure” when it’s easy stuff to educate and then proceed to fix the existing site without spam.

    YES. This post might be off-topic. That’s tough beans. I could not let his comment go without a rebuttal to it.

    I still think the majority in this thread just don’t see the big picture, and are simply taking what Matt says word for word and applying it to their sites. I would never be that short-sighted.

  432. Wotchit Doug, or you’ll have Eternal Optimist on yer back πŸ˜‰

    Yes, I missed a couple of phrases out. Here is my first description from higher up the thread. You’ll notice the differences.

    Never ever use blackhat on behalf of a client without the client’s full knowledge of the risks involved, and his/her agreement to take those risks.

    If a client wants to me to do something that I know is non-compliant, I tell them that it can attract a penalty, and we discuss what the penalties could be. I.e. they know that it could result in their site being dropped from the engines. It isn’t possible to disclose more fully than that.

    It will probably surprise you that, when I come across a new client who already has spam on the sites, such as hidden text, I tell them to remove it, and I tell them why. It does happen.

    Isn’t it the job of the firms in our industry to educate that client that they never have to do blackhat stuff?

    No it isn’t. It is our job to tell them the truth.

    I’m not going to debate it here because, as you rightly pointed out, it is way off-topic. I said before that you are welcome to discuss/debate at my place. Flaming isn’t allowed there, so you may feel a bit restricted, but there are plenty of whitehat views, so you wouldn’t need to feel alone.

  433. Doug

    I still think the majority in this thread just don’t see the big picture, and are simply taking what Matt says word for word and applying it to their sites. I would never be that short-sighted.

    I’m sure you’ll understand when I say that I’d much rather take Matt’s word for what Google does systemwise, than yours, or any other outsider’s.

  434. Hey matt thanks for the response , i didnt mean to insinuate you hadnt read my post , sorry about that..

    Incase you need a refresher , im the guy who was asking questions regarding the influx of recently expired non-adult domains being filled with spam doorway pages showing up high in the serps across the board for the most searched “adult” terms..

    I have a followup questions.. and a comment

    #1 i notice the examp[le ( offending ) domains i posted vanished from the serps shortly after i reported them.. I have reported this chain to google before and in google discussions and no action has ever taken place so i thank you for whatever you did that seems to have improved the serp’s ( somewhat ) i also noticed all the ones ididnt specifically mention were NOT removed so i really hate to do it but since i never seem to have any luck with google support or google groups taking action ill drop some more links and hope whatever voodoo took place happens again and they are removed ( if not i dont mean to be pushy just hope you can help

    #2 ok this may sound funny/silly/simplistic but here goes.. Why doesnt google have a team of people that manually reviewed the top 100 search terms for the day before for spam pages ? I dont mean every page but the top100 lets say , and i dont mean removing any site thats questionable but just the obvious ones.. I timed myself and searched for 50 of the most searched search terms on google and reviewed the top 100 results and it took me approx 3 hours to sort them into “obvious spam pages” and ” safe”. so to me it doesnt seem that hard that you couldnt have a team do that ? If you extrapolated my expirement and had a team of 50 people you could manually review 250,000 sites every 3 hours and you would have cleaned up the “most liekly searched items” ( now of course this would make the results way too relevent and nobody would click n the ads * cough *cough ) πŸ™‚

  435. heres those examples

    charterschoolleadershipcouncil.org/main.html
    gtpmanagement.com/hot/hardcore.html
    1800hithome.com/pornmovies.htm
    helsingborgsguiden.com/mpegs.htm
    sitel.org/hot/anime-sex.html
    nfgslim.com/amateurporn.htm
    bradseglarforbundet.com/indianporn.htm
    dogori.com/amateurporn.htm

  436. hey phil, can’t you move this babble off matt’s blog and put it somewhere else? I don’t want to read it, I don’t want to scroll through your self aggrandizing garbage, and I’m sure I’m not the only person who feels this way.

    Since you have so much to say, and since you are clearly in your own mind right in everything you say, why don’t you blog your thoughts somewhere else, I don’t want to wade through this junk to follow this stuff, it’s a waste of my time.

    Or can’t you get any readers on your own blog so you have to come here? That’s probably it is my guess.

  437. Dave (Original)

    RE: “Lord knows what you’ll make of this one…”

    Same your other one, not much. You only seem capable of focusing on the person & not the topic.

  438. h2. I’m sorry that you don’t like debates, but I can’t help that. I can only repeat what suggested to you earlier:-

    “Then I suggest that you skip my posts. It’s easy enough to do. Just look at the name and, if it’s mine, skip to the next post. Then I won’t waste any of your time. Easy huh? ;)”

  439. Matt:

    While I agree with your comments regarding spammy links and links that are not relevant. I can only assume that you comment regarding a mortgage link on a real estate website was not relevant was an error.

    While the way the link was implimented amongst other not relevant links and done purely for PR or rank was poor quality, realtors work with mortgage brokers every day and the referals that go back and forth between them in real life confirms the relevance of the two professions and therefore related websites.

    What do you say?

    John

  440. Dave (Original)

    RE: “Please get off the silence thing…”

    For someone who doesn’t want to go there, I find it odd that you mention it every single time. Perhaps what you really mean is “Let me have the last say on it” πŸ™‚

    RE: “You see? It was all about what had been written in the thread at that time”

    Then I guess we should NOW “assume” the tide has turned against you πŸ™‚

    I know what you wrote about blackhat work (I read it) and it makes you a blackhat. Period. Or is a cheat only a cheat if they cheat ALL the time?

    RE: “What I’m getting at about paid links (ads), Adam, is that it isn’t a search engine’s place to tell webmasters that there’s a right and wrong way of putting paid links on their pages, as happened last year. The problem that engines have is internal, and should be sorted out internally.”

    It is sorted “internally”. Those that do not tow the line, in return for a FREE Google placement, are dealt with “internally”. You ALWAYS have a CHOICE and nobody can force you. Why not extend your belief to hidden text, cloaking etc.

    Do you have rules for your forum, blog or whatever? I thought so.

    RE: “I don’t disagree that engines can treat links in any way they want”

    Yes you do! You disagree that Google treats sites with poor/little/no links differently to those that have good/many/lot links.

    RE: “I don’t have the inclination to go through it sentence by sentence, as you have done”

    That’s because I’m not cherry picking.

    Anyway, as you are admittedly cherry picking it is pointeless debating further. I will end with my thoughts on the issue.

    Google is the most popular SE in the history of SEs. They have guidelines that are written in a VERY clear manner IMO. The one statement below sais it all “Following these guidelines will help Google find, index, and rank your site.”

    In ADDITION, Matt also helps many by posting his beliefs and thoughts etc. Rather than moan, gripe and complain from an UNIFORMED position, I do as they ask and apply “common sense”. In return I have full listing in Google.

    There has been MANY times in the past that 1000’s of my pages have ‘dissapeared’ from Google. But rather than moan, gripe and complain from an UNIFORMED position I think “What can I do to help Google find all may pages”. Many times they simply come back without me doing anything.

    So, in summary, I look after my site and try to make it useful, popular and of quality and let Google get on with running their site. Guess what? It has worked for nearly 10 years πŸ™‚

  441. Matt,

    I heard a rumor that Google might penalize the sites of those who hijack your blog comments and turn them into a personal rant space, using it primarily to attack each other in public without actually adding much content to the conversation. Is this true?

    -Michael

  442. So how would you suggest affiliates make money then? If Google discounts all affiliate links then people with affiliate sites will be starving.

    First of all, when did Google become this huge monopolistic entity that so greatly controlled traffic that not being indexed by it would make or break a segment of the website “population” (for lack of a better term)? There are millions of other places to draw traffic from…and webmasters that don’t know what those places are will have much deeper issues than indexing/ranking. No site or search engine should be so prevalent in a site’s stats that even the slightest fluctuation can deeply impact the bottom line of a business.

    Second, I never said once that affiliates couldn’t make money. If someone is paying them a percentage of sales from a hyperlink, more power to them. That’s their choice to do so.

    The problem with that logic is twofold:

    1) “Affiliates” can take away business from operations that have arranged for proper supplier relationships, taken care of shipping issues, RMAs, customer problems, and have generally put in a lot more effort. How difficult is it to maintain a hyperlink on a site in comparison to someone who has to deal with all of that stuff?

    2) The question of whether the hyperlink would still be there if the affiliate arrangement isn’t there has not yet been answered in such a way as to suggest it would. (In most cases, it likely wouldn’t).

    Jack, you need to step back and look at this from another set of eyes. The beauty of the comments that I’m making is I have no interest, personally or professionally, in your site whatsoever. I don’t know anyone that would compete with it on any level, directly or indirectly, and I sure wouldn’t.

    In your case, you’re negatively affected, and you’re upset about that. I understand that, and can empathize to some extent. But you need to realize that, while you have put in more effort than a “typical” affiliate site, it’s still an affiliate marketing site and does present some bias, whether you choose to accept that logic or not (there doesn’t seem to be one free link on your site). They’re not “your” products…when you send people to the affiliated sites, they’re not your problem after that point…so you really don’t have to do much other than tell people about your site to maintain your “business model”.

    When it comes down to it, all you’ve done is used what appears to be a stock osCommerce skin, thrown some hyperlinks into it, and said “here’s my so-called storefront.”

    And from the standpoint of the end user, that’s not as valuable a resource because of the bias presented. If you’ve already got Adsense on your site, why not offer a truly useful shopping resource with free links to things? You know, compare those sites that don’t offer you any financial compensation for doing so and be as thorough as you can about it.

    More links and info = more content = more traffic = more money in the long run than the bits and pieces you’ll pick up from affiliate links in the short.

    No matter how you explain it, affiliate marketing (in its online incarnation) represents a penny-ante game…minimal effort, minimal return. Why should Google reward that?

    On a side note, it needs to be said that an affiliate is doing the same thing basically as Adsense..i.e advertising a good or service. Somehow I cannot see Google penalizing tho for Adsense so how are affiliate links any different?

    There are major differences between Adsense and what you’re doing in the body of your site.

    Adsense is a contextual advertising solution designed to provide complementary ads for content. The sites themselves don’t necessarily have to “sell” anything as their major premise…in fact, most of the good ones don’t. It’s advertising, and it’s clearly delineated as such. And those advertisers who choose to participate don’t gain anything from search engines by doing it…it’s a straight exchange of money for user traffic. That’s what non-SE advertising is supposed to be about…money for traffic.

    With Adsense, the webmaster cannot fully control the ads that appear…they’re served by Google in an attempt to be contextual. Webmasters can’t turn around and say “I want eBay because their ads pay more” or “I don’t want mom-and-pop because they only pay $0.10 per clickthrough.” Webmasters don’t know, and that means they have minimal influence (other than via content) over the links that show up.

    Just some stuff to ponder for you…maybe you’re looking at this from the wrong angle.

  443. Dave (Original)

    Michael, I heard Google was going to target those who try and stir the pot.

  444. Hey Adam:

    Your post was very good reading. The main thing I wanted to emphasize in my analysis was that both my affiliate sites and Adsense were both advertising IMO. To me that is what I do, advertise specific products or specials. I know I could add some informational content, but if I wanted that kind of site I’d do a sales site. To me a sales site and content site are two very different things. I do have a movie blog that links to my movie sales site and that site just has movie reviews and movie news. Lastly, bear in mind I don’t mind the fact if pages rank low, just the fact that pages aren’t being indexed even in the supplemental index.

  445. can anyone explain me what dose “Linking to spammy neighborhoods on the web” to means?

  446. Matt,

    I’m glad that you and the guys at Google are working hard to improve things and that your still listening. However, having read the post by Lunov above, I have to say I’m getting something very similar. Our site seems to be listed on the DC’s (I have done site: on most DC’s) but yet is only ranking for keywords on a limited number? I have no idea why this would be nor does there seem to be any explanation from Google or any lead SEO’s on why this would happen.

  447. Dave (Original)

    Then I guess we should NOW assume the tide has turned against you

    Three people are a tide?

    I haven’t the rest of your post. I got to that bit and decided that you haven’t anything to contribute except argumentativeness just for the sake of. Sorry.

    Adam, Jack

    *If* affiliate links on a page have a negative effect for the page with Google, then it would be grossly hypocritical of Google, because AdSense is nothing more than affiliate links, whether they are on the actual page or not (AdSense isn’t on the page – it’s in an iframe – a different page)). I don’t see that contextuality has anything to do with it. For instance, if you write an information page about a various kinds of mortgage, and you add a few affiliate links to some mortgage companies, they would be on-topic, but for Google they would just be affiliate links.

  448. Hi Mat, i have been reading your blog for a long time. Now you have said that:

    “The sites that fit β€œno pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site”

    Our company has a network of more than 20 sites and we link to our sites under “Our Network”. We also advertise heavily on adwords.

    How are those links treated in this update?. They are not related, but the sites belong to the network of that company.

  449. Matt,

    I’ve lost eleven pages 11 pages since Friday, of which all were indexed in the past 10 days. Today only 59 of our more than 400 pages are indexed after hitting a bottom of 35 on May 3rd and a small peak of 70 on May 9th. Furthermore from May 3rd to yesterday, while the overall number of pages indexed remained flat what was indexed did not. Every few days one or two β€œold” deindexed pages appeared while pages that survived BD (no matter what their PR or content level in our DB) disappear. And while our keyword ranking for pages listed on any given day has returned to pre BD levels, the SERPs vary significantly from day to day. Pages returning a top 10 position yesterday have disappeared from the top 1000 today.

    I would hardly compare this roller coaster with the recovery you alluded to in your original post. You’ve suggested that Google is crawling sites with few quality links less frequently than before, I’m actually being crawled more than I was before. Furthermore, you’ve suggested web pages that are considered spammy are being put in the box for roughly 30 days. If Google considered my pages spammy why were they indexed or reindexed since May 3rd only to be deindexed once again? This is not a gripe. I would greatly appreciate a response so that you or I can find a fix to whatever it is that my site is suffering from.

  450. Hi Matt,

    Great post – not too much of the voodoo (as my non technical colleagues call the more hard-core lingo).

    One point I wanted to make though was about the real estate site, with the links to the motgage lender. What does someone usually do when they have found a house to buy? Find someone to lend them the money to buy it with.

    In this instance I think that it was a perfectly valid link (and no, I have nothing to do with either sector). It’s a small point, but I think a valid one in relation to the relevance of links.

    I’m not sure what the answer to this is, but maybe that’s why you work for Google & I don’t!

    πŸ˜‰

    Cheers,

    CiarΓ‘n

  451. Your post was very good reading. The main thing I wanted to emphasize in my analysis was that both my affiliate sites and Adsense were both advertising IMO. To me that is what I do, advertise specific products or specials.

    I agree totally with the first part. They’re both forms of advertising. But the manner in which they are presented, as Phil pointed out, are very different.

    Google Adsense is usually quite distinguishable, even when blended into the rest of the content, from the actual page itself. An affiliate link can be buried in content without the average user knowing it.

    Webmasters also have greater control over affiliate links than they do over Adsense.

    And therein lies the problem.

    Jack, you may advertise different specials as your site’s theme, and that’s cool…but again, I go back to my original point about bias. If you’re truly out to assist your user base in the best manner possible, then it really shouldn’t matter whether or not you’re getting a cut of the sale/special. If you only promote affiliate links, to a certain extent you’re cheating the end user and presenting partial content.

    It’s still a fair trade…you get content for your site and the opportunity to increase your userbase (since you’re still running Adsense, you’re fine that way), and you get the Adsense income from the ads themselves.

    As far as being hypocritical on Google’s part, it would only be hypocritical if both Adsense publishers and advertisers were rewarded for any contextually provided hyperlinks, and there is nothing to suggest that. With Adsense, since it’s behind a Javascript, it can be assumed relatively safely that it’s a straight traffic-for-money exchange, with no SERP benefit. (Yes, it’s possible that Googlebot could read its own Javascript and thereby extract the links that way, but there’s nothing that would establish that behaviour and I believe it’s not the case.)

  452. Jack, you may advertise different specials as your site’s theme, and that’s cool…but again, I go back to my original point about bias. If you’re truly out to assist your user base in the best manner possible, then it really shouldn’t matter whether or not you’re getting a cut of the sale/special. If you only promote affiliate links, to a certain extent you’re cheating the end user and presenting partial content.

    Isn’t this eaxactly what Google is doing. Promoting only sites deemed “worthy” by their types and number of links and cheating the end user and presenting partial content?

    A case of “Do as I say, not as I do.”

    Dave

  453. “Promoting” should be read as “indexing”…

    Dave

  454. I don’t know where the “types” portion of it came from or what you’re referring to. So I’ll leave that part alone.

    As far as cheating the end user goes, yes there are sites that wouldn’t show up that probably deserve to be there. But there are also sites that don’t need to be there and that wouldn’t have the backlinks that could end up indexed just because someone asked. The problem is that there is no conclusive way to tell just from looking at a site. How do you tell?

  455. As with the latest update… I’ve noticed sites like ehow and about, are now leading the pack on most search terms….

    and the little sites are mia, and dropping out fast…….

    I’m starting to wonder if Google is lining their pockets more…..

    Also last night went to buy a harddrive and noticed that westerndigital, included “The Google Pack”

    coincidence or not western digital is ranked “1” for the term harddrive….

    And as with sites selling links, google shouldn’t get involved if the links fit the site…. Who is google to say that sites can’t sell advertising “aka links” and collect money. We could care less if the people paying for the links are after pr, link pop…. or just prime real estate on the sites…

    Google use to have good results, now I noticed google punishes sites for using the dmoz, yet google is showing up more and more for its version of the directory….

  456. Missing the point, and repeating errors. Yep that pretty much sums up my problems with site indexing. Let me start out by saying I paid Yahoo to list my site. I thought at the time it was blackmail, and I admired Google for letting me go through the process for free. But honestly I have spent significantly more money on Google, if you factor in time. I work with a guy Isaac Bowman, http://www.isaacbowman.com who is a Google fanatic. I mention his site, because he is trying to be a blogger like you, and a lot of his posts are about Google. Everyday it is Google is doing this or doing that, but I think he is even baffled by our situation.

    We are a new company, and we want to be active in managing our Google listing. In fact we spend a couple of hours a day trying to improve our rankings. We pay for AdWords, we use Google Analytics, we track goals, we use Site Maps, we are listed a dozen times in different Wikipedia pages on our business topic. A short point on Wiki; Our Wiki pages on Electronic and Digital signatures bring us 50 times the traffic our PPC bring. I mention this because I was SHOCKED. I had no idea people used Wiki to that extent. The irony is they drive that much traffic to us, because they are listed in the top 3 search results for electronic signatures and digital signatures on Google. The content that exists on Wiki is a direct copy from our site. We wrote it and published it there, but they get all of the love.

    We know that active Google management can and will drive business. Unfortunately we cannot speak with anyone at Google to get help, and it is frustrating me. We all own Google stock, and we believe in the company, but the average small business person must be lost, because we are working our butts (I wish it had a literal effect), and we are getting nowhere. In early April of 2006 we had 130 pages indexed by Google. One of our developers posted a Robot.txt on our site blocking everything. This was brilliant, NOT, but none-the-less we tried over the next 50 days to fix it. We even submit a re-inclusion request last week, because we thought maybe we got blackballed or something. I know people will make mistakes, and I know there are ways to fix mistakes, but I do not know how to expedite the process, and I need help.

    This is what brings me to you. I know you don’t answer specific site questions, and this post will not make it on your site, but Matt I need help. Is there someone I can call, or email who could suggest tools or approaches we are not using. I just want a helping hand to come down from the Google heaven, and give me the answers, and even though it goes against the open world we all hope for I would even pay, but SEO optimization always seems like a scam, and honestly there is nothing they can do we shouldn’t be able to do with hard work and effort.

    Matt, this is very forward and you don’t know me from Adam, but I sure hope you made it through this. I know I am not alone in wanting a help line, or a support service. We respect Google, we pay homage to this blog and the work you have done. I just see this as our last shot in the dark before we curl up in a ball and forget the whole thing ever happened. If you have time, and I know you don’t please look into https://privasign.com, and help me figure out what sacrifice I can pay the gods to get back to where we once were, or heaven forbid even better.

    Thanks Matt. You are the Google voice of reason, and we truly do appreciate the help you try and offer to others. It is obvious you care, and I hope there is an answer out there somewhere.

    Jason McKay

  457. But there are also sites that don’t need to be there and that wouldn’t have the backlinks that could end up indexed just because someone asked. The problem is that there is no conclusive way to tell just from looking at a site. How do you tell?

    I don’t think you need to tell, Adam. Imo, a general purpose search engine should index as much as it can – just because it’s there. If they can then find, with a reasonable degree of certainty, that certain pages and links shouldn’t be indexed because they are spam, or because they are links that shouldn’t be counted, then drop them, or remove the links from the index so that they don’t count for rankings. What Google is doing is simply leaving pages out on the strength of a site not scoring enough in the trustable links department – not enough juice.

    Incidentally, the site that I mentioned near the top of this thread, suddenly came back yesterday. It had got down to having only 25 fully indexed pages and now it’s up to ~14,000 of them. I’m not now certain that it suffered from the dropped pages syndrome (DPS), or if it had been penalised again because of its functionality, and the timing was coincidental. So that’s one bit of good news.

  458. Dave (Original)

    RE: “If you’re truly out to assist your user base in the best manner possible, then it really shouldn’t matter whether or not you’re getting a cut of the sale/special. If you only promote affiliate links, to a certain extent you’re cheating the end user and presenting partial content”

    Isn’t that true of ANY selling site? That is, they don’t promote the competition.

    RE: “a general purpose search engine should index as much as it can – just because it’s there”

    Why do you assume they don’t? Looks to me like Google are far ahead of all other SE’s in that area. Like I have said (but I guess you didn’t read), perhaps they HAVE to make choices (no, I don’t know the reason) as to which pages they index.

    We can rest assured though that a reason DOES exist.

  459. Phew it took me sometime to get through this blog entry – great post Matt.

    I suppose what you are saying is that forget reciprocal link exchanges and concentrate on building content that people will want to link to – as these are classed as better quality links.

    For blogs and such sites I think this is much easier – I only have to post about the effects of the South East Asia Tsunami on my blog for a large number of people to link to it.

    Getting quality links to a website which is providing a service or product though is going to be more difficult – if I have a page promoting blue widgets who is going to want to link to that.

    Time to get my thinking cap on πŸ™‚

  460. Dave (Original)

    RE: “Getting quality links to a website which is providing a service or product though is going to be more difficult – if I have a page promoting blue widgets who is going to want to link to that”

    Agree, it is harder. I would have links on that page to “how blue widgets work”, “why are blue widgets blue” etc. On these pages I would link back to the blue widgets page.

  461. I don’t think you need to tell, Adam. Imo, a general purpose search engine should index as much as it can – just because it’s there.

    Really?

    So something like this should be indexed:

    http://www.bme.gatech.edu/groups/fontan/welcome.htm

    Or this…

    http://216.89.218.233/

    (By the way, I already know about the broken images and CSS…but since it’s not the live site, I couldn’t care less).

    The former is actually indexed in Google (if anything, showing a weakness in the engine as far as excessive indexing goes, although I suspect that’s part of a framed site and the menu frame isn’t showing.)

    There are millions of pages under construction, just like these two…and that’s just one of the reasons why your logic is extremely flawed (and it’s not even the best one). How are these pages of any use to anyone in any capacity?

    Just because a page is there doesn’t mean it should be indexed.

  462. Dave (Original)

    Thinking on the indexing issues, wouldn’t sitemaps also help Google find all pages on a site?

    If yes, then all any site needs to get all pages indexed is sitemaps.

  463. Thanks for your time.

    I worked for Geac on big library indexing in the seventies, writing bespoke add ons and installing core system/troubleshooting, so i think a bit like you guys i guess, one of the few still alive and almost sane.

    Given that background: on my teddy bears site, which helps ex-alcoholics/addicts like me for free, i got high level listings for most pages, purely designed as fun bear adventure pages to make people laugh, and self-help for various things like weight loss, addiction, debt etc. on original pages until some unscrupulous russian porn guys took over my guestbook with their links.

    that was in november 2004. my mate spotted it in october 2005 (razor sharp on the uptake). it was so obvious im amazed i missed it for nearly a year. though rectified i still seem to be languishing as a site though i fixed it in october 2005.

    i have an issue with a couple of pages highlighted before in your blog you so kindly published for a creaking old pro like me; which your excellent sitemap service brought to light – however even with that fixed i cannot see it would have stopped all the other 200 odd pages from gaining credence – so i suspect i still have a ball and chain attached whilst i am trying to swim horizontally man.

    is it likely that as it stands my site will gradually resurface properly or am i still, to quote doug and dinsdale, transgressing an unwritten law. is this also why my head is nailed to the floor or is that another google algoritm at work?

    my friend and i have other sites for business, which have taken a hammering lately, but as they are genuine attempts and not spam i reckon they will resurface as you tune it up to reinclude those good guys you screwed up in the crossfire; in any event thats business; this teddy bear site is purely there to help people help themselves/make people laugh at the bear adventues and forget worries for a while, i.e to spread joy man.

    its not spreading too much joy at present as it only gets 400 odd visitors a day, wheras before declassification/emasculation it had 1000 a day and rising. i make no money off of it except for the odd donation, which everyone on the blog will know doesnt pay the piper on a website;
    i used to get 200 quid an hour, i reckon i lose about 4 an hour now on a good day!

    strangely, as the bears link to some pretty odd places/frogs site/ferrits site and undoubtedly by their nature some VERY odd people (like me) link to them nevertheless all their pages are there. (bound to disappear now).

    problem is few get above 141 in position in serps, not prime for normal mortals.

    my point in all this rambling is this:

    as with a lot of other people on this blog i have slogged my way through submitting emails to google. very erudite emails. and i always received an answer, unlike some, but it was pretty much machine generated mass produced help waffle totally disregarding any salient points at all – uusually with a four day lapse time. this is not fun, to spend hours writing up your problem which you only got to doing because all else has failed to be regailed with MR HELP from zog says read this help text……

    i think it would help more , even if it took a week or so, to know you were going to get some kind of answer within the bounds of reasonable disclosure, which pertained directly to the questions specifically asked.

    presumably Mr Cutts, you will now email me with “i note your blog entry and refer you to paragraph 23 sub patagraph 67 of the google help manual”. i would see the funny side.

    i doubt this will be published man, but it is a genuine sadness i cannot reach others to help them and put a little back with my expertise that once i took out.

    sometimes these sweeping changes, somewhat cavalier in their instigation, take a toll on completely innocent sites merely trying to forge ahead quite legally and properly; whose owners often go bankrupt or mad before a solution filters through, if at all – a lot of people identified with google as the boy made good, the underdog that bit back and put their faith into gaining credence and exposure on Google, often abandoning others and putting all their eggs in one basket BECAUSE THEY LIKED GOOGLE AND WHAT IT STOOD FOR, so its not so good if they suffer because of that total annihilation if their sites are indeed pukka and not spam; i think Google has a slight moral obligation to try to weed out these good sites and help reassert them to proper status, and having a proper email/helpline/support line would surely allow them to try to at least have a fair crack of the whip, there is a danger Google is forgetting its origins and primary ethics in abandoning its little peoples sites to ruin by draconian algoritm changes – is this what the original altruism envisaged?

    from my own viewpoint it would be nice to see the teddy bears unshackled, at the very least the addiction and weight loss pages.

    Thanks for listening, malcolm pugh
    mr-bong@blueyonder.co.uk
    http://www.stiffsteiffs.pwp.blueyonder.co.uk teddy bears site for all man.

  464. Dave (Original)

    malcolm, are you saying your sites pages aren’t ranking as well as you would like and/or that not all pages are indexed?

  465. PhilC said:

    Intentionally leaving some or all of a site’s pages out of the index because of assumed link manipulation is morally wrong, imo.

    I wouldn’t say it’s morality is something you can argue about – but I would suggest it’s dangerous close to anti-competitive behaviour, *if* Google were to be saying that certain non-Google advertising models could result in penalties for the advertisers and publishers.

  466. Should clarify – first sentence should read:

    “I wouldn’t say morality is something you can argue about in business.”

  467. to dave original

    nice of you to read the ramblings of an insane english systems programmer with a blissfully stressfull thirty five years at the grindstone – it hasnt affected me………….

    astonishingly, in this paranoid era, ALL my pages index, even discussion ones i couldnt give a damn whether they index or not – im sure there are pages in there from people just being emails or discussion pages from aunt dolly to her errant little jimmy on mull faring orgaically………

    however i digress, yes – these pages used to figure on page one of google before russian porn guys – period. ALL of them. i guess i should count my blessings they are all there at all i hear you all wail, but notwithstanding that, apart from two who are mentioned in dmoz(which i think lifts them to divine absolution by the god brian) all of them languish between about 89 – 199 in position. these are pristine pages. they validate on every validator known to man, and also on one i wrote in c specifically for google. they pass four other progrma i wrote in c for google affinity tests i couldnt face doing over and over by eye – thay also xenu fine, they are keyword density we olove you status – they are even allowed to sing in the choir in the local church so pure is their sound – however they potter along in no mans land waiting for a google shell to put them out of their misery- they of course run riot oven msn/yahoo/jeeves/altavista et al quite happilt being the teddy bears with attitude at the top of the table and well smug………but google still denigrates them to being has beens and pariahs – which kind of grates as all they are there for is helping others and having a laugh in a grim world -though my english and warped sense of humour allowing for nearly forty years systems programming on top of an originallly probable deranged mind may figure highly in googles reluctance to grant exposure – they may well be right – perhaps the world is not yet ready for teddy bears with attitude – one of whom is a cannibal – also they may be suppressing the fact that there is incontravertable evidence the original rock and roll heroes were in fact teddybears see teddyboy.htm on stiffsteiffs website.

    you must forgive me – i dont do blogs or logs usually – in fact my computing partner keeps me in the basement in a rocking chair most of the time – but its raining so heavy here i got let out for the day.

    so, yes – all pages feature, and no they arent exactly pre-eminent – if they were gladiators they would be dead.

    a last line to all birmingham city supporters about emile hesky from steve bruce during a recent match.

    get warmed up son, youre coming off.

    this is fun, but id better go back to the basement. cheers malc pugh rubery england

    and yes, i know there used to be an asylum in rubery……………

  468. Thinking on the indexing issues, wouldn’t sitemaps also help Google find all pages on a site?

    If yes, then all any site needs to get all pages indexed is sitemaps.

    According to the sitemaps page, this is what is “supposed” to be happening.

    Just like, according to Matt, when there’s not enough results in the main index, the SI is “supposed” to be used to fill the querie.

    What “supposed” to be happening and what is actually happening are 2 different things.

    Dave

  469. Thinking on the indexing issues, wouldn’t sitemaps also help Google find all pages on a site?

    If yes, then all any site needs to get all pages indexed is sitemaps.

    No. Submitting a Sitemap only tells Google that the URLs (files) exist. According to Google, it doesn’t mean that they will be crawled and indexed.

  470. We’re getting more crawls now with BD than we did last year although what is indexed is still fluxuating wildly for us — and actually less than before.

    I’m especially puzzled about 1 thing though: how can my 7 year old site that has a PR7 homepage not even be ranked on the first 4 pages for our own business name? We get daily crawls and used to be ranked #1 for our name (with or without the hyphen, as a single word or 2 words, it didn’t matter).

  471. Mike.

    Matt said that Google is intentionally crawling more pages from a site than they will index. That could account for the lower number of indexed pages, and the higher amount of crawling.

    He also indicated that affiliate OBLs aren’t helping, but he was talking about crawling and indexing, and not about rankings. Perhaps an abundance of affiliate OBLs is now having a negative effect on rankings.

  472. Hey everyone,

    PLEASE PLEASE PLEASE (re)read Matt’s comment guidelines! before posting here.

    Matt’s on a (much deserved!) vacation right now and I’m assisting with his blog. In particular, I’m unabashedly doing the best I can to uphold his comment guidelines by moderating the comment queue and sending inappropriate comments to that great big bit bucket in the sky.

    Thanks for your understanding.

    – Adam, on behalf of the Search Quality Team

  473. Dave (Original)

    RE: “According to Google, it doesn’t mean that they will be crawled and indexed.”

    I didn’t mean to insinuate that SiteMaps would *guarantee* a site would be fully indexed, only that it would *help*. If a sites pages are not being fully indexed then, for me at least, submitting a SiteMap would be common sense and my first port-of-call.

    At the end of the day, Google is doing a far better job than the other big 2 IMO. I guess as Matt has stated before though, Webmasters tend to compare Google with perfection and not their competition. With this mind, some will NEVER be happy.

  474. Dave (Original)

    Mike, your business name is VERY generic and is probably a common SEO phrase. Also, I dont think we should confuse real PR with toolbar PR. Keep in mind also that Matt has stated before that PR is is only one of over 100 factors used in ranking.

  475. If a sites pages are not being fully indexed then, for me at least, submitting a SiteMap would be common sense and my first port-of-call.

    Absolutely – even though it’s unlikely to make a difference when a site is being restricted.

  476. Matt’s on a (much deserved!) vacation right now and I’m assisting with his blog

    Hi Adam. You arrived just time to jump right in at the deep end, huh? πŸ˜‰

  477. PhilC, I don’t think there’s *ever* been a dull moment at Google πŸ™‚

  478. Dave (Original)

    RE: “Absolutely – even though it’s unlikely to make a difference when a site is being restricted.”

    I wouldn’t say that for sure. If however it didn’t help, I would then seek out links from relevant sites.

  479. hello Adam – good luck (and thick armour).

    you will be pleased to know this is my very last post, i hope i havnt broken too many laws.

    will you be addressing my queries, or will Mr Cutts be doing it on his return?

    i am a bit with blogs like groucho marx was with being a member of a club.

    thanks in anticipation, appreciate having somewhere at long last to even ask something and get a meaningful reply.

    yours sincerely

    malcolm pugh – rubery – england

  480. PhilC, I don’t think there’s *ever* been a dull moment at Google

    Probably not, but Matt dropped a huge bombshell with his intial post in this thread. Anyway – all the best with it πŸ™‚

  481. Like lots of others, my site has suffered badly in the recent Google shake-up.

    I’m trying to understand why my pages should suddenly disappear overnight from the first 10 – 20 results.

    I don’t understand why over 75% of my site is now “supplemental”.
    Why is Google regularily crawling my site, but not displaying up-to-date pages?
    You are in some instances, showing way out-of-date pages, which causes confusion with my customers.
    Surely, it would be just as easy to display the current cache?
    I don’t understand it.

    I submitted a sitemap, thinking that it would give links to all my pages for Googlebot.
    In fact, although Google is telling me that the sitemap is being regularily downloaded, the results get worse by the day.

    So, please can someone at Google be more precise about things?

    I am told that in-bound links would help.

    How on Earth am I supposed to solicite quality in-bound links from competitors sites?
    I certainly will not link to theirs.
    And if I don’t show in the rankings, how will other types of site find me?
    I could arrange reciprocal links with hundreds of websites, but in the main it would not improve the quality of my website to customers.
    Surely, this policy will only serve to boost the rankings of bigger, more wealthy businesses, who can interlink with several controlled sites, and run lengthy weblogs. I see lots of evidence of this in my trade. The same high-ranking companies using lots of different websites, all inter-linked.

    I am also told that Google is comparing the content of different webpages, and dropping those that appear to be similar or “cloned”.

    I have lots of SIMILAR products, each showing on separate webpages.
    There is a wealth of data showing in detailed drawings and photos. Each of the pages is, in fact, entirely different, but they could, I suppose, be considered similar by a robot.
    Must I now add superfluous cluttered text to these pages, just to make
    them “different” to Google?

  482. Hi, Matt. I’m not SEO ptofi just the editor of architectural site(Russian). I had some supplemental results but use to think it’s OK. Unfortunately today I found that I have only 4 pages in main index – 2 from my site & 2 from another site (have no idea about that site). The question is – is it temporary situation or I really have some troubles? I spent a lot of time writing articles, collexting original photos etc… So I beg u to explain me the situation

  483. Hi Adam,
    I have some questions. Hopefully they have not been asked before. After the majority of my pages have been deindexed (due to the recent google/big daddy/google dance stuff), several of the pages still kept their page rank (google toolbar ranking) but they are not currently indexed. My question to you is what exactly does this mean? If pages still carry a page rank but are not indexed then what part does the current page rank play when a user does a search on google? Are my pages still going to show in web results? After all, they do have a rank, right?

    Is google going to return and index these pages that still kept their page rank?

    Any help/clarification you can provide would be a great help.

    Jamie

  484. I can see that since big daddy the indexing of OS Commerce sites has taken an almighty twist!

    The pages are still indexed but have not been refeshed for months, the google cache still shows old meta data and content. Even those sites that have had the URL’s rewritten to .html format and employed a 301 redirect from the previous dynamic URL’s have yet to fall under the google radar!

    Not even a google sitemap helps!

  485. I agree with PhilC. The expectation of the user of Google is that all relevant sites will be listed, large or small, new or old.

    Here’s the thing about refusing to index sites without “enough” trusted IBLs. If you’re a small niche site webmaster, as I am, you don’t have a lot of natural IBLs. But most of us are honest and do our best to create good content. After all, the average webmaster thinks, that should be enough. I’ll make good, unique content and put it on the web and eventually, people will find it.

    But how can people find it if it’s never fully indexed because it’s too small or new to have a lot of IBLs? How are users benefitting when they search for unique content that is on the site, but Google shows them NO RESULTS because the site is largely supplemental? It’s not as if we’ve done something WRONG or DISHONEST just because we don’t have more than a handful of IBLs.

    I built my site organically without “tricks” but I currently have three pages indexed out of 600. The message I’m getting: “Google won’t index anything more than a couple pages unless you play their game. Aggressively try to get IBLs and hope to death the people you approach don’t expect you to link back to them. Good content isn’t enough anymore.” I thought we were supposed to design sites as if search engines didn’t exist. I did so, and as a result got largely erased from Google. While I’m not going to go blackhat, I’m not surprised others are. What have they got to lose? I won’t be contriving links – therefore I probably won’t get more pages indexed unless something changes. I may even lose all but the homepage at this rate. That doesn’t seem right, does it?

    Google built its rep on being the most comprehensive. I don’t understand why suddenly that’s not important anymore. They can do what they want, of course, fair or unfair. But I’m not coming here to whine or rant. I’m coming her to express my concerns with the hope (perhaps naive) that someone will read these and realize yes, it is unfair and yes, we need to make some changes. I’m coming here because I don’t yet believe Google is a lost cause even though currently it treats my site as practically worthless.

    If worse comes to worst and Google continues to de-index, it is only hurting itself. No company is so big that it can afford to disregard user needs.

  486. Great Post! I know it took along time to do.

    I am see some of the same issue with my customers.

    I wish Google would stick with a set of rules and make no other changes.
    I will not hold my breath on that one.

    Thanks

  487. Way to go Dan for dragging Matt into reality for a single post! πŸ˜‰

  488. Adam,

    There is a thread going on at webmaster world about pages still dropping out of the index. A lot of webmasters noticed that on the 17th and 18th that google began to de index pages again.

    http://www.webmasterworld.com/forum30/34398.htm

  489. “It’s more comprehensive.” Has been said regarding BigDaddy.

    I then read that sites will no longer be indexed fully if not linked to according to googles liking. Full indexing is in the past, and we will now only list 5% of your pages and the rest will be 6 month old supplimental pages.

    I then went on over to google and used the operator, define:comprehensive

    All of those definitions seem to say that comprehensive means to cover everything, most of things, stuff in its entirety.

    So now we have conflict of statements. How can the index be more comprehensive while at the same time not include as many pages. I travel the same blogs and forums as most here. They are not full of people saying that their page count went up like crazy since bigdaddy rolled out.

    Does that just mean that for every page deleted from a small site operator that doesnt have 100 friends that have websites link to him another ebay auction page will make the index? Because I’m not so sure having pages for Light Blue Widgets for sale on Ebay that expired two weeks ago constantly returned as a search result is better than having a page written about the use of Light Blue Widget written by a retired gentleman in Fargo. Unfortunately for the world he cannot get a link on the Washington Post website to his work, so google just won’t even show it on page 999 of the serps.

    What does more comprehensive mean? More pages but from less sites? More of the same sites shown for the same search terms? I don’t understand the double talk.

    Thanks,

    John

  490. Jeff said, “There is a thread going on at webmaster world about pages still dropping out of the index. A lot of webmasters noticed that on the 17th and 18th that google began to de index pages again.”

    There also discussion going on about someone who made their home page have 10,000 links on it to product pages and they were reindexed. The flatest of flat directory structures. There are going to be some interesting sites out there due to this.

  491. John,

    Can you post the link to that thread?

    Jeff

  492. Dave (Original)

    What might help, those with not all pages indexed, is seeking out directory listings. Not those that *require* a link back as they are link farms.

  493. Jeff and all,

    The WMW Post disccussed above is http://www.webmasterworld.com/forum30/34442.htm It starts with someone noticing a pattern, trying a crazy fix, it working, then devolves from there.

    John

  494. Googles original reason for being was to index the whole of the net, comprehensively, without fear or favour, and allowing the small guy in the street to achieve his dreams of his website being visible to all on a level playing field with any other user, big or small, rich or poor; without money having any influence or being able to do back door deals.

    I think this also reflected in the usage of google by ordainary joe public; they wanted to champion a little guy made good who took on the big guys but retained the integrity to look after them, and nurture their websites.

    All that is deemed critical in the “Guidelines” is to create relevant content and avoid spurious actions. Nothing in there says “yea verily, if you are small, go out and multiply links like the sands”.

    In view of the driving statement on this thread, that is in fact what has been deemed to happen. Inbound links, and preferably themed ones that come one way inwards, are purported to be necessary to have more than your index page visible, though you may have forty pristine pages.

    There are two ways of looking at that statement, one is that it is actual fact, and two is that it is a convenient cloak to disguise a problem. Google is notorious for not giving anything away, to the extent you have no idea what they are doing, then all of a sudden a missive on this thread sets out in meticulous detail how big daddy works and that small users must basically cultivate inbound links from total strangers without being able to offer even a link in exchange.

    this to me smells a bit off.

    however, given it were true, is it not totally contrary to the whole ethos of Google when originally set up and marked out in clear guidelines like tablets in stone?

    i have spent four years sweating over my own private help website and another one on twelve other commercial websites, rigidly adhering to these self same guidelines.

    Is it really good enough to say “sorry guys, big daddy went and shifted the goalposts, go ye forth and seek inbound links”?

    how the hell are new websites just setting up ever going to get any links in that case? Who will ever see them in the first place? Is it going to be viable to say to another webmaster “ive got a cool website of forty pages man, you just have to see it and link to it.”, “ok dude, let me see it, which page should i link to?” “well, you can only index my main index page………………” well good street cred for that site.

    This being contrary to original guidelines seems to be a huge red herring covering over a disaster.

    the ceo said there was a huge space and machine problem.

    big daddy may have been rolled out not “pretty good all in all” but pretty flawed and unretractible.

    It just seems odd a company so intractible and reticent about putting data out there in the public domain suddenly becomes effusive about why things are happening.

    Even if all this is true, what this thread is saying is “hey, all you little people sites that supported us from the off; you dont count any more, we have abandoned you because you are not big enough and rigged it so as you will never be able to be so – sorry guys, our guidelines got trashed, heres a new set”.

    Is this acceptable for millions on millions of people who have slaved blood sweat and tears to get their website up on google? is it fair to trash their dreams in march then wait till half way through may to say they have moved the goalposts and hey, you are all now history?

    Who searches? who types into google to ask for answers? does big business sit there typing away at mundane searches, or is it joe public who uses Google and made it pre eminent in search mythology?
    the same joe public whose websites are getting trashed by the day.

    perhaps something should be done to put the guidelines back where they were and these genuine and heartflet personal websites, which people have put real thought, effort, time, patience and swear words a plenty in creating, back into full view, full pages, full stop.

    this current situation is unfair, intolerable and shabby, and a sad way to treat loyal people who raised Google to where it now sits treating them like plebians to their proletariat.

    We want a level playing field again, with the old guidelines and we want it now.

    Malcolm Pugh – webmaster – http://www.stiffsteiffs.pwp.blueyonder.co.uk
    mr-bong@blueyonder.co.uk

  495. Googles original reason for being was …

    I’d guess that Google’s original reason for being was simply to launch the new engine and try and make a success of it. I don’t know of any Robin Hood attitiude.

    The only thing that makes any sense to me is space. Eric Schmidt (the CEO) said that the machine are full and that they have a crisis, and yet Matt said they have all the space they need to run everything including the index. In the first few months of this year, Google spent several hundred thousand dollars on servers, so they shouldn’t be short of space, but Google needs servers for more things than the index, so all the servers probably weren’t for the index.

    With the new servers, I can imagine that Google really does have lots of space for the index, and to expand the index a lot. But I can also imagine that a decision was made to be more selective in what is included in the index, so that the current capacity isn’t filled too quickly, and to avoid keeping on adding more and more capacity all the time. That would make sense of this fiasco to me.

    I can’t see this fiasco being wholly about spam and/or certain types of links, because of the health care site example. It’s “a fine site”, there’s nothing wrong with its OBLs or IBLs. It’s just that it doesn’t have enough good IBLs, so spam and/or link types aren’t the problem there, and it looks like a simple limitation based on good votes for the site.

    When you think about it, is it possible for a search engine to continually add everything it finds to the index, bearing in mind the rate of expansion of pages on the Web? It probably isn’t possible without continually adding more and more capacity, and I’m not sure that that’s possible either. Both MSN and Google are in the process of building huge facilities close to large hydro-electric plants (for the electrical power), so large expansion continues, but is it really possible to index everything, or is it actually better to a bit selective? If I were an engine, I’d certainly give selectivity a very close look.

    If that’s what’s happening now, then it’s working in the wrong way, imo. There are billions of pages in the index that nobody wants there except their owners – scrapers, etc. It may be very difficult to programmatically indentify them with a reasonable degree of certainty, but that’s where the focus should be, and not on limiting the presence of perfectly clean sites in the index.

  496. Oops! That should have said that Google spent several hundred million dollars on servers this year.

  497. China, google video, google earth – may have used up a bit of the server space, lord knows how they expect to store every video known to man.

    what remains is that googles core users are being stitched up like kippers whilst what sold on ebay three weeks back is readily available.

    People want quality in searches, so the bottom line on this is that as the standards drop, and all you get is ebay, other search results, ufindus directory pages and out of date data then people will switch to other searches – this is why google thrived originally – natural progression to better results. likewise, if google abandons its little user websites another “new” google will evolve to take up those sites just as google did originally.

    it would seem to me that google were originally altruistic, i worked for a few firms like that that went big and forgot their dreams, they are nowhere now as their customer base went out of the window in direct proportion to their loosing their grip on their original purpose, ethics and focus.

    I am suprised so many on this blog just accept this huge swing in policy and being sidelined big time, and merely fawn to the google gods and ask “wherefore shall we seek these new links then great ones”

    i would suppose there is a core of google employees who are secretly fuming at this deriliction of all the standards they held dear and worked for and towards – it is not a lot of fun to find your employers who sold you on a dream are now watching and promoting another movie to the one you are trying to uphold.

    i worked with great devotion to some firms in my youth, in real belief in what i was doing, only to find that the bigger they got the more the original emphasis, camaraderie, visions and hopes for the future got blanked out by the grey reality of corporate success, to the detriment of the firm, the employees and those it originally strove to help.

    IBM were invincible in my day until it believed it was – thus goes google as we speak, could probably also be true if this imbalance and failure to support and sustain the little man, who in the first place MADE google the institution it now is allowed to continue.

    Natural selection will take out what was until now the only hope for small real human single users and their own cherished websites which bring them fun to see their name up there on google.

    i think it is sad that such a light in a grey universe is slowly dimming as it sinks under the waves of its own success, neglecting to remember how it came to be sucessful in the first place, as the only outlet of the ordainary man it has now patently seenn fit to betray and banish from its pages under the guise of “what a super new algoritm we have, better get those link finding skates on normal people”

    and everyone seems to be scuttling to do their bidding instead of stopping to question why the hell they should be having to do so.

    content was purported to be king and good practices his courtiers – it would appear the queen has taken over and we must all bow to her wishes.

    its not goog enough for me, it should not be good enough for you, and id be suprised if it isnt not good enough for googles original employees.

    i am on msn/yahoo/jeeves/altavista. i have articles all over the shop, real ones written by me. i have directory entries for real things. my teddy bear site, non google high listed though extant still pulls in 600 visitors a day sans google – and thats in an era of google supremecy. so if google slips a little should we all cry – after all we have been left out to dry.

    what they mention in their guidelines(of old – as in original guidelines) is still salient – good content – proper links – couple this to good articles and directory entries and if the current google doesnt get its act back together then someone else will evolve that will, or other engines will realise that it really does come down to those criteria any way – the only thing you cant spam or tweak is real content, the only other thing that shows yu are good guys is real visitors – these will surely surface in the end anyway with or without google so i for one will not be rushing off and fabricating artificial links to satisfy googles whims – i will be hoping they reassert some sanity – but regardless of that going on about my business in the right manner. i thought it salient to try to talk to them via here to give them a chance to see it as i and many others perceive it, in case they had lost the plot and needed a reminder. this is that reminder.
    good luck to you all – i think the original guidelines are not a bad idea to follow in essence whether google sinks or swims.

    malc pugh

  498. I am suprised so many on this blog just accept this huge swing in policy and being sidelined big time, and merely fawn to the google gods and ask “wherefore shall we seek these new links then great ones”

    and everyone seems to be scuttling to do their bidding instead of stopping to question why the hell they should be having to do so.

    I think you’ll find the very few people in this thread have agreed with what Google is doing. In fact, I don’t remember any posts of support, except from Doug Heil, and I’m not even sure about that.

    Acceptance is different though. The reality is that, unless Google changes it, it’s here to stay, and we have no choice but to live with it (accept it). Like you, I have no intention of doing any unnatural link-building just to satisfy Google. Why the hell should I? If the site is good, it should be indexed. If Google doesn’t want to index my site properly, and do as good a job as they can for their users, then “stuff Google!” is my attitude – I said that before.

    But not everyone is in a position to adopt that attitude, and, as long as Google is the biggest provider of search engine traffic, nobody can be criticised for taking steps to fit in with the new way. It doesn’t mean that people approve of, or agree with, what Google is now doing. Judging by the posts in this thread, people strongly disagree in general, but people can’t be criticised for recognising that things have changed, and seeking to survive in the new reality.

  499. Damn! It amazes me how often a typo causes the meaning to be reversed. The first line of the last post should have been:-

    I think you’ll find that very few people in this thread have agreed with what Google is doing.

  500. There is a choice anyone can make – choosing another search engine to use for day to day searches – google does not have to be the default search per se forever – maybe it has become too blase on that score.

    people tend to vote with their feet eventually.

    Not a lot of traffic would flow if no one was there.

    malc pugh.

  501. Traffic is customer driven, not google driven – you can have the best shop in the world, it will shut without customers coming in the door.

    Assume the computer mags keep running their six best searches in the world to the public – week in week out – as soon as google dips down to fourth, which is on the cards the amount of salient data it is dumping, what price traffic then?

    you are as good as your last game in sport, as good as your delivery in business – as soon as you take your customers as read for granted in your pocket you are on a slippery slope.

    If evryone bought any petrol but brandx for two months how would brandx fare?

    likewise, if google suddenly has no punters using its search for months on end due to poorer results what price its market share, share price and traffic then?

    google rides on people USING it as a search engine, we dont subsist to pander to google. it should be a symbiotic relationship, but is currently one where the shark is eating pilot fish and expecting the scant surviving pilot fish to clean it.

    If everyone disaffected with google started using alternative search engines to SEARCH im sure google would soon rectify these little “anomalies” and im sure they would pay a bit more attention than they are at present.

    We govern who cracks the whip by our own choices and actions, we are not governed by the retailer when its us who are buying and choosing.

    I will quite happily wander off and write my own search in the end if i have to and all else fails, google are not the only people who can program indices sucessfully.

    We all have a choice, we all have a will, we all have freedom of action.

    if google fails us all in this then why frequent its search and perpetuate its existance?

    ive said all i can on this, good luck to everyone suffering at present

    yours sincerely

    malcolm pugh

    england

  502. Dave (Original)

    Perhaps there is not enough hours in a day, weeks in a month or months in a year for Google to include everything it can reach. A criteria for deciding which ones would then need to be used.

  503. Can someone at Google please give a simple explaination for this:

    As I’ve said in a previous post, my site enjoyed good rankings in the first 10-20 results until the “Big Daddy” shake-up.
    On 30th March I disappeared from view.

    I submitted a sitemap to Google, cleaned up the site’s internal links etc, and nervously waited.
    During April, nearly all my pages were re-visited and current versions were gradually being displayed in the cache.
    Then, in a cruel twist, all my major keywords returned to the first 10-20 placings on 16th May.
    I whooped with joy.
    This joy lasted for just two days.

    Suddenly, the site went “supplemental”, and again, I disappeared from view..

    Google has now decided to list only 20% of my site pages.
    The rest are “supplemental”.

    OK, reading the many posts on this subject, maybe this is the way of the future at Google.
    BUT, the “supplemental” cached pages being shown of my site are up to 12 months old!

    So – given that Google had an April 2006 cache of my site, WHY is Google using these OLDER versions ?
    WHERE are my April 2006 -cached pages ?
    Have they been lost or discarded ?
    IF the April 2006 cache was to be found and used by Google, would those pages still be classed as just “supplemental ?”

    In other words, are the OLDER versions of my webpages affecting my rankings ?

  504. While I appreciated the fact that Matt Cutts is talking to us, I hope that he is also taking the various suggestions you can read throughtout the comments back to Google.

    So, here goes my humble comments and in between the lines, suggestions for Google or anyone out there who cares to develop a better SE:

    I believe that Google, as well as the other search engines, should focus on finding a way to better SELECT indexed sites – rather than quantity, quality. We’ve seen Google’s index grow a great deal (particularly in the last 18 months or so), making it more difficult for webmasters and SEOs to get a decent ranking, when following its original guidelines. Clearly, this benefits its advertising business.

    The link development requirements are crazy and unrealistic because it only makes everyone spend more time on getting links while they could be developing real partnerships (and real bona fide links) and content (!). Even if a SEO knows the best way, the clients are pushing for results, and faster and faster results…

    Our approach is to develop quality content, create site tools that are important for the visitors, AND select link partners (yes, shame on us!) like we select friends. Very carefully. We also work on increasing overall online visibility (submitting articles, for example).

    However, whatever the product/business segment, we are competing with many, many spammers who DO get good results, despite all the talk… and if we were to report spammers, we would do nothing else all day long. Humm, I do have a business to run, clients to take care of…

    Here, we are also not neglecting the other SEs, because our goal is to get quality traffic across the web, especially because we have growing concerns about Google’s standards – and results quality. For example, why do directories show up when I do NOTt type “my keywords” + “directory” in my search? Most directories have no content, just links. If I wanted a directory, I would search for one. Also, why does Google keep old, outdated and obsolete sites in its index? There must be fairly easy to purge pages from the index that have not been updated in the last, let’s say, 180 days… Who needs a site that hasn’t had any new CONTENT (not links) since 2002??

    Google should be looking into developing more ACCURATE search results – e.g. based on business/organization location, main purpose, etc. – instead of indexing more and more and more and more pages… Perhaps revisiting its original idea/mission is not a bad idea.

    I think Google might be shooting itself on the foot: by making it so difficult for good sites to get decent rankings, it will start loosing quality and thus, losing users. Just look at the posts above and you see “joe public” starting a revolution.

    Thanks all for your contribution.

  505. Sounds like Google is cracking down on ad networks, but might end up hurting general sites (like my little baby blog) that link to whatever they want just based on what they think is interesting that day.

    We’ll see how it all shakes out- thanks for the post Matt

  506. I think you’ll find the very few people in this thread have agreed with what Google is doing. In fact, I don’t remember any posts of support, except from Doug Heil, and I’m not even sure about that.

    Since apparently it has to be explicitly stated now or for some bizarre reason it doesn’t count, I support what Google is doing and am okay with it.

    Hey Adam (as in the one who needs to change his name because I got here first πŸ˜› ), are you gonna be doing any guest posting for Matt or just cleaning up threads?

  507. I recently spun off a subdomain of my website to its own domain using a 301 redirect in an .htaccess file. That went alright and it seems to be successfuly reindexed by the major SEs. However, the new domain has lost its page rank in Google.
    Is it typical that a subdomain doesn’t carry pagerank over when redirected to a new domain? Also, is the redirected subdomain going to be treated as a brand new domain, ie in the sandbox for a year?
    I am planning to do the same thing with another subdomain in the same site. Can you see anything here that would penalize the websites with Google?

    Thanks – love your blog
    Mike

  508. Dave (Original)

    RE: ” think you’ll find the very few people in this thread have agreed with what Google is doing”

    That would be the silent majority I mentioned, or the vocal minority.

    Reading Matt’s Blog I see 95% problems all-round.

    RE: “are you gonna be doing any guest posting for Matt or just cleaning up threads?

    I think he has just quit πŸ™‚

  509. Because I am not tied to WMW and always seem to miss out on the opportunity to submit my specific case details, my website is still in the crapper. Site: shows 300+ out of 1,800+ pages. Seems to have only the home page and 1-2 levels crawled/indexed. Even level 2 is patchy.

    “Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled. As these indexing changes have rolled out, we’ve improving how we handle reciprocal link exchanges and link buying/selling.”

    I am disturbed by this comment, not becasue I do the same thing, but because there are so many reasons why this would happen in a perfectly natural way. I am begining to wonder if I am the victim of this type of “spam prevention” mechanism. I run a resource website that links to seemingly unrelated websites A LOT. By unrelated, I mean to a machine, not to my target audience.

    Can I be sure that Google understands that cakes, travel, dresses, photographers, fireworks, babysitting services and magazines are all, in fact, related under the overall theme of my website?

    Am I being penalized becasue websites that deals in antique cars, a day spa, a jazz band and a choclatier are all linking to me. Unrelated? Seems so. Poor quality? Sure, none of them are SEO savvy and therefore know nothing about links and PR.

    My visitors see the connections, though..

    I do NOT engage in link heavy exchange. A few (1-10) will happen every month.

    I DO engage in link selling. IT”S CALLED PAID ADVERTISING.

    I DO get a lot of incomming links as a result of press. Are these links also unrelated and therefore of low quality (suspect)?

    I’m beginning to wonder.

    “If the site is good, it should be indexed.”

    Who says what’s good? If the website is crawlable and indexable, it sould be crawled and indexed.

    “You can argue as much as you like, but you still can’t come up with a valid reason why any decent, perfectly clean, website should not be fully indexed, given that there is plenty of space in the index.”

    Like I said…

  510. Since apparently it has to be explicitly stated now or for some bizarre reason it doesn’t count, I support what Google is doing and am okay with it.

    Having just realised who you are, that doesn’t surprise me in the slightest. But, since you mention it, yes, you *do* have to express an opinion for it to be counted. Silence simply doesn’t count.

  511. My site which follows Google guidelines to the letter has gradually improved its rankings over the last year to the 30-40 results for our actual business name and in the 1-10 results for some of its keywords, with some more common keywords in the 100-500 range. Our keywords are actualy very specific to angling and our location.

    Up until now weve solely concentrated on providing original quality content and interesting reading to the angling communnity, but its becoming apparent that to gain higher rankings we need to obtain hundreds of IBLs, it doesnt matter from where !!!!

    The most frustrating part is that 2/3 of the sites we are competing with on Google have no content at all, just thousands of inbound/outbound links, just directories in effect !!!! These same sites then have completely off theme links adverts !!!! Aaarrggh !

    Matt, why doesnt google penalize sites that have more outbound links than actual text ?

  512. Dave (Original)

    Phil, talk is cheap. Actions are what count. Most of those happy with Google are too busy to complain πŸ™‚

  513. Matt, regarding pagerank, my observation is that if you have a squeaky clean site, perfect code to W3C standards, lots of original content, a few good inbound links, and absolute minimal outbound links then after say 6 months the site would be PR4, this seams to be the baseline for Google Pagerank, if a site is less than say PR3 then it doing something wrong, if its PR5 or above then it conforms to Googles guidelines and is benefitting from quality inbound links.

    Googles Pagerank seems to be a good indicator of quality for most sites, but Ive come accross quite a few sites that have PR3 but have no content at all except for ‘under construction’ ?

    ps there are some good content sites that only show PR1 but I assume they are due to poor coding.

  514. As you wish. I’m sure you’ve polled the silent and busy ones, so I can’t argue with that, can I? Perhaps you’ll publish it sometime πŸ˜‰

  515. Dave (Original)

    Nah, no need to poll. I’ll just “assume” like you πŸ˜‰

  516. > are you gonna be doing any guest posting for Matt or just cleaning up threads?

    Not at this time, and no πŸ™‚

  517. Coding has nothing to do with page rank. And i think most everyone agrees by now that the green toolbar that shows PR is pretty meaningless.

  518. I’ve just recently noticed that the Google traffic to my blog has vanished. I don’t usually look at my stats, but now all my visitors are from Yahoo or MSN, whereas I used to have 100+ from Google every day. The somewhat amusing thing to me is that while I am a webmaster and do have commercial sites, my blog is completely and utterly pristine, no advertising, no affiliate links, and 100% original content all the time. My only outbound links are my blogroll.

    I write pet health articles and used to have huge amounts of traffic to a few of my entries, including my most popular one, about medicating dogs who are afraid of loud noises. I discussed the pros and cons of various available treatments, and had a few visitors to that entry every day. Now that entry isn’t even in the first 100 search results for phrases I think would find it. Ironically though, on the first few pages of search results there are links to comments made by me on the blogs of other people, where I’m talking about some of the problems I’ve had with my dog! So, my one-sentence version of my article is loved by Google, but the entire article isn’t. Btw, those comments by me which show up in the search results aren’t spammy hyperlinks, just conversation on the blogs of friends’.

    I guess the problem is that most of my inbound links are reciprocal because of the blogroll? This was never intentional, it’s just that it’s normal for a community to form, and for people to link to each other as they form “blog friendships”. I read all the blogs on my blogroll each day — that’s why the links are there. They help remind me to go read, and they help people who read my blog to find blogs about related subjects.

    I think this is an example of where this algorithm is faulty. Pretend I know nothing about SEO, and I’m just a person writing original articles about a subject on which I’m well-informed, in a blog format. This is supposed to be what Google now loves, but my traffic now totally sucks. I started my blog because I wanted to be helpful about what I’ve learned about pet health over the years, but it’s not very helpful to anyone if my articles don’t show up as they should.

  519. Dave (Original)

    RE: ” And i think most everyone agrees by now that the green toolbar that shows PR is pretty meaningless.”

    I wish that were true. Unfortunately many out there live and die by TBPR and are PR junkies.

  520. regarding pagerank, my observation is that if you have a squeaky clean site, perfect code to W3C standards, lots of original content, a few good inbound links, and absolute minimal outbound links then after say 6 months the site would be PR4, this seams to be the baseline for Google Pagerank, if a site is less than say PR3 then it doing something wrong, if its PR5 or above then it conforms to Googles guidelines and is benefitting from quality inbound links.

    Googles Pagerank seems to be a good indicator of quality for most sites, but Ive come accross quite a few sites that have PR3 but have no content at all except for β€˜under construction’ ?

    ps there are some good content sites that only show PR1 but I assume they are due to poor coding.

    Terry. You are mistaken about PageRank representing the sort of quality value that you described. Doing something right or wrong doesn’t affect PageRank. In fact, some of the spam methods increase PageRank a lot, and doing everything totally by the book doesn’t improve PageRank one iota. There are plenty of squeaky clean sites that show PR1 in the toolbar, and plenty of not so clean sites that show PR6 an up.

    If you do a search on ‘pagerank’, you’ll find some good articles about it.

  521. On an inspiring rather than dissillusionary note, it is quite likely than predominantly disaffected people write to blogs like these as they have, or perceive they have, a problem, wheras as many have stated i guess if you are having no problems you might not even search this log out in the first place. the only reason i did was i had a problem and could not get any joy anywhere else.

    also to their credit google have printed all my comments, both pro and adverse.

    If there has been more than a link issue, which i deem likely, then it is likely to have been a space issue or a rebuild the index from scratch and throw out the dross issue; either of which might well involve just retaining the index core page as a base reference to rebuild the link pointers onto.

    in which case all of our websites that are down to an index page should soon “flesh out” again as the salient data recently gleaned from better and deeper crawling is reapplied to a stable and cleaned out base with better foundations.

    this would explain the crawling but apparant non reindexing, a lapse between the two, possibly until new space is brought online to service it.

    if websites were deemed to be only so good as to warrant an index page, what is the point of even retaining an index page? it would be better to completely zap the site, and effectively much the same effect, as a site with 600 pages hardly functions with just its index page anyway.

    however, if your intention is to completely spring clean all the dross at the same time as adding more space, then you would need to retain your index pages as salient points of reference for a rebuild; so i predict those with just an index page will gradually see all of the pages that are valid and pristine reappearing, or in most peoples cases all of their pages full stop.

    i dont buy the inbound links explanation, it seems to have too many holes in it, i think we have witnessed/ are witnessing a rebuild probably initially prompted by space considerations, and thence seen as a good time to revamp properly to put the index on a sound and restructured basis removing a lot of historical; and “known” spam dross.

    presumably also a lot of valid sites like the Lady with pets blog may have got caught in the crossfire and will be addressed and rectified as they learn to fine tune their innocent victims.

    if this is the case i think id like to have been told “hey guys – we are revamping the index and you might lose your pages for a month or so………” which would at least make you realise that at some stage they were due back online, rather than the thought of eternal damnation and forever excluded.

    i hope this raises a ray of hope for some, as this must have been catastrophic for some businesses who still must wonder what the hell is going on, and not forgetting thjat we here can debate this sissue knowing a little of what is behind it, where to look for answers, and why things work as they do. for everyone that does there must be thousands of ordainary people and businesses who just simply cannot understand “where google has gone” who are probably the most innocent users afloat as they havnt a clue about how any of it works and simply wrote it as it is “much as google says you should” just popped in their content and tried to lob a few salient keywords here and there, no devious intent, no machinations, no machiavellian manoevers, yet they are suffering the fate apparantly reserved for those who spam unmitigatedly. this i think is where this is most scary, this update is cleaning out websites it is supposed to be sponsoring and nurturing, the completely innocent webmasters to whom seo might as well be ufo.

    But I am sure this is coming good in the wash now, or soon, its just a pity this, if the case, had not been made a little more transparant.

    Fear not, those single index pages will soon flesh out again.

    Malc Pugh
    England.

  522. Dave (Original)

    RE: “..and doing everything totally by the book doesn’t improve PageRank one iota”

    You really are the eternal pessimist (no, not realist). There are 2 serious flaws with the statement.

    1) Nobody outside Google knows what the *true* PR of any given page at any given time really is.

    2) Doing “everything totally by the book” is a what will make a Web business prosper long term.

    Your condoning of SE spamming and admission to being a black hat Phil does nothing for your ‘professionalism’ .

  523. Can’t you do anything but troll? You’re not even a good troll.

    Your #1 statement: nobody suggested any different.

    Your #2 statement: nothing to do with anything that was previously written.

    Your third statement: “professionalism” is nothing to do with black and white hats – success is πŸ˜‰ But I suggest you learn to understand English better before you comment on people’s posts, because you’re not very good at making educated guesses.

  524. Btw, Dave. Matt wrote some guidelines for comments, which you might like to read, because your posts don’t comply with them. If you really do want to discuss your views, I’d be very happy to discuss them with you somewhere else, but not in Matt’s blog. But if you are just trolling for the sake of it, then I’ll leave you to it.

  525. One is led to presume from this thread that inbound links should be cultivated now to help boost results. no doubt over the course of this blog in the past, the present, and the future, there will be many other imperatives to follow which may give short term gain also.

    but i believe the original Google concept of good content, to the point of being completely ignorant of seo is probably the most viable long term idea.

    if you develop completely natural sites with real text and real pictures and real videos and real sound and hook it all together with like mionded people with like minded websites and then tell the world via articles and directories and blogs and ezines and podcasts all about your site and ideas and aims, then you will have new exciting content people will want to view, hear and read, and link to naturally out of interest alone to pop back to now and then.

    this may fly in the face of the seo of the day or month or even season, but long term it is REAL, it is ORIGINAL and it is INTERESTING, and most of all it is yours. So it will prosper long after fancy ideas squeek through holes in the search engines logic.

    plus in that idiom you are not tied to one search engine, or one system of coding or one basket with one set of eggs – you have men for all seasons and baskets on each arm and in the house and in the car – we are talking versatile, fascinating and eyecatching by the rarity of being honest, true and personal, an oasis of contemporary original creation in an era of duplication and recylced material.

    we are having a hard time at present, but honest to goodness reality will bring long term visitors and long term viability, and also not be tied in to any one avenue of traffic but served by diverse traffic from all kinds of springs.

    Malcolm Pugh
    England
    http://www.datacoms.co.uk free help site for all.

  526. Hi Matt,

    Great info on Big Daddy, I am not an SEO expert but this was pretty straightforward for even the beginners as myself – one quick question for you though – we have a Video content site where we allow many of our users to “embed” videos out across the web, typically at their blog or MySpace pages – for awhile that was really cranking our PR, but as of late seems to have almost totally halted – wondering if our “portable” embed links are now working against us.
    Thanks,
    _MD

  527. if you develop completely natural sites with real text and real pictures and real videos and real sound and hook it all together with like mionded people with like minded websites and then tell the world via articles and directories and blogs and ezines and podcasts all about your site and ideas and aims, then you will have new exciting content people will want to view, hear and read, and link to naturally out of interest alone to pop back to now and then.

    When I read that, I thought you were being sarcastic, and I smiled. But you weren’t, were you? Oh dear.

  528. Dave (Original)

    Call me a “Troll” if you like, but I will still make comments on erroneous statements you make. I would think Matt would be happy to have something pull you up on condoning SE spamming. The problem is big enough for Google without you making it worse.

    RE: “When I read that, I thought you were being sarcastic, and I smiled. But you weren’t, were you? Oh dear.

    Now, what were you saying about “Trolls”. Oh dear πŸ™‚

  529. Dave (Original)

    RE: “if you develop completely natural sites with real text and real pictures and real videos and real sound and hook it all together with like mionded people with like minded websites and then tell the world via articles and directories and blogs and ezines and podcasts all about your site and ideas and aims, then you will have new exciting content people will want to view, hear and read, and link to naturally out of interest alone to pop back to now and then.”

    That is correct IMO. Content always has been and always will be King. It is this train of thought that will likely result in googlebot grabbing ALL content on a site. Just like Matt has hinted at.

    The best links are the ones you probably do not even know about. That is, a Webmaster on another ‘like’ site sees your content as *worth* linking to and gives it a vote.

  530. Matt, I think it might be useful if you did a post in which you defined some of the terms you routinely use. For example, in this post you mentioned “refreshing supplementals” and back in late December or early January, you mentioned that Google had just done a “data refresh”. I’m sure there are lots of other terms you use frequently that are also ambiguous, but I can’t think of them offhand right now. These are terms that you probably take for granted, since you and your co-workers use them frequently in-house, but those of us on the outside tend to get confused. Perhaps this would be a great post for Adam to take on as well. Just a suggestion. πŸ™‚

  531. Dave (Original)

    That is a great idea Dazzlindonna! I often see an acronym or buzz word used that has different meanings in different circles.

  532. Call me a Troll if you like, but I will still make comments on erroneous statements you make.

    That’s ok if that’s what you do, but your comments had nothing to do with the statement that you picked me up on. I.e. we know that only Google knows a page’s actual PageRank, but that’s nothing to do with “doing everything totally by the book doesn’t improve PageRank one iota”, and “Doing everything totally by the book is a what will make a Web business prosper long term” as you put it, is also nothing to do with “doing everything totally by the book doesn’t improve PageRank one iota”. It looks an awful lot like trolling to me – or it could be that you really don’t understand what PageRank is.

    I would think Matt would be happy to have something pull you up on condoning SE spamming. The problem is big enough for Google without you making it worse.

    Er….which part of my post condoned search engine spamming? Or is it that you are stalking me here because of my views, whether you can find anything to pick me up or not?

  533. Now, what were you saying about Trolls. Oh dear

    No trolling there – sorry. I guess my comment went over your head.

  534. Nice to see that you guys are cracking down on European websites as well as your own over in the states. In recent months the amount of sites especially in europe where I am – are spamming and using duplicate content. I sent a couple of requests to your department with no enjoy, but it is nice to see that you are actually doing something about it.

    I know that spanish site that you mentioned but I never looked in to great detail of why they were well ranked.

    Keep up the good work especially in Spain, please….

  535. “Perhaps there is not enough hours in a day, weeks in a month or months in a year for Google to include everything it can reach. A criteria for deciding which ones would then need to be used.”

    Most hurl-inducing, sycophantic post of this thread.

    Is Google’s mission statement is to “index the world’s information,” or the Fortune 500’s?

    If MSN can do it, why can’t Google?

  536. Hey Matt,

    Some fodder for your next article: http://www.adsensepages.com

    You may find it worthy of a post or two πŸ™‚ Normally, its not even worth pointing on bad sites, but this one seemed to be worth mentioning.

    al

  537. Dave (Original)

    I don’t believe MSN have as many pages as Google by a loooong shot. If you have prove to the contrary, please share.

    RE: “Is Google’s mission statement is to β€œindex the world’s information,” or the Fortune 500’s?”

    That’s right it’s their *mission*. That doesn’t automatically mean they have done it. Your point?

    RE: “Most hurl-inducing, sycophantic post of this thread”

    Grow up…..please!

  538. Dave (Original)

    RE: “That’s ok if that’s what you do, but your comments had nothing to do with the statement that you picked me up on. I.e. we know that only Google knows a page’s actual PageRank, but that’s nothing to do with β€œdoing everything totally by the book doesn’t improve PageRank one iota”, and β€œDoing everything totally by the book is a what will make a Web business prosper long term” as you put it, is also nothing to do with β€œdoing everything totally by the book doesn’t improve PageRank one iota”. It looks an awful lot like trolling to me – or it could be that you really don’t understand what PageRank is.”

    Yes it does. As by your own admission, nobody outside Google knows the PR of any given page at any given time. So nobody can say what raises PR and what doesn’t. Or, more specifically, which *links* get counted for PR and which links don’t. However, it is is commnon sense that black hat methods carry a VERY HIGH risk (funny how you never mention that when condoning black hat methods) that would outweigh any *percieved* PR gain. It is also common sense that placing good content on ones site increases the chances of another simliar site voting by linking to it. Thus, likely raising PR.

    Now, doing everything by the book means following Googles guidelines. If you do that your chances of gaining PR (without cheating) are increased with NO risk what-so-ever.

    RE: “Er….which part of my post condoned search engine spamming?”

    PHIL SAID
    “Doing something right or wrong doesn’t affect PageRank. In fact, some of the spam methods increase PageRank a lot, and doing everything totally by the book doesn’t improve PageRank one iota.”

    On your site you condone cloaking (all though you are very confused on what it really is) and even link to WELL know black hat who has cloaking software.

    On your site you also link to a well known PR monger and take commission from them. This site you link to attempts to sell PR. As shown below by the heading statement on their site:

    “Buy Text Links from our partners for high PR quality ads”

    Phil, there is only one thing worse than a self-admitting black hat IMO. That is one who does not have the back-bone to admit they are one, will not condemn black hats and talks out of both sides of their mouth.

  539. Dave:

    My point–obvious to most–is that their current approach to indexing precludes them from fulfilling their own mission statement.

  540. Dave (Original)

    It may *appear* that way from out extremely limited and view with bias blinkers on. But some of us are willing to admit there is a bigger picture. Likely a means to an end, not the end itself.

    Do you really think Big Daddy was implemented to improve indexing etc, or make it worse? That one IS “obvious to most”.

  541. Dave, it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index. Therefore, if Big Daddy was implemented to improve indexing, it is “obvious to most” that it has taken a step in the wrong direction. I only hope Malcolm is right and that this is a temporary situation.

  542. Dave, it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.

    I’m not saying this isn’t a fact. It may well be true. But if you’re going to make a statement like this one, be very prepared to back it up.

    For example, I’m going to ask you to list the sites to which you’re referring.

  543. Dave (Original)

    RE: “Dave, it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.”

    I would say that is your opinion based your minute part of the Web. Only Google themselves would be in a position to make such a bold sweeping statement. Besides, I bet just as many (likely more) “unique, honest, non-spammy sites” are being added to Googles index that were NOT being included before.

    Having said this, Google indexes pages not sites.

    Like Adam, I too would like to see you back-up your “incontrovertible fact ” with proof.

  544. It should be pretty obvious with all the people complaining about pages and sites disappearing and not being updated. Obviously there is something wrong on Google’s end when a page is there one day and gone the next.

  545. Dave (Original)

    Like I said, that is based on one minute part of the WWW. Matts blog, forums etc are always full of nothing but *problems*. That is why they exist. To base the whole of Google and the WWW on complaints, moans, gripes, and outright bias is very niave at best IMO.

    RE: “Obviously there is something wrong on Google’s end when a page is there one day and gone the next.”

    Dissagree completely! In fact, it often means Google is outing spam. There are literally 1000’s of reasons why a “page is there one day and gone the next”. Matt himself has said many times that when he checks pages/sites that have vanished it is mostly due to a breach of the Google guidelines.

  546. Dave (Original)

    Jack, also keep in mind that those who have ALL pages indexed would not be heard from. You know, the silent majority, or the vocal minority.

  547. All
    I agree that there are reasons why google needs to change it’s outlook on how pages are indexed.

    I personally know of many people who have decided to open “keyword rich” aka “spam” sites with lots of outbound links that they receive 10p a click on…. many of these sites are just lists, and are in my opinion pointless and annoying… I mean, how many sensible people actually use these sites anyway?

    The thing that is getting me, is the fact that my company website, which used to have over 100k pages indexed, is now hovering around the 10k – 27k mark. The thing is that we didn’t change a thing! We’ve got No inbound links from directories as far as I know, and we don’t link out to any.. in fact we have minimal outbound links, as there’s not many people who we feel need the free adverts….

    Since this has happened, I’ve completely redesigned the site, and we’re not actually going up, or down, at the moment, but more than 70% of our stock lists is no longer indexed…. and to make matters worse, we used to be the top for 90% of our products…. in a competative market!!!

    I’m sitting here with my boss breathing down my kneck asking why this is happening…. the funny thing is, does anyone actually know? How long does it take for items to be indexed,

    And why, if a page is relavent, and it’s been spidered, does it not appear in the index….. and it’s not being indexed 1ce a week.. but more like 2ce or 3 times a day!

    How about google do something drastic….. DELETE EVERYTHING, and start again…… becuase half the stuff that’s in there is old (including a few of our pages), or completely pointless.

    Dave, I don’t think that anyone has ALL their pages indexed, unless their site contains 1 or 2 pages….

    If google is changing for the better, I wish it would have been fully tested before it started to go live……

  548. Dave. Your posts just aren’t worth responding to (or even reading). You invent what people have said when they haven’t said them, and you show little to no understanding of anything to do with Google or search engines in general. You bring in topics that were never mentioned just to argue about them, and you continually appear to be wanting a fight. You ignore the offer to come and discuss whatever you like somewhere else, and you ignore Matt’s guidelines for posting here.

    Quite frankly, you haven’t shown enough knowledge or sense to merit debating with, but the offer is still open if you have the guts, but I don’t think you have. You know the forum in my site – come and discuss it there. I’ll make it one-to-one if you like, and I’ll not allow anyone else to post, so you won’t be overwhelmed by people. If you’ve got the guts to have an open debate with me, without the constraints of Matt’s guidelines, then come on over. Btw, flames and bad language are not allowed, so all you’ll get is debate. Do you have the guts for it?

  549. Dave, Adam;

    Did you really think that Matt would post an example of a non-spammy site that is having indexing problems?

    Just because it is not happening to you, doesn’t mean that it is not happening. Talk to designers, developers, et al that build new, reputable sites on a regular basis.

    At one time, if you built it, Google would come. That it is no longer the case.

    Read between the lines. Matt is saying that if you do not have sufficient PageRank, you won’t be fully indexed.

    “There are literally 1000’s of reasons why a β€œpage is there one day and gone the next”.”

    Uhhh, OK–I’ll settle for 10. “Literally.”

    Keep drinking that ‘Plex Kool-Aid. It kills the critical part of your brain, but makes it easier to sing and dance.

  550. I’m not saying this isn’t a fact. It may well be true. But if you’re going to make a statement like this one, be very prepared to back it up.

    For example, I’m going to ask you to list the sites to which you’re referring.

    Adam. I assume that was addressed to everyone, and not just to Nancy. Here’s one such site: http://www.forthegoodtimes.co.uk/

    The site was specifically designed to be as clean as a whistle, and not to take search engines into consideration when it comes to linkages, except for two things. (1) some of the inner link texts and (2) a small handful of starter IBLs – to get it noticed by the engines. As a result of Matt’s post, I removed 2 of the starter IBLs because they were from an off-topic site. The others are from on-topic sites.

    It links out to hundreds of on-topic sites, it doesn’t link to any off-topic sites, it doesn’t contain any reciprocals (not even back to the starter IBLs sites), it doesn’t contain any paid links of any kind, it doesn’t contain any affiliate links other than AdSense, and it has its own data.

    Before BD, Google claimed tens of thousands of its pages in the index, and plenty of people were using the site because it’s a useful resource. A few weeks ago the number of pages in the regular index went down to just 12 (13 with filter=0), and there are a few hundred pages in Supplemental. The rest of the pages seem to have vanished altogether.

    As a site, I think it is not far from being unique in its field and location (read the About page to understand that). If that site is spammy, then I’d like to know in what way.

    You only need to read the various forums to know that a great many sites have had their pages dropped for no apparent reason, Adam. And if so many people in forums say it of their sites, just imagine how many sites have suffered that are owned by people who don’t use forums.

    Googleguy posted an email address at WMW to send such sites to, but he posted it behind closed doors, and most people weren’t aware of it. If you really want example sites, then ask for them in the various forums where the problem is being discussed. There are so many people in those discussions who have had it happen to their sites.

  551. Worra bloody idiot!

    I misread the name and thought that it was the real Adam who made that post πŸ™

  552. I am the real Adam. I got here first. Lasnik’s just a stuck-up white boy fakin’ the funk, and he’s not a real brutha like me. WORD LIFE!

    And Phil, if that site is your idea of a joke, it’s not funny.

    Or should I say:

    “It’s not funny in Caithness. It’s not humourous in Caithness. It’s not amusing in Caithness. It’s not even mildly entertaining in Caithness.”

    That site has scraper site stink written all over it.

    Not to mention the fact that it has a navigation structure that forces the user who doesn’t want to search through various listings to have to browse at least 3 levels deep before they even find anything! And when they do, there’s only ever 1 or 2 listings!

    Robert G. Medford:

    I wasn’t asking Matt…I was asking Nancy. She made the statement. It’s up to her to back it up.

  553. Quite frankly, you haven’t shown enough knowledge or sense to merit debating with, but the offer is still open if you have the guts, but I don’t think you have. You know the forum in my site – come and discuss it there. I’ll make it one-to-one if you like, and I’ll not allow anyone else to post, so you won’t be overwhelmed by people. If you’ve got the guts to have an open debate with me, without the constraints of Matt’s guidelines, then come on over. Btw, flames and bad language are not allowed, so all you’ll get is debate. Do you have the guts for it?

    If I were Dave, I wouldn’t either, and the reason has absolutely nothing to do with guts. If it’s your forum, you’d have the ability to modify/edit/delete posts on a whim and he wouldn’t be able to do the same thing to you.

    I’m not saying you would; I’m saying you could, and that gives you home-court advantage right off. I don’t think Dave would be that stupid.

    If you two are gonna do that, you two need a neutral turf.

  554. Just because it is not happening to you, doesn’t mean that it is not happening. Talk to designers, developers, et al that build new, reputable sites on a regular basis.

    I’m not saying that people aren’t having indexing issues. Obviously, they are or no one would be bitching.

    What I’m saying, and I’ve been saying this for years (even before BD came out) is that the vast majority, if not all, of these issues have been caused by some form of webmaster error. All BD has done so far is to expose errors. Whether it’s an overall lack of promotion, linking to bad areas, scraper site-like content, or whatever the error may be, errors were exposed.

    I used to be someone who would cry wolf whenever my site went down. I can admit that…I used to get pissed off and shout at the skies any time it slipped. But I learned something a long time ago that a lot of you who are complaining should learn too:

    Look within before you look without.

    In other words, assume you were the one that made the error. Check your own ego and bias at the door (yes, that’s a lot easier said than done), and try to find the error and fix it. If you’ve made a legitimate, honest effort to do so and you can’t, then you might be able to start blaming Google. But until then, any bitching is unfair.

    That’s not “Plex Kool-Aid” or “GoogleCultism” or whatever the hell else you want to call it…that’s just stepping back, being objective, and examining the situation. What they’re doing makes complete sense…it just caught a lot of webmasters with their pants down and they didn’t like it.

  555. Thanx for the info matt. I am late i know, but this has been really helpful. I would also like to know if google penalise for duplicate content, like using a article directory, thats for sale.

  556. No it wasn’t a joke Adam. I would have expected something a bit more serious from you, but then, since I made a mistake, and thought you were someone else, I guess it’s my fault.

    Adam. The discussion is about spammy sites. You asked for examples of non-spammy sites that had their pages dropped – remember? It isn’t about your idea of how many clicks its should take to get from the homepage to the meat. Maybe you simply don’t know about directories and how they work, or maybe you are just doing your darndest to find fault where there is none. Incidentally, it takes a lot fewer clicks in that site to get to the meat than it does in Yahoo! and DMOZ πŸ˜‰ So why not cut the crap and try to stay within that actual discussion – which YOU invited.

    It isn’t a scraper site, and it doesn’t look anything like one, but perhaps you are not very familiar with scraper sites. Do you think I would have posted the URL here if it was a scraper? It looks like a perfectly ordinary niche directory, which is exactly what it is.

    So where’s the spam, Adam? You wanted examples, and I gave you one.

    It’s not funny in Caithness. It’s not humourous in Caithness. It’s not amusing in Caithness. It’s not even mildly entertaining in Caithness.

    I mentioned that in my post about the site. It’s ugly and undesirable, and I don’t like it at all, but it’s something that Google has made people do. It’s ugly but it isn’t spam. If that’s all you can find, then I’ve shown you a non-spammy site that has had its pages dropped, which is what you asked for.

    About Dave:
    I’ve no doubt that he won’t show his face in my forum, because he daren’t debate where there are no constraints as there are here. Regarding what you say could happen there – I am a very straight person, and I would never do anything that you suggested, except for removing flames and foul language, which don’t belong in my forum. I’d be more than happy to do it in a neutral place as a one-to-one debate, as long as it is understood that I won’t continue a debate with someone who resorts to flames, because it becomes totally pointless. I thoroughly enjoy a good debate, so if someone wants to set it up, I am more than willing to get to it.

  557. All BD has done so far is to expose errors. Whether it’s an overall lack of promotion, linking to bad areas, scraper site-like content, or whatever the error may be, errors were exposed.

    I disagree. For one thing, I doubt that BD has done anything about scrapers, but the main thing I disagree with is that when you suggest that a lack of promotion is an error – it is not. Judging by Matt’s post, a failure to acquire enough decent IBLs is a reason for a site to have many of its pages left out of the index, but it is not an error on the site owner’s part. Search engines fail if they refuse to index unpromoted sites. It isn’t what any of its users expect of it.

  558. My proof is that I have seen so many people on complaining (although their links are not available), plus of course my own personal beef. But if you choose to believe that every single complaint is from an evil blackhat spammer, fine. I have no investment in convincing you of anything, since you are not in a position to do anything about it.

    And about your silent majority – it is small websites being affected, and most of these are run by people who never think to do checks on how many pages are being indexed and visit forums for webmasters. I used to be like that. Then one day I noticed my site dropping severely. At first I thought it was my fault. I examined Google guidelines – what was I being punished for, I wondered? When I found nothing, I did some research, and that brought me to Webmaster World, and finally here.

    But again, if you choose to dismiss all of these people and believe the algo is flawless brilliance, who I am to burst your bubble? The land of denial can be nice happy place, and you can hold hands with Google and skip through the daisies all you want. Have fun!

  559. My proof is that I have seen so many people on complaining (although their links are not available), plus of course my own personal beef. But if you choose to believe that every single complaint is from an evil blackhat spammer, fine. I have no investment in convincing you of anything, since you are not in a position to do anything about it.

    Then you really need to learn what proof is. What you have here is emotional outcries from people who have a bias toward their own sites and products. And that’s quite understandable…not too many of us out there think our sites suck (as in the people with the best of intentions.)

    Nevertheless, it’s not proof, and if you want things “fixed” to your liking, stand back and look at it from the standpoint of anyone inside of the Google engine. If you were Matt, or Adam, or anyone else involved, and someone emailed you or posted on your blog complaining about their specific site and how it got dropped and how Google is evil and how it “needs to be fixed” for your site specifically, ignoring the much larger picture and how any change to suit your needs might affect millions of others in a negative way, would you go ahead and do it? Would you deal with someone who is not being logical and emotionally objective? I know I wouldn’t. No matter how I tried to appease that person, I couldn’t win. So why waste energy?

    You’re right. You don’t have to convince me. But if you want things done in a manner that will help you, you do have to convince them. And none of you who are in the “webmasters affected negatively” boat have done that.

    I’m not saying every complaint is from an evil blackhat spammer. I suspect the vast majority are from webmasters who have legitimate intentions. The problem is that the execution is flawed, and we all need to stand back and look at this objectively and ask ourselves, “is Google fully to blame? Have I potentially done anything wrong?”

    Nancy, can you in all honesty say that your site is perfect? Can you say that it is such a valuable resource that it could not possibly be improved upon? Can you say that it is without error?

    That’s all I’m saying here…stand back, and be objective. It’s a hard thing to do, but a necessary one.

    Google’s not perfect. And I’m not saying it is. The algo is prone to error, as all algos are. But this isn’t the error.

  560. The Adam That Doesn’t Belong To Matt said “..stand back and look at this objectively and ask ourselves, ‘is Google fully to blame? Have I potentially done anything wrong?'”

    This is good advice but unfortunately, there are some of us that have looked at all of the possible errors with our sites and found that we have done our best to “comply” with Google’s guidelines. Even after that, there is obviously something wrong somewhere that has caused our sites to drop drastically in the SERPs.

    My site, for example, is older than Google and for years ranked #1 with many search terms that now (since March ’06) list the site in the 100s. I have gone over it with a fine-tooth comb and have asked for help for the last couple months but no one seems to know why the drastic drop. The site contains quite a bit of exclusive information that exists ONLY on my site.

    Some of the keywords that used to list my site at #1 are now listing sites that aren’t even related to the search term or only contain the search term once, with no relevant information after that. So from my standpoint, and I am sure there are others, if a site can dip from #1 to #141 in one day, there may be problems on the site but there is no doubt that something changed with Google and without an explanation for any help whatsoever, there is little hope to remedy the situation.

    I have been a Google “supporter” since the beginning but based on what I have seen in the past couple of months, Google IS NOT returning valuable and reliable information for some search terms, period. If I can’t rely on a search engine to return results that I am looking for, why would I continue to use it?

    If anyone is willing to help me understand the drastic drop and what may be happening, please let me know. I can provide examples of keyword terms for my site that return #1 in MSN, Yahoo, Ask, Altavista and more but do not show up on Google. I need help in understanding the problem.

    Thanks for letting me vent.

  561. Adam. The discussion is about spammy sites. You asked for examples of non-spammy sites that had their pages dropped – remember? It isn’t about your idea of how many clicks its should take to get from the homepage to the meat. Maybe you simply don’t know about directories and how they work, or maybe you are just doing your darndest to find fault where there is none. Incidentally, it takes a lot fewer clicks in that site to get to the meat than it does in Yahoo! and DMOZ So why not cut the crap and try to stay within that actual discussion – which YOU invited.

    Oh, but I did. And my point about the number of clicks is very relevant to this particular discussion.

    And since you brought Yahoo! and DMOZ into it:
    http://dir.yahoo.com/Health/Nursing/ two clicks, and you have content.
    http://dir.yahoo.com/Society_and_Culture/Weddings/ two clicks, and you have content.

    There are many examples of this within Yahoo! itself. Is the content regionalized? No. Is there a problem for the user who wants regionalized content? Absolutely.

    DMOZ, there are categories that accomplish the same thing.

    http://www.dmoz.org/Society/Sexuality/ Two clicks.
    http://www.dmoz.org/News/ One click.

    You should start to see at least SOMETHING within a click or two), if you want to compare yourself favourably to those two sites (or most other directories for that matter).

    Why don’t we? Because that would interfere with the search engine work that “Google made you do.” That statement in itself is grossly flawed, but I’ll get to that in a bit.

    As far as it resembling a scraper site goes, it does bear certain similarities (specifically the link reference I made above and that you acknowledged.) Why would you do this, if not for IBL anchor text manipulation? It certainly isn’t user-friendly, which you’ve acknowledged as well. If that isn’t spammy (and there’s strong reason to believe it is), at the very least it’s a significant usability issue. It’s not a 100% spam-free site, and you haven’t established a damn thing with that link.

    Whether or not you’d be willing to post it into the blog of a search engine doesn’t establish anything either. You may be running the risk of making Google aware of something that you’re doing, but as of right now most of your site has an indexing issue anyway. So what have you really got to lose by doing so? Not much, at best.

    And without meaning to, you’ve shown an example of exactly what I was talking about as far as other possibilities existing.

    You openly admit that you don’t like the site and you can see significant room for user-based improvements.
    You openly admit that you do things specifically to please a search engine at the potential expense of the end user.
    Google’s primary goal is to present the most relevant results possible to the end user.

    If your site has obvious usability issues (and for that matter, BDDA issues but that’s another topic…I’m just pointing it out since it’s tangentally relevant to this one), and it’s not good enough for other webmasters to link to it whether you ask them or not, then there isn’t a valid reason for it to be indexed.

    Again, this is what I’m talking about as far as personal bias goes. This is your site. And it suffered. And there is reason to believe that it deserved to, both yours and mine.

    As far as “Google made me do it”…that’s just a big steaming stinking festering pile of crap and you know it. You have complete control over your own site to build it as you see fit. If you really wanted to, you could flat-out disallow the Googlebot from ever visiting your site in the first place. And if you really wanted to, you could build your site for its users and say “I’ll get legitimate visitors and let the SE stuff fall where it may.” It can be done, it has been done, and it will be done again.

    No one’s forcing you to put keyword-altered anchor text descriptions into your site for the various cats and subcats. No one’s forcing you to go and get IBLs. Matt and Adam and Sergey and Larry and the rest of the Google gang isn’t going to come along with a lead pipe and bash you in the cojones until you comply with their qualifications for being indexed and ranked.

    It’s choice and consequence. You can choose to do as you wish with your site, but if it doesn’t rank or get indexed, then that’s a consequence of your choice, not an evil overlord implementation on the part of Big G.

  562. Actually, Adam, “emotional outcries”, as you put it, do have an effect with Google, because Google people listen, and if they find something wrong, they make changes. They need and look for such things to help them discover things when they’ve made some changes. That’s been true for years, but you only need to read Matt’s first post in this thread to see that it’s true.

    You appear to be a person who accepts everything that Google does as being good. For instance, you are the only non-Googler in this entire thread who has actually stated an agreement with what they have done with the new crawl/index function. (Don’t turn to the silent majority again, because they all disagree with you. You don’t think so? How do you know?) You assume that, if a site has had it’s pages dropped, it is almost certainly a fault with the site. In the present circumstances, that’s something that is almost certainly wrong.

    You said that you used to tear your hair out when your site dropped (my paraphrase), but you don’t any more because you believe that it must deserve it if it happens. I’m better than you. I never publically complained about it when a site of mine suffered in any way, however badly, and I still don’t. But I do a bit of public shouting when I think that Google (or any big engine) is doing something wrong – as now. I haven’t complained about my sites – in fact I told Matt to leave the example site that I gave and to let it die.

    However, there is nothing wrong with shouting “unfair” when Google drops a perfectly innocent site, as has been happening all over the place recently. Algo changes that cause big ranking drops are one thing, but intentionally not indexing sites because they don’t have enough good IBLs is abominable, and everyone who it has happened to has every right to shout about it. People who haven’t experienced it, such as yourself, are better staying out of it if all they can say is, “it’s mostly the sites’ fault”, and ignore what everyone else is saying.

    I suggest you take your own advice – stand back and be objective. It’s good advice for everyone.

  563. Phil said:

    “intentionally not indexing sites because they don’t have enough good IBLs”

    And here we have the biggest issue of Big Daddy, between that and the supplemental index problem. One can also not state it’s to provide better value because there are still loads of spam and scraper sites listed.

  564. Agree with Philc, intentionally not indexing sites because they don’t have enough good IBLs is abominable.

    There are a lot of quality sites in terms of original information that will never get indexed by Google, what should the webauthor do, write more quality content or spend all his time asking directories with no content to provide links?

  565. And when they do, there’s only ever 1 or 2 listings!

    The site has over 21 thousand listings (and the number was climbing nicely), and many categories have plenty of listings. Isn’t that enough to make it a useful resource? Try looking before you judge, Adam.

    So we’ve had a real example of a non-spammy site that’s had almost all of its pages dropped. Do you still say it’s an error with the sites, Adam, or is that site merely the exception that proves the rule?

  566. Could someone explain to me what the supplemental index is ?

  567. Does anyone know where you can inform Google of scraper/spam sites ?

    There are so many of them, but if everyone informs Google of them, then Google can remove them if they dont meet their guidelines!

  568. The Supplemental index is part of Google’s Auxiliary index, and is used to store pages that they don’t particularly want to list in the results. They resort to gettting results from the Supplemental index when they can’t get enough from the regular index.

    For us, it is an index where we definitely don’t want oyr pages to be. It used to be the case that whenever I saw supplemental results in the serps, they were usually pages that were obviously no good to anyone. But now Google is putting very large numbers of perfectly good pages in Supplemental.

  569. The site has over 21 thousand listings (and the number was climbing nicely), and many categories have plenty of listings. Isn’t that enough to make it a useful resource? Try looking before you judge, Adam.

    So we’ve had a real example of a non-spammy site that’s had almost all of its pages dropped. Do you still say it’s an error with the sites, Adam, or is that site merely the exception that proves the rule?

    Tell me something, Phil. If you were caught by the cops with a bloody knife handle in your hand with the blade stuck square in the back of a fresh corpse, you’d still deny murdering the person, wouldn’t you?

    No, We do not have a real example of a non-spammy site. You’ve even admitted that you’ve done something that could be interpreted as spammy since it was done solely for Google. It’s not exactly a squeaky-clean site.

    Seriously, you need to stop making statements that aren’t conclusively true like the one you just made. All they do is confuse people and get them riled up unnecessarily.

    As far as the 21,000-someodd listings go, I looked in at least a dozen or so cities and regions. I saw exactly ONE with more than one listing (7 to be precise). So I did look and couldn’t find them…and perhaps your little SEO trick is part of the reason why.

  570. I’m confused as to what could possibly be wrong with the site Adam posted? I took a look and didn’t see anything spammy about it or that it was using any cloaking or other Black Hat.

  571. Dave (Original)

    Oh great. I see Phil & Robert have quickly descended in personal childish attacks. Grow up guys.pleeeeease!

    Phil, you simply post with nothing more than a personal attack each time you respond. Then, you have the gall to say I’m breaking Matts posting rules. LOL! Heard the expressin “the pot calling the kettle black”?

    You continually avoid responding to what I write, so any “debate”, as you call it, would be pointless.

  572. Dave (Original)

    RE: “Just because it is not happening to you, doesn’t mean that it is not happening”

    Show me where I said it wasn’t happening???

  573. Dave (Original)

    Phil, were you really NOT joking about
    forthegoodtimes.co.uk
    ???

    If so my friend, you have even more gall trying tell me that I know nothing. Honestly Phil, why would Google want to list page after page of nothing more than links?

    BTW, I would not participate in ANY black hat forum ever! Matts blog is fine if you can refrain from the personal attacks and stick to the topic. Oh, you would need to read my post as well πŸ™‚

  574. Dave (Original)

    What I’m really struggling to understand is that when Matt posts an example and CLEARLY states the reason why a site may not have all pages indexed, nobody that is moaning bothers to take heed. Matt has already stated that BD is more comprehensive, yet all those, that are not happy with Google, still base the whole of the WWW on their own minute area. Should I believe Matt (who sees the WHOLE picture) or those that land here with their own bias agenda (rhetorical).

    Expand your minds and realize there IS a bigger picture than what we read in blogs, forums etc.

    If one was to land on Earth from another Planet and view a racial hate site, they might conclude “these humans are mostly racial”. At least they are not from the same Planet and so have an excuse πŸ™‚

  575. Dave (Original)

    RE: “these humans are mostly racial” should be β€œthese humans are mostly racist”

  576. Of course my site isn’t perfect. Please show me where in the Google guidelines it says your site must be perfect to be indexed.

  577. Dave (Original)

    Nancy, Adam never stated a site must be perfect for it to be indexed fully.

    However, Matt HAS stated the likley reason some sites are not fully indexed. You can ignore that if you like, but then don’t complain that your site is not fully indexed.

    Also, this IS what Google state

    “Google supports frames to the extent that it can. Frames can cause problems for search engines because they don’t correspond to the conceptual model of the web”

    Also, isn’t your site just a copy of what is already on many other sites?

    Please don’t shoot the messenger πŸ™‚

  578. Dave said:

    Also, isn’t your site just a copy of what is already on many other sites?

    Your poiint? The same thing could be said of Google. Looks like Yahoo, MSN and every other search engine. Most sites are similar to other sites of the same type, that’s what we call viewer association. And regarding Matt stating the reason for sites not being indexed, there is nowhere in the thread where he has said anything that fits all the sites having trouble with getting new and old pages indexed and up to date.

  579. Dave, you’ve got to express your point of view a little more clearly, bro. I know what you were referring to, but others may not.

    Dave: you’re also slightly misinformed as to the nature of the “frame”. Nancy actually used a scrolling div with fixed width and height for the content, and while that’s not very common, it works in her case. So she wouldn’t get burned for that…as far as big G can see, it’s one page.

    I saw Nancy’s site. And I looked at it quite closely. And yeah, it’s a pretty cool site…although it’s an incontrivertible fact that the Eagles are ranked as follows:

    1) Don Henley (anyone who has the Actual Miles CD knows what I’m talkin’ ’bout. Nice solo career, and he performed with Jimmy live once.)
    2) Crazy Joe Walsh (Funk #49…kickass guitar song)
    3) Randy Meissner (I don’t like guys who sing with high voices normally, but he pulled it off)
    4) Timothy B. Schmidt (see Randy Meissner)
    5) Glenn Frey

    Anyway, enough about my personal tastes in music.

    As Dave pointed out (sort of), there are a few problems here:

    1) The lyrics section. Don’t get me wrong, it’s a great feature to have, but how would your lyrics be any better than the lyrics on say lyricsdownload.com or song365.com ? By providing that feature, you’re basically providing content that is non-unique, and therefore you’ll be competing against other, much larger sites that have the same type of thing going for it.

    Again, when you answer my question (if you choose to), step back and view it from the standpoint of the average user of your site. Someone with no bias whatsoever toward it. Someone who may have just stumbled upon it somewhere. How are your lyrics any better, or how do they hold any more importance, than any other lyrics site?

    2) The other problem that you have relates again to your IBLs, which is what started all of this.

    Now, I’m going to reveal something that I haven’t revealed before…I’m actually a “victim” of the Supplemental Hell people refer to.

    Before everyone gets out of shape and takes that comment the wrong way, understand that I’m fully aware of why these particular page are in the supplementals, and they certainly deserve to be there. They’re pages that I created to help someone else try to prove or disprove a conclusion that they had come to that simply wasn’t true at the time, and when that was done, these pages have had, and continue to have, no other use whatsoever. In other words, as Google has quite rightly determined, they’re garbage pages.

    http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2006-19,GGLG:en&q=Rontoronto

    Mine are the adamwebdesign.ca pages. They’re just a couple of links floating in outer space with no real reason for existence anymore.

    Now…how is this relevant to you, Nancy?

    Consider this: what if I were a complete and total asshole to you? What if I were to take that link, the existence of which is known to Google, and use it to copy/paste the entire contents of your site to mine? In other words, rontoronto3.html becomes the index page, all the rest of it becomes the other pages of your site.

    My site has no real IBLs (well, one but it’s also supplemental). Yours has no IBLs. Which becomes more useful? Which is the actual true resource that is being promoted?

    The potential scenario that I listed could play itself out, and it certainly wouldn’t be your fault if I chose to do that (don’t worry though…I’ve got no intention of it).

    The same scenario applies with sites that are in production vs. live and there are probably other examples I can’t think of.

    What other measure, besides IBLs, would Google have to determine who was there first?

    You may also want to consider that more IBLs would lead to more traffic that has nothing to do with search engines. Nancy, you give people some link love there…don’t you think you’re entitled to a little of that yourself? Why shouldn’t you go and ask someone for a link to your site if it’s topical and relevant? You’ve earned it…why be shy about it?

    Ignore the SEs, and getting relevant one-way links is still a smart thing to do, as it not only gives you more traffic over time, but it gives you an idea of how well your site is doing. You start getting relevant links, you know you’re on the right track.

    In other words, you’re making a mistake of sorts by not actively seeking links. It’s certainly not as egregious as most of the mistakes that are made. But it is, while a minor error, affecting you in more ways than one and the sooner you deal with it, the better off you’re going to be.

    Out of all the people who have raised an issue, you’re about the only one that I’ve seen that could make a legitimate claim that it deserves to be there. But you need to distinguish yourself from the millions of other voices making the same claim for pages that don’t.

  580. Dave (Original)

    My point on that one question is, why would a SE want to list page after page of the exact same text? E.g song lyrics.

    RE: “And regarding Matt stating the reason for sites not being indexed, there is nowhere in the thread where he has said anything that fits all the sites having trouble with getting new and old pages indexed and up to date.”

    Did someone say he did?

  581. One other note:

    The friend of mine that I mentioned is not the same Rontoronto that turns up in all the gay classified ads. But if that’s your game, you do what you do, and I’ll be a lesbian, and we’ll all live happily ever after in a nice non-judgemental way, okay kids?

  582. Hey Dave,

    Look again. I do not have frames. I use .css and have created a template that mimics frames. Laypeople won’t recognize the difference, but I assume you are a webmaster yourself, and therefore I’m a bit surprised that you didn’t. At least your suggestion that my “frames” are interfering with my indexing gave me a bit of amusement in this otherwise frustrating exchange.

    It also seems, like Googlebot, you have not crawled my site with much depth. I don’t really wish to get into a discussion of my site as this is a generalized problem and not particular to me. I also certainly hold no illusions that my site is this magnificent, dazzling star of the web. It is, however, more than duplicate content. I will take the time to give you one example. Buried deep within where Google no longer crawls, I have the lyrics to unreleased songs ony played live once. You won’t find those on lyrics.com or any other fansite. I transcribed them myself from bootlegs. I hardly expect you to know this, but perhaps you should reconsider being so quick to judgment. I also think it is rather hypocritical to condemn me unless your site is flawless and has 100% unique content. Can you honestly say that’s the case? If so, please link me to this glorious site so that I can gaze at it in awe and admiration.

    I never claimed that my site is the MOST AWESOME SITE EVAH!!111!!! I did, however, say it was honest, non-spammy, and offered some unique content. I stand by that. But this isn’t about my little site. It’s about a much larger issue, one which has already been outlined multiple times in this hideously long series of blog replies. I have no wish to waste time rehashing. Indeed the only reason I am continuing to post at this point is the fact that I have been directly addressed and criticized.

    In fact, since this is Cutts’ blog and not a message board, I am surprised by all the back-and-forth I see you and other individuals engaging in. I doubt Cutt’s intent was to provide a forum for such unpleasantness. I assumed (perhaps wrongly) that this blog was for the purposes of encouraging discussion between webmasters and a Google representative. I see little point in debating a matter with fellow webmasters who have no more clues about the algo than I do.

  583. Adam, I appreciate your effort to diffuse the situation with some humor by your incontrovertibly flawed Eagles ranking. πŸ˜‰

    Regarding IBLs – I actually have a few, although for some reason they do not show up when you do a Google search for them. However, astoundingly enough, there is only one active Eagles site and ZERO active Glenn Frey sites. I’ve gotten a couple other links from generalized music sites but I don’t see how to NATURALLY get others.

    I didn’t come here to talk about my personal site issues though – I really don’t think it’s about me, although some are convinced I only came here to whine. Indeed, I’m hardly the worst off – my site is a hobby, not my livelihood. It’s not the end of the world if it doesn’t get a lot of hits. It is, however, indicative of a larger problem that needs to be addressed.

  584. Again the fact that pages have similarities should affect ranking but has nothing to do with pages being indexed. You mentioned lyrics and why one should be better than the other is beyond me. Both are useful sites if a person is looking for a lyrics site. This is not a there is only one site that can exist day and age. The value of a page is dependant on what is on the page, nothing more and nothing less. It is only a matter of did the visitor find what they were looking for.

  585. Dave (Original)

    RE: “Look again. I do not have frames. I use .css and have created a template that mimics frames. Laypeople won’t recognize the difference, but I assume you are a webmaster yourself, and therefore I’m a bit surprised that you didn’t”

    Yes I am a Webmaster. However I only took a brief look at your site and pointed out what was staring at me.

    RE: ” I hardly expect you to know this, but perhaps you should reconsider being so quick to judgment. I also think it is rather hypocritical to condemn me unless your site is flawless and has 100% unique content. Can you honestly say that’s the case? If so, please link me to this glorious site so that I can gaze at it in awe and admiration.”

    Despite me asking nicley, you are shooting messenger here. I was only trying help you. However, I can say my site is 90% unique. I too have many pages not indexed, but I’m not going to judge the whole of the Web on such a limited view. A frame of mind I think you, and others should adopt.

    RE:”Indeed the only reason I am continuing to post at this point is the fact that I have been directly addressed and criticized”

    A bit of friendly advice you might want take heed of. Some of your snide comments should be left out, especially when replying to people trying to help you.

    Jack, sorry, I missed this before “Also, isn’t your site just a copy of what is already on many other sites?”

    No.

  586. Dave (Original)

    RE: “Again the fact that pages have similarities should affect ranking but has nothing to do with pages being indexed. You mentioned lyrics and why one should be better than the other is beyond me. Both are useful sites if a person is looking for a lyrics site. This is not a there is only one site that can exist day and age.

    You say simliar, I say exact text. Look at it from the perspective of the SE business and that indexing the WWW is expensive and time consuming. It seems pretty clear to me that Google must create a criteria for what it indexes. As yet, It cannot index every single page out there. I bet they are working on it though!

    RE: “The value of a page is dependant on what is on the page, nothing more and nothing less”

    YES! Now you are getting it. In Google’s case, Google is one to decide what it can, or wants to include. Or, do you think someone should make an uniformed choice for them?

  587. “Yes I am a Webmaster. However I only took a brief look at your site and pointed out what was staring at me.”

    I assumed your look was brief. That was the problem. As your mistake demonstrates, such hasty judgments tend to be flawed.

    “Despite me asking nicley, you are shooting messenger here. I was only trying help you. However, I can say my site is 90% unique. I too have many pages not indexed, but I’m not going to judge the whole of the Web on such a limited view. A frame of mind I think you, and others should adopt.”

    I am disappointed you did not link to this site, then. It sounds impressive. Do you want to hazard a guess as to why so many pages are not indexed, then? Do you use frames, perhaps, as that was your first thought regarding my site? Do you have problems with IBLs? If not, it seems to me your site is an excellent example of the algo gone wrong and helps make the case for those arguing it is unfair.

    As for the second half of this, forgive me if I find your comments help make Google shine more than they benefit me. Your suggestion that I remove lyrics from a fansite about a musician would be counterproductive to visitors who expect such things. As I have said before, I make the site for visitors and not to please SEs. I still fail to see how that is so terrible that I should be de-indexed for it, while other sites who do the exact same thing are not.

    Also, you might find it surprising that one of the few pages that is indexed by Google are the lyrics for one of his hit songs.

    “A bit of friendly advice you might want take heed of. Some of your snide comments should be left out, especially when replying to people trying to help you.”

    While I’m not sure I buy the second half of this statement, I can see the validity of the first half, so that I will grant you.

  588. Matt –

    I read your posting guidelines and see that you won’t answer site specific questions. No problem at all. I have more of a search-specific question. I am my own webmaster and SEO guy for my wedding videography business here in Charleston, SC. My four keywords phrases are not much to the world of Google, but everything to the world of my family of four.

    Consistently I am ranked at MSN and Yahoo from 1 – 5 and have avoided any links that would be “outside” of my field. I have been #1 at Google several times, alas for only a day, then I will drop like a rock 26-100 spots the next day.

    With the other search engines the moves are slight, one spot, two spots. With Google they are extreme.

    Now, I say these things to you but I don’t need site-specific answers. I haven’t been ranked at Google long enough to see any impact on my business so I am not asking how you can help me. I am asking because when I search Google in my genre, I get VERY poor results for what I am looking for. There is a company selling jewelry ranking in front of me. There are people in Chicago ranking in front of me.

    All this to say that it makes me wonder about the other results I get from Google outside of my genre. I mean, I know my business very well and if I see wacky results coming from Google for something as specific as Charleston Wedding Videographer, what kind of results will you give when I search for Denver Real Estate? Will I have to weed through a jewelry store or find some city in New Jersy?

    Bill

  589. carping on at each other does not address the underlying problem, and i guess is not what this log was set up for in the first place. we should be using it to try to find out what is really happening.

    There would appear to be a lot of personal bickering instead of an attempt to pool some clever minds to address a strange situation rationally and logically.

    there are pages being lost from websites where in all conscience it would appear, at least on the surface, that these are not a result of seo attempts at “being clever” in whatever shape or form anyone cares to mention.

    sites where the webmaster has no appreciation or inkling about seo or any of its techniques are being hit hard and losing valid unique content pages in a quite bewildering manner.

    this is obviously not in harmony with googles guidelines to webmasters, in that people have set up sites with their own quirky hobbies or completely individual interests on there, producing their own content quite oblivious to the outside world, and it has quite happily indexed up and sat with say 200 pages out of 200 on google – not necessarily first page tenth page or twentieth page, but ALL on google.

    Then all of a sudden this plummets to 150 then 100 then 80 then 20 and finally one index page.

    This cannot possibly sit tight with googles webmaster guidelines to my mind.

    Then this thread purports that that is because of a lack of inbound links as i understand it. and that is where we are at now.

    if that is the truth then google have soldout/stitched up all of the people out there who stuck to what they were told to do – write nice individual websites/dont do seo type things/content is king – hence go past go collect 200 pounds enter heaven.

    i cant imagine for a moment Google would renege on its founding principles in such a draconian manner – i think they have more entrenched principles than that – now either i am right there or i am not.

    if not, inbound links are necessary to offset this blip. end of story.

    if so then what are we to deduce; primarily that another problem is in operation we are not aware of at present.

    Google may well have made a mistake and rolled it out.
    they may have truly temporarily run low on space
    they may be spring cleaning the index.

    what puzzles me is if you have taken sites that had umpteen pages and pared them down, why RETAIN the index page, or in some cases TWO pages.

    what earthly use is it to someone who had tens/hundreds/thousands of pages to have……………one index page left.

    google are no fools, they know this too. it makes more sense in essence if you had multiple pages, to be zapped altogether if you are perceived, even wrongly, of being in conflict with one of their “laws”, not to retain one or two pages out of many.

    generally speaking indexes have a start and an end record, and all of the ones in between point forward to their predecessor who points backwards at them, and the front and end records have zeroised backward and forward pointers respectively – in a generalised description. so it is easy to strip out the middle ones as long as you retain the first and last, or even just the first. then a rebuild has a valid start point of reference to rebuld from.

    this is why i postulated they retained either just the index page or the two pages – it makes sense if you intend to rebuild the whole string eventually and saves space in the interim period.

    otherwise why else retain an index page of a website where you may have zapped 10,997 other records?

    we should be pooling our minds together, in the abscence of any google input during Matts abscence and Adams declaration of no input, to work together to try to understand what may be going on, instead of trying to score points or engage in personal exchanges.

    we are all here because we are puzzled and have problems.

    we can sort them out much better if we act as one community trying to unravel something a bit bizarre than bickering amongst ourselves. we are all capable of a piece of the jigsaw each, which when viewed may trigger other ideas off of each other, then we might move forward together and understand better what we are facing and why we are facing it, for i dont think personally the inbound links scenario fully explains what is out there, why it is out there, and what to do to circumvent what is out there.

    personally i think it is true there are perfectly valid pages disappearing daily and which have been disappearing daily since march 8th ish.

    personally i think these will reappear just as they were before, as i think they are rebuilding the index on the back of the single index pages retained. however i may well be wrong; what we should in any event be doing is analysing what is occurring, putting forward ideas why it might be occuring, and if we find instances of places where it is NOT occuring trying to find out what makes those inviolate.

    that involves working together as a whole not fragmented and opposed and dissembling.

    picking holes in each others comments does not advance the common cause.

  590. I am not so well versed in all of this; however, I do read and learn a bit each time. I don’t participate in link farming but I do have a link directory which we manually maintain by request. I don’t actually know if my site ranking has been hurt by my links directory or something else. In December I took a big drop in ranking for awhile. I’ve slowly climbed back but my ranking seems to fluctuate constantly.

  591. Again the fact that pages have similarities should affect ranking but has nothing to do with pages being indexed. You mentioned lyrics and why one should be better than the other is beyond me. Both are useful sites if a person is looking for a lyrics site. This is not a there is only one site that can exist day and age. The value of a page is dependant on what is on the page, nothing more and nothing less. It is only a matter of did the visitor find what they were looking for.

    Actually, it might. If it didn’t, consider the potential example above where I was a jerk and stole Nancy’s entire site. Word for word, graphic for graphic, every little minor detail.

    The site itself has some merit and value…but I stole it. If it gets indexed, there’s the possibility that it gets ranked for something at Nancy’s expense. Is that fair to Nancy? She put in the work, and she just got hosed by yours truly.

    That’s where the IBLs come into play. They are a factor in establishing who created what first and which is the better resource. If Nancy has 100 IBLs and I have 1, it’s a pretty safe bet that either Nancy created hers first or that others find hers better for whatever reason.

    And as mentioned before, there are other factors where a page shouldn’t necessarily be indexed and that the lack of IBLs indicates that (pages under construction, private content, production servers, copy and paste jobs, doorway pages, scraper sites, abandoned sites, pages created for testing purposes, etc.)

    Does that mean legitimate websites suffer? Yeah, some do. But the number of those sites that are a legitimate and useful resource to other people are significantly less than those in one of the categories listed above.

    Google really is, to a certain extent, screwed either way. If they list everything, people would complain about the spam ranking. If they list the stuff that wants to be there, people complain because their sites aren’t there.

    And what other measure does Google have, besides the IBLs? It needs something, or the search engine ends up full of crap that could potentially rank. As it is, supplementals can still rank for oddball stuff (I showed an example of that.)

    The point is, as I made it before, Google needs some way of telling which sites even want to be there in the first place. As of right now, IBLs provide the only such measure of indicating whether a site wants to be there. As I asked before (and no one answered), what else is there? (And “because it’s there” isn’t really an answer, because there are a lot of “because it’s there” scenarios that could potentially fill the index with crap.)

    Nancy, if you want to get some IBLs, this should get you started.

    http://www.directorycritic.com/index.php?page=free-directory-list&sort=date_added

    Yours is at least good enough for half of those directories. Submit to maybe 5-10 of those a day and you’ll be in decent shape as far as indexing goes. Ranking? Come on…gotta at least leave you some stuff to find out for yourself. πŸ™‚

    As far as whether that’s natural, look at it this way…is going up to your friend and saying “look at my site” natural? Is putting an ad in a paper to promote your business natural? Marketing, in its purest sense, is a relatively unnatural thing. But it needs to be done. And if it’s not done, then yes, that’s a webmaster error too, just as it would be for any brick-and-mortar business that gets shut down because they didn’t market right either. It’s certainly not an egregious offense, but it’s still a mistake.

  592. Adam, I appreciate your effort to diffuse the situation with some humor by your incontrovertibly flawed Eagles ranking. πŸ˜‰

    Flawed my ass. Don Henley rocks. Gotta learn about your facts, darlin. πŸ™‚

    Seriously, if you get the chance, there’s a clip of him singing Volcano live with Jimmy Buffett (THE UNQUESTIONED GOD OF ALL MUSIC) in Boston you can download…it’s not on sale anywhere to the best of my knowledge, or I’d buy it and crank it.

  593. Tell me something, Phil. If you were caught by the cops with a bloody knife handle in your hand with the blade stuck square in the back of a fresh corpse, you’d still deny murdering the person, wouldn’t you?

    Yes I would – because I wouldn’t have done it.

    No, We do not have a real example of a non-spammy site. You’ve even admitted that you’ve done something that could be interpreted as spammy since it was done solely for Google. It’s not exactly a squeaky-clean site.

    I have admitted no such thing. If you think that those internal links are spam, then you really have no idea what you’re talking about. Do you really think I would have posted a spammy site here? Descriptive link text is now spam??? Get real, Adam. You just hate to admit that you are wrong. It’s just the same as when you fought tooth and nail to avoid answering a very simple question, because the truthful answer would be a negative one for Google.

    As a matter of fact, I had a look at Nancy’s site yesterday and I couldn’t see anything wrong with at all. Would you like to have a look as well? Or would you rather not see another non-spammy site that has been hit?

    As far as the 21,000-someodd listings go, I looked in at least a dozen or so cities and regions. I saw exactly ONE with more than one listing (7 to be precise). So I did look and couldn’t find them and perhaps your little SEO trick is part of the reason why.

    Go and look again – they are all there – over 21,000 of them – no tricks – just honest to goodness plain old listing. Try to make sense please – it does help.

  594. BTW, I would not participate in ANY black hat forum ever! Matts blog is fine if you can refrain from the personal attacks and stick to the topic.

    I said that a neutral place would be fine, Dave, but I knew you didn’t have the guts to debate properly.

    Adam
    I’ve just read your review of Nancy’s site – and you are wrong, as far as I can see. The site does have good IBLs (where did you look? It’s no good looking at Google for them). And the fact that many sites have the lyrics to Eagles songs on them is irrelevant. There is no reason that I can see, or that you have suggested, for that site’s pages to be dropped. The idea of which page from which site first diplayed certain lyrics is pure nonsense. I can only think that all of your comments are based on your desire to believe that Google has got it right.

    Nancy said:

    It also seems, like Googlebot, you have not crawled my site with much depth

    Dave and Adam are very good at that, Nancy πŸ˜‰

  595. A bit of a synopsis (because I’m bored right now)

    These are the reasons that Adam came up with as to why the two sites that he looked at might have had their pages dumped from Google’s index:

    Site #1 (mine)

    1) That site has scraper site stink written all over it. The site is a perfectly ordinary niche directory that looks nothing like a scraper site (see item #6).

    2) Users need to click 3 times to get to what they are looking for. Yes they do – just like any directory, such as Yahoo! and DMOZ.

    3) Some internal links are ugly. Yes they are, but they are not spam.

    4) Those links were done that way solely for Google. Yes they were (and for the other engines), but they are not spam, and Google encourages us to do things solely for Google.

    5) 21,360 listings in the site are not enough, so the site’s pages must be dropped.

    6) Adam couldn’t find all the listings, so it must be due to seo trickery. Everyone else can find them if they look, because they are all there in plain site. Perhaps it would better for Adam if I put a list of all 23,360 of them kust one link waya from the homepage.

    Site #2 (Nancy’s)

    1) The site contain song lyrics, as do many other sites. Therefore the site deserves to be dumped, because other sites are “much larger”.

    2) The site shouldn’t contain lyrics because they are no better, and no more important, than other sites that contain them.

    3) The site doesn’t have any IBLs. Yes it does – Adam didn’t know where to look.

    4) Nancy made a “mistake” and a “minor error” “by not actively seeking links”. That’s not a mistake or an error – it’s natural and normal. Besides, the site does have IBLs. Saying that it doesn’t, doesn’t change that.

    5) To be in the index, Nancy needs to distinguish herself [site] from other voices [sites]. Isn’t a good, useful, non-spammy site satisfactory any more? Apparently not as far as Adam is concerned.

    My conclusion

    Adam is prone to inventing things when he can’t find what he’s looking for – presumably so that he doesn’t have to admit that Google might have got it wrong. This thread needs some more sensible posts to have any value at all at this end of it.

    To be fair, an increase of on-topic one-way IBLs might actually do the trick for both sites, so Adam wasn’t totally wrong about Nancy’s site, even though he didn’t realise that the site does have some good IBLs. I’d suggest that Nancy tries to get some more.

    Personally, I’ve no intention of doing it for my site, and I don’t see why anybody should have to. I can understand what Google is trying to achieve, and I don’t disagree with their goal, but it’s no excuse for dropping the pages of a great many good and useful sites. They should find other ways of achieving their goals.

    This isn’t about a change in the algo, where rankings get bumped about – it’s a fundamental change in the basic indexing of sites, and imo, it is dead wrong.

  596. Hello;

    I have a question regarding my site. I’ll give you a bit of history, main site over 5 years old and I had 3 sister sites but with all similar information and 1 site that was an ecommerce site (subdomain). First I shut down 2 sites and 301’d them to my main site. One site expired and recently purchased by a spam site but no longer affliated with us. The subdomain which was a oscommerce site I 301 it to http://www.site.com/catalog.

    The new site was being indexed and all of sudden it dropped from 1000’s of pages to 300 and only two pages from the /catalog directory. And alot of old documents are on that index. Am I being penialized?

    Not sure if I can do this, my site is http://www.siservices.com and the estore is at /catalog. Should I shut down the subdomain at shop.siservices.com which is currently being 301’d to the /catalog.

    Anyhow, great post.

    Ivan
    SEOWannaBIE.

  597. i would agree it is a fundamental change in the basic indexing of sites, and wonder whether that was wholly intentional, or a side effect now regretted but hard to erase or rectify.

  598. Google’s job is to index every site out there. That is what a search engine does. Even if a site has pages in the supplemental index the latest copy should be recent, not a year old.

    As for the lyrics discussion, none of the sites are any more valuable than the other. Go ahead and rank the original higher but all the sites deserve to be indexed.

  599. You’re right, it’s indexing not algo. At this point I care little for rankings, I’m just hoping to get a decent crawl so that if someone does a search to find unique content such as my listing of the members of Frey’s touring band in 1985 or the lyrics to the two songs Frey wrote when he was 18 that didn’t get released beyond a local Detroit label, they would get a result instead of “Your search did not match any documents.”

    Adam, I have nothing against Don Henley even though I prefer Glenn Frey, but putting Frey behind Randy Meisner (who left the band 30 years ago – name one thing he’s done since) and Timothy B. Schmit (who has sung a grand total of TWO Eagles songs in his 25-year tenure with the band) – come on! Madness!!!

    I will attempt to get more inbound links, though, after all of this – even though I feel a little like it’s admitting defeat regarding the principle of worth and not IBLs determining indexing.

  600. I will attempt to get more inbound links, though, after all of this – even though I feel a little like it’s admitting defeat regarding the principle of worth and not IBLs determining indexing.

    You are not wrong, Nancy – it *is* admitting defeat. But there’s nothing wrong with that in cases like this. A few sites standing up for what’s right, and refusing to bow to what Google wants, won’t make a scrap of difference to anything. It would just mean that those sites are left out in the cold.

  601. It is very simple: first of all, Google needs to admit that they have an indexing problem.
    Our website (distribution of industrial components) has been around since 1996, two weeks ago Google had 20 pages in their index, now just 6 pages are indexed.
    Furthermore, our 21 shopping cart pages (hosted by a third party shopping cart provider) disappered completely from Google’s index around May 19th. Googlebot only showed up on a few of the shopping cart pages once since May 19th, prior to May 19th Googlebot used to visit every shopping cart page as often as once a day.

  602. Seems like you’re right, Phil. I still think this fundamental change is also harmful to Google. Right now, many people believe that if Google can’t find it, it’s not on the web. While the public is slow to catch on to changes, eventually, it will have a detrimental effect on Google when people continually are disappointed in their search results and must turn elsewhere to get results of substance.

  603. Right now, many people believe that if Google can’t find it, it’s not on the web

    That’s probably true, and it’s a shame. No search engine should be as influencial as that.

    Adam linked to what I assume is a good resource for getting one-way IBLs. You might also want to have a look at the 3 “sticky” threads at: http://www.webworkshop.net/seoforum/viewforum.php?f=20

    If you haven’t already done it, you should submit to DMOZ and to Yahoo! (I think that Y! still adds non-commercial sites for free). But be especially careful to find the right categories.

  604. Phil, you’re in the wrong line of work. You should be manufacturing cologne. I have truly never seen anyone better at making a complete pile of garbage smell like a rose. I’ve never seen anything I’ve said taken so far out of context in my life.

    The only reason I would dignify your unique brand of insanity with an answer is not because it deserves one, but because it can and will mislead other people.

    I’m just surprised you haven’t invited the rest of your Rat Pack buddies in here yet. You’re already making a complete ass out of yourself…might as well make it a party.

    But I’m going to break down each of your “points”, and I use that term very loosely, one by one.

    1) That site has scraper site stink written all over it. The site is a perfectly ordinary niche directory that looks nothing like a scraper site (see item #6).

    Pantload #1.

    We have listings for these alternative types of accommodation in Holywell:

    Bed and Breakfast in Holywell
    Campsites in Holywell
    Caravan Parks in Holywell
    Farm Holidays in Holywell
    Guest Houses in Holywell
    Holiday Cottages in Holywell
    Holiday Parks in Holywell
    Hostels in Holywell
    Hotels in Holywell
    Self Catering in Holywell

    You have no listings for 6 out of the 10 of these. Yes, that’s only one example, but it doesn’t take much to find another 100 or more just like it. I’m only posting one, because one is really all that’s necessary.

    Why do you need to repeat “Holywell” so often? Because there’s only one listing in there, and because users looking for “Holywell Bed and Breakfasts” or whatever the case may be would find that you have next to nothing there?

    What about the integrity of your site? What about it being such a great resource that such a thing wouldn’t be necessary? What about the 21,360 listings that are allegedly there, and, if they are there, are probably either duplicates, dead listings, or buried so far that no one could find them?

    2) Users need to click 3 times to get to what they are looking for. Yes they do – just like any directory, such as Yahoo! and DMOZ..

    Pantload #2.

    I see the links I posted above have been conveniently ignored.

    And “any directory”, eh?

    Let’s take a look at some.

    http://www.hedir.com … hmmm… I see top-level category links. Does anyone else see them? Yeah, pretty obviously there.

    http://www.joeant.com … looks like there are top-level listings there, too.

    I’m not going to ring off the other 1,000s of sites that do the same thing, although those of you who saw the directorycritic.com link I posted above will be able to find them and peruse them yourselves and see just how incredibly stupid this statement is.

    3) Some internal links are ugly. Yes they are, but they are not spam.

    Pantload #1a (see above.)

    4) Those links were done that way solely for Google. Yes they were (and for the other engines), but they are not spam, and Google encourages us to do things solely for Google.

    Since I suspect you’re referring to rel=”nofollow”, yes, there are aspects that Google encourages webmasters to do that have a positive benefit on Google. However, there has been absolutely nothing that Google has suggested strictly for its own benefit.

    The other issue here is the idea that Google is this monopolistic beast that somehow controls webmaster content and promotional activities and forces them to do things just to comply with its rules. Nothing could be further from the truth. Google can’t gain FTP access to your site and go messing around with the content. Google can’t control how you choose to structure your site and its internal links. That’s your choice, Phil, and your responsibility.

    It’s choice and consequence. If you do A, X behaviour occurs. If you don’t do A, Y behaviour occurs. You can go either way, but it’s still your choice. Just like it’s my choice, and just like it’s Nancy’s choice, and just like it’s Dave’s choice, and just like it’s Jack’s choice, and just like it’s everyone else’s choice that has ever built a web page. So stop trying to blame other people for your decisions.

    Yes, you did something solely for Google…and you cheated the end user in the process. If you really needed to keep those there, why not have a series of dropdowns for country, then region/city, then suburb/area/whatever England calls it (I don’t know) so I won’t ask? If you were really concerned about your end user’s experience, this would have been a complete no-brainer, and about an hour’s worth of implementation time.

    And here is one of the biggest issues I have with you and the things you do and say, Phil. You really don’t give a damn about anyone else, as long as it fits what you happen to be doing. There’s nothing wrong with looking out for number one, but there is when it’s an exclusive focus.

    5) 21,360 listings in the site are not enough, so the site’s pages must be dropped.

    I’m still waiting to be shown where these are. Even in large cities such as London, the categories I saw had maybe 60-70 listings. (By the way, your search doesn’t work for London…try typing it in, then clicking on the London, London link.)

    And even then, so what? I don’t know if they’re accurate. I don’t know if your site is a credible source. I don’t know if there are dead listings among the bunch (as there would be of a site that size). I don’t know if other tourist-type sites like your site. And the site is a mess from a navigation point of view.

    6) Adam couldn’t find all the listings, so it must be due to seo trickery. Everyone else can find them if they look, because they are all there in plain site. Perhaps it would better for Adam if I put a list of all 23,360 of them kust one link waya from the homepage.

    This doesn’t even make enough sense to be called garbage. What does SEO trickery have to do with me not finding the listings that you claim are there? That’s bad navigation from a poorly designed site at best. At worst, it’s just another outlandish claim from a webmaster.

    1) The site contain song lyrics, as do many other sites. Therefore the site deserves to be dumped, because other sites are β€œmuch larger”.

    2) The site shouldn’t contain lyrics because they are no better, and no more important, than other sites that contain them.

    Pantload #3.

    That isn’t what I said at all. I never said the site deserves to be dumped strictly because of lyrics…I merely said that the lack of unique content that the lyrics represent indicate the possibility that the content could have been stolen or duplicated without permission from another website for the sole purpose of ranking.

    Nancy, I know you didn’t do this…I’m just pointing out that it can be done, and others have done it (if you go way back into ancient history…like say the mid-to-late 90s…it was a pretty common occurrence). It’s also just as likely to happen to you as it did to someone else if measures aren’t implemented as such.

    3) The site doesn’t have any IBLs. Yes it does – Adam didn’t know where to look.

    I did, and I found a total of 4. And 4 is close enough to 0 that it’s still well within the realm of possibility that a site gained “accidental” links or links from sources as part of a network link farm (although in this case, also not true). So for all practical intents and purposes, I said 0 because it’s less than the 6 Matt mentioned in the very beginning (and no, don’t go on about how wrong that is because we’ve all heard you ramble on like a drunken sailor about how evil that is and we get it).

    5) To be in the index, Nancy needs to distinguish herself [site] from other voices [sites]. Isn’t a good, useful, non-spammy site satisfactory any more? Apparently not as far as Adam is concerned.

    And with that statement, you’re intentionally making the same mistake most other people make inadvertently: the site’s good, let’s list it. And Nancy, the site is good. I like your site, and I’d actually suggest submitting it to the first directory I listed (hedir.com). As a mod there (although I haven’t been able to practice there in over a month), I’d have no problems giving it a positive vote and I think it would be a solid addition to the directory (feel free to reference this post if you like, too.)

    And this isn’t even about “as far as Adam is concerned”. The reasons have next to nothing whatsoever to do with my personal opinion of Nancy’s site. They have to do with reasons a site with no IBLs would have NOT to want to be indexed:

    1) Under construction.
    2) Private/intranet content.
    3) Confidential content.
    4) The ability for webmasters to be able to submit websites of competitors that are not yet complete and damage their credibility for potential customers.
    5) The number of pages of little to no actual content (e.g. “search results” from sites such as weatherstudio.com) that could be allowed in without having to gain backlinks.
    6) Doorway pages created for SEs specifically.

    As of right now, there is nothing outside of a manual investigation of Nancy’s site that distinguishes it from these types of things. And there are a buttload more of those types of sites than there are of Nancy’s. Does that mean Nancy gets affected by something that isn’t her fault? Yes, it does, and that’s unfortunate. But the fix isn’t an SE-specific fix either, and no harm can come from it.

  605. Adam, I have nothing against Don Henley even though I prefer Glenn Frey, but putting Frey behind Randy Meisner (who left the band 30 years ago – name one thing he’s done since) and Timothy B. Schmit (who has sung a grand total of TWO Eagles songs in his 25-year tenure with the band) – come on! Madness!!!

    Okay, I threw that last part in there to be a smartass. πŸ™‚ Although I do like Henley better than Frey. Meisner…the first song I ever remember hearing on the radio as a kid was “Take it to the Limit”, and I guess that’s why I remember him.

    Timothy B. Schmit…I don’t really have an opinion one way or the other on him as an Eagle (although don’t get his solo album…it sucks.)

    I will attempt to get more inbound links, though, after all of this – even though I feel a little like it’s admitting defeat regarding the principle of worth and not IBLs determining indexing.

    Then don’t do it for Google. Change your perspective a little bit.

    Do it because you’ve built something worth linking to.
    Do it because you’re confident that you’ve built something that people would want to use.
    Do it because you want more traffic to your site from as many sources as reasonably possible.

    And most importantly…

    Do it because you deserve it.

    Then let the Google stuff straighten itself as it will.

    One thing I forgot to mention about the Directory Critic site…a lot of the free links are reciprocal link directories (which isn’t indicated as such on the site)…if a directory requires a reciprocal link, it ain’t worth it.

  606. If Google didn’t exist how many people would go after links? None. Without Google my time is much better spent on creating more pages for people to shop from.

  607. I think Matt’s blog has been hijacked.

  608. If Google didn’t exist how many people would go after links? None. Without Google my time is much better spent on creating more pages for people to shop from. .

    So with no search engines, and no links, how would you get traffic to your website so that people can actually buy the stuff?

    Without search engines, a good number of us still would “go after” links, since going after the right kinds of links still gives us some idea of how good our sites actually are. as well as providing us with that thing called visitors that we all want to our sites.

  609. Hi Matt,
    I hope you take the time to read and respond to my question. DCpages.com has been part of the web sites that dropped off the
    Google’s search results. It popped back to the first page in the
    beginning of May for a few days and then dropped to page 9.
    I set up an account for site maps to do a diagnostic of what was
    wrong with DCpages. The funny thing is that it states that DCpages
    is positioned at No. 10 for the keywords “washington dc”, but that
    does not correspond with my checking to see our position. Currently between page 9 and 10. I also checked our extreme tracking
    stats and found the same same results. The Google staff stated in an email that we are not being penalized, but don’t give me a straight answer. Can you please be straight forward with me and explain the problem. I hear that Google does value community portals any more. Rather blogs like this one are better search food.
    I would really appreciate your help. Thanks in advance.

  610. Dave (Original)

    Nancy, good luck with your site. Don’t you weaken to Google and do anything that would result in your site being fully indexed. Biting of your nose to spite your face is the way to go πŸ™‚

    Phil, you constant repetition of the word “guts” reminds of my school days when the local hormonal bully going through puperty and with an excess of testerone would challenge someone to fight. Please grow up. Or are you are you of school age? That would explain a lot.

    I have already stated I will debate with you on the subject of this page, right here. You are the one who is not responding to my posts on the topic. You have also admited to NOT reading my posts and then in the same breath say my posts aren’t worth your time.

    Let’s shake hands, make up and get on with the topic at hand. Please!

    Now, let’ try again with your example. I asked you why Google should list page after page of nothing but link? That is what many of your directory pages are.

    Phil Said;
    RE: “A few sites standing up for what’s right, and refusing to bow to what Google wants, won’t make a scrap of difference to anything. It would just mean that those sites are left out in the cold.”

    Explain to me why Google is wrong in their endeavour to index the Worlds information?

    Why do you assume they are fully able to do so at this point in time when none of the other SE’s can?

    Why should Google not decide their own fate by running their business in the way they see fit?

    Why, when you SELL links for a commission, are you so against Webmasters obtaining links as Matt has suggested so a site can be fully indexed?

    Do you truly believe you know enough about inner workings of Google to be able to state that what they are currently doing is “wrong”?

    The only way your statements would make any sense IMO, is if Google was a pay for listing SE.

    Jack Said;
    RE: “If Google didn’t exist how many people would go after links? None.”

    Same question as Adam, how would you get traffic without links??

    It’s rather funny you say this as all those who buy and trade links often use the defence “I was buying/trading links long before Google came along”.

  611. Adam. I’ll go through your points in order:-

    #1 Calling a site a scraper site doesn’t make it one. Do you know what a scraper site is? Do you know what a directory site is? What makes you think that it’s a scraper, and what makes you thing that it’s not a directory? All you’ve done so far is call it a name. If directories are scrapers in your view, then… naaa – you just don’t know what scrapers are, do you.

    #2

    We have listings for these alternative types of accommodation in Holywell:

    Bed and Breakfast in Holywell
    Campsites in Holywell
    Caravan Parks in Holywell
    Farm Holidays in Holywell
    Guest Houses in Holywell
    Holiday Cottages in Holywell
    Holiday Parks in Holywell
    Hostels in Holywell
    Hotels in Holywell
    Self Catering in Holywell

    You have no listings for 6 out of the 10 of these. Yes, that’s only one example, but it doesn’t take much to find another 100 or more just like it. I’m only posting one, because one is really all that’s necessary.

    Adam. I’m going to teach you something a little basic about webpages, and I hope it helps. When words are clickable, they are links; when words are not clickable, they are not links. You can often tell by the different colours. In this case, you can tell the non-links by the fact that they are mid to light grey (they are greyed out) and the actual links are blue. If you look very carefully, you can see that 6 of them are mid to light grey, and that only 4 of them are clickable. I’ll interpret that for you – it means that only 4 of them have listings at the other end.

    Now it may be that it doesn’t fit in well with your idea of good design, but this discussion isn’t about good design – it’s about you trying to find spam in the site. No luck so far.

    #3

    Why do you need to repeat “Holywell” so often? Because there’s only one listing in there, and because users looking for “Holywell Bed and Breakfasts” or whatever the case may be would find that you have next to nothing there?

    For the engines, Adam. I said it’s ugly and undesirable, but it helps them to know what the next page is about – it’s how they work, you know πŸ˜‰

    yes, it’s true that many catgeories have few listings, but it’s also true that plenty have many listings. E.g. try this one:-

    http://www.forthegoodtimes.co.uk/hotels/england/south+yorkshire/sheffield.html

    Only 43 in that one, so maybe it’s not enough, huh? You really should do some research before you go around making silly statements.

    Anyway, we’re looking for spam, so still no luck. You’re not scoring very well so far.

    What about the integrity of your site? What about it being such a great resource that such a thing wouldn’t be necessary? What about the 21,360 listings that are allegedly there, and, if they are there, are probably either duplicates, dead listings, or buried so far that no one could find them?

    Er…did I mention doing some research before making silly statements? Didn’t I? I thought I did. Anyway – go and look – I know I said that before. Still no spam.

    #4

    http://www.hedir.com hmmm I see top-level category links. Does anyone else see them? Yeah, pretty obviously there.

    http://www.joeant.com looks like there are top-level listings there, too.

    Quite right. May I add to the list? Let’s try http://www.forthegoodtimes.co.uk Ah yes. I see top level category links there as well. So that’s 3 we’ve found between us. Aren’t we doing well!

    Oh just a minute. Are you saying that niche directories aren’t really directories, and that only those directories with wide-ranging categories are true directories? If that’s what you mean, I do beg your pardon. In that case, clearly the 3rd one is spammy and merits being excluded for Google’s index . It must be spammy – it’s not a general directory. But then again….. you’re wrong. So, still no spam.

    #5

    Since I suspect you’re referring to rel=nofollow, yes, there are aspects that Google encourages webmasters to do that have a positive benefit on Google. However, there has been absolutely nothing that Google has suggested strictly for its own benefit.

    rel=nofollow is one of them, yes. Another is what Matt said in this thread – build IBLs. He even suggested ways of doing it.

    It’s choice and consequence. If you do A, X behaviour occurs. If you don’t do A, Y behaviour occurs. You can go either way, but it’s still your choice. Just like it’s my choice, and just like it’s Nancy’s choice, and just like it’s Dave’s choice, and just like it’s Jack’s choice, and just like it’s everyone else’s choice that has ever built a web page. So stop trying to blame other people for your decisions.

    Now you’re getting even sillier. I haven’t blamed anyone for my decisions – quote me if you think I have. I blame Google for their new crawl/index function, but that’s completely different.

    Yes, you did something solely for Google and you cheated the end user in the process. If you really needed to keep those there, why not have a series of dropdowns for country, then region/city, then suburb/area/whatever England calls it (I don’t know) so I won’t ask? If you were really concerned about your end user’s experience, this would have been a complete no-brainer, and about an hour’s worth of implementation time.

    “Cheated” is a strong accusation, Adam. Please explain how anything that I’ve done has cheated anyone – or apologise.

    The reason I didn’t use drop-downs is because (1) they are far less useful for the user, and (2) without using CSS, they would be no good for search engines. You should learn about search engines – it may help you.

    Then you went on to talk about me personally…

    And here is one of the biggest issues I have with you and the things you do and say, Phil. You really don’t give a damn about anyone else, as long as it fits what you happen to be doing. There’s nothing wrong with looking out for number one, but there is when it’s an exclusive focus.

    It is true that I look out for number one, but you have no reason whatsoever to think that I don’t give a damn about anyone else, and every reason to think otherwise. How many times have you seen me post helpful stuff in the forum you frequent? Why would I have a forum if it’s not there to help people? How many people have called me for my services, and been told that I’d rather help them to do it themselves than take their money? You don’t know the answer to that, but it’s quite a few.

    Adam, you have made some pretty strong, and completely unjustified, accusations in your post. I think you are as thick as two short planks, and a lousy website designer, but I don’t go around posting it in public, so why make such strong and stupid accusations about me? Why make it personal?

    Then we were back to the site, and talking about the 21k+ listings…

    I’m still waiting to be shown where these are. Even in large cities such as London, the categories I saw had maybe 60-70 listings.

    You’ll wait a long time then, because they are at the end of every chain of links, and I’m not going to list the thousands of URLs for you. Would you like me to put them all in a pool just for you? No category is linked to unless it has some listings – no empty category pages to clog the engines with, or to lead users to – good design huh? Yes, I know about London – it needs sorting out.

    And even then, so what? I don’t know if they’re accurate. I don’t know if your site is a credible source. I don’t know if there are dead listings among the bunch (as there would be of a site that size). I don’t know if other tourist-type sites like your site. And the site is a mess from a navigation point of view.

    Talk about clutching at straws – I shake my head in sheer disbelief. I’m not even going to comment on such stupid statements.

    So you still haven’t found any spam. You did a very lengthy, and extremely biased (not to mention very silly), site review Adam, but you were looking for spam – and you failed. The site is as clean as a whistle. The only thing you found is adding location names to links. It would have been much better if you’d said something like “I think they are spammy”, and I could have said something like “I don’t think they are spammy”, and we could have saved a lot time.

    You’ll have to excuse me at this point. I’m only halfway through replying to your post, but it’s after 3 a.m. here and I need to sleep. Sorry.

  612. I just saw this, and I can’t resist, even though it’s very late.

    About Adam saying that Nancy’s site has no IBLs, when in fact it does, and me saying that he didn’t know where to look…

    I did, and I found a total of 4. And 4 is close enough to 0 that it’s still well within the realm of possibility that a site gained accidental links or links from sources as part of a network link farm (although in this case, also not true). So for all practical intents and purposes, I said 0 because it’s less than the 6 Matt mentioned in the very beginning (and no, don’t go on about how wrong that is because we’ve all heard you ramble on like a drunken sailor about how evil that is and we get it).

    That’s an absolute gem, and I can go to bed with laughter in my heart πŸ™‚

    4 is close enough to 0, so saying 0 was ok LOL!!!

    and…

    Saying 0 when there were 4 is fine because both 4 and 0 are less than 6 LOL!!!!

    Priceless!!!

    Thanks for that Adam – I could get to like you πŸ™‚

  613. Ok – last quickie for the night…

    Dave

    I have already stated I will debate with you on the subject of this page, right here

    That’s alright then. It’s just that you keep taking the subject off onto me personally, and my views about spam. If you want to discuss/debate the topic of this thread, without such diversions, then I’m fine with that. I’ll have a look at your post tomorrow, but I really do need to sleep now.

  614. Dave (Original)

    RE: “Now you’re getting even sillier. I haven’t blamed anyone for my decisions – quote me if you think I have. I blame Google for their new crawl/index function, but that’s completely different”

    Seems to be one-and-the-same to me. I see you blaming Google all over this page for not indexing all pages out there and the ones you work on because you have *decided* it has enough links already and you shouldn’t need to get anymore.

    RE: “but this discussion isn’t about good design – it’s about you trying to find spam in the site”

    IMO it is about finding out why the site you use as an example is not fully indexed. One reason is likely down to not enough quality inbound links for Google to crawl deeper. But Matt also stated this in his initial post. So you had a very likely reason before asking the question.

    Phil, on the about us page of the directory you work on, it states

    “Direct contact details for all accommodations, in all locations, are only 3 clicks away from the home page”

    That is simply not true is it and is yet another likley reason why Google does not go as deep as it needs to without suficient links. E.g
    http://www.forthegoodtimes.co.uk/guest+houses/n.ireland/county+londonderry.html

    BTW, there are some typos on that page that need correcting.

  615. Dave:

    I have always been again link buying. IMO it is what has made such a wasteland of Google results and made the results so much spammier.

    Without search engines you would use forums and email contacts etc to advertise your sites and of course your other sites.

  616. Dave: it appears that you are not going to offer a reason as to why your 90% unique site has been de-indexed by Google, something that would have been far more valuable to this discussion than a snarky comment about me cutting off my nose to spite my face. Unfortunate.

    Adam: Your point about getting IBLs for reasons other than Google’s overvaluation of them is taken. But I still believe that the concept of worth-determination-by-links is fundamentally flawed. (BTW, Frey also worked with Jimmy. Get out your copy of Last Mango in Paris and give a listen to Gypsies in the Palace – guess who co-wrote that? And I already have the Walden Woods Benefit concert with Volcano in my bootleg collection, but thanks for the rec anyway).

  617. Dave (Original)

    RE: “why your 90% unique site has been de-indexed by Google, something that would have been far more valuable”

    Who said my “site has been de-indexed by Google”, all I said was some of my pages are not indexed by Google. They come and go at random. As I have nothing outside the Google guidelines I never worry and run around arm waving and cursing Google. They always come back πŸ™‚ In fact, just checked, they are back.

    RE: “snarky comment”

    If you don’t like snarky comments I suggest you refrain from dishing them out. Ones own medicine is often a bitter pill to swallow πŸ™‚

    BTW, I have looked at your site in-depth and see many, many problems with it. One is canonicalization issues. I will refrain from saying anything else about it though for obvious reasons.

  618. Dave (Original)

    RE: “Without search engines you would use forums and email contacts etc to advertise your sites and of course your other sites.”

    How would you find the forums? How would you advertise without links? Remember, you said

    “If Google didn’t exist how many people would go after links? None”

    Now you are saying you would link to yourself via forums and your other sites?

  619. (BTW, Frey also worked with Jimmy. Get out your copy of Last Mango in Paris and give a listen to Gypsies in the Palace – guess who co-wrote that? And I already have the Walden Woods Benefit concert with Volcano in my bootleg collection, but thanks for the rec anyway).

    Well I’ll be damned. I never knew that.

    I do happen to have that CD in my collection (sans inside liner notes because it’s one of those remastered-type CDs where they give you the outside cover notes). But the woman is right. Frey did cowrite the song with Jimmy. I just learned something. Cool deal.

    I always thought the Glenn at the end of that song sounded like Frey, but never thought to look on the CD itself. I wonder if that was him.

    Nancy: go build a Jimmy site now. I’ll visit every day and bring my schizoid Parrothead buddies…even the woman goin’ crazy on Caroline Street. πŸ™‚

  620. it is also possible google works on the basis of seeing who survves being delisted.

    if you have a “real” site you built with your own hands over years, you are likely going to retain it, not bin it, if things get to only having an index page, somewhat the logic behind being sandboxed at the start.

    spam sites who go to one index page will by and large just get canned and start up pagain under another name, wheras real ones will linger……….

    given that google has problems as anyone would telling true from fake when they are alive, perhaps it figures if it all but kills them it will show who is real by who is still twitching after three months or so, when possibly it will ressurrect them.

    maybe all these dormant sites with just an index page will slowly re-evolve back to being whole after a lapsed time period.

    malcolm pugh
    rubery
    england

  621. Dave (Original)

    Malcolm, I believe Big Daddy will slowly place back in many recently dropped pages. I guess some of us simply have more patience, faith, resilience and logical thinking patterns than others πŸ™‚

  622. Dave said:

    Now you are saying you would link to yourself via forums and your other sites?

    I hope this is a rhetorical question because anyone would do the same.

  623. Dave (Original)

    Nice cherry picking Jack. Why not answer the other questions? Here they are again in case you missed them;

    How would you find the forums? How would you advertise without links? Remember, you said

    β€œIf Google didn’t exist how many people would go after links? None”

    Now, you are saying you would go after links.

  624. i think the primary function of this blog serves to highlight the main problem we are all here. this is the only place we can get a glimmer of an answer from google to serious disruptions to our visibility to the outside world.

    wheras google as a company can quite legitimately do as it likes, it has to take into account that it is now perceived and used as the number one traffic delimiter in the world, and peoples livlihoods and or fun sites depend to a certain extent on its actions.

    given that you would expect some avenue of legitimate concern to be available to voice opinions and legitimate concerns, and in turn receive adroit and to the point answers, focussing on th eoriginal questions.

    i would guess a lot of us here on this blog are only here on this blog because we have found it is the only viable place to get any clear cut advice.

    to me that is a failing.

    imagine if all 70 per cent of all the worlds telephones were the subject of a takeover by one company. would it then be viable that your only recourse to telephone problems would be on a blog to an employee – as in all emails were treated as blanket covered by a series of robotic standard uninformative replies?

    i think google owe evryone a proper avenue to apply to themselves with skilled personnel to answer individual queries and worries. i would not care if that were the case if it took a couple of weeks to fruition, as long as i knew i was going to get some kind of word from the mouth of god that was unquestionable truth.

    and surely this is why we are all here now – as this is the only place you ever get a snippet of what is going on, and when and how.

    dave mentioned faith/resilience/patience/logical thinking patterns and we all must surely empathise with that, as all of us have come here on the back of exasperation/confusion/long lapse times from action to results/change just when you thought it was safe to go back in the water.

    all this does is show anyone here has been through a lot of turmoil just tyo have got here in the first place.

    my contention is that we would be better served, and less irritable when posting and in general if we had a real avenue of complaint/information/feedback/knowledge which is not the case at present.

    if you buy a software product you expect there to be a support function to talk to if there are problems/misconceptions/misuse of their product as originally intended or just mere “help”.

    google may not have been “bought” by us but nevertheless it governs our lives and future by its actions, and i think in doing so it warrants a “help” line to apply to for guidance as to what is or is not true/viable/disallowed etc.

    we would not be on this blog if we could get accurate information elsewhere, and surely it would aid google to get examples of websites from real people that maybe they have screwed up inadvertantly so they can see “hey yes man, we missed that one – well i never……” how can you stay in step with your users if you never ever talk to them?

    i think a viable email line or support team would greatly alleviate the present confused and worrying end user experience, and i think it is only what ordainary people deserve.

    malcolm pugh.
    england
    http://www.stiffsteiffs.pwp.blueyonder.co.uk

  625. Dave.

    I blame Google for the effect of their new crawl/index function on a great many sites; I don’t blame them for any of my decisions. It’s not the same thing.

    IMO it is about finding out why the site you use as an example is not fully indexed

    No Dave – you are mistaken. This discussion came about when you said something, and Nancy replied with:

    Dave, it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.

    to which Adam responded with:

    I’m not saying this isn’t a fact. It may well be true. But if you’re going to make a statement like this one, be very prepared to back it up.

    For example, I’m going to ask you to list the sites to which you’re referring.

    and I posted the site as an example of a “unique, honest, non-spammy” site that has had its pages dropped. That’s what we are discussing. I know that Adam has expanded it into all sort of silly things because it’s personal with him, and he couldn’t find anything to genuinely criticise.

    So getting back to what the discussion really is about – unique, honest, non-spammy sites having their pages dropped from the index…

    unique
    Quite possibly. I don’t know of any other site in its field that is as free as that one. All the sites I know about charge money to be listed, or to be listed preferentially, or cover limited areas.

    honest
    It certainly is. It’s just a plain honest to goodness niche directory that is genuinely designed to be a useful and free resource for both surfers and accommodations.

    non-spammy
    Yes the site is non-spammy, both internally and externally. The only part of it that some people may consider to be slightly grey are the links that have locations in their link text, but I mentioned those when I put the site forward as an example. I’ve also said that I dislike them, and would rather not have them like that, but Google has forced things like that upon us. Imo, they are ugly, but not spammy.

    One reason is likely down to not enough quality inbound links for Google to crawl deeper. But Matt also stated this in his initial post. So you had a very likely reason before asking the question.

    I didn’t ask a question, and I didn’t ask anybody anything about my site. I know about the shortage of IBLs. The site was designed to be totally clean, and link-building for search engines isn’t clean, imo. So I gave it a small number of IBLs from my own sites just to get it noticed and started, and left it at that – as natural as I could make it. The irony is that I really did create that site to be as natural as is reasonably possible – no link building, no link exchanges.

    Phil, on the about us page of the directory you work on, it states

    “Direct contact details for all accommodations, in all locations, are only 3 clicks away from the home page”

    That is simply not true is it and is yet another likley reason why Google does not go as deep as it needs to without suficient links. E.g
    http://www.forthegoodtimes.co.uk/guest+houses/n.ireland/county+londonderry.html

    That *is* 3 clicks from the homepage. guest-houses/n.ireland/ is one click, county+londonderry.html is the second click, and then you need a 3rd click on a town/city to get to the listings for guesthouses in the town/city.

    In any case, a bit of incorrect factual information won’t cause a site to be de-indexed.

    Adam.

    I want to thank you again for the 4 is less than 6, and 0 is close to 4, so saying 0 instead of 4 is ok – because they are both less than 6. I’m still chuckling at the squirming πŸ™‚

    But I forgot to mention that Nancy’s site has a lot more than 4 decent IBLs. If you want to know how to find them, just ask πŸ˜‰

  626. Dave (Original)

    RE: That *is* 3 clicks from the homepage.

    Of course it is, that is why I used it as an example. The untruth comes about through the fact that there are NO “Direct contact details” on that page. One of *many* which is 3 clicks from the homepage and I checked at random.

    I fully aware that the false text on the page will not hamper Google, but if a human cannot drill down to find any content a spider has little chance without going in circles.

    I do hope you are going to answer all my questions from the same post.

  627. I have read all the above and still do not have an answer to why DCpages.com is being dropped.

    According Google Sitemaps – Query stats

    http://www.google.com/webmasters/sitemaps/querystats?siteUrl=http%3A%2F%2Fwww.dcpages.com%2F&hl=en

    Average top position is the highest position any page from your site ranked for that query, averaged over the last three weeks. Since our index is dynamic, this may not be the same as the current position of your site for this query.

    Top search queries Average top position
    1. washington dc 10

    But the reality is that we are positioned at 100 or more the last two months. Except in early May when we were back on the first page for a few days. Here is an example of where we are today on the bottom of page 13.

    http://www.google.com/search?q=washington+dc&hl=en&lr=&start=120&sa=N

    Our extreme stats have reflected this as well. My staff showed me the Google spikes and dips during this period.

    For months I tried to find an answer to this question and never got a straight answer. I just get the same boilerplate response from Google.

    “Our Webmaster Guidelines offer helpful tips for maintaining a “Google-friendly” site: http://www.google.com/support/webmasters/b…py?answer=35769. In general, webmasters can improve a site’s visibility in our search results by increasing the number of high-quality sites that link to it.

    We appreciate your taking the time to write to us.

    Regards,
    The Google Team”

    According to Google

    http://www.google.com/search?hl=en&lr=&q=%22www.dcpages.com%22

    Results 1 – 10 of about 106,000 for “www.dcpages.com”.

    These result are from Congressional leaders, government offices, civic groups, non-profits, and local businesses that have placed a link to DCpages for the past 10 years.

    I have spent a decade building the trust for our community publication

    http://en.wikipedia.org/wiki/Washington_DC_City_Pages

    This week DCpages was invited by the White House to cover the Memorial Day Ceremony in Arlington.

    http://www.dcpages.com/gallery/

    The old Google would have picked that up and spread the word. Now it does not.

    http://www.google.com/search?q=memorial+day+ceremony&hl=en&lr=&start=20&sa=N

    DCpages is my life. So I take this seriously. I just want to know the truth so I can set my community publication in the right direction.

    All the Best,

    Luke Wilbur

  628. Dave (Original)

    Phil, you have yet to show any evidence for Nancy of here statement: “it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.”

    All you have shown is site which doesn’t have all pages listed in Google. That’s a far cry from the statement by Nancy.

  629. Dave. I said I’d come back to your post today so…

    Let’s shake hands, make up and get on with the topic at hand. Please!

    That’ll do for me πŸ™‚

    Now, let’ try again with your example. I asked you why Google should list page after page of nothing but link? That is what many of your directory pages are.

    Many of the results pages contain links, but most of them don’t contain any – unfortunately. I’d like it if all of them contained plenty of links, because I believe that people would like them. Anyway… All of the results pages contain addresses, phone numbers, descriptions, as well as links where there are any.

    From a user’s point of view, all of them contain contact details for what they are looking for. From a programme’s (se) point of view, the pages could appear to have little of value in them, and that could be a reason for the pages being dropped.

    Explain to me why Google is wrong in their endeavour to index the Worlds information?

    I didn’t say that anything is wrong with it.

    Why do you assume they are fully able to do so at this point in time when none of the other SE’s can?

    I don’t, but I do assume that they are able to index all the pages that they had in the index before dropping them.

    Why should Google not decide their own fate by running their business in the way they see fit?

    Their users don’t expect them to intentionally leave useful pages and resources out of the index – they don’t expect them to be editorial in that way.

    Why, when you SELL links for a commission, are you so against Webmasters obtaining links as Matt has suggested so a site can be fully indexed?

    Do you mean AdSense? If you do I don’t understand the connection. I’m not against webmasters obtaining links as Matt suggested – I’m against webmatsers HAVING TO obtain links for that reason.

    Do you truly believe you know enough about inner workings of Google to be able to state that what they are currently doing is wrong?

    No I don’t know a lot about the inner workings. I only know what we are told by Googlers such as Matt, and what I read in papers that they publish. I don’t know the reason why they are doing what they are doing with the indexing, but I have a very good idea, and if I’m right, I don’t disagree with what they are trying to do. What I do disagree with is they way it’s working out for many “unique, honest, non-spammy” sites, and I vehemently disagree with that.

  630. Dave.

    You misunderstood my post about the 3 clicks. The listings ARE 3 clicks from the homepage. The URL you showed is only 2 clicks from the homepage. All listings are 3 clicks from the homepage. I’ll show you:-

    When you are on the homepage, 1 click gets you to
    http://www.forthegoodtimes.co.uk/guest-houses/northern-ireland.html

    another click gets you to
    http://www.forthegoodtimes.co.uk/guest+houses/n.ireland/county+londonderry.html
    (that’s 2 clicks to the URL you showed)

    the next click (the 3rd one) gets you to
    http://www.forthegoodtimes.co.uk/guest+houses/n.ireland/county+londonderry/londonderry.html

    et voila!

  631. Phil, you have yet to show any evidence for Nancy of here statement: “it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.”

    I didn’t make that statement, so it’s not for me to show a list. But I agree with the statement for 3 reasons.

    (1) The sheer numbers of forum users all over the place who have had pages dropped. I wouldn’t suggest that all of them were innocent, but many of them must be.

    (2) We have seen two examples here, so it is certainly happening.

    (3) Matt’s post confirms it.

  632. Hi Matt,

    This is first time I am posting meessage an in your blog. I have a site that was banned some years back which was confirmed by google. After lot of changes we submitted re-inclusion request and receivied a replying saying that our site is no more banned or penalized.

    We received this reply around a year back, but till today except home page all pages are still in supplemental index.

    You say that generally first time penalty is for 30 days, which is not true because even after getting confirmation about our site that its not penalized our site was never re-indexed.

    Can this be checked ..

    regards

  633. Phil,

    I’m just going to say two things to you:

    1) The only thing in that entire bizarre rant that you even remotely were correct on was the 0 vs. 4 thing. However, that’s not “squirming” on my part. I often make statements “for all practical intents and purposes”, and assume most people are intelligent enough to know what I meant. That was one of those statements. I should have clarified. That was my only mistake.

    2) The fact that you’ve spent so much time trying to spin doctor and discredit every thing I say, in addition to your absolutely pathetic attempt to bully me into making an apology for something that you’ve admitted to be true, tells me that you don’t even believe what you’re saying. So why would I bother to debate with anyone that can’t even trust themselves?

    That’s all you’re going to get out of me, Phil. I know I’m right, you know I’m right. And I’ll leave it to the relative intelligence of others to see the same thing.

  634. Adam.

    I take it you don’t want to know how to find a site’s IBLs then? πŸ˜‰

    You were squirming, Adam, and it was a wonderfully entertaining squirm. I thoroughly enjoyed it πŸ™‚

    You are right about one thing though – there was a lot bizarrness in the posts between us, but that was due to you having an attitude about me from the start. Your posts were bizarre and way off-topic, and my need to teach things to you were bizarre, but then we’ve seen it all before, haven’t we. At least you’ve learned how to recognise a real link on a page πŸ˜‰

    Here’s a little tip for you Adam. When somebody genuinely offers something, such as that example site, it’s not a good idea to make the very first comments things like “You are joking aren’t you” and “it’s a stinking scraper site” when it isn’t (my paraphrase but that’s effectively what you said). Things like that don’t lend themselves to good sensible discussion. I hope that helps πŸ™‚

    And just out of interest, the site isn’t a scraper, but it wasn’t until Dave said something later – yes, Dave (original) – that it dawned on me that its listings pages could possibly fit a programme’s analysis for scraper pages, depending on what the programme is programmed to analyse. A person wouldn’t see them as scraper pages, but a programme might.

    Now, if you had handled it in a sensible way, and said something along the lines of, “the listing pages could look like scraper pages to Google’s programming”, if that’s what you really thought, then we could have had a sensible discussion. But accusing it of something that it isn’t, was just stupid, because it accused me of something that I didn’t do.

    Btw, you have still to explain how I’ve “cheated” the end user. Do I take it that you were lying about that, or will you explain?

  635. I often make statements “for all practical intents and purposes”, and assume most people are intelligent enough to know what I meant. That was one of those statements

    Of course. How stupid of me. I’m embarrassed that I didn’t realise that, when you said the site has 0 IBLs, you actually meant that it has a few IBLs but not enough. Any intelligent person would have realised that.

    Squirm Score:

    Technical merit: 4.2

    Artistic ability: 4.7

    Sorry Adam. This squirm isn’t a patch on the other one. You’re losing your touch.

  636. “Who said my β€œsite has been de-indexed by Google”, all I said was some of my pages are not indexed by Google. They come and go at random. As I have nothing outside the Google guidelines I never worry and run around arm waving and cursing Google. They always come back πŸ™‚ In fact, just checked, they are back.”

    Imagine that, all back now! Must be because of all that patience in faith.

    “If you don’t like snarky comments I suggest you refrain from dishing them out. Ones own medicine is often a bitter pill to swallow :)”

    I purposely did in the previous post but since my efforts were ignored, I have little motivation to continue them.

    “BTW, I have looked at your site in-depth and see many, many problems with it. One is canonicalization issues. I will refrain from saying anything else about it though for obvious reasons. ”

    As I have said before, it is not perfect, and I never claimed it was. I am not a profession webdesigner. In fact I never expected such intense focus on my site – perhaps if I knew it was coming I would have taken your route and made claims without producing the site itself. Luckily, since only you have been derogatory about it, I do not regret linking to it. I appreciate you limiting your criticisms in this post, at least.

    Adam: That’s Glenn Frey alright. Did you also know that Jimmy appeared on the song “The Greeks Don’t Want No Freaks” off of the Eagles’ “Long Run” album? He is listed on the liner notes which I transcribed for my site as the head of the “Monstertones” backup singers. Sadly, that page and the ones giving the lyrics to “Gypsies in the Palace” are no longer indexed by Google.

    PhilC: I appreciate you defending my statement. In my opinion, one site as an example makes it a fact. Whether or not you believe the problem is severe enough to warrant action, the problem DOES exist. Pretending it’s not there doesn’t go away. Hopefully the people at Google know this, as it is only good business to be aware of such problems.

  637. Dave said:

    How would you find the forums? How would you advertise without links? Remember, you said

    β€œIf Google didn’t exist how many people would go after links? None”

    Now, you are saying you would go after links.

    ————————————————————

    I’d fine the forums fine without Google. The ones I frequent I never found via a search engine. I’d advertise via my own site. But I don’t really call it advertising to list my sites at the bottom of each page of my sites.

  638. Just a quick comment about “if Google didn’t exist”…

    If Google closed tomorrow, everyone would still go after links because the other engines copied Google’s links-based ranking system.

    If Google had never arrived on the scene, it’s likely that everyone would still go after links becuase, before Google came along, other engines were already factoring link popularity (linkpop) into their rankings.

    If Google didn’t exist and no engine had ever used links for anything at all, then nobody would really “go after” links, but the forums, etc. would be found easily via the engines that existed.

  639. Dave (Original)

    I SAID: “Explain to me why Google is wrong in their endeavour to index the Worlds information?”

    PHIL SAID: “I didn’t say that anything is wrong with it.”

    Well you did, many times. However, you now say there is nothing wrong with Googles endeavour to index the Worlds information. So we agree.

    RE: “No I don’t know a lot about the inner workings.”

    Ok, that’s good enough for me.

    RE: “I didn’t make that statement, so it’s not for me to show a list”

    Yes I know, but you jumped in with ‘so callled’ proof of the statement Nancy’s behalf. I guess we just have to conclude that there is no proof of Nancy’s statement.

  640. Dave (Original)

    RE: “In my opinion, one site as an example makes it a fact”

    One site (that IS still in the Google index) makes it fact? Surely you are not serious?

    Do you remember what you said?

    β€œit is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.”

    JACK

    RE: “I’d fine the forums fine without Google. The ones I frequent I never found via a search engine. I’d advertise via my own site. But I don’t really call it advertising to list my sites at the bottom of each page of my sites”

    Ok Jack, I think you would have been better to admit your original statement was simply over-the-top. Even Phil has contradicted you.

  641. Dave.

    I thought you wanted to discuss sensibly. I’ve never said that anything is wrong with Google wanting to index the World’s information. If you think I have, then show me, or even tell me what I said. Apart from that, what does it have to do with this topic? Tell me that you are not off on a personal thing again.

    Yes I know, but you jumped in with ’so callled’ proof of the statement Nancy’s behalf. I guess we just have to conclude that there is no proof of Nancy’s statement.

    Incorrect. You’ve been shown proof. I jumped in with it – remember?

    RE: “In my opinion, one site as an example makes it a fact”

    One site (that IS still in the Google index) makes it fact? Surely you are not serious?

    Do you remember what you said?

    “it is an incontrovertible fact that unique, honest, non-spammy sites are being erased from Google’s index.”

    That’s nit-picking, Dave. This whole thing is about Google dropping the pages of many sites from the index.

  642. Dave, if something is BEING erased, that means it’s not yet FULLY erased. For instance, if you were BEING robbed, the robbery would still be in progress. Therefore my statement is accurate.

    Do you have anything other than semantic issues to add to this conversation?

  643. Dave (Original)

    RE: “I thought you wanted to discuss sensibly. I’ve never said that anything is wrong with Google wanting to index the World’s information. If you think I have, then show me, or even tell me what I said.”

    All throughout this thread you have been saying Google is wrong for NOT indexing and including every single page out there. IMO Big Daddy has been implemented as a means to that end.

    RE: “Apart from that, what does it have to do with this topic? Tell me that you are not off on a personal thing again.”

    NO, nothing personal. The top of this thread IS “Indexing Timeline”.

    RE: “Incorrect. You’ve been shown proof. I jumped in with it – remember?”

    No, there has been NO *proof* of Nancy’s statement.

    RE: “That’s nit-picking, Dave”

    Not in my opinion, Nancy made the statement as fact that “unique, honest, non-spammy sites are being erased from Google’s index” There has been nothing shown to prove this.

  644. Dave (Original)

    RE: “Dave, if something is BEING erased, that means it’s not yet FULLY erased”

    What were you saying about semantics πŸ™‚ When/if the site of Phils IS fully erased you can then claim it *was* BEING erased.

    Erased: “Rubbed or scraped out; effaced; obliterated”

    Nancy, your statement was emotion based on wayyyyy over the top and I think you know that.

  645. Dave said:

    All throughout this thread you have been saying Google is wrong for NOT indexing and including every single page out there. IMO Big Daddy has been implemented as a means to that end.

    ————————————————————

    Of course that is wrong. It’s just common sense and noone would disagree that Google has a responsibility to index every site.

    As far as the links thing, I do not believe links play near as important a role on any other search engine. I certainly don’t spend my time getting new links. Only way I get one is if someone approaches me.

  646. Dave (Original)

    Not sure what your statemnets: “Of course that is wrong. It’s just common sense and noone would disagree that Google has a responsibility to index every site.” has to do with my statement, but I will say this.

    Google DOES NOT have “a responsibility to index every site”. Last time I heard, those that are not fully indexed can request a full refund πŸ™‚

  647. Dave (Original)

    RE “As far as the links thing, I do not believe links play near as important a role on any other search engine”

    That’s also incorrect. Yahoo, MSN etc all use links from other sites to include other Web sites. Without “links” the WWW would be TOTALLY different.

  648. Ys but MSN, Yahoo et all don’t rely on links for rankings:)

  649. Dave (Original)

    Rely for ranking. Possibly not, but likely use them as part of their algo just like Google.

  650. Yes I will agree they may use them as a small part, but Yahoo and MSN mainly rely on on page stuff.

  651. Google arent doing anything for free. They make billions on the stock exchange rating and advertising on the back of being responsible for 70 odd per cent of all internet traffic at present.

    Yet there is where responsibility seems to end.

    One bad roll out can be rectified in a matter of months, one mistake can be sorted in the same – sandboxing merrily disappears sites for 3 to 6 months. In the interim period thousands go bankrupt on the changes. you can all argue till you are black and blue whether good sites are getting zapped, i think they are else there wouldnt be people like nancy who NEVER EVER went to blogs etc appearing asking what the hell is going on – i dont look at stats much…..but my web pages started disappearing…………to me at any rate that is indicative of ordainary folk very puzzled with no hidden agenda/spam/zillions links/dodgy area links etc. these seem to me to be normal webmasters who never trod the boards appearing on blogs all over the place wondering what the hell is happening to their website they took literally years to build up. i am one myself. i would not be here but for having exhausted all reasonable ideas on what the hell is going on. i have tested everything i knew and have since managed to find out on why pages might disappear……….dodgy links/reciprocal links/duplicate pages/being hijacked/guestbook links by spurious people/etc etc so i didnt even get here till i exhausted all reasonable and known causes. like a lot of other pretty sharp people. and what do we get as the official version – google abstractly decides, contrary to its own webmaster guidelines put out in black and white for webmasters on its site, that it will arbitrarily throw in a requirement for a considerable supply of inbound links. i for one am very severely brassed off with that – its negated about three years work on my self help site for others and one solid years work for zero pennies setting up 12 other websites. out of 13 i have 9 with just a bloody index page. i am not a happy bunny. and where is the channel to talk to google? it does not exist, evn this blog is a one way street where chice snippets are put out now and then piecemeal which may or may not explain happenings that in any event are often a month or so old. big daddy happens march 8th to me – this blog mentions it middle may – wow big help there – i already had two months of hell and i still have 9 sites effectively dead which to my mind pass muster(not that i would put them forward for dissection here!! perish the thought).

    i dont think its good enough, when you know you hold 70 per cent of all traffic to play god with data to this extent – effectively moving the goalposts when its peoples futures and livlihoods at stake in many cases.

    if they turn around in three months time and say either “we made a bit of a mistake but its sorted now” or “heres how you should now code to appease our whims this month” what good is that to people in dire straits as a durect consequence of their changing emphasis totally NOW.

    this is a sea change by their own admission on this log. the guidelines never mentioned inbound links – no one had the decency to amend the guidelines to read “there must now be inbound links” it is just a fait accompli related two months after the fact ” oh by the way, we now require inbound links and we rolled it out early march guys” this in may 2cnd week.

    so i dont think google or anyone else can say we didnt buy off them, we bought in off their reputation and stock exchange street cred, and to a lesser extent because they were like us, little guys made good everyone supported; i dont think you can sit there with 70 per cent of al www traffic and be absolved of any blame/responsibility/guilt/shame for changing your basic premise and rolling it out to such an extent you are ruining peoples efforts over years and bankrupted firms who trusted in you.

    maybe im a heretic cand others are quite happy – i have no real way of knowing, but i think i do have a sense of right and wrong, fair and unfair, truth and fiction, trust and deceit. so when im told in guidelines “this is how to do it to be ok by us” and do that, and then it all is swept away like yesterdays news, i tend to think of the wrong, unfair, fictional, deceitful words im hearing, that are incidentally also wrecking my websites.

    long term this is stupid anyway, as their stock rating depends on good results on searches and punters will gradually gravitate elsewhere if the results suffer which i cannot see but happening unless this dropping of perfectly valid and unique material is reversed.

    so i do think google owes us one, big time, because we believed.

    malcolm pugh
    england

  652. Dave (Original)

    RE: “Google arent doing anything for free”

    They crawl, index, list my site and send me 15,000 + potential customers a day nad have done for years. The cost so far is a total of $0.00

    I’m not alone here as millions of other Web sites also enjoy the free ride.

    The fact is that when any page is dropped from Google, another moves up to takes it’s place. Another fact is that there are only 10 places on page 1 of any SERP. Sooooo, at any point in time there are going to be at least 90% (targeting a specific phrase) unhappy and only about 10% happy (arbitrary figures). The ratio (whatever it is) basically remains static.

  653. Maybe I’ve woken up never-never-land or somthing like that, because things are back to front.

    All throughout this thread you have been saying Google is wrong for NOT indexing and including every single page out there

    Dave. I have been saying that Google should index good decent sites and pages just because they are there, as you pointed out. Now, correct me if I’m wrong, but doesn’t that mean that I have nothing against Google indexing the world’s information? I *want* them to index the stuff – that’s what I’ve been saying. I have nothing against it. You’ve got it back to front.

    Just one thing though. I never said, or even hinted at, “every single page out there”. If you are going to be unreasonably picky with Nancy, you must be absolutely spot on with your own words – and you weren’t.

    Actually, you’re right in one respect. My site isn’t proof – but only because you never saw how many pages Google had indexed of it before almost all of them were dropped. You only have my word for that. However, it’s normal to believe people with things like this, particularly when the same things are being said all over the place, so absolute proof isn’t needed. The balance of probabilities, and common sense, say that it’s happening. Accept it.

    It would be far better to discuss realities, Dave, than to waste effort on the meanings of words. I see no point in arguing about a word or two. Listen to what people are saying everywhere. If you haven’t seen it for yourself, it only means that you personally haven’t seen it. If stacks of people say they have seen it for themselves, believe them.

  654. Google may not invoice you, Dave, but they do cost you money – you do pay. I should say that most site owners pay to be in their index, but not all.

    Every site owner who pays for bandwidth, or who pays to have their own server online, pays real money to be in search engine indexes. Crawling your site costs you. Every time somebody looks at Google’s cache of one of your pages, if there’s a graphic on the page, it costs you – you pay for that person to view Google’s cache of your page.

    It’s not free, Dave, and for a great number of sites, Google gives nothing back in return – they take but they do not give. From Google’s point of view, they cost individual sites money and they don’t even try to give anything back in return. So get off your “free” bit – it isn’t free. They only people for whom it is free are those who get totally free hosting and bandwidth. Everyone else pays.

  655. Dave (Original)

    Phil, anyone who thinks the cost of bandwidth chewed by Google is not worth the cost can simply use a robots.txt file. Simple. Or, they email Google and ask them to “throttle back”. It works as I did exactly that about 1 year ago. Common sense stuff again.

    I would estimate my ROI for ‘Google costs’ runs at about 500%. In fact, MSN uses over double that of Google on my site and yet only lists about 10% of my pages.

    RE: “I *want* them to index the stuff – that’s what I’ve been saying”

    You *want* them to index everything yesterday. Why not let Google run their business as they see fit and you run yours? Patience is a virtue πŸ™‚

    In regards to pages from sites being dropped. I have never denied that. What I KEEP saying is, Bigdaddy is likely a means to an end and NOT the final solution. I am also saying that while many pages have been dropped, many others have likely taken their place. Remember, we will ONLY ever hear about problems on Matt’s blog and other SEO forums. No suprise there as this is why the exist.

    BTW, a friendly reminder. You have stated that personal attacks etc are not tolerated on your forum. It would nice if you paid Matt the same courtesy and refrained from your personal attacks/rants on Adam. As I have mentioned before, it does nothing for your “professionalism”.

    A question for you Phil. Do you think Bigdaddy has been rolled out to get Google closer to their ultimate mission, or to hinder it? You sure sound like you believe the latter.

  656. Dave.

    Many people get more back from Google than they pay, but many people get less back from Google than they pay. I simply said that it isn’t free – and it isn’t free.

    Adam wrote posts with a huge attitude about me personally because of outside things from the past. His posts about my site were full of obvious nonsense, and he got sound logic/common sense in return (plus the odd merited snipe). He made false accusations, the worst of which was that I cheat the end user, but he refused to explain because there is no explanation, and he went away with his tail between his legs because he has no answers to common sense and truth.

    Adam started it when he stated that a perfectly ordinary niche directory is a stinking scraper site. How could I not respond to that? He chose to make it personal with complete disregard to the truth.

    I enjoy a good debate/discussion when people are being genuine, but Adam chose not to be. You seemed to want to discuss my views of spam, presumably because you have some strongly held different views about it. That’s why I wouldn’t discuss your posts in this blog. But when you said, let’s start again and discuss the topic here, you’ve had different responses from me, and I’ve answered all your questions, because you are discussing genuinely instead of trying to score points against me for other reasons.

    I’m a normal person, and I’m not known for flaming. But if someone attacks me, as Adam did, I’ll respond. If he wrongly calls me a cheat, I’ll tell him that he’s as think as two short planks (which I honestly believe he is). I’m normal. If you want to divert the discussion and attack me for my views, as you seemed to be doing earlier, I’ll respond. I’m normal. If you want to genuinely discuss/debate our different views, I’d be happy to do it elsewhere, as long as nobody makes it personal.

    All Adam was doing was attacking me because he wanted to, and he had a complete disregard for the truth. All I did was respond to it. All of it is against what Matt wants in this blog, and I apologise to Matt – but not to Adam – Adam deserved it.

  657. Why should I have faith Google will rectify the situation when today I discovered two more pages had been removed from their index? The situation is getting WORSE, not better.

  658. P.S. And yes, the page with the song lyrics is one of the three that remain.

  659. A question for you Phil. Do you think Bigdaddy has been rolled out to get Google closer to their ultimate mission, or to hinder it? You sure sound like you believe the latter.

    I don’t really know what Google’s ultimate mission is. They’ve said something along of wanting to index the world’s information, but they are never likely to be able to do that, but as far as the Web is concerned, they probably would like to index all of it that is useful. In a way, I think BD is a step in that direction, but I’ll tell you what I think BD is about.

    Matt said that it is largely about the new crawl/index function, so, although it dealt with some other things, I’d say that it’s mainly about that. What I think is that they are trying to deal with types of links that they don’t want to count. When they began, they almost had a natural Web as far as links were concerned. One or two of the existing engines had started to take account of links but organising links for that purpose wasn’t anything like it became when Google became popular.

    Google has run their links-based ranking system on a Web that has become increasingly poluted with made-for-search-engines links, and I think that BD’s new crawl/index function has tried to deal with some of that, by not counting certain types of links such as affliate links, too many reciprocals, off-topic links, etc.

    In other words, I think BD is largely about no longer crediting link polution, but that’s just my best guess. If it’s a good guess, then I don’t disagree with what they are trying to achieve, but I do disagree with the way they are going about it.

    So, imo, BD is a step towards indexing all of the ‘good’ Web, even though it’s failing in many cases, and a step towards NOT indexing ALL of the Web. But that’s based on my best guess.

  660. I thought you said that all your pages were back, Nancy.

  661. Hell no, Phil. That was Dave.

  662. Hi Phil,
    You seem to understand what is going on here. Can you explain why Google states DCpages has an average top position of 10 when for key words “washington dc”, when in reality DCpages is at 130 and dropping farther every day? This makes no sense to me. Any help would be appreciated.

  663. So soon we will see these supplements go?

  664. I don’t know Luke. The average is over 3 weeks, and you may have had a page ranked near the top in the last 3 weeks.

    It isn’t clear how they count a ranking. I don’t think it can be assumed that they count all possible rankings for a query. E.g. if a searcher only looks at the first page of results, it can’t be assumed that the system counts all 1000 viewable rankings. Personally, I think it’s more likely that they only count the rankings that people actually see, so if nobody searches down to the page your ranking is on, then the ranking won’t count and it won’t affect the average. That’s just my guess, but it is one way why the average needn’t change as the ranking falls.

  665. Dave: thanks for the help, but no thanks.

    As I’ve said before, I don’t give a damn if someone chooses to attack me personally. If that’s what they want to do, that’s their business. If someone’s got that much free time that they can ramble on about how I’m attacking them in a pathetic attempt to deny the truth, I say let them. They just need to keep in mind that the more they choose to do so, the weaker their own arguments become. So hey, let him. It just shows insecurity and a complete inability to defend what the truth actually is. The harder he argues the way he does, the stronger my own case gets.

    So Phil, go nuts. Take any shot you like at me on this blog. I encourage it!

    There are more than enough people who respect who I am and what I have to offer and say that I don’t need to get all bent out of shape about what he’s doing. Let him. I’m okay with that. I think Allen Iverson said it best: “there are a million people that love me, and ten million that hate me.”

    This will be my first, last, and only such post on this, and I did it only to clear the issue up. I’m not one to stray off-topic either, but I just wanted to make sure everyone knows how I feel about this stuff.

  666. Adam: That’s Glenn Frey alright. Did you also know that Jimmy appeared on the song β€œThe Greeks Don’t Want No Freaks” off of the Eagles’ β€œLong Run” album? He is listed on the liner notes which I transcribed for my site as the head of the β€œMonstertones” backup singers. Sadly, that page and the ones giving the lyrics to β€œGypsies in the Palace” are no longer indexed by Google.

    That I did know, but that’s pretty impressive nonetheless. Now where’s my Jimmy site? πŸ˜‰

  667. Aaw – come on Adam – be a man, for goodness sakes. This is what you wrote about me:-

    Yes, you did something solely for Google…and you cheated the end user in the process

    Several times I asked you to explain what you mean by “cheated the end user”. So far, you’ve declined to explain it.

    You see, calling somebody a cheat is a serious accusation, and you should be prepared to back it up (remember Nancy? πŸ˜‰ ). How many times do you need to be asked to back it up and explain what you meant? I’ll ask you one more time…

    Adam, in what way do you think I cheated the end user?

    Now, if you want any credibility at all, answer the question.

  668. Now where’s my Jimmy site? πŸ˜‰

    Perhaps I made one and you just can’t find it because it’s not indexed. Now this is hitting home, eh? lol

  669. test…

  670. Adam, in what way do you think I cheated the end user?

    Now, if you want any credibility at all, answer the question.

    I’ve already done so repeatedly. The problem is that you didn’t like the answer because it was right. But, to make it very clear:

    1) Search doesn’t work. (An error you’ve acknowledged).

    2) Three links to actual content, and no this is not the same as every other directory as you put it (including the two you cited, DMOZ and Yahoo!)

    3) 10 phrases for every city page, of which at least half usually don’t have any content. Yes, it is indicated by their not being hyperlinked, but that requires an extra little bit of thought on the part of the user to figure that out (and there are a significant number of people that wouldn’t).

    4) Above the 10 phrases, the following phrase: We have listings for these alternative types of accommodation in …

    If you do not have those types of accommodation, then this phrase should not be listed. It’s misleading to the end user (you know, cheating).

    4)

    http://www.forthegoodtimes.co.uk/holiday-parks/northern-ireland.html
    http://www.forthegoodtimes.co.uk/farm-holidays/northern-ireland.html
    http://www.forthegoodtimes.co.uk/farm-holidays/scotland.html

    All links from the opening page that suggest content that isn’t there. Misleading to the end user (cheating).

    Now if you want any credibility at all, fix it so that the user actually can find real content from the opening page (1 click at best), get the search to work, and get rid of any misleading text/links to no content/spammy descriptions.

    Oh, and you may want to look into fixing the HTML as well…you know, strip that down so that users can actually get to your wonderful content that much faster. Use CSS (I still haven’t figured out why that’s such a bad thing yet). Build for the user first.

    If you do that, you’ll never hear a word out of me as far as any SEO attempts you choose to make. But the way I see it, that’s what BD in a larger sense is shooting for, and if that’s the case, I’m behind it 100%.

  671. Clarification Note: in the previous post, I’m only pointing out the search functionality not to suggest that the user is being cheated, but merely that they are being deprived of the maximum possible benefit from the site. If it’s a problem, it should be fixed.

    The reason I pointed it out is because Phil acknowledged it as a problem. If he’s serious about this site being a valid user resource, he’d fix it.

  672. Perhaps I made one and you just can’t find it because it’s not indexed. Now this is hitting home, eh? lol

    Why must you turn this blog into a house of LIES? Make my Jimmy site. heh πŸ˜‰

  673. Ah! Finally an answer.

    Clarification Note: in the previous post, I’m only pointing out the search functionality not to suggest that the user is being cheated, but merely that they are being deprived of the maximum possible benefit from the site. If it’s a problem, it should be fixed.

    The reason I pointed it out is because Phil acknowledged it as a problem. If he’s serious about this site being a valid user resource, he’d fix it.

    So when you said, “and you cheated the end user in the process”, you didn’t really mean that I cheated the end user in the process – you meant that I didn’t cheat the end user in the process. Now I understand :). It was one of those 4 = 0, and anyone with any intelligence should know what you mean moments, right? So why take so long to explain it? Anyway…

    To be honest, I’m still not sure what you mean. The search function didn’t actually work when you wrote it, and I only discovered that it was broken yesterday. There’s a reason why it hadn’t worked for the last few weeks, but it’s fixed now.

    When you wrote about cheating, you were referring to the list at the bottom of each page. It is a list of each accommodation type. Those for which there are accommodations in the area are linked, and those for which there are no accommodations in the area are greyed out (not linked). I don’t see how “users are being deprived of the maximum possible benefit from the site” by doing that. It’s standard practise to grey out unlinked items in lists or rows of links.

    The only other thing that was mentioned about those links is that I add the location name to each of them, e.g. instead of “Guest Houses” being the link text, it is “Guest Houses in Location”. To be honest, I don’t see how that causes users to be “deprived of the maximum possible benefit from the site” either.

    I can’t see anything that users are being deprived of, so I can’t see anything to fix, but if anyone thinks differently, please say so.

    I never said it is a “problem”, Adam. I said that I don’t like including the locations in the links, because it isn’t necessary for users, and it doesn’t look as good, but it describes the destination pages much better for links-based engines like Google, which helps their ranking system, and helps to get the destination pages ranked where they belong. It doesn’t deprive users of “the maximum possible benefit from the site” though. It doesn’t deprive them of anything.

  674. So when you said, β€œand you cheated the end user in the process”, you didn’t really mean that I cheated the end user in the process – you meant that I didn’t cheat the end user in the process. Now I understand . It was one of those 4 = 0, and anyone with any intelligence should know what you mean moments, right? So why take so long to explain it?

    You see, Phil, this is why no one can ever make a point with you. You can take one small piece of a much greater point and distort it so severely that it suits your specific needs at the time.

    The point behind mentioning the search was to support the argument that the user isn’t the primary focus of the site, and never has been.

    Oh…and it’s still not fixed.

    http://www.forthegoodtimes.co.uk/search.php?searchtype=town&type=all&query=London&ex=1

    And greying out the accommodation types for which you have no listings in conjunction with the statement that you make directly above is cheating, because it’s misleading to the end user.

    When you say “We have listings for the following types of accommodation…” and then you grey out a category because there are no listings, it means you do not have listings for that type of accommodation and therefore it should not be listed. It’s a misleading phrase, and that’s cheating the end user. Either remove those phrases for which you don’t have listings entirely, or amend the statement to read that “For more information on other types of accommodation in (city), please visit the hyperlinks below. Items that are greyed out do not have listings and therefore are not hyperlinked.” (Or something to that extent). At least then, you’re not misleading anyone.

    As far as the links go, you don’t have to include the city in those links. You choose to include them because you want a higher ranking, and you’re clearly saying that you don’t like the idea. It’s ugly. It’s unattractive. It is generally agreed that it would be better if the links didn’t include the city names. I know full well why you did it…but you don’t have to do it.

    I’m not even overly concerned about that, though…I’m more concerned about the greyed-out text in conjunction with the phrase used above it. If you put a greyed-out text phrase in context with the phrase above it, you can see the contradiction quite clearly and therefore how a user could be confused/misled.

    That deprives the user of the maximum possible benefit, for aesthetic reasons. That’s cheating the end user. And if you were concerned about it, you’d fix it. But I don’t think you will.

  675. Adam.

    You didn’t like my humour, huh? I thought it was quite good – and it was true! But you brought it on yourself by taking so long to address the issue of you accusing me of cheating the end user. Don’t be surprised if the people you abuse like that don’t like it, and don’t expect them to express brotherly love to you when you finally address the issue.

    I’d forgotten about that ‘choices’ part of the search function. I must fix that. It works fine when there are no choices, but for a couple of weeks before yesterday, none of the search function worked.

    And greying out the accommodation types for which you have no listings in conjunction with the statement that you make directly above is cheating, because it’s misleading to the end user.

    There you go again. I’m cheating again. You just can’t stop, can you? First I was cheating, then I wasn’t cheating, and now I’m cheating again. And you expect me to be nice to you???

    I can actually understand what you are saying about a contradiction between the text above and the greyed out texts below, but I just don’t agree with your conclusion. I think that the text above is fine, and the fact that some ‘types’ are greyed out is perfectly indicative that there is nothing for them. I don’t think anyone is being confused or misled in any way. If I’d linked to the empty pages instead of greying the texts out, then you would have a point, but I see no reason to include a lengthy explanation about the greyed out ones. Greying out is standard – people know these things. The blue linked ones are very very obvious.

    That deprives the user of the maximum possible benefit, for aesthetic reasons. That’s cheating the end user.

    Now I’m cheating on two counts! Users are aesthetically deprived, and are therefore being cheated. That’s a new one. Hell, the whole site aesthetically deprives everyone – including me! The same applies to most sites – including yours. Sorry, Adam. Your whole reasoning is contrived for the sake of finding fault.

    We do agree about the locations in the link texts, even though you made a big thing about them earlier – twice. You’re getting there πŸ˜‰

  676. Adam.

    The point behind mentioning the search was to support the argument that the user isn’t the primary focus of the site, and never has been.

    I actually thought you’d made a mistake when you mentioned the search functionality. You first mentioned it in your “Clarification note” post, and referred to your previous post, but without any details or explanation, so it was natural to assume that you meant the links. However…

    To suggest that “the user isn’t the primary focus of the site, and never has been” is yet another accusation for which you have no evidence – only imagination. But you are correct in this case. Users and accommodation owners share joint focus in that site. But the primary focus in that site is me. Users and accommodation owners are secondary.

    I’m in good company though. Google’s primary focus is Google, Yahoo!’s primary focus is Yahoo!, Microsoft’s primary focus is Microsoft, and the primary focus in Adam’s site is Adam.

    So what point were you trying to make?

  677. There you go again. I’m cheating again. You just can’t stop, can you? First I was cheating, then I wasn’t cheating, and now I’m cheating again. And you expect me to be nice to you???

    I never once said you weren’t cheating the end user. In fact, I’ve been pretty clear on this.

    I also don’t care if you’re nice to me or not. If you’re going to be a jackass, go right ahead. It’s your prerogative. I’m not going to stop you or tell you otherwise.

    The same applies to most sites – including yours. Sorry, Adam. Your whole reasoning is contrived for the sake of finding fault.

    I’m curious as to what you find aesthetically displeasing. And when two people agree that something doesn’t look right, it doesn’t look right.

    But I do like your humour, on the rare attempts that you actually say something funny. This is a freakin’ gem.

    To suggest that β€œthe user isn’t the primary focus of the site, and never has been” is yet another accusation for which you have no evidence – only imagination. But you are correct in this case. Users and accommodation owners share joint focus in that site. But the primary focus in that site is me. Users and accommodation owners are secondary.

    I’m in good company though. Google’s primary focus is Google, Yahoo!’s primary focus is Yahoo!, Microsoft’s primary focus is Microsoft, and the primary focus in Adam’s site is Adam.

    I can believe that first part. While you think it’s imagination, it certainly does show that you are the first person you think about when you build your site. So for me to be correct, either I got really lucky and pulled something completely out of my ass or there are aspects of your site which suggest as such.

    But to suggest that all of us think that way…we couldn’t, or almost no businesses would ever survive online or offline. That’s completely, totally, utterly ridiculous. If you don’t pay attention to your users and what they want first, you’ll never grow anything…there are too many of them and not enough of you. You start looking at what people want and trying to serve them as best you can, you’ll be in much better stead.

  678. I guess this means I better start hiding my affiliate links:)

  679. Let me state that member “adam” and member “dave” are the only participants in this thread making any points of common sense.

    Dave wrote:
    “Malcolm, I believe Big Daddy will slowly place back in many recently dropped pages. I guess some of us simply have more patience, faith, resilience and logical thinking patterns than others.”

    Let me add the words of…… common sense….. to that statement.

    I really wish I knew who Adam and Dave were? I think they must read a lot at the ‘church of heil’ as I couldn’t have stated the points better myself. πŸ™‚

  680. I don’t even know where the Church of Heil is, but my girlfriend says I should go to church more often. You got a map or something? πŸ™‚

  681. Adam.

    Clarification Note: in the previous post, I’m only pointing out the search functionality not to suggest that the user is being cheated, but merely that they are being deprived of the maximum possible benefit from the site.

    I took that as being your statement that you didn’t mean I was cheating users after all. I know it mentions the search functionality, but that hadn’t been mentioned before so I assumed it was a mistake on your part.

    I’m curious as to what you find aesthetically displeasing

    I didn’t say that anything in it is displeasing, although for me the columns are too evenly sized and the left column navigation is poor (too widely spaced). I said that it aesthetically deprives us, and it does when you compare your website design business site with aesthetically good sites, such as this website designer’s site:- http://www.finerdesign.com/ and many many more sites on the Web. Aesthetically, your site is neutral – neither good nor bad.

    Commercial sites are self-focussed – they have to be – it’s the nature of business. A good method of being self-focussed is to provide a very good service (site) for people. It then appears that the focus is on the user, but in fact, the primary focus is on the business itself. Business sites are like that Adam – including yours.

    Just out of interest, it would be very easy to omit the greyed texts altogether (it would have been easier not to put them in in the first place). The reason I decided to put them in is so that an accommodation type always appears in the same place. e.g. hotels is always near the bottom of the right column and is anywhere else. Imo, it makes for better usability for when a person wanders around the pages, as can easily happen when s/he is looking at different ‘types’ for the same or nearby locations. Having the types jump around is bad design. having the dropdowns you suggested is even worse. They way I’ve done it is ideal because the types are always there in the same positions, and the highlighted (active) ones are seen at a glance.

    So to sum up, in your opinion, I cheat the end users because something isn’t as aesthetically good as it could be, and also because the sentence above the links isn’t totally accurate. I can live with that. Your choice of the word “cheated”, and your refusal to explain what you meant by such a serious accusation were foolish, but I can live with such insignificant reasons. Imo, that sentence is totally accurate because it obviously only applies to the highlighted types, and as somebody once said, any intelligence person would know that.

  682. Adam. This is sincere…

    If you compare your website design site with the one I linked to in the previous post, I think you’ll agree that, judging by the sites, you would be the second choice to hire as a website designer. The two sites are poles apart. But looking at the portfolios of both designers, there’s not much difference at all. The other one put their best work into their own site, whereas you put your worst work into your own site. If your site is there to attract clients, then I suggest that your current site can’t be helping you anywhere near as much as the other people’s site must be helping them.

    You make much better sites than I would have thought judging by your business site. It’s a pity that the fire surrounds site is now defunct because the thumbnail looked really nice, and I would have liked to have seen it.

    Btw, the whole of the search function is now fixed – I think.

  683. Dave (Original)

    RE: “Greying out is standard – people know these things. The blue linked ones are very very obvious.”

    Hmm, yes if we are to assume users are all Webmasters, but spiders don’t. What IF Google were to list the page in their SERPs, due to the text which has NO content. That would pee off a Google searcher. Perhaps your reason was to gain SERP position for those words? Sorry, but I cannot see how adding a *possible future* link with keyword anchor text while having no content behind it is for anything but the SEs.

    Perhaps one of things BigDaddy does is seek out such sites and NOT index them for obvious reasons.

  684. Sorry, but I cannot see how adding a *possible future* link with keyword anchor text while having no content behind it is for anything but the SEs.

    You’d like it to be that, wouldn’t you πŸ˜‰ but it isn’t. There is no anchor text, so the fact that there’s nothing behind it doesn’t come into it – there’s nothing for anything to be behind. They are just words on a page.

    You can’t fault it, Dave, so you might as well stop trying. It’s squeaky clean. Adam made a big thing of it, but he was just having a personal go at me. When he got right down to it, all he could say is that the sentence above isn’t 100% accurate, and that he doesn’t like the aesthetics. Frankly, I’m not interested in those opinions. He’s entitled to them, just as you are to yours, and I do have my own, of course. But none of them address this thread’s topic with respect to that site.

    If a search engine listed one of those pages high in the serps because of a greyed out phrase, then it would be a really crap engine when you consider that there are plenty of pages out there that really are about each of those phrases. They are not exactly obscure phrases. Those pages would make it into the results set (about 40,000) for those phrases, and quite possibly into the top 1000, but they wouldn’t rank anywhere where people would actually see them.

    Dwelling on that aspect of the site is barking up the wrong tree. Whatever the reason for the dropped pages, it isn’t that. There’s a much more obvious reason, which is much more in keeping with all that we’ve learned about the new BD crawl/index function – the shortage of good IBLs. And there’s a less obvious possibility that you made me see – the pages could match Google’s pattern for scraped pages, IF they have any capability of spotting scraped pages yet. I’m sure it’s the first reason, but it could be the second one.

  685. Dave (Original)

    RE “You can’t fault it, Dave, so you might as well stop trying. It’s squeaky clean”

    Matter of opinion, but who cares anyway, other than Google πŸ˜‰

    RE: “There’s a much more obvious reason, which is much more in keeping with all that we’ve learned about the new BD crawl/index function – the shortage of good IBLs”

    Hmmm, we all knew that before you chose to link to it. Have you done anything about that yet?

    RE: “And there’s a less obvious possibility……”

    There are likely other reasons (some have been mentioned already) but so long as you have the mind-set that the site is “squeaky clean” etc and should automatically be indexed, I doubt you will fix the issue.

  686. The other one put their best work into their own site, whereas you put your worst work into your own site.

    You’re correct…sort of.

    I put my best work at the time into that site. Mind you, the layout itself is almost 3 years old, and I know I could improve it. But I can’t be bothered for various reasons…not the least of which is that I’m not taking on new business at the present time. I actually have it more up there for the articles, which my clients use and like, than for any other reason.

    So yes, as it stands right now it’s my worst work, and yes, I know it could be improved. I’ll be the first person to stand up and admit that…in fact, when I’m damn good and ready to, I will. But what would be the point in improving it to attract new business that I can’t deal with anyway? For all practical intents and purposes, it’s a dead site right now.

    And as stupid as this sounds, I started my business without ever having had a website for my own business. I had five clients before I even thought about building it…I actually put it up there originally just to shut one of them up. “You should have a site. People should see your work.” etc. I never have had a problem finding people to work with, but now I pick and choose my spots and deal with a small client base…I can serve them better, and it’s easier for me to manage.

    There’s also some better stuff that I haven’t even put in there yet.

    So yes, criticism understood and accepted. I can agree with it. But don’t expect to see anything done about it anytime before I’m damn good and ready. πŸ™‚

  687. Actually…that one part should read…when I’m damn good and ready, I’ll fix it. Not admit to it. I already do admit to that. πŸ™‚

  688. Oh yeah…and I just realized the site you’re talking about.

    The owner skipped town on me on that one. Owes me money, wouldn’t pay, so I had to shut it down. That was almost a year ago now.

  689. That’s why I said something along the lines of, “if the site is there to attract business”. At the time of writing it, I almost included “(and it may not be)”.

  690. Hmmm, we all knew that before you chose to link to it. Have you done anything about that yet?

    Certainly not. I said from the start that I’ve no intention of running round link-building for a bloody search engine.

    There are likely other reasons (some have been mentioned already)

    There may be other reasons, but none that have been mentioned here.

    but so long as you have the mind-set that the site is “squeaky clean” etc and should automatically be indexed, I doubt you will fix the issue.

    Damned right. You still keep wanting to attribute some blackhat stuff to the site when there is none, but it doesn’t stop you trying. The site was specifically designed to be squeaky clean, which it is, no matter how much you’d like it to be different. Judging by this entire thread (from the top), the only thing that can be fixed is the number of good IBLs, and I’ve no intention of pandering to a bloody search about that.

  691. Dave (Original)

    RE: “Certainly not. I said from the start that I’ve no intention of running round link-building for a bloody search engine.”

    Then don’t complain about the site not being fully indexed. Simple πŸ™‚

    RE: “There may be other reasons, but none that have been mentioned here.”

    You assume none have been mentioned here. I bet some of the mentioned problems are contributing though.

    RE: “Damned right. You still keep wanting to attribute some blackhat stuff to the site when there is none, but it doesn’t stop you trying. The site was specifically designed to be squeaky clean, which it is, no matter how much you’d like it to be different”

    I think you should be more concerned about what “black hat stuff” Google attributes to the site. Keywords on page with no content is likely seen as black hat.

    In regards to links and “no intention of pandering to a bloody search engine”. Have you told the client this? I would have thought you as SEO would do nothing but pandering to SEs. I certainly cannot see that the site was designed for users only with the navigation, keywords on page with no content etc being how it is.

  692. I appreciate the information. Until I read this, I really didn’t understand the why of it.

    I guess it’s kind of tough writing a search engine that pleases everyone. Some are bound to get hurt in the process as my site was, and many others, too. The problem is that if some of those in the supplemental results are indeed good sites, then those webmasters will be less likely to use that search engine.

  693. Dave said:

    Keywords on page with no content

    Bear in mind that a search engine cannot distinguish between what are keywords and what is part of the content. This is why people who just list keywords do so well.

  694. Dave (Original)

    Yes, but in this case it’s page after page after page of nothing but keyword links mixed in with future keyword links and nothing else. I would guess the KWD of some of these keywords is up around 80%+

    RE: “This is why people who just list keywords do so well”

    I bet there are many more that never see past page 10 of the Google SERPs, or who are not even indexed.

  695. And some sites I’ve seen not only have lots of links, but very little, if any of it, is their content. Some use images that don’t belong to them. Some use copyrighted text that doesn’t belong to them, but they’re listed.

    Then others like myself go to great lengths to come up with original content, and yet ours aren’t listed.

    I don’t think I’ll ever understand how it all works. I just do my best to come up with a good site, small as it is, but I do try, and I hope that some day it will be listed.

  696. Dave.

    Then don’t complain about the site not being fully indexed. Simple

    You have completely missed the point – completely. I didn’t complain about that site being de-indexed. I offered it as an example of one that had been, and I didn’t ask anybody’s opinion, although having offered it as an example, it was open to opinions. But don’t imagine that I asked for any or wanted any. I knew the score before I introduced it.

    You assume none have been mentioned here. I bet some of the mentioned problems are contributing though.

    You can make all the bets you like, but you are not likely to win any.

    I think you should be more concerned about what “black hat stuff” Google attributes to the site. Keywords on page with no content is likely seen as black hat.

    Dave, anyone can imagine all sorts of things, but that’s all it is – imagination. It would a really bad way of living if we all ran around changing things all the time simply because of all the things that might be.

    However, I can put your mind at rest in this case and tell with absolute certainty that those greyed out words are not seen by Google as spam, and if it’s all the same to you, I’d rather take Google’s word for it than yours.

    In regards to links and “no intention of pandering to a bloody search engine”. Have you told the client this? I would have thought you as SEO would do nothing but pandering to SEs. I certainly cannot see that the site was designed for users only with the navigation, keywords on page with no content etc being how it is.

    What client? What I do and don’t do is none of your business. If you don’t think the site’s design is good for users, it’s ok – you are entitled to your opinion. Have you thought of spreading that opinion to all the other directories out there?

    Yes, but in this case it’s page after page after page of nothing but keyword links mixed in with future keyword links and nothing else. I would guess the KWD of some of these keywords is up around 80%+

    As I said, it is absolutely certain that it is not spam in Google’s eyes. I’ll add to that – it isn’t spam in anyone’s eyes, unless the person is either very stupid or is trying to score points. Just in case you are not just trying to score points, I’ll explain it to you Dave.

    The sort of spam we are talking about tries to improve rankings. Ok so far?

    And it tries to improve the rankings of relevant pages. Stop me if I’m going too fast for you.

    All but one of the greyed out phrases are not relevant to the pages they are on, and high ranking for the page for those phrases are not wanted. Did you following that ok?

    Now, if that’s spam, then I must be the worst spammer of all time – I’m targeting the wrong pages!!!

    The one exception is that the current page’s phrase is greyed out, because a link to it is not required, as we are already on the page. Of course, if I were spamming, then I’d link that one up as well, and get its link text to count for the page.

    You see, Dave, keyword spam is concerned with the page it is on. Not only that, but it is repeated in the page so many times that it becomes spam. Single phrases are not spam. If they were, then all pages that contain text would be spam, because all text contains phrases that are keywords for other pages.

    Let me know if that was too complicated, and I’ll try to dumb it down a bit for you πŸ™‚

  697. A serious question for Dave, or anyone else…

    it’s page after page after page of nothing but keyword links mixed in with future keyword links and nothing else

    I was thinking about that, and, although probably well over 80% of the site’s pages are listings pages, many of the listings pages are links-heavy. The site is a simple, no-frills directory, and is designed in the normal drill-down directory format. Because many listings pages contain very few listings, it means that most of the site’s pages are links-heavy, and could fit the profile of things like scrapers, if an engine has such a profile. But I can’t see any alternative.

    The question is, what alternatives are there? How can the pages be made less links-heavy? There are far too many pages for some ‘text’ pages to offset them.

    I should say that I have an extremely good reason to believe that the pages being links-heavy and fitting a ‘profile’ isn’t the reason why the site was de-listed, but I don’t rule it out altogether, so it’s worth asking the question.

    Note:
    Dave. As long as you keep trying to say that the site contains spam, I’ll keep implying that you are stupid (as in the last post). I don’t think you are stupid, but keeping on flogging a dead horse could cause a change of mind. You suggested shaking hands and starting again, and I agreed. It would be good if you stuck to that agreement.

  698. All of the above posts and similar posts on other sites indicate that Google has implemented a huge change of policy. Before the BD changes the G index was seen by most people as the best representation of the whole web. G also encouraged users to think that what they saw was an accurate reflection of the web. Plus or minus a few sites that were excluded for legitimate reasons such as portrayals of violence, criminal behaviour or outright scams. The implication was that everything, and everybody was include in the index. All sites could and would be listed and all a searcher had to do was dig a bit deeper and they could find anything they were looking for. G encouraged this view point with their mission statement – ‘To organise the world’s data.’ The implication is again that it is ALL of the world’s data.

    That was before the BigDaddy changes. Now the above scenario has been changed. By insisting that websites conform to a new set of rules G has changed the nature of the index as well. It is now a list of ‘selected’ websites that have been pre-screened by google to meet their new, tougher, parameters. The effect of this change is likely to be huge, as many websites will not make the selection and will be excluded from the previously democratic, un-biased, natural list. This appears to violate G’s own terms of service.

    ‘The search results that appear from Google’s indices are indexed by Google’s automated machinery and computers, and Google cannot and does not screen the sites before including them in the indices from which such automated search results are gathered.’

    This is no longer true – the index is drawn only from a pre-selected list of ‘approved’ site which meet G’s new parameters. This is a whole new ball game as it implies that G approves of the included sites and is therefore recommending them to users. The sites that are not in the index do not have G’s approval and therefore are not listed. Or that is what the users will assume. A site is not in Google index therefore it is no good, of poor quality and no value. But is that what G is trying to do – I don’t think so. G should think very carefully before going down this route IMO because they will end up with people assuming that sites in the index are some how approved by google and that others sites are rubbish.

    Maybe that is the intention – to only allow, high quailty, G approved sites in the index. That would be one way to go – it would soon reduce the level of spam and the volume of sites that they have to index. But it is an entirely different search engine to the one we all know and love (well some of the time). If G want a ‘selected sites’ only search engine that is their right to do but they must come clean and tell the web users and web site owners that that is what is happening. Then all excluded sites can either try and meet the qualty guidelines or go elsewhere and stop worrying about not being in the index. Of course the volume of users would drop dramatically, as people realised that they would rather make their own choices rather than read a ‘selected- only’ list. G should the change their mission statement to read ‘Organising some of the world’s data’ or ‘Organising those bits of data that are approved by Google’

    These are the choices for Google at the moment – either an approved, pre-selected index or an index that all can enter, that is fair, that does not discriminate between large or small sites, technically savvy or rough and ready sites and that contains the good designs, the bad designs, and the plain ugly sites. The users will determine which ones succeed and which fail – just like they do for businesses and films and books etc.

    Anyway that’s what I think. (Another) Dave

  699. i agree with davidw – i wish i could have put it as succinctly as he has – and as he says, given that there has been a change of emphasis from that which is propounded on the google website, to that which is prevalent now, the google guidelines should be changed so that new webmasters just emerging are fully aware of the new criteria in place.

    malcolm pugh
    england.

  700. Dave (Original)

    OK Phil, you have gone back into smart arse mode *again* (I guess you can’t help yourself). I’ll leave you to have your tantrum on your own.

  701. Dave (Original)

    DavidW, I don’t expect you have read the whole Thread here, can’t say I blame you either πŸ™‚

    But basically, what I have been saying throughout is that BD is likely a means to an end (organise/index the Worlds Data) and not the final soultion itself.

    It’s very easy, I guess, to see/read all the noise on SEO forums etc and conclude that Google is *only* dropping pages. However, what we never hear is when pages are included (silent majority perhaps). It is quite possible that since BD, Google’s index has been increased over-all. Or, that BD is in progess of doing just that. The WWW is a HUGE place and indexing/organising ALL pages out there is going to take time. In the process of doing this, we would be short-sighted to take a moment in time and draw conclusions based on all the noise.

  702. Dave (Original)

    RE: “…the google guidelines should be changed so that new webmasters just emerging are fully aware of the new criteria in place.”

    It been written there for a long time. E.g

    Google states;

    “Although we index billions of webpages and are constantly working to increase the number of pages we include, we can’t guarantee that we’ll crawl all of the pages of a particular site. For more information about how we find and include pages in our index, please see http://www.google.com/technology/index.html

    While we can’t guarantee that all pages of a site will consistently appear in our index, we do offer our guidelines for maintaining a Google-friendly site.”

    AND

    “When your site is ready:

    Have other relevant sites link to yours.
    Submit it to Google at http://www.google.com/addurl.html.
    Submit a sitemap as part of our Google Sitemaps (Beta) project. Google Sitemaps uses your sitemap to learn about the structure of your site and to increase our coverage of your webpages.
    Make sure all the sites that should know about your pages are aware your site is online.
    Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites”

    There is LOT more than just this. I guess Google can only supply the guidelines, unfortunately most don’t bother to read them and Google cannot force it.

  703. Dave. I only went into smart-ass mode because you continued to talk about something that never existed. When you talk seriously, there’s no problem.

    It is quite possible that since BD, Google’s index has been increased over-all. Or, that BD is in progess of doing just that. The WWW is a HUGE place and indexing/organising ALL pages out there is going to take time. In the process of doing this, we would be short-sighted to take a moment in time and draw conclusions based on all the noise.

    Lots of things are “quite possible”, Dave, but you can’t take any notice of them. If you read the thread from the top, and read the “noise” from everywhere, you will see that the conclusions are drawn from one particular thing – Matt’s first post in this thread. People are not drawing conclusions from the noise, but from Matt’s post, and it’s very reasonable to draw conclusions from it. You should stick to that and not things that are “quite possible”.

  704. Artur Kowalczyk

    Is it true that after Bigdaddy update a site can be penalised for having “too many” or irrelevent incoming links?
    If it was true, companies could destroy their competition by arranging tons of irrelevent, “bad neighbourhood”, spammy links.

    Artur Kowalczyk

  705. DavidW wrote:
    “The effect of this change is likely to be huge, as many websites will not make the selection and will be excluded from the previously democratic, unbiased, natural list. This appears to violate G’s own terms of service.”

    What is so unusual about a website ‘including’ only those sites/pages that it wants to include? Hasn’t any search engine only included those pages that it “wants to include? Wouldn’t you want the ability to control your own website, and not cater to what “other” websites wanted you to do?

    I guess I don’t see the difference in what Google is doing now, than what they or any other search engine is doing now? Algos change, right? Just because Google may have stricter “parameters” now about what they see as quality, does not mean they are any biased or unbiased as they have ever been in the past. I’m getting your argument at all. I can personally change how or who or when or what I want to do with my own forums, right? If I wanted to start “pre-screening” members before allowing them to post, I could do that, right? If I wanted to start allowing signature files, I could do that, right? If I wanted to start allowing any live links in posts no matter what kind of site it linked to, I could allow that, right? If I wanted to change things up, and start allowing “cuss” words, and making the forums completely “adult” in nature, I could that, right?

    So what is so different about a major search engine being allowed to do anything it wanted to do, and do it whenever they please?

    It’s almost like the argument I see being made in regards to se spam; “I should be allowed to do anything I want including spam the se’s, as it’s my right to do so.”

    While it’s always a website’s right to do anything it wants to do, another website always has it’s own rules and guidelines to abide by, right? What’s sooo different about Google, or Yahoo, or MSN in this regard?

    We all can whine and cry all we wish, but the bottom line is we either live with “change”, or we choose to not live with change. We either confirm and change ourselves to deal with “reality”, or we choose not to, right?

    I guess I’m missing something….

  706. I guess I’m missing something….

    Correct.

  707. DavidW, your argument has one fatal flaw in it. Google isn’t selecting the pages to be included or not included. Google isn’t making any unilateral decisions to say “this site stays”, “this site goes”. They’re not pre-screening anything. Think about that…Google has X number of employees. For argument’s sake, let’s say that number is 250,000 employees (I know it’s not even close to that high, but just to be safe). They have employees that are involved with spam, employees in AdSense, employees in AdWords, employees in HR, employees in administration, employees in Google Maps, etc. and so on. Is it feasible, or possible, to even assume that Google and its employees are somehow pre-screening or filtering out websites? Of course not.

    So who is, if they’re not?

    We are.

    We who?

    Webmasters.

    We have the ability to control what Google does and doesn’t index simply by acquiring and/or using organically placed links to show where our content is and that it’s a valuable resource. Webmasters actually have gained more control to a certain extent than they ever have, and that in theory is a very good thing.

    All Google’s algorithm change has done is to reflect what we see as important, in general terms (obviously there are those assholes who will choose to link to complete crap no matter what, and the crap will still find its way in, but for the most part it’s showing good, relevant, useful stuff).

    It’s not about what Google is filtering. It’s about what we are. And if we have a problem with getting our sites listed without supplemental results, we can fix it ourselves without having to wait weeks on end for a Google response.

    It’s not a violation of Google’s ToS either…this has been going on for years now. The vast majority of website content still isn’t indexed yet. It wasn’t before BD, and it’s not now. Why is it suddenly a big issue after an algorithm update?

    And, as Doug pointed out, we have choice. With choice comes consequence. You can choose not to get IBLs or you can choose to. Each choice has a consequential effect. No one’s beating you over the head and forcing you to do anything. Do as you wish, but if the consequences aren’t what you’d like, then you need to make a different choice.

  708. Adam said:

    And if we have a problem with getting our sites listed without supplemental results, we can fix it ourselves without having to wait weeks on end for a Google response.

    ————————————————————-

    Where in the heck did you get this LOL. If there’s a way we can fix it I would love to know what it is.

  709. Well thanks for the response; Doug, Dave and Adam and others. I can’t answer every query that you have raised – most of which are valid points. I think some people are missing the point and I may be jumping to conclusions also.

    ‘So what is so different about a major search engine being allowed to do anything it wanted to do, and do it whenever they please?’

    There is nothing wrong in an SE doing exactly what it wants to do – paid inclusion, advertising, polical messages, humour (that would make a nice change) – the www could be organised along any number of lines. But the point is that they should state what they are doing and not pretend to represent the whole web in an unbiased fashion. OK any SE is always going to have some bias, they may favour small sites or certain formats etc etc but apart from that they all give the impression that they are searching the whole web and that the serps are the best matches for a users query. When in reality all the index is doing is showing sites that best fit the algy – in G’s case, that is – the serps are those sites that have the most appropriate links to them, as judged by the google system. And similar results will be found in the serps of other SE. What they are not doing is finding the most revelevant or highest quality sites. But it is my reckoning that the average web surfer – who is not savvy in search or algys or seo etc – thinks that they have the best selection of sites avaiable. But that is not the case – it would be a tall order for any SE to do this and could be seen as the Holy Grail of search, to use a topical analogy. (But G are on the case I believe as they have stated that they intend to introduce AI into the search system within the next two or three years – that will give the seo’s something to worry about.)

    OK – so whats my point – G introduction of new crieteria for being indexed by G, is that sites must have good quality links in suffienct numbers to be included. Reciprocal linking is to be discounted or ignored. This move significantly raises the bar for many sites who have mostly reciprocal links (myself included – now gone to SI apart from home page) It also raises the question what is a ‘quality’ link – who decides? That is the question that needs answering. Do websites have to get .gov or .edu or .ac links or can they be from large corporate sites, well designed sites or whatever!! Who knows – only google that’s the point.

    As I see it; and I’m not a webmaster, more of an editor I suppose, a large number of sites which do not meet the new parameters are being de-indexed from the G index. Many of these have been in the index for three or more years. Why should we not be interested in that? Looking at the consequencies of this move could mean that loads of smaller sites, such as small businesses (my sector), independent hotels, shops, artists, musicians and loads of other non- webby people will be dropped out of the index. What people on these blogs don’t realise is the huge volume of people who are on the web but who are not web-savvy – many donot have a clue about this change in policy. See comments above – how many people have only just found Matt Cutt’s (highly informative) site.

    My main concern is that by upping the requirements needed to even get in the G index – let alone ranking well – google will exclude very many perfectly good but technically weak sites – and I think that would be unfair and a great shame.

    Regards DW

  710. PS

    When I talk about pre-selection I mean that G will only index sites that have the ability to attract high quality links. But that is only a relatively small portion of the websites out there. Some webmasters are well connected, some work in the ‘trusted’ sectors pf .gov and .edu, some work for large corporations within huge resurces available – all of these will have a large advantage over smaller, less well connected sites, which will then be excluded.

    By requiring high quality links and discounting recip. links google is pre-selecting which sites get into the main index and may end up in serps.

    DW

  711. It’s not that Google is pre-selecting which site will get in and which won’t – it’s how much or how little of a site gets in, and that *is* being pre-determined.

    The vast majority of website content still isn’t indexed yet. It wasn’t before BD, and it’s not now. Why is it suddenly a big issue after an algorithm update?

    But it was before BD for individual sites, Adam. That’s the whole point. That’s what all the outcry is about. Many sites that were fully indexed before BD are having many of their pages either dumped out altogether or dumped into Supplemental, and for no justifiable reason. Nobody would mind in the slightest if it were only crappy sites that were being hit, but it isn’t – it’s perfectly good, honest, non-spammy sites. That’s why it’s suddenly a big issue. It’s nothing like it was before BD.

    You say that people have the choice of acquiring more good IBLs or not bothering, and that’s true, but people should *never* have to go out and get IBLs to be included in a top class search engine’s index. Sites should be included on merit, and getting unnatural IBLs has nothing whatsoever to do with merit. Also, selling text ads, and putting affiliate links in the site have nothing whatsoever to do with whether or not a site merits full inclusion. By merit, I mean that scrapers, etc. should not be included, but all normal sites merit being included as far as is possible.

    Doug said that Google is entitled to do exactly what they want with their site, and that’s also true, but there are things that they cannot do and still remain a top class search engine. For instance, they cannot refuse to fully index perfectly good, honest, non-spammy sites and remain a top class search engine. Doing something like that means that their results are intentionally limited – that’s just not a top class search engine.

  712. Luke Wilbur Said,
    June 2, 2006 @ 6:40 am

    Hi Phil,
    You seem to understand what is going on here. Can you explain why Google states DCpages has an average top position of 10 when for key words β€œwashington dc”, when in reality DCpages is at 130 and dropping farther every day?

    PhilC Said,
    June 2, 2006 @ 7:13 am

    I don’t know Luke. The average is over 3 weeks, and you may have had a page ranked near the top in the last 3 weeks.

    I wish that was true, but DCPages.com has been buried 9 pages down for month. So something is wrong.

    Also, a friend of mine said that he heard that one of the way Google ranks pages now is they cookie users who use their browser tool. Is their any truth to that?

  713. Dave (Original)

    RE: “Dave. I only went into smart-ass mode because you continued to talk about something that never existed. When you talk seriously, there’s no problem.”

    Whatever you say Phil. I’m done trying to debate with you. You are simply not mature enough and/or capable.

  714. Dave (Original)

    RE: “But the point is that they should state what they are doing and not pretend to represent the whole web in an unbiased fashion.”

    Google’s mission has not changed as far as I can tell. That is;

    “Google’s mission is to organize the world’s information and make it universally accessible and useful”

    I think we would be niave and shortsighted to take one point in time and base Google’s success failure on it. Matt has stated that “Bigdaddy is more comprehensive than our previous system”. Let’s not forget that the number of pages out there is HUGE and growing and VERY rapid rate.

  715. Dave (Original)

    RE: “Can you explain why Google states DCpages has an average top position of 10 when for key words β€œwashington dc”, when in reality DCpages is at 130 and dropping farther every day?”

    That is odd IF what others here are saying here is true. That is, Google has less pages now that before. You would think pages that are still in the index would all move up πŸ˜‰

  716. Dave. if you learned to debate, you would be fine, but debating doesn’t appear to be your fortΓ© πŸ˜‰

    Luke. I don’t believe that user surfing is a ranking factor, because there are far too many reasons why it’s no good for that purpose. However, it’s a common idea that people discuss these days because it appears in a patent application, along with a mass of other ideas. The problem is that, when people start to talk about a theory, or an idea, or something that’s imagined, other people often take it as fact, and repeat it as such.

    I wish that was true, but DCPages.com has been buried 9 pages down for month. So something is wrong.

    Something certainly appears to be wrong. I suppose you know that Google has many datacenters from which the results are delivered, and that the DC that we receive the results from changes – even between pages of results. It’s possible that some DCs were showing the higher ranking a lot longer than the DCs where you were getting the results from.

    There’s also the possibiliy that the site is hosted in a country other than the U.S. and that it’s been showing a higher ranking in the results for that country. That’s probably unlikely, but you would know where it’s hosted.

    It would be odd if the average top positions is actually broken, but it could be.

    That is odd IF what others here are saying here is true. That is, Google has less pages now that before. You would think pages that are still in the index would all move up

    I think you didn’t understand Luke’s question, Dave πŸ˜‰

    Even so, there’s no IF about it. People aren’t saying that Google has less pages now than before. They (we) are saying that Google is dumping pages either out of the index altogether, or into Supplemental. If you aren’t aware of that, you should get out more and read it in every seo forum. Those are the places where you’ll find soooooooooo many people that it has happened to. Also, whether or not there are fewer pages currently in the index, rankings do change.

  717. Phil, you really need to learn what specifically I’m commenting on before you comment in turn.

    But it was before BD for individual sites, Adam. That’s the whole point. That’s what all the outcry is about.

    That’s not what DavidW said, and that specifically was what I was addressing. Reread it, and if you’re still not sure I’ll give you the benefit of posting the part to which I was addressing. However, if you guess wrong and put words into my mouth I won’t.

    You say that people have the choice of acquiring more good IBLs or not bothering, and that’s true, but people should *never* have to go out and get IBLs to be included in a top class search engine’s index.

    You keep saying that as if it’s the only reason to go out and get good IBLs. “Get them because of Google.” “Webmasters will only get them because of Google.”

    Will the less savvy ones? Absolutely they will. There will be those out there who will get IBLs specifically for Google-type reasons. There is no disputing that.

    But the more savvy and somewhat confused right now will realize that Google isn’t punishing small business webmasters…they’re actually doing small business webmasters a huge collective favour.

    For those of you who still aren’t convinced, ignore Google for a minute. Pretend as if every IBL you get has no effect on getting indexed in Google. In other words, IBLs are completely independent of big G. In fact, let’s go a step further and assume it has nothing to do with any search engine. We’ll go right back to the dark ages of 1996 here for a minute. Program your DeLorean, and take a trip back at 88 miles an hour to the year 1996. You there, McFly? Great.

    Should you still go out and get them? Should you still try to gain as many one-way IBLs to your site as possible? Let’s consider all of the scenarios that come from asking for a one-way link from a site that provides them and has some form of human approval/decline system (e.g. a directory site):

    1) Your link isn’t approved. The decline itself can be useful simply because it indicates that at least one person doesn’t think your site is a useful resource. If nothing else, this would raise a caution flag to you as a savvy webmaster, as you now know that “I should do something to my site”, especially if another webmaster lets you know why your site was rejected (spam, bad content, minimal content, whatever the reason). That’s user feedback, and that’s something any savvy website owner wants as much of as they can get their hands on.

    2a) Your link is approved, and sends you no traffic. For the same reasons as your link being declined, this is useful as well. You now know that your site is “good enough” in the eyes of at least one webmaster.

    2b) Your link is approved, and sends you traffic. So, not only have you gained an IBL, but it’s sending you traffic and potential customers/end users.

    In the case of an IBL you know nothing about, 2a) and 2b) apply…in addition, you have an even better idea of how valuable your site is as a resource.

    Now…individual IBL approval/rejection doesn’t mean much and can be misleading, but when you can go out and get yourself 1000/1500 or more IBLs without a great degree of difficulty, you’ll have a lot better idea of just where your site ranks in terms of human perception, especially if you keep seeing your site gaining IBLs that you had nothing to do with.

    As far as small business/commercial sites go, is this “difficult”? It can be. Is it impossible? No.

    So again, for those of you who are complaining because you got “screwed” by Google, don’t look at it as getting screwed…look at it as an opportunity to improve both your website and how it is marketed.

    http://www.seo-scoop.com/2006/05/30/seo-rant

    It’s not the most logically worded piece (and no it’s not mine…I don’t even know who this person is, other than her name’s Donna), but the idea is very much true.

  718. Note: I’m not crazy about the links on the side of that blog site, but the content has merit.

  719. I’ll say it again. I would not go after links if it weren’t for search engines.

  720. Dave (Original)

    Jack, you have also stated that if they were no SEs you would promote your site via forums links etc.

    Personally, if there were no SEs I would never stop chasing links.

  721. Adam.

    My mistake about your ‘before and after BD’ comment. I read it out of context – sorry.

    You keep saying that as if it’s the only reason to go out and get good IBLs. “Get them because of Google.” “Webmasters will only get them because of Google.”

    No I don’t. I keep saying it because webmasters now HAVE TO go out and get IBLs just to be treated fairly by Google. I never suggested that it’s the only reason for doing it. I object to having to do it – that’s all. *Having* to get IBLs to be treated fairly by a search engine isn’t the normal Web, and it’s very wrong.

    But the more savvy and somewhat confused right now will realize that Google isn’t punishing small business webmasters…they’re actually doing small business webmasters a huge collective favour.

    Your reasoning to back that up isn’t sound. People don’t need IBLs to tell them whether or not their sites are good. The lack of IBLs was never an indication of the lack of quality or usefulness – it wasn’t in 1996 and it isn’t now. Google hasn’t done anybody any favours with the new crawl/index function.

    But if you really want to think of things as though search engines don’t exist, you should include reciprocals, paid ads links, and affiliate links in your reasoning. They are also types of links that Google is hammering sites for. But they are all normal types of links in their own right – just as one-ways are. They also say nothing about a site’s quality/usefulness – they are normal Web links.

    No, Adam, Google isn’t making people deal with their sites as though search engines don’t exist – they are making people do things *because* search engines exist – if they want their sites to be fully indexed by Google, that it.

  722. Incidentally, Adam, the article you linked to isn’t anything to do with this topic. It’s about people thinking that their sites belong at the top of the rankings. It’s nothing to do with whether or not sites are fully indexed. People (including me) have said let the rankings fall where they may, but index the sites. This isn’t about rankings – it isn’t even about crawling – it’s about indexing.

    If you want an on-topic article, try this one:-

    http://www.webworkshop.net/google-madness.html

  723. Just as Adam tried to make a point by going back in time, I want to make a point by viewing the concept of Web.

    The Web is a huge, generic, public arena in which everyone can display what they want to display. In the beginning, people who visited the arena found things of interest by being directed from one display to another – it became known as surfing the Web. But as the arena grew larger, it became more difficult to find many of the things of interest, and help was needed to point people to what they were looking for.

    Help arrived in the form of directories and search engines. These utilities were like tourist centers in large cities – they were there to point people to displays that they wanted to find. They did the job very well, by including directions to every display in the arena as well as they were able. They even included a small amount of information about each of the displays, just like tourist centers.

    That’s an overview of the Web, and of the role of a search engine within the Web. Just as a tourist center will try to include details of every decent facility in a city, simply because they are there, and because each of them may suit the needs of some people, so search engines tried to include details of everything as best they could.

    Nothing has changed since then. The arena has grown, but the role of the search engine hasn’t changed. Search engines are still the equivalent of tourist centers, and it is still their role to point people to displays within the public arena. Since search engines arrived in the arena, people who put new displays up only had to register them to be included in the engines’ lists of displays to point people to – just like a tourist center. That’s how it has always been, and that’s what the Web population still expects it to be.

    Most search engines still perform that role as well as they are able, but not Google. Google has decided to stop pointing people to certain displays if the displays don’t fulfill certain criteria. Instead of being impartial, as people expect of a tourist center/search engine, they have now become the judge of the displays that they will direct people to. In other words, they have abandoned the impartial role of a proper search engine.

    But people who use Google’s utility, don’t yet know that Google isn’t doing their best for them any more. If they did, they would use a different utility that would provide them with the impartial information that they want and expect. It doesn’t take very much for a top search engine to become history. It happened to AV when the buzz about a new search engine called Google spread around.

    Having said all that, I do believe that BD is Google’s way of trying to do the best they can for the Web’s population, because I believe that the new crawl/index function is intended to deal with link pollution. But the way that it’s worked out in practise is bad for the Web’s population, and if Google accepts it, then I would have to believe that they are no longer interested in providing a top service, and that other utilites (search engines) offer a much better service for people.

    The Web is a huge, generic, public display arena, and search engines are the equivalent of impartial tourist centers. When they become partial, as Google has done, they stop being the good search engines that people want and expect.

  724. Actually, Phil, the article in a lefthanded way is on topic. What we have right now is actual value vs. perception of value.

    And, although the article does refer to ranking specifically, the same logic can apply to indexing. We have a series of webmasters who think their stuff is great, and a search engine that, via an objective algorithmic process, disagrees. The only difference is that it refers to ranking instead of indexing.

    That’s the point I was trying to make…that a lot of this is emotional bias. The phrase about the blinders is especially true (although a bit harsh possibly).

    I also never said that IBLs were an indication of lack of quality or usefulness in and of themselves. What I said was that IBLs that require manual approval can be an indication of quality and/or usefulness. One of the biggest problems any webmaster faces is trying to get feedback of any kind as far as the overall quality of the site. Getting IBLs (or not getting them) from sites that require manual approval is another way to possibly gather this information. Is it 100% accurate? No. Is it accurate enough to give a general idea of how a site may be perceived? Yes, and that in itself is worth something.

    The same thing would apply to IBLs that a site receives that it never even asked for or did anything to include. If Dave sees an IBL to his site from another site in their blog, or Matt sees his blog linked in a CNN article, or I see a page on my site hyperlinked from a PHP board, then we know we’re building something others find useful. The lack of these IBLs does not imply lack of usefulness, but the presence of these IBLs does imply usefulness.

    A lot of you seem to think that search engines are the only good source of traffic. They’re not. Do they send a lot of traffic? Yes, they do. But can other sites send a lot of traffic as a collective? Yes, they can.

    Take advantage of every resource possible, not just search engines. And when they do change their algos, the damage will be minimal, if any. If you don’t, you’re cheating yourself.

  725. I’m not very good at reading left-handed, Adam πŸ˜‰

    I can see what you’re getting at, but I haven’t seen anyone suggest that their stuff is great – just that their stuff is there, is non-spammy, and therefore merits being included. For instance, Nancy didn’t say that her site is great, and nobody suggested that it merits being excluded. Also, I don’t think that anyone would judge the site as lacking in interest and usefulness for some people. I certainly found it interesting because I used to do quite a few Eagles songs, but I didn’t know anything about them, so reading about the band on that site was very interesting for me. It’s a useful resource.

    I agree that some IBLs can be an indication of a site’s quality or usefulness, and from that point of view, it could be said that Google has done everyone a favour, but that would be by accident, and not by design. Generally speaking, people don’t want that favour from Google – they just want their sites to be treated fairly.

    I agree that people should not rely on search engines for their sites’ success, and people should always try to find other sources of traffic, but search engines provide the bulk of Web traffic, and all websites should be treated equally fairly by them. Judging a site’s worth by its IBLs and OBLs is not a fair way of treating a site – it’s very unfair, and it’s wrong to be so unfair. That’s all I’ve been saying.

  726. Good points about ranking/ indexing issues PhilC.

    Adam – previous post

    ‘Now…individual IBL approval/rejection doesn’t mean much and can be misleading, but when you can go out and get yourself 1000/1500 or more IBLs without a great degree of difficulty, you’ll have a lot better idea of just where your site ranks in terms of human perception, especially if you keep seeing your site gaining IBLs that you had nothing to do with.’

    This statement raises a number of very important issues which I believe are at the heart of these issues/ blog.

    An important point about IBL is that they are appreciated and understood by webmasters and other web-savvy people, particulary people who regularly visit blogs like this. But they don’t mean diddly squat to the average web user or the owner of a small business. If you ask a sample of web surfers how they would rank a variety of websites they have visited I am sure they would mention things like; good, relevant content, pleasing design, easy to navigate, call to action – ie where to get more info. etc. but I bet none would mention the number of IBL, or OBL for that matter. (This would make a very good research project for somebody – a G intern maybe;-) So, IBL are not a natural, or popular way of juding a websites quality except by web insiders, such as webmasters.

    You say that people can go out and get 1000+ IBL’s without difficulty – I disagree entirely. I’ve worked with several small businesses, in a marketing role and each business had around 30 – 40 links, which were acquired over two/three years. To get 1000’s of links is next to impossible for many companies/ websites. One area of business was specialist vehicles – minibuses etc – there around 30 – 40 similar companies in the UK and most of those are competitors. So how do they get 1000’s of links. You are forgetting that you and many others on this blog are web insiders – who know their way around the web and have contacts with which to find links but you are not typical of the average website owner. If anything, SEO companies and similar, should be down graded in terms of links as they have an advantage over other types of company (this will be popular!!) that are not concerned directly with the internet. By insisting on higher grade links and not reciprical links, google is acting unfairly with regard to smaller, non-computer/ internet sites IMO.

    Another point that I take issue with is that – ‘you’ll have a greater understanding of human perception’ – presumably of the value of your website. What you are talking about is the perception of other webmasters – not ordinary surfers! You are suggesting that if you get 1000+ links then you are seen as having a useful, quality resource – but by who – other webmasters! NOT by members of the public and that is the issue that you have missed. Your site obviously has value for other webmasters and people interested in building websites but an ordinary surfers would probably find it a bit dull. The fact that other web-people find it intersting results in loads of inbound links. But what happens to sites that are not web/ seo/ blog orinetated, they will find it much harder to find good quality links. How is a local artist or photographer going to get thousands of links? Should they then be excluded from the index?

    Perhaps you’re spending too much time on the web!

    As PhilC said above, I’m not personally too worried about rankings, I don’t expect to rank on page one for popular search terms but I do expect to be fully indexed. This is the main point. Much of my traffic comes from direct sources – people find the site useful, – if they can find it in the first place!

  727. I agree that people should not rely on search engines for their sites’ success, and people should always try to find other sources of traffic, but search engines provide the bulk of Web traffic, and all websites should be treated equally fairly by them. Judging a site’s worth by its IBLs and OBLs is not a fair way of treating a site – it’s very unfair, and it’s wrong to be so unfair. That’s all I’ve been saying.

    The former generalization, while it may be true right now is changing. People are finding their favourite sites/niches (e.g this site) and are using them more and more. I’m personally finding more and more traffic from sites that in some cases, I’ve never even heard of.

    And if judging a site’s worth by OBLs and IBLs is unfair (and on this one, we’re just going to have to agree to disagree), what is a better alternative?

    “Because it’s there” doesn’t work because of sites that are under construction, scraper sites, doorway pages created for SEs, mirrored copyrighted content, and other reasons (but these alone are enough).

    Using the Toolbar wouldn’t work because of backlash over privacy concerns, and because as the Alexa toolbar has shown us, it can be very easily manipulated and will have a tendency to lean toward web-design-industry-related sites (as in all aspects, including SEO).

    Social bookmarking…well, that’s an IBL of a whole different colour, and it’s still very much in its infancy.

    So what else is there? What else could possibly be used that would be any better?

  728. On page content:)

  729. So what else is there? What else could possibly be used that would be any better?

    I completely agree about the scrapers and other page types that you mentioned – I said that way up the thread. And I’ve also said that Google has to do something about link pollution. They don’t really have any choice if they want to stem the deterioration of the serps. So we probably agree about the overall aims of the new crawl/index function. Where we disagree is the method used. We may also disagree if the new function is also intended as a means of preventing the available index capacity from filling up too quickly, as I think it might be.

    So what’s the alternative to the current method? I suggested one at the top this thread – don’t count certain link types for rankings – remove them from the index – but don’t use links to decide how much or how little of a site to index. That way, unwanted links won’t benefit a site because they will fail in their purpose. At the same time, all pages will at least have a chance of being found in the serps in the long searchterms tail. I know that it’s far from ideal, but it would prevent unwanted links from getting high rankings for anything even partly competitive, which is where the spam is targeted, and it would allow the pages of honest sites to rank well in the long tail. In short, discount certain link types for rankings.

    Another alternative would be to really work on profiling certain types of spam pages, so that they can be dropped. Profiling links to pages, so that bad pages and sites can be dropped is another alternative. Google already uses plenty of profiling, so they could work on those.

    I do believe that there are ways of targeting bad pages and sites that are different to what they have done, and that would hit bad pages and sites without hitting many good ones. They may be a long way from being reasonably able to profile certain types of spam pages, but they could quickly change their new links evaluation so that it hits rankings, but not indexing. Drop the unwanted links – remove them from the index so that rankings can never be affected by them, but using links to decide how many or how few of a site’s pages to index is wrong.

    There are alternatives.

  730. Correction:

    Profiling links to pages, so that bad pages and sites can be dropped won’t work, because it would mean that anyone could take out anyone else’s site.

  731. I have a very small site with around 25 pages, all pages contain original content and the html is 100% compliant, I have less than 45 obls, 90% of my obls are on my links page, 90% of my l links are nofollow. Pagerank is 4 for most of the main pages, dropping to three on subpages, and no pagerank for pages created in the last few months. All IBLs point to the index page. Also the site has been submitted to Google Sitemaps.

    My question is: A few months ago, Google indexed all my pages, why has it suddenly de-indexed 9 of the 25 pages. My only observation is that these 19 pages have less internal links compared to the indexed pages.

    Any Ideas?

  732. I have a very small site with around 25 pages, all pages contain original content and the html is 100% compliant, I have less than 45 obls, 90% of my obls are on my links page, 90% of my l links are nofollow. Pagerank is 4 for most of the main pages, dropping to three on subpages, and no pagerank for pages created in the last few months. All IBLs point to the index page. Also the site has been submitted to Google Sitemaps.

    My question is: A few months ago, Google indexed all my pages, why has it suddenly de-indexed 9 of the 25 pages. My only observation is that these 9 pages have less internal links compared to the indexed pages.

    Any Ideas?

  733. Dave (Original)

    The idea of sites linking (as votes) shot Google to fame all those years ago. I know that Google is now using inbound links for a reason to, or not to, index sites. What is that reason? I don’t know for sure (sure wish some others would admit to that). However, I would hazard a guess that due the sheer volume of pages out there is has been FORCED (at least for now) into using a criteria for indexing. I have almost no doubt that Google is STILL working on ways to index ALL pages out there.

    If Google (who know ALL the facts) sees inbound links as a valid criteria for indexing all pages of a site, then so be it. SOME us have known for years that NOT all links are counted as votes.

    Biting off ones nose to spite ones face has never been a viable option for the business savvy πŸ™‚

  734. Dave (Original)

    Jerome, without seeing the site it’s hard to say. However, Matt has stated that he *believes* more quality links into a site will result in better/deeper indexing.

  735. Comment back to Matt on recent Google Search Results / Big Daddy Problems…

    My site has been chopped up so bad by the recent Big Daddy update problems, that it’s simply destroying my site’s reputation and my personal economy along with1000’s of people who count on me to help them find jobs. I’m sure it is has effected many webmasters the same way. I only wish that Google would at least make some official public statement that the search engine service is temporarily disrupted due to updates and the results diplayed may not be relevant… It’s a shame this happened… Did Google make a back-up copy of the old version that worked?

    Today on Jume 6th I see a ton of the old spammy sites like superpages.com ranking high again. Do you realize that they have over 5 million pages indexed by Google? Most of those are the typical junk Superpages: Widgets… Find Widgets at Superpages and get the best deal on widgets blah blah blah. They’re nothing but junk.

    By contrast, my site is comprised of just a couple of thousand very relevant highly focused and on toopic hand built pages void of AdSense or RSS, affiliate links and so on. The typical fodder that makes up the majority of sites these days. I know you like facts with your complaints, so I’ll give you a few, which testifies to the legitimacy of the site:

    A) My site, a nationwide directory of mobile entertainers -http://freedjamerica.com is 4 years old, hand built and comprised of 1000’s of disc jockey listings and other party services which lately has been cut down from the usual 2000 something pages that were indexed at Google, to only 588 and that number seems to shrink daily.

    B) My site is not a spam site, landing page site, MFA made for adsense site, etc. It has no cloaking, hidden text, redirects, etc. ALL THE CONTENT 100% was manually published by me. ALL the listings in it are hand edited. ALL the listings were all submitted and processed by real people. The site is completely static with no dynamic elements whatsoever. It has no database no AdSense, no auto generated pages or RSS feeds… It’s just a simple hand made static site.

    C) I have done everthing I can do, to make this site acceptable and it has shown up great on Google for years, yet now it seems like I’m being punished for having followed Googles ethic guidelines, as I see sites like superpages.com once again start climbing up the rankings.

    In Summary, What gives? I thought this update was supposed to clean up spam, not create it! Or is this simply about AdWords profit as so many have suggested on the forums. EBay cloaking, hidden content and empty landing pages from all the big sites like superpages gazillion subdomains, concert ticket sites and others. These huge sites are notorious for just catching traffic, without providing any useful on topic content whatsover. Granted someone were looking for a 1932 rca radio, or something in particular then having a site like EBay pop up makes perfect sense, but for the zillion other searches for “regular stuff” how does Google allow the spam sites to rank so high? Is that REALLY what Google is encouraging webmasters to do? OR is it conveniently allowed or admitted because those companies spend huge money on AdWords?

    Google has very little worthwhile support, as all the documentation is put through the Google blender, seemingly randomizing the words in your articles and support FAQ’s so that it renders them void of any clear instruction. The supoport forums are ridiculous because whoever the support people are rarely if ever respond to anyone’s posts. The support form submission apparently delivers all support requests directly to the recycle bin, because of the past 4 I’ve sent in I’ve heard nothing back at all. I hope you’ll seriously consider the implications this has caused for so many hard working webmasters and small business owners such as myself. I have spend at least 200 hours per month on my site and it’s incredibly depressing to see Google’s irresponsible release of Big Daddy destroy perfectly legitimate search results for high quality original content sites like mine.

    PLEASE, PLEASE, PLEASE Post some response or at least reply to me in person. You have my email address.

  736. Reflecting on the impact of the recent implementation I wonder whether the Google team have considered the long term impact of their changes.

    The increased emphasis on IBL’s to get indexed creates a barrier to new sites getting indexed and hence reduces their visibility. Lack of visibility will impact their ability to naturally acquire IBL’s.

    Does this mean that Google’s index will become stale with fewer new entrants, with new content, and will it become a club for established sites?

    Secondly for those sites currently in the index ….. are their IBL’s being impacted by sites with few IBL’s dropping out? As sites drop out do their outbound links cease to count and hence do some other sites loose IBL’s as a consequence?

    If this is true, each iteration of the process sees established sites loose IBL’s and some will ultimately fall below the IBL volume criteria and they drop out of the index.

    For these reasons I do not consider that the recent changes can be sustainable in the long term and Google must have a rethink, or we will learn that this was a stopgap change to address some short term problems with a more considered set of changes to come.

  737. Is anything in Big Daddy going to be done to this sort of cloaking/redirect Spam:

    I have noticed it a few times and it is not the normal form of cloaking where a user is served a different page to the search engine, when they type in the URL to the address bar, but a different page is served when the use clicks from the search results.

    One example I found is below:

    Look at the search results for : http://www.google.co.uk/search?hl=en&safe=off&rls=GGLG%2CGGLG%3A2005-47%2CGGLG%3Aen&q=hen+parties&meta=

    Currently in approximate position 8 is an entry for http://www.aceweekends.co.uk – you can go to the site direct by typing it into the tool bar. However when you click the link from the search results it goes to http://www.aceweekends.co.uk/choice.php – this just shows two links to affiliated websites (which are even on the same IP address!!).

    If you dig a bit further under the surface you can find even more suspect stuff going on:

    First check the backlinks on Google:

    http://www.google.com/search?as_lq=www.aceweekends.co.uk&btnG=Search

    Here I see a grand total of 16 links (not much for a site that is in the top 10 for a highly competative search term – so whats going on?) .

    Now lets try the same on Yahoo:

    http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.aceweekends.co.uk&ei=UTF-8&fr=sfp&n=20&fl=0&x=wrt

    Yahoo shows a total of 441,000 Links !!!

    So check a bit deeper and if you check some of those yahoo links you find that they are almost all derived from Digital Points Coop.

    So although it seems that Google has not counted the Coop links in its backlink results, it does seem to be counting them in its alogrithm (hence the high position in the google SERPS).

    So this seems like a simple but very effective way to spam google, without the possibility of the main sites you are trying to target being penilised:

    – Set up a very basic website about your topic
    – Point a huge weight of links to the site using your key terms, using something like digital points COOP.
    – Wait for the Search Engines to give you the high rankings.
    – Put a script onto the site to detect if the visiting user is coming from one of the main search engines
    – If the user is coming from a Search Engine result set, either redirect them to the target site, or alternatively server them another page with in your face links to click to your targetted websites.

    Pretty simple hey! – And the best part is the sites you are really trying to target are clean, so even if google penilises the doorway site it doesn’t matter as it was just something you knocked up in a few hours anyway – so you can just set up a new doorway site and do the same thing.

    I am not advocating this at all, but I am hoping by putting this in the public domain and advising people close to Google, that they will close this loophole down before it gets out of control!

  738. Man. This is all confusing.

    If I make a few pages on my site full of amazon affiliate links to potentially generate a bit of revenue – that ARE related to my site content – will I get penalized? NOT on the front page or anything – my site is legit.

    My site is doing great in the Google search engine again, although – pre Big D, it seemed to hide in the sandbox, now we’re back.

    Crap. Somebody save me. This stuff goes over my head at times. Maybe I’ll put the affiliate links in my blog instead.

    Any advice out there?

  739. I heard Matt took vacation as a “cooling off” period and to reflect on whether or not he wants to retire. This happened about the time he found out Google no longer cares about search quaility since they are two busy trying to release new products that having nothing to do with search every other day.

  740. Thx for the reply Dave, completely agree with you that due the hugely increasing volume of pages out there, Google has been forced into indexing less pages per site, which is a great pity but hopefully they will get it sorted soon.

    Ye I have been requesting more IBLs in the last few weeks, although Im not convinced its the best indicator of a quality site, Id far prefer to be concentrating on writing new interesting content than requesting links.

    The good news today is that my site has jumped up to no1 for our main keyword phrase (from 31) (the same thing happened for just for one day, last month ? so I wont get too excited!)

    The de-indexed pages are still missing though!

  741. Dave (Original)

    Jerome, my site too drops by about 20,000 + pages at times. They always end up coming back though.

    Likewise my SERP position. That is, I notice a page jump onto page 1 only to fall back (after a refresh) to page n. The good news is that it’s normally an indication that the page will *stay* on page 1 after sometime (weeks normally).

    RE: “Ye I have been requesting more IBLs in the last few weeks, although Im not convinced its the best indicator of a quality site, Id far prefer to be concentrating on writing new interesting content than requesting links.”

    I can only assume Google knows what they are doing and are the ONLY ones with the BIG picture in mind. I too spend most time writing content for my site and, although it’s slow, it is the best way IMO to obtain good quality one-way inbound links.

  742. Hi Matt,
    Let me first thank you for taking the time to write about the ins and outs of Google. I think at least 80% of us want to play the game correctly, its just that, we don’t always know what Google considers “correctly”.

    A good example is my latest, “ah oh”. I built a website for a friend and somehow google found it, indexed it and even pank ranked it. Normally this would be good thing. However, I didn’t know all the google rules, so I copied the site, mysql and all and moved it the companies URL and took it live. Deleted all main pages from the test site, and placed a url to the new site on the old site home page. The new and actaul site is not being indexed, of course.It’s probably because its 100% duplicate pages. Well there is a lesson learned the hard way — working to clean up my mess.

    But now for my main reason for writing.

    Google Sitemap: When I make any major content changes to a page, I resend my sitemap, so the proper new wording is indexed and people find what they are searching for.

    However, there is so little info I know about the process. After I submit the new sitemap to reflect content changes how long before the page’s content index is updated?

    Does google just grab that one page? or do they refresh the whole site?

    After google dwonloads the sitemap — what happens next? … at google?

    Best regards,
    Jeff

  743. Matt,
    Today DCpages.com is completely wiped off your search. I can see it in my stats and could not even find it in your search results. Besides the Washington Post and Washington DC Convention Center I would say we deserve to be on the first page for Washington DC. I do not know if you read your own blog, but can you please give me a reason why we just keep dropping. When I call your staff says that we are not being penalized. Obviously that is not true. All I just want is a straight answer. How did we drop after 10 years to nothing on your search. We have followed every rule and been community centric in our focus. According to local Washingtonian expert the total links for http://www.dcpages.com are 389,635.

  744. Luke,

    You’re wasting your time on two levels:

    1) Matt’s on vacation.

    2) Matt’s already stated that he won’t answer specific questions on sites because of the number of webmasters that would ask and fill the blog full of crap.

  745. Hi Matt
    I realise that cannot posibly answer every question asked of you would you try on this one please. I have seen it asked before, but not answered. In Googles sitemap program lots of people are getting “Potential indexing problems: Some of your pages are partially indexed”. Could you please explain what this means ? Is it because of “invalid html” in the page ? or because i havent got enough links ?

    All the best
    Alan

  746. Dave (Original)

    RE: “Today DCpages.com is completely wiped off your search”

    According to Google you have 88,000 pages in Google.

    RE: “Besides the Washington Post and Washington DC Convention Center I would say we deserve to be on the first page for Washington DC”

    You and the other 10000000 shooting for the same phrase πŸ™‚

    Google doesn’t answer questions on poor rankings. If they did, oh boy oh boy…..just imagine!

    BTW, you are at about #80 in Google SERPS for “Washington DC”. To be blunt, I cannot see why your homepage should rank on page 1 for “Washington DC”.

  747. Dave,

    First I want to thank you for taking your time to answer my question. I do appreciate that.

    You are probably right that many sites are shooting for the same phrase. But, If you look throughout the 80 (farther down on my browswer) the results list few web sites focused on promoting the District’s history, tourism, and community like DCpages.com. In the past we promoted our community. Now its our community that is promoting us. While I am grateful for that, that is not our mission.

    In the past Google had all our pages of written content, now they are gone.

    Look at todays search results for Washington DC. These four actually deserve to be in their spot

    Washington DC Convention and Visitors Association
    Washington, District of Columbia
    Washington Metropolitan Area Transit Authority
    Washington Post

    Here examples of others that do not belong at all.

    Ex.
    weather.com – Page Not Found
    Seattle Post-Intelligencer
    SmarTraveler Legal Terms…

    I realize that DCpages is a small guppy in Google’s world, but we have a staff that really cares about the District of Columbia. I can only imagine how difficult it is to write an algorythm that makes everyone happy. I also know it is very difficult for an engineer that has no relationship to what we are doing to have any empathy to our publication. But in the past we were proud of our search results with Google. Now all our traffic comes from MSN, Yahoo, and ASK. In the past Google was above them. Now maybe we get very little. I have been asking this same question for months and still get no answer. What are we doing wrong?

    Adam,
    Thank you for responding to my question as well. I really think it is important for people that have web publications to understand what has changed in Google. At this point would you agree that Google is the most used search engine? Would you agree that organizations focused on promoting their city that Google is the best tool to use? Does Google strive to find the organizations that are focused on information on a specific topic?

    I just want to be clear here. In no way am I shooting down the company that has helped our community publication in the past. I am just looking for answers to the future.

  748. I agree with you, Luke. I think it is important for everyone to understand what has changed in Google. The problem is that if they tell people who have legit sites, they also tell those who have illegit sites and that’s what imo this was all about in the first place. There are some things that should be left to secrecy and speculation, and this is probably one of those things.

    As far as the results you’re seeing goes, I don’t see those at all. I see 10 resources in the city of Washington (Smithsonian, their National Zoo, a sightseeing map). It isn’t until about the third page where I see something that I would consider to be irrelevant (germany.info).

    Buuuuuuuuuuuuut (since I’m about to get flak on this)…

    Different datacenters return different results at different times, and what you see won’t be what I see won’t be what Dave sees won’t be what…you get the idea.

    I wouldn’t necessarily agree that Google is the best tool to use for sites that promote city tourism. One thing that tends to piss me off sometimes about big G is the inability to find certain entertainment-related things in my hometown of Toronto. I even have this bizarre ability to find gay-and-lesbian-related sites looking up innocuous things. (Side note: I have nothing at all against gay people, and if that’s your orientation I’m cool with that. I just don’t want to know about your Naked Men’s Club when I’m trying to find a bowling alley near me, that’s all.)

    In other words, the traffic you get from Google is a bonus, but it’s not something that can and should be relied upon. Google is a free resource, and subject to change pretty well whenever it wants. If you’re relying on so much traffic from a resource like big G that a change in the algo is hurting you so badly that you need to ask why your site isn’t ranking, then you may well have a much deeper issue.

    You may also want to consider that your site bears certain similarities to the portal pages offered by all 3 major engines (in Google’s case, you have to be logged in and have a Personalized home. Theirs is the least similar, but there are commonalities). To a small extent, you’re a competitor. If you were a major search engine/portal, and a competitor came along with a more specialized regional portal, what would you do? “Hello, ant. Let me just get my brand-new wingtip and go SQUASH!”

    I’m not saying any SE does do that. I’m saying it’s a possibility that it may be algorithmically detected now. A very slight one, but stranger things have occurred.

  749. Luke: I should clarify that I’m neutral to your site personally. On the one hand, it’s got a lot of information on it. On the other, I find it cluttered and slightly difficult to navigate.

  750. Saw a BigDaddy complaint today in an SEO e-mail newsletter I get regularly. The author of the complaint trashed Google asserting that it was like PPC results. I couldn’t disagree more.

    Yahoo/Overture allows me to pay to rank higher in the organic listings, through “search submit express”. I think MSN may also allow this, but their adCenter is too much of a hassle, so I haven’t bothered to figure it out. I do give MSN credit for recognizing and indexing another one of my sites more quickly and consistently that G or Y.

    My main site generally rank as well or better on those other two engines, and in particular for things where I shouldn’t rank well. For example, both give me good rankings for “dwi” or “speeding tickets”, terms that do not include location names. I am a local lawyer in the Albany NY area, handling mostly cases in this area and some other parts of upstate New York. I shouldn’t rank well for a non-location-specific term like DWI – that should go to MADD or some other national organization that deals with DWI. (What’s funny is that the #1 site on G for that search relates to drinking water quality in England, and #2 is about “Dancing With Intensity”.)

    I do well on all the major engines for location-specific terms (i.e. Albany DWI), and that’s appropriate because I am one of the major DWI lawyers in Albany and I have a strong website with a lot of useful content, and have contributed content to other relevant sites.

    But in my mind it’s a negative reflection of the other engines that they rank me high for terms that are not location-specific. I appreciate the traffic, but at the same time it’s lower quality traffic (someone in California who searches for DWI is not likely to be looking for a DWI lawyer in Albany NY). But in kind of a Groucho Marx attitude, any engine that would rank me high for something like that is not an engine I would want to use. I give Google credit for not ranking me high on the terms that are not location-specific.

    So keep up the good work. And don’t start ranking me high for terms that aren’t location-specific. I’d lose respect for ya!

    Warren

  751. Dave (Original)

    RE: “In the past Google had all our pages of written content, now they are gone.”

    By “gone” do really meam not ranking as well? I ask as I see 88,000+ pages in Google for your site.

    One thing I will say, and it’s about the single biggest mistake most Web sites make, is that one should never rely on one phrase for most of their traffic. It’s far better to get 1000 hits from 100 pages, than 1000 hits from 10 pages.

  752. True but it does no good if the traffic is coming to pages that Google hasn’t updated since 2005:P

  753. My earlier post on 6/6/06 , asked the question why Google had de-indexed some of my pages, since then all, but three of the pages are now back in their index. My observation is that these three pages have the least internal links, so fair enough, easily fixed.

    Completely agree with β€˜The Adam That Doesn’t Belong To Matt’, whatever improvements Google make, they cannot give too much away, as it will make it easier to manipulate SERPS.

    My opinion is that you can moan about not being indexed by Google but you shouldn’t moan if your rank is not what you hope for.

  754. Dave (Original)

    RE: “True but it does no good if the traffic is coming to pages that Google hasn’t updated since 2005:P”

    Huh? Not too sure I know what you mean? The only pages Google updates is it’s own. User don’t look at ‘cache snapshots’.

    I will say that it can do *no harm* and Google isn’t the only SE out there. The more pages you have in SEs focusing on different keywords/phrases the better insured you are against algo tweaks and SE fluxuations in general.

  755. Dave (Original)

    RE: “My opinion is that you can moan about not being indexed by Google but you shouldn’t moan if your rank is not what you hope for.”

    Why the difference? Google doesn’t owe anyone rank or indexing IMO.

    Matt has stated time and time again that poor rankings and indexing etc are mostly down to the site itself.

  756. You can tell if Google is showing pages with different title and differnet description. Obviously the user won’t find what they are looking for when Google doesn’t update these in a timely fashion.

  757. Dave (Original)

    If Google is showing pages with titles that are over 1 year old, then the site likely has big problems, or nowhere near enough links. However, I would say the problem you describe is an rare exception and certainly not the rule.

  758. It seems to be the rule with all of my sites. On my main site I get 60% from Google but 90% of that is supplemental results.

  759. Hi Luke,

    I didn’t look at your site before, but since it’s become the topic of a few posts, I had a look, and to be honest, I can’t really find much in it except empty frameset pages that pull other sites into the main frame. The site seems to be mostly a directory. I was looking for articles, information, etc. about DC, and I couldn’t find any. I’m not saying that the stuff isn’t there – just that I couldn’t find any of it, and almost everything I found was a directory – and all the listings link to empty frameset pages.

    I.e. most topics go to a directory structure. Every directory listing page has several listings, each of which links to an empty frameset page, which links (src) to a virtually empty top frame page – all of which are identical. So every listing page leads to several empty frameset pages, plus several virtually empty and identical top frame pages. That would make the site consist of predomonantly ‘nothing’ pages.

    Without studying the site is great detail, I would estimate that most of the pages that the engines have of the site are empty frameset pages and virtually empty, and identical, top frame source pages. A person might find the site to be very useful, but how would a programme view it?

    Have I misunderstood the site?

  760. The Irony of Google and Links – an overview.

    Before Google came along with their links-based rankings, websites linked to other websites whenever it was suitable, and the Web worked just fine. Then Google became very popular and people stopped linking naturally. Instead, links were added or not added according to how it might affect the rankings in Google, and in the other main engines that had copied Google’s links-based system. People stopped linking to other sites unless the links were reciprocated, etc.

    Google’s links-based system relied on the natural linking of the Web, and the system produced much more relevant results than the other engines of the time, but it had the unfortunate effect of destroying the natural linking of the Web – the very thing that their system relied upon. The destruction of natural linking in turn caused the quality of Google’s index and rankings to deteriorate. Google’s index and rankings self-destructed.

    Google caused the deterioration of their own index and rankings. Nobody else is to blame. When a search engine states that it ranks pages according to certain criteria, as Google did, then nobody can be criticised for seeking to fulfill the criteria.

    So now we have the ludicrous situation where, because Google destroyed the natural linking of the Web, they now suggest convoluted ways in which websites can get natural links – unnaturally! Write articles, write blogs, and create a buzz, are some of the suggestions. If every website owner wrote articles, or wrote a blog, then it would all even itself out and nobody would gain anything. And it simply isn’t possible for most websites to create a buzz.

    The whole thing it utterly ludicrous. What’s done is done, and we have to live with it because the clock isn’t going to be turned back. But the whole mess is Google’s fault, and no fault can be apportioned to anyone else.

    I’ve no doubt that Big Daddy is Google’s way of starting to sort out their own mess. It just happens to be the wrong way, imo.

  761. Dave (Original)

    RE: “It seems to be the rule with all of my sites”

    Jack, not wishing to be rude here, but it sounds like the common link is you!

  762. I use no trickery in my sites at all. The fault is definitely Google’s as no other engine has a problem.

  763. Dave (Original)

    RE: “The fault is definitely Google’s as no other engine has a problem.”

    Oh, ok πŸ˜‰

  764. Cracking the google code.
    Found some β€œmust read” stuff about the google patent.
    Finally i understand what is happening ! Let me know what you think.

    http://www.wwwcoder.com/main/parentid/285/site/5033/266/default.aspx

  765. Jack. You occasionally find the odd person who twists every whichway to show that Google does nothing wrong, but it’s best to ignore them and look at the realities. You’ll probably find the answer to your sites’ problems at the top of this thread.

  766. all my sites index pages are pretty much “not been cached since may 26th” at latest. perhaps google is about to roll out new results.

  767. I use no trickery in my sites at all. The fault is definitely Google’s as no other engine has a problem.

    Presenting affiliate hyperlinks on a site that uses storefront software as if they’re your own products and services doesn’t really qualify as “no trickery”, dude. You don’t want to hear that because it takes money out of your own pocket, which is understandable; but if you were an average user looking to buy the stuff you have online, would you not prefer to stay on the storefront the whole way through until your credit card was entered in?

  768. Look a little closer. You’ll see that there is no cart button on the site. I just use the software to present the links, but the visitor is easily able to tell it leads to the merchants site, just like every other affiliate site out there.

  769. Presenting affiliate hyperlinks on a site that uses storefront software as if they’re your own products and services doesn’t really qualify as “no trickery”, dude.

    You are totally wrong, Adam. Affiliate links are nothing whatsoever to do with trickery, as the word “trickery” is used concerning search engines. You may personally have an issue with affiliate marketing, but that’s just you. If you don’t understand the word “trickery”, as it is used concerning search engines, do a search for the phrase, “search engine spam” – you can learn a lot πŸ˜‰

    Just to set you straight, the thread is about the effects of Google’s Big Daddy update on websites, and is nothing to do with what users may or may not think of a site. Furthermore, you are unable to speak for users.

  770. Dave (Original)

    Alan, nice article over-all. The only thing I will say is, just because Google has filed a patent (1000’s I’ve heard) doesn’t mean they are using it. Now (for the troll out there), that’s not to say they aren’t using it, just that it’s not an automatic progression.

    Jack, I’m curious. Do you want to resolve your site issues, or be right? If the former you should understand that Google can easily live without your site being indexed. Can you?

  771. The only thing I will say is, just because Google has filed a patent (1000’s I’ve heard) doesn’t mean they are using it.

    At last – something from you that I agree with, Dave – except for the “1000s” πŸ˜‰

    Filing a patent for something doesn’t mean that it will ever be used.

  772. Sure I want it indexed just don’t say it’s because it’s an affliliate site:)

  773. Dave (Original)

    Oh Jack, I think you troubles are deeper than just that πŸ™‚ When you buy the papers on Monday to read about your favorite sports Jack, would still by *more than one* if the all papers were using the same reporters writings?

  774. Depends how they are arranged. Not that I read papers LOL Unless it’s to get the coupons and ads:)

  775. I also forgot to say that coupon sites have all the same coupons yet you don’t see a problem there:)

  776. You are totally wrong, Adam. Affiliate links are nothing whatsoever to do with trickery, as the word β€œtrickery” is used concerning search engines. You may personally have an issue with affiliate marketing, but that’s just you. If you don’t understand the word β€œtrickery”, as it is used concerning search engines, do a search for the phrase, β€œsearch engine spam” – you can learn a lot

    And once again, Phil, you have no idea what I’m referring to. If I were talking about SEs in that case, you might have a point.

    But…I wasn’t. My comment was strictly from the viewpoint of the end user, and really had nothing to do with search engines.

    Look a little closer. You’ll see that there is no cart button on the site. I just use the software to present the links, but the visitor is easily able to tell it leads to the merchants site, just like every other affiliate site out there.

    Jack, it’s obvious from that comment that you haven’t built a lot of sites. And that’s okay. We all start out having built 0 sites.

    But take this comment from someone who has built over 50…there is nothing that can be assumed as it pertains to a visitor. Your site visitors are capable of being extremely tech-savvy and being able to determine the affiliate links (as most of us would have on first glance), and they’re just as capable of being easily confused by the different sites they end up going to and wondering where Jack’s Retail went and how come the pretty colours went away and who the hell is bentgear.com and why am I there?

    I’ve even seen one man who couldn’t figure out how to navigate past MSN.com from Internet Explorer. I’d go up, look at his machine, and it was fine. He’d complain a day later.

    Finally, I sat him down and got him to show me exactly what he was doing. Turns out he was clicking the X because he thought that maximized the window (since X is in Maximize, you see).

    The moral of all this is never to assume any intelligence from visitors, as negative an attitude as that is.

    If your site grows in popularity to any degree at all, you will answer some of the dumbest questions in the history of mankind. Things like, “do you sell left-handed safety scissors?” or “when will you be adding XXX product which has no affiliate hyperlink?”

    “Every other affiliate site” does it isn’t really a justification for doing it yourself, either. How would doing what every other affiliate site does distinguish your affiliate site from Cliff’s affiliate site from Michael Wong’s affiliate site (obscure inside joke here) or anyone else’s?

    Understand that I have nothing against you personally. You seem like someone who’s probably an okay guy and just wants to make a few bucks and is pissed off because some of it got taken away. I’m just trying to point out to you that you may need to open up your eyes a little bit and see it from different standpoints besides your own.

  777. Anyway…the reason I came back (before I go to bed) is because my evil little twisted psychotic disturbed brain has managed to come up with a rather unusual outside-the-box idea to deal with most (if not alL) of these issues.

    I asked earlier if there was a better alternative to IBLs…and…well…someone came up with an answer. The weird part is that the someone was…me.

    Anyway…here’s my twisted idea and since it’s almost 5AM, I’m gonna try and sleep:

    http://www.searchenginefriendlylayouts.com/permission-based_indexing/

  778. Dave (Original)

    RE: ” also forgot to say that coupon sites have all the same coupons yet you don’t see a problem there:)”

    You should really start to concentrate on what you don’t see in Google rather than assume all you do see automatically means it’s always ok, for all, anywhere, anytime. For example, I see many site pages with hidden text, so should we all assume hidden text is just fine and dandy with Google? (rhetoric Jack).

    To be blunt Jack, I have never seen someone being in denial more than yourself. You states that YOUR site not being in Google fully is Google’s problem. That is so funny it hurts. Do you truly think Google has “problems” because your site, and others it probably doesn’t even want, isn’t it it’s 8 Billion page + index?

    Like so many who have issues withy Google, they always blame Google and will never consider it’s them and/or their site. As the saying goes: “you can lead a horse to water…..” πŸ™‚

    Adam, must you keep spoiling good stories with facts. :))

  779. And once again, Phil, you have no idea what I’m referring to. If I were talking about SEs in that case, you might have a point.

    I knew exactly what you were talking about, Adam, and I replied to it in context. Jack wanted to know why his pages were dropped from the regular index, and you wrote about it. If you were doing a site review, I’m sure you would have said so.

    My comment was strictly from the viewpoint of the end user, and really had nothing to do with search engines.

    Then it didn’t belong here.

    Your article is interesting, but there are a few points to make. One is that you weren’t the first to come up with a viable solution – if yours if viable – we shall see.

    Another is that you should learn that not all users are Web savvy, so you should design your pages so that everyone can easily use them. I.e. I got to the bottom of the first page and wondered where the new idea is. Many people wouldn’t have gone looking for more pages, but I did, and I found the links at the top of the left column. A good design would have a “next” link at the bottom of each page’s text. Say thank you for the tip, Adam πŸ˜‰

    A third point is that it is factually inaccurate:-

    (1) You obviously don’t understand what a search engine’s profiling is, and I’m not going to explain it to you.

    (2) Your reasons why Google shouldn’t index pages just because they are there are way off, simply because, if Google isn’t able to recognise such pages (e.g. pages under construction) in one scenario, they aren’t able to recognise them in any scenario, so whatever they do, they are going to end up indexing them, even if other pages are left out.

    But the biggest flaw in your suggestion is its very concept, which can be summed up in one very short sentence – allow webmasters to verify their sites and they will be fully crawled and indexed. Yeah, right. Do I really need to explain what’s wrong with that, Adam? From your article…

    The verification technology outlined above, sans sitemap, can be used by Google as a means of obtaining permission from webmasters to crawl, index, and deindex websites. Webmasters would be required to verify their desire to be included or not included in the index, as well as indicate to Google that they want their site crawled without being required to submit a Sitemap.

    You don’t seem to have grasped what Google’s problem is, and what BD is about. Google doesn’t want or need permission to crawl and index sites – a mechanism for that already exists, anyway. And Google doesn’t need sites to be verified to be indexed. They don’t have a problem with any of that. I’m sorry, but you have spent a lot of time writing something that is worthless.

    So the only viable alternative that has been suggested is the one that I put forward very early in the thread, and again lower down when you asked for alternatives. I assumed that you didn’t respond to it because it is so blindingly obvious. But then you don’t respond to anything where you’ve been shown to be wrong – you just go away for a while.

  780. Jack.

    Ignore them. Adam has gone into site review mode, which neither belongs in this thread, and doesn’t attempt to answer your problem, and Dave – well, Dave is just Dave – he doesn’t really say anything, except that it’s everyone’s fault except Google’s.

    Study Matt’s posts in this thread. If you compare what he wrote with the links in and out of your site, I’m sure you’ll find the reasons why your pages have been dropped.

  781. Dave (Original)

    Oh boy, some trolls need to be spoon fed don’t they πŸ™‚ Jack has been told over a week ago he needs more links (by Matt himself I *think*). One of his (Jacks) replies;

    “Why are people still talking about the influence links should have on getting a site indexed? This has nothing, or should have nothing to do with getting a site indexed”.

    Now, for crux of the whole matter! Why has Jack not got enough links pointing to his site? See my and Adams posts above πŸ™‚

    DOH πŸ™‚

  782. Oh boy, some trolls need to be spoon fed don’t they

    I wouldn’t know what utensil they feed you with, but I’ll take your word for it πŸ˜‰

  783. Adam. I decided that I’d better add something, in case you really can’t understand what’s wrong with your alternative. You might twig it if you think of the phrase “spam central” πŸ˜‰

  784. I’ve notice when I type a kewords ex: “Hotels cityname” or other keywords,
    the same url listed in first page:
    domain.com
    domain.com/doma.htm

    Is on update or it’s a problem of google index?

    We’d like to find different result in first pages and never the monopoly of the free index.

  785. Dear Matt,

    I find more and more blogs rank good in competitive keyword.
    In China, if you search “机η₯¨” (airplane ticket), you will find 5 blogs at first page.

  786. Phil:

    I’m going to say a few things to you, short and sweet.

    1) I don’t see anything wrong with going into “site review” mode. Even though my intentions were non-search-engine related, they were intended to help Jack realize the fundamental flaw in his logic (as well as the logic of the vast majority of affiliate sites). And it may actually have some relevance to the search engine portion of things as well.

    Besides, I’m not the one who made the “no trickery” statement. He didn’t specify whether he was referring to search engines or not. So therefore, I don’t need to specify whether I’m doing a “site review” or not.

    2) Out of everything you’ve attempted to find fault with, the only thing you were correct on was something in the original design of the piece that I had commented out as part of a test and merely forgot to comment back in and that was the navigation links on the bottom between the pages. Thank you for pointing that out.

    3) The rest of that is so far off the mark that it doesn’t even merit commenting, and based on the conventional and prevailing logic that “Google has to fix it”. Which would take longer, Google to fix it without help, or Google to fix it with help? With the idea I’ve come up with, webmasters get to contribute, and that’s something you’ve gone on about for a while now.

    Besides, I really don’t see a better idea coming from you. If I haven’t responded to it, I either didn’t see it or it simply wasn’t worthy of a response (likely the latter). So…until you come up with a better way, then you haven’t got any grounds to criticize what someone else is doing.

    So…now’s your chance. Indicate your ‘better way’.

  787. Adam, must you keep spoiling good stories with facts.

    I apologize, Dave. I’ll never use logic, sound reasoning, and common sense again.

    Here, let me practice on you.

    YOU’RE WRONG. I’m not going to tell you why you’re wrong, or give you any valid objective reason, but YOU’RE WRONG. So like, just shut up or do a Google search on why YOU’RE WRONG, okay?

    How’d I do? πŸ˜€

  788. I have actually added some links but Google is very slow to add them. Link count has not changed according to them and is miles away from what Yahoo shows. Yahoo is in the thousands whereas Google only shows 20. Yes I know Google only shows a sample, but they still shouldn’t be that far off.

  789. Adam:

    The only other thing I can do is open links in new pages that go to the merchant’s site which I have done on some pages. However this really doesn’t touch on Googles indexing the site, user quality perhaps but not search engine indexing.

  790. Adam.

    Besides, I’m not the one who made the “no trickery” statement. He didn’t specify whether he was referring to search engines or not. So therefore, I don’t need to specify whether I’m doing a “site review” or not.

    Site reviews are not what this thread is about, and you certainly do have to say that you are just doing a site review in the midst of this discussion when that is what you are doing. Otherwise you a liable to impart the wrong understanding. Jack said,”I use no trickery in my sites at all. The fault is definitely Google’s as no other engine has a problem.“, and you responded with, “Presenting affiliate hyperlinks on a site that uses storefront software as if they’re your own products and services doesn’t really qualify as “no trickery”, dude.” It didn’t take many brain cells to know that Jack meant search engine spam trickery. It obviously went over your head, though.

    I won’t quote the rest of your post, but I’ll answer it all, hopefully in a way that even you might understand…

    Your alternative is useless because it doesn’t address the problem that BD is addressing, there is nothing to be gained by doing it, and there is a lot to lose by doing it. What gain would there be in only indexing sites where webmasters have verified that they have some control of them? Does it get rid of links for rankings (reciprocals, bought/sold, etc.)? Nope. Does it prevent spam pages, scraper pages, or any other kind of unwanted pages from getting in the index? Nope. Does it address any of the link problems that Google is dealing with in BD? Nope. What *is* there to be gained by Google by doing what you suggest? Absolutely nothing. So what’s the point? What makes you think it’s an alternative?

    Let’s look at the other side. Would your idea cause any problems for Google? Yep. If Google was stupid enough to do what you suggest, and to trust webmasters in the way you described, it would be open season on spamming the index and serps. You might quickly make an addition to your idea now that I’ve pointed it out – by suggesting that every site would still need to get through spam filters, etc., but then we’d be back where we are now – sites and pages have to get through spam filters, etc., and links would still need to be scored the way they are now. So nothing would have been gained in terms of spam, and nothing would have been gained in terms of links.

    When you get right down to it, your idea isn’t an alternative to the way that BD is treating links (that’s what the alternative was about, remember). It’s simply something additional, which is neither needed nor useful.

    I’m sorry that I’ve trashed your idea – especially since you put so much time into it – but you can’t expect a round of applause for an idea that would gain nothing, solve nothing, and would be like Christmas for spammers, can you?

    I accept that your memory isn’t very good, and you missed both times that I suggested a real alternative to the way that BD is treating links, even though you specifically asked for alternatives, so I’ll do as you now ask and repeat it yet again. But if you want honesty, I don’t think that at all. You ignored the alternative when I wrote it, because it was a very sensible answer to your question, and you can’t cope with that. Anyway, here it is again…

    By all means, identify the link types that you (Google) don’t want to count for anything, and don’t index them. That way they won’t count for anything. They won’t help rankings, and they won’t help PageRank. It will be as though they don’t exist. If they are there to get spam pages into the index, the spam pages won’t be indexed. If they are there for their link text (rankings), they won’t affect the rankings. Whatever they are there for won’t succeed, because, as far as Google is concerned, they don’t exist.

    Now I think that’s a very good alternative to identifying unwanted link types and using the information to decide how many pages of a site to have in the index. And it’s a whole lot fairer than what Google is doing now.

  791. Dave (Original)

    RE: “Yes I know Google only shows a sample, but they still shouldn’t be that far off.”

    Why do you say that? I have seen times my backlinks vary by 70% or more, no big deal. It’s only a number.

    Remember, you need “quality” links, not just any old links as someone suggested. I would truly focus on long term and add good useful content to your site so it *naturally* attract links. That, I believe, is the crux of your indexing problem.

    Adam, may I suggest ignoring PhilC.

  792. I thought about that, Dave, but there are people out there that listen to him and may actually adopt some of the irrational, inconsistent, and illogical statements he comes up with as complete, utter, and absolute truth. I’m just trying to nip the cult in the bud before too many people drink the Kool-Aid.

    You know what Phil’s about.
    I know what he’s about.
    Doug Heil knows what Phil’s about.
    Phil even knows what Phil’s about. He’d never admit it, because his “points” gain credence on the sole basis of his ability to present them through a fun-house mirror.
    At least 95% of the intelligent-thinking population knows what Phil’s about.

    It’s the 5% who might buy into whatever crap he happens to be spouting off that I’m addressing, because if one or two of them that might fall victim to his bullshit see the light, it’s worth it. If we don’t use common sense, logic, and sound reasoning to point out how flawed and vested the thinking of a guy like Phil is, then that leads to more spam (which we’re all trying to avoid, right?)

    Phil also made two very large assumptions, which I would have expected a guy like Phil to make:

    1) He assumed that in what I proposed, there would be no spam filters. I never said that, but he assumed it. The spam filters would remain intact. I never said that it would change. It would be quite obvious to anyone with half of a brain that the spam filters would remain completely intact.

    Buuuuut…that’s Phil. He reads what he wants, and interprets it how he wants. It’s a great way to argue, really…you select the facts that make you appear to be right and ignore the ones that prove you wrong.

    2) He assumes that I thought he’d like the idea. I actually expected and hoped that he’d hate it. I also expected he’d selectively misunderstand it. And that’s cool.

    I like when Phil pulls apart things I do and tries to distort them. I used to hate it, I’ll admit that: but when he needs to take that much time to respond to an idea, it means that deep down inside, he knows who’s really right and wrong. But for him to admit that, it would require humility, honesty, and the ability to look at a situation without being self-centered. And we know that isn’t going to happen.

    So Phil, go ahead. Tell me how wrong my idea. is. Tell me how stupid it is. The harder you try to argue it and distort the truth, the more you end up proving it right. I openly encourage you to take anything I say and misinterpret it (intentionally or otherwise).

    I would be more concerned if someone like Dave, or Doug Heil, or Nancy, or someone else who has earned respect and deserves it finds fault. But since your opinion, the amount of respect you have from people besides you, and $5 is worth a total of $5, it means nothing.

    Now, as far as your “idea” goes, I’m not going to “ignore it and run with my tail between my legs” or whatever spin doctoring you’re going to put on it. The reasons it won’t work have been discussed at length, the flaws have been established, and it doesn’t merit further discussion. Your idea is asinine, you know why it is, and you should just let it go. Bury it before you try to explain it anymore.

  793. But Dave “quality links” depends on the searcher. As you well know a search engine has no way to determine if a link is useful to visitors. If a person comes to my site looking for diamond rings, they may also be interested in travel destinations for honeymoons. Search engines would never see these two topics related but a searcher would.

    I should say that a couple of my recent links I got were good PR sites and I fully expected to see them with a Google link command but they still haven’t shiwn.

  794. Dave (Original)

    No Jack, “quality” links into your site depends on Google. The type Google likely sees as “quality” are the type that you gain naturally. The higher the TRUE PR (not Toolbar) of the page supplying the natural link, the higher “quality” Google likely sees it as. Hence why you need content for long term success.

    RE: “I should say that a couple of my recent links I got were good PR sites and I fully expected to see them with a Google link command but they still haven’t shiwn.”

    So what??? Googles Link command is not only go to show what Google sees as quality links else they would be revealing part of their algo.

  795. Dave:

    Supposedly though Google has always shown links that were PR 4 or above.

  796. And don’t say you need content for long term success as there are many sites that do not fit this that do great. Once again I point to coupon sites for just one example.

  797. Dave (Original)

    RE: “Supposedly though Google has always shown links that were PR 4 or above”

    That’s was over a year ago and even then, that was only what was percieved!

    RE: “And don’t say you need content for long term success as there are many sites that do not fit this that do great. Once again I point to coupon sites for just one example.”

    I can show you samples of hidden text sites doing just great too. Why not use that trick also? Same goes for cloacking, doorway pages etc. Each day is a new day though and for every spam or no/low content site you see, there are likely 1000s other than you can’t see.

    Perhaps these “coupon sites” have good quality links pointing to them and/or they are not targeting competetive phrases. Or, anyone of a 100 other factors your site doesn’t have.

    Jack, it’s time to wake up a smell the coffee. Your mind-set and ways are NOT getting your site fully indexed. I don’t expect you will ever do well in the ever changing arena of Google as you will not listen until you hear what you want to hear.

    Ta ta and best of luck, you’re going to need it πŸ™‚

  798. Dave (Original)

    RE: “I thought about that, Dave, but there are people out there that listen to him and may actually adopt some of the irrational, inconsistent, and illogical statements he comes up with as complete, utter, and absolute truth. I’m just trying to nip the cult in the bud before too many people drink the Kool-Aid”

    Trouble is, he has his own site that promotes all sorts of black hat methods. Buying links for PR, cloaking and all sorts of other high risk stuff.

    Notice how only he replied when I said

    “Oh boy, some trolls need to be spoon fed don’t they”

    That sais a lot πŸ™‚

  799. Phil even knows what Phil’s about. He’d never admit it, because his “points” gain credence on the sole basis of his ability to present them through a fun-house mirror.
    At least 95% of the intelligent-thinking population knows what Phil’s about.

    I can say with absolute certainty that Phil is about two things in this thread. One is discussing the BD update rationally and sensibly, which I’ve done from the start. The other is something I’d rather not have to get into, but I think it’s necessary. It’s to prevent you (Adam) from spreading misinformation due to your lack of knowledge and understanding.

    Now to get back to your so-called “alternative”…

    Your #1 point – about spam filters
    Your piece describes a simple system where webmasters verify that they have some control of a site and Google indexes it all. You said nothing about spam. It’s not my fault that you either forgot to mention it, or that you didn’t realise that it would be permanent Christmas for spammers until I mentioned it, so don’t come on at me for your failing.

    Your #2 point – you wanting me to hate your idea
    It’s not exactly a point, is it? However, I don’t hate it – I think it’s just stupid. In fact it’s the most stupid thing that I’ve ever seen you write.

    I like when Phil pulls apart things I do and tries to distort them. I used to hate it, I’ll admit that: but when he needs to take that much time to respond to an idea, it means that deep down inside, he knows who’s really right and wrong. But for him to admit that, it would require humility, honesty, and the ability to look at a situation without being self-centered. And we know that isn’t going to happen.

    So Phil, go ahead. Tell me how wrong my idea. is. Tell me how stupid it is. The harder you try to argue it and distort the truth, the more you end up proving it right. I openly encourage you to take anything I say and misinterpret it (intentionally or otherwise).

    That is the second most stupid thing that I’ve ever seen you write. It really does take the biscuit for twisted logic. In a nutshell, you say that the more I find fault with your idea, the more the idea must be right. Hello??? Anyone home??? Ever heard of logic, Adam? It’s interesting to note that you used to hate it when I pulled something apart. You don’t suppose it could have been because I made good sense, and you had no answers to the good sense, do you? I do. Let’s take your “alternative” idea as an example…

    I asked you some very good questions about your ‘alternative’, but you chose to ignore them – presumably because you don’t have any sensible answers. Here are some of them again…

    What gain would there be in only indexing sites where webmasters have verified that they have some control of them? Does it get rid of links for rankings (reciprocals, bought/sold, etc.)? Does it prevent spam pages, scraper pages, or any other kind of unwanted pages from getting in the index? Does it address any of the link problems that Google is dealing with in BD? What *is* there to be gained by Google by doing what you suggest?

    Perhaps you would like to answer them now, instead of ignoring them and hoping they’ll go away.

    I really would like you to answer those questions, please. You need to answer them if you want people to believe that your idea has any merit. If you want your idea to have any credibility, then answer the questions.

    Now, as far as your “idea” goes, I’m not going to “ignore it and run with my tail between my legs” or whatever spin doctoring you’re going to put on it. The reasons it won’t work have been discussed at length, the flaws have been established, and it doesn’t merit further discussion. Your idea is asinine, you know why it is, and you should just let it go. Bury it before you try to explain it anymore.

    So, whilst saying that you are not going to ignore it, you ignore it yet again – another example of your twisted logic. Would it be because you can’t find fault with the idea? It has not been discussed at length, or even briefly. In fact, it hasn’t been discussed at all. If you think it has, then show us. If you can’t find where it’s been discussed, then please provide a brief synopsis. I can’t be any fairer that that. If you don’t want to do it for me, then do it for the people who are following this thread – you don’t want them to have misinformation, do you πŸ˜‰

    Adam. I’ve put you on the spot concerning your idea. I’ve publically asked you some sensible questions about it – twice. If you don’t answer them, anyone who reads this must conclude that you don’t have any sensible answers to support your own idea.

    I’ve also put you on the spot concerning the alternative that I suggested at the top of this thread. I’ve asked you to either show us where it’s been “discussed at length”, or to give us a brief synopsis of why it won’t work. If you fail to do one of those, then people who are following this can’t avoid the conclusion that you were just trying to avoid addressing it because you can’t find any fault with it.

    I like when Phil pulls apart things I do and tries to distort them

    I don’t distort anything, Adam, and you can’t find anything that I’ve written that distorts anything. I do find fault with your idea, but finding fault isn’t distorting it. If you think it is, then nobody is preventing you from correcting it.

    Anyway, the ball is in your court concerning both idea. You have good questions in front of you about both of them, and the world awaits πŸ™‚

  800. Trouble is, he has his own site that promotes all sorts of black hat methods. Buying links for PR, cloaking and all sorts of other high risk stuff.

    My site is written in English, Dave, but maybe English isn’t your native language, so I’ll explain it to you. A few of the articles in the site discuss blackhat methods rationally and logically, but they don’t promote them (that’s where you were wrong). In fact, the site promotes NOT using them unless whitehat methods won’t work, which is sometimes the case.

    The reason for the articles is because I’m a person who thinks for himself and draws his own conclusions about things, unlike a sheep that just follows. But don’t misunderstand me – I’m not against sheep – I love eating them! πŸ˜‰

  801. No, Phil, that’s not what you’ve done.

    Let me explain to you what you have done, in simple, plain English:

    1) You’ve suggested something that couldn’t possibly work, since it in no way addresses the original issue of people being required to get inbound links (you still need at least one to form an inbound link profile), thus doing absolutely nothing to solve the issue for webmasters that haven’t got any; and it also doesn’t address the ability for a webmaster to create “external site links” from sites they may own but appear not to from seemingly legitimate pages to other pages from other sites outside of the link structure of the other sites.

    Those are just two reasons. But since they stand up by themselves and therefore I don’t have to list others, you’re wrong, give it up, stop trying, move on, that’ll be $3.95, drive through to the first window to pay.

    As far as whatever “sensible” questions you’re asking pertaining to my idea, as soon as you ask one, I’ll answer it.

    And, to make sure it’s not some question that you’re just creating for the sake of finding fault which does not exist (by the way…so far, you’re the only one out of hundreds going on thousands that have read the thing in its entirety), I will only answer your question if someone else posts in agreement and it can be reasonably determined that it’s someone with a legit interest in asking. In other words, at least a semi-regular poster, or someone who has shown at least a legitimate interest in learning if not helping.

    That way, you don’t get to judge a sensible question, I don’t get to judge a sensible question…it’s left in the hands of others who use this site to learn and/or help. Let’s face it, neither you nor I is in a fair position to judge our own opinions, so I’m not about to.

    As far as promoting something when there is “no other way” to promote it, I am reminded of a line from Kung Fu: The Legend Continues (the greatest TV show with horrible production values of all time, for those of you who have a bad TV fetish like I do):

    “There is always…another way.”

  802. Adam, Phil directly asked you some questions concerning your idea, Are you afraid to answer them? You have made quite a few posts containing nothing after he asked them, which surely looks like an attempt to change the topic.

    To be honest, I am actually curious about your idea. However, when you refuse to answer some pointed questions honestly, it makes your idea lack any merit at all.

  803. Adam.

    Yet again you’ve avoided the questions about your idea. Don’t you think that your idea is worth discussing? Do you really want people to think that you have no answers to the questions, because that’s the only conclusion that can be drawn from your refusal to answer them?

    Your post is all about my alternative suggestion, so we’ll continue with that…

    I asked you to show us where it had been discussed at length, as you stated it had been, or to even to give us a brief synopsis. You haven’t showed us any discussion(s) – because none exist – so I guess your post is some sort of synopsis.

    You are right that my suggestion doesn’t incorporate the need for at least one IBL to get a page indexed, but that isn’t an issue with BD, or with Google, or with website owners, so there is no need to incorporate it in the BD alternative.

    And, if I’ve understood your second point correctly, you are also right that it doesn’t address what you wrote:-

    it also doesn’t address the ability for a webmaster to create “external site links” from sites they may own but appear not to from seemingly legitimate pages to other pages from other sites outside of the link structure of the other sites.

    It is badly worded, but I think you are talking about IBLs from people’s own sites, when the ownerships of the sites can’t be determined. I don’t see the need to address that issue either, because, if the sites are good, as you suggested they are, then there isn’t a problem.

    Now, to get this back on track, I’ll refresh your memory as to what it is we are discussing. Big Daddy seeks to eliminate the benefits of unwanted links, by recognising certain types of links for that purpose. You wanted to know what alternatives there are, and I suggested one (although I’d already suggested at the top of the thread). Since BD recognises and handles certain types of links, I suggested still recognising them, but handling them in a different way – in a way that would eliminate their benefits, and everyone who expressed an opinion agreed. Other issues, whether real or imagined, don’t come into it.

    The suggestion is only slightly different to what BD does. BD doesn’t count the unwanted links for crawling and indexing. I suggested not counting them for rankings and PageRank instead. That’s the only difference. So, as far as any of us can see, what I suggested would work perfectly well as an alternative to the aims of BD.

    You seem to be have gone way outside an alternative to what BD does, which would be fine, except that you asked for an alternative to what BD does. You don’t want to answer any questions about it, so first I’ll show the summary of it again, and then I’ll tell you what would happen if it were applied.

    allow webmasters to verify their sites, and the sites will be fully crawled and indexed

    A webmaster would be able to verify a site, which would be fully indexed, and all the unwanted links in them (bought/sold, etc.) would be effective for rankings. Webmasters would be able to verify multiple sites, and the same would happen, only more so. Spammers would be able to verify many sites (different Google accounts – multiple sites on each account), and every day would be Christmas day.

    It would do nothing about the unwanted links that BD is addressing – the thing that you wanted alternatives for. So, you see, it really isn’t an alternative to the aims of BD at all. I’m sorry, Adam, but your idea doesn’t even begin to address the unwanted links problem that Google is dealing with in BD. It simply isn’t an alternative.

    The only similar thing about your idea and BD is that they both use some means of deciding what to have in the index. BD decides on the basis of unwanted links, whereas yours decides on the basis of a website owner demonstrating that s/he is able control the site. If I were a links-based search engine, I’d prefer to do it in a way that eliminates the benefits of unwanted links, if I were to do it at all.

    Your idea would address a space problem *only if* large numbers of sites weren’t verified, but if there is a space problem, BD addresses it, and combines it with eliminating the benefits of unwanted links, and without the commitment to fully index all sites that your idea has. If everyone verified their sites, then any space problem that there may be would be made much worse, so your idea would make things worse – not better.

    I can’t see any merit for anyone in your idea. It does nothing about unwanted links, and it would make a space problem worse, if space is an issue (as I believe it is).

  804. LMAO!

    I’d say that’s a nice try, Phil, but come on, that was weak. I expected more out of you.

    You’ll notice how I specifically said people who are at least semi-regular posters, not your forum buddies. Naturally, I expected you to bring Willie into this. (Of course, I’m expecting Janeth at some point too. Bring her in. Make it a party!)

    For those of you who don’t know what I’m referring to, WilliamC and Phil travel in the same pack.

    You’re gonna have to do better than that, Phil.

  805. On second thought, Phil, this is getting kind of old. When you have to resort to gang mentality because you don’t have the mental capacity to come up with an effective argument to anything I’m saying without distortion or bias, then well…you’ve got nothing. So please…don’t bother.

  806. Will’s appearance here was a surprise to me. I promise you that it wasn’t arranged or expected. And we don’t travel in any sort of “pack” – that’s a lie that was invented by Minstrel, who has the distinction of being one of the biggest trolls on the Web.

    But his post was spot on. You are just avoiding anything that makes you sound wrong, or even mistaken – like discussing your own idea, because you can’t defend it. Will’s post doesn’t give you an excuse to not discuss it. You introduced the thing here, so discuss the thing here. Otherwise it can only be seen for what it is. If you can’t defend it, you could at least be honest enough to admit it.

  807. Sorry that you feel you can not answer the questions Phil posed honestly Adam. I quite honestly had a few questions myself that would have actually helped your idea along, but you have shown it has no merit by not even being able to intelligently debate the very idea you dreamt up.

    I have seen enough. Good day on this topic.

  808. Dave (Original)

    RE: “A few of the articles in the site discuss blackhat methods rationally and logically, but they don’t promote them (that’s where you were wrong).”

    Not true at all. You even have an affiliate link to a well known PR selling site. On this site it states;

    “Buy Text Links from our partners for high PR quality ads”

    RE: “In fact, the site promotes NOT using them unless whitehat methods won’t work, which is sometimes the case.”

    Like I say, you promote black hat methods. Using them only if white hat methods don’t work is ludicrous to say the least. A LOT like saying, “but I only cheat at cards if I’m not winning”. Good grief Phil are you serious?

    Like I have said before, there is only one thing worse than a self-confessed black hat and that’s a black hat that lies (goes with the territory I guess) and denies it all.

  809. Dave (Original)

    RE: “Big Daddy seeks to eliminate the benefits of unwanted links, by recognising certain types of links for that purpose”

    That’s an awfully bold statement Phil.

    Answer From Matt on BD;

    Q: What’s new and different in Bigdaddy?
    A: It has some new infrastructure, not just better algorithms or different data. Most of the changes are under the hood, enough so that an average user might not even notice any difference in this iteration.

    So what facts are you basing your statement of fact on Phil?

  810. Dave (Original)

    Oops, sorry Phil. I forgot I was ingoring you. Don’t bother with a reply, like Adam, I’m done with your imature smart arse rude attitude.

  811. Dave.

    You can throw insults as much as you like. They largely reflect on the person who throws them.

    Yes, my site has a link to a link-selling site. Something wrong with that? But, as I said, the site doesn’t promote blackhat methods unless whitehat methods won’t work. The reason I suggest not using them unless it is necessary is because of the risk of penalties, and not because there is anything “wrong” with the methods. If you want to discuss why I say there is nothing “wrong” with blackhat methods, you are welcome to debate it with me somewhere else.

    As for my “awfully bold statement”, and “So what facts are you basing your statement of fact on Phil?”…

    I am basing my view on what Matt has written. He has said that the Big Daddy infrastructure change is largely to do with the crawl/index function, and everything that he has said about it, including the examples, indicate that the changes concern types of links, both in and out of a site. It’s all there if you care to read it.

    Google has a huge, self-inflicted problem with unnatural links on the Web – links that are there to boost rankings – and the problem continues to get worse. They have caused the quality of Google’s index/results to deteriorate, and Big Daddy is the start of Google’s way of dealing with the problem.

    What they’ve done is devalue certain types of links, both in and out of a site, so that they don’t help a site to be crawled and indexed, which is why many sites have had some, many, or most of their pages dumped from the index. In that way “Big Daddy seeks to eliminate the benefits of unwanted links, by recognising certain types of links for that purpose.

    That’s my view of Big Daddy. It may be wrong in part or in whole, but it’s what I understand from what Matt has written. I’ll go further, and suggest that this is only Google’s first big attack on links-for-rankings. I expect more to follow. After all, devaluing them for indexing purposes doesn’t stop them from boosting the rankings of pages that get in the index. It limits the number of pages that a site can have in the index, but I’ve found no pattern as to which pages get in and which are left out. There’s a lot more for them to do.

    So, rightly or wrongly, I believe that Big Daddy is mainly an attack on links-for-rankings, and not simply a method of determining how many pages of each site to have in the index. That’s why I suggested that not indexing the links would be a better way of doing it, because it would prevent them from affecting the rankings, so it wouldn’t matter if all the pages from a site were indexed or not.

    That’s what I believe about BD, but I can also see that it could be that limiting a site’s presence in the index is to do with index space, and not primarily an attack on links-for-rankings.

    Adam’s idea seems to be all about choosing which sites to index, so it may be that he sees the reason for the Big Daddy changes as being about that, whereas I see it as being an attack on links-for-rankings. I don’t see any point in choosing which sites/pages to index, just for the sake of it, so if Google is doing that, it must surely be because of indexing space, or future indexing space. In which case, Adam’s idea won’t work, because it allows all pages from all sites to be indexed. My suggestion would only be marginally better, so it wouldn’t work either.

    To be honest, I’m torn between the two views – index space, or an attack on unwanted links. It makes very good sense to me if BD is simply an attack on links-for-rankings (reciprocals, bought/sold, etc.), and everything that Matt has said indicates that that’s what it is. But I don’t see the sense in limiting a site’s pages in the index because of them, especially since there is no discernable pattern to which pages are in and which are out, when it would surely be much better to not index the links so that they can’t affect the rankings. It also makes very good sense to me if Google is limiting the pages so that the index capacity doesn’t get filled up too quickly, and they don’t have to keep on adding new capacity all the time. Attacking the unwanted links-for-rankings could well be the way they have chosen to do it, albeit a less than perfect way of doing it.

    So, Big Daddy seeking to eliminate the benefits of unwanted links is a very valid understanding of the reason behind BD, and it’s the one that I favour, but I can see that index space is another valid possibility.

  812. Big Daddy is broke.

    Sites that have the noindex no archive tag are showing up. Also, it used to be that when doing obscure searches for information Google was good at returning some good results in the top. Now, it seems worthless adsense sites show up more and more. Good for Googles bottom line if people actually click a link on them, bad for searchers. I don’t see how those sites ever got approved for Adsense in the first place. Must of been a broke bot doing the review of the site.

    So much for the “Do no Evil” thing.

    I’m quickly loosing faith in my most favorite search engine and it has nothing to do with links for me. The search quality is in the toilet.

  813. I have no problem with Google discounting links coming or going from any site but my concern is more towards a penalty being attached to this.

    According to Matt

    Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site could cause your site not to be crawled.

    He went on to say that mortgages sites, credit card sites, and exercise equipment sites could also cause a problem for your site.

    Is this because the sites are off topic with the original site or is it because of the types of sites they are?

    We own a web design company and are proud of the sites we build, we sometimes link to sites we have built to show visitors the type of sites we design.

    Will this cause a penalty now?

  814. it should be noted matt cutts is on holiday/retired and his relief person does not comment or filter. so this blog perhaps is effectively no longer a google outlet anyway, and this was the only means of discernment/education information dissemination.

    we are reduced to squabbling between ourselves in the abscence of any google instigated input or comment.

    i am going back into hibernation now, but am not amused at my sites being trashed for no good reason along with many others as far as i can see, and then never getting any sane or relevent or cohesive feedback that adequately explains why.

    i dont think the way google has treated its loyal users has been either fair or even in accordance with its own webmaster guidelines.

    good luck to all here

    malcolm pugh

    england

  815. Well said, Malcolm. I visit this blog a few times a week hoping to get new insight and all I find people bashing each for petty reasons. Very annoying. I still believe the sitemap has played a role directly or indirectly in the deindexing. I have yet to hear from someone who has corrected all the possible problems and have gained all the pages they lost after they submitted a sitemap. If Big Daddy can detect duplicate pages, etc when it visits a site, then why let webmasters know so something can be done.

    There are two many of us out there with similar problems for this to be a coincidence. We need solid answers. If they are saying they can alert webmasters through the sitemap inteface (eg. 404 errors found, etc), then why not let us know other errors they found too. I don’t see why this is so difficult to do.

  816. Malcolm. What has happened has happened, and we have to live with it, because it isn’t going to change. I really don’t think that things are going to slowly get back to normal in the way that some upheavals have done in the past. This change is too fundamental for that, imo.

    Matt showed the way to deal with it in his original post in this thread. As far as I can see, that’s the only forward for all sites – not only those that were hit.

    … and then never getting any sane or relevent or cohesive feedback that adequately explains why.

    I don’t think we will get the answer to “why” in the near future. I have an inclination as to why, but nothing has been said about it in this thread. I actually think the answer to “why” may be more fundamental than any of us have even hinted at – except elsewhere.

    P.S. Don’t forget that the match starts at 5 p.m. tomorrow (Thursday) πŸ˜‰

  817. you mean the fact that they are short of space:P

  818. Space related, yes – that’s been discussed for quite a while now – but a little more fundamental than just that.

  819. This is what I’m thinking, and it’s just conjecture…

    From the start, search engines have indexed whole websites as well as they were able. But the Web continues to grow at such a rate that no search engine can keep pace with it. So Google decided not to even try to keep pace with the growth. Instead, they’ve decided to allow all sites to be represented in the index, but not necessarily to be fully indexed.

    That would be a fundamental change to the way that they intend to index the Web. The basic intention would be changed from fully indexing all sites as well as they are able, to allowing all sites to be represented in the index, but not necessarily to be fully indexed. It would fit in with Matt saying that they are now indexing more sites than before (I think that’s what he said), and it would also fit in with him saying that Google’s index is now “more comprehensive” than it was before – if “more comprehensive” means that pages from many additional sites are being indexed. If they have representations from all sites in the index, then the index could be seen as being more comprehensive than if many sites had to be completely left out.

    To do it, they needed a reasonable way of deciding how much of each site to index, and evaluating the links around a site would be reasonable, because it would have a tendancy to limit the representation of sites like scrapers in the index, which don’t tend to get ‘good’ IBLs, whereas normal sites do tend to get ‘good’ IBLs. It would also limit the pages of sites that have been link-building for rankings, which is what they don’t want to happen. So, in general, it would limit the pages from iffy(?) sites more than from non-iffy sites. I’m not using the word “iffy” judgmentally.

    But there’s a possible flaw in that idea. Why haven’t all sites had some of their pages dumped? Well, maybe a lot more sites *have* had pages dumped, but it’s only those that have been reduced drastically that we hear about, or were even noticed. Or there may be a threshold score, above which a site is ‘good’ and can be fully indexed. Or it may be that this is just the start, and it isn’t yet necessary to limit all sites if their links evaluation scores well enough.

    So the fundamental change is to stop trying to fully index the Web, as they tried to do in the past, and instead to try and have all sites at least partially represented in the index.

    Remember that this is just conjecture. The reason that I’m leaning towards it is because I can’t see the sense in hitting at the point of indexing. Why use link evaluations to hit there, instead of using them to hit the rankings?

  820. It sounds like you’ve gone from being upset about what they are doing to agreeing with it.

    Is this correct or were you only explaining your thoughts?

    Do you still feel what they’ve done is a bad idea?

  821. If that conjecture is correct, then I can’t honestly find fault with it. I only find fault if they are using links to determine how many of a site’s pages to index, just to do something about unwanted links.

    I can’t see that any engine can keep up with indexing the whole of the Web because of the Web’s rate of growth, and they all have to do something about sooner or later.

    or were you only explaining your thoughts?

    They were only my thoughts. Maybe some people can come up with reasons why it can’t be correct, so it’s worth writing the thoughts down. If it’s correct, it would mean that both Adam and I were aiming in the wrong directions for alternatives.

  822. PhilC,

    I think you are on to something there.

  823. Dave (Original)

    Janeth, the fundamentals (Web growing too fast) of what Phil *claims* as his own conjecture was mentioned in the blog over a week ago a number of times.

    In a nutshell, I mentioned that perhaps due to the number of pages out there growing so rapidly there are not enough hrs in a day, or days in week, to index the lot, at least not at this point in time. So, Google has come up with a criteria (links perhaps) to determine which pages are index/included and which ones aren’t.

    I have no doubt in my mind though that Google is STILL aiming to Index and organise the Worlds data though. Perhaps sheer volume of pages has outpaced their ability and technology…….for now anyway.

  824. I’m willing to email google my pages and even mail them a hard drive to hold them on if that would help. Maybe they could purge the cache of the thousands of old pages they have of mine, would that help out their space problem? I want to be part of the solution not the problem.

  825. Dave (Original)

    JLH, it’s apparently not a space problem.

  826. the fundamentals (Web growing too fast) of what Phil *claims* as his own conjecture was mentioned in the blog over a week ago a number of times.

    In a nutshell, I mentioned that perhaps due to the number of pages out there growing so rapidly there are not enough hrs in a day, or days in week, to index the lot, at least not at this point in time. So, Google has come up with a criteria (links perhaps) to determine which pages are index/included and which ones aren’t.

    Yes, Dave. The shortage of space idea, as a reason for dropping pages, was brought up as soon as this thread was started – Matt mentioned it right at the start – way before you mentioned it. Heck, we were discussing it in forums long before this thread was even started. It was Google’s CEO (Eric Schmidt) who first raised it. Matt mentioned it in his initial post because people in forums were discussing whether or not it is the reason for the pages being dropped. No points there for you, I’m afraid πŸ˜‰

    The conjectured fundamental change is because of index space – it isn’t a suggestion that there might be a shortage of space, and it isn’t the idea that the Web is growing too fast, and it’s not just that Google has come up with criteria to determine which pages are indexed and which aren’t. It’s a bit more fundamental than that, and it hadn’t been suggested anywhere before I wrote it in my forum, and then here. You’ve misunderstood what I wrote, Dave.

  827. I’ll try to make it a bit clearer…

    Like the other engines, Google started their crawler on the Web, and it crawled and indexed everything that it found, by following links from page to page and from site to site. That’s basically what search engines have done.

    But the time came when it was realised that the Web is continuing to grow at such a rate that it will never be possible to index it all, and yet the crawl /index function still continued to try – it continued to index everything that it found.

    So Google decided to abandon the idea of indexing the whole Web, and instead it was decided to focus on having every website represented in the index, but not necessarily all of their pages. If that happened, it would be a fundamental change to what Google is aiming to do concerning the indexing of the Web.

    That’s the fundamental change that I suggest might have happened. It’s a change from aiming to index the whole Web, to aiming to have every website represented in the index, but not necessarily all of their pages.

    When Google started, there was no need to do anything other than index everything that the crawler found, because the Web was much smaller then, and Google’s system was highly scalable. So the aim, and system, was to index everything. With the growth of the Web since then, that aim has become an impossibility, and they had to abandon the idea of indexing it all sooner or later. I’m suggesting that BD is the change.

    Since the start of this thread, I’ve written about the unfairness of dumping perfectly good pages from the index, but if BD really is the change that I’ve suggested, then I would have to say that it is fair, because it allows all sites to be represented in the index. Matt said that they are now indexing more sites than before, so I’m guessing that they must have had to leave many sites out previously, or maybe just take their index pages.

    On the other hand, if the suggestion is wrong, and BD is about dealing with unwanted links, then it is grossly unfair.

  828. Wow, want to see some major spam, I just reported it to google. Over 4 billion pages according to google….

    site:cgq7wm.org
    site:eiqz2q.org
    site:t1ps2see.com
    site:etlz8o.org
    site:viwhha.org
    site:qge6f7.org
    site:rfni70.org
    site:jkthy0.org
    site:geku8h.org

  829. Dave (Original)

    RE: “Yes, Dave. The shortage of space idea, as a reason for dropping pages, was brought up as soon as this thread was started – Matt mentioned it right at the start – way before you mentioned it.”

    Err no Phil, I never mentioned anything about a shortage of space.

    RE: “You’ve misunderstood what I wrote, Dave.”

    Yet again Phil, it’s the other way around. Tell me, do you even read what others write?? I doubt you do.

  830. Ha! For a change, you got something right, Dave. You didn’t mention space. Even so, you didn’t understand the conjecture, or you wouldn’t have written what you wrote to Janeth (who never mentioned anything about it).

    Janeth, the fundamentals (Web growing too fast) of what Phil *claims* as his own conjecture was mentioned in the blog over a week ago a number of times.

    You see? Complete lack of understanding of what I wrote. It wasn’t that the “Web [is] growing too fast”. If you read my later post, you might understand.

    Why don’t you learn to discuss things properly, instead of forever trying to score points and insult people? You haven’t managed to score any points yet, and you’ve tried enough times.

  831. Incidentally, Dave. I don’t agree that there aren’t enough hours in the day, days in the week, etc. for Google to crawl and index the entire Web. They crawl vast numbers of sites on a daily basis, irrespective of whether or not there are frequent changes (freshness) in the sites. They only need to crawl them less often than that, to free up a huge amount of time. So I don’t see time as being a problem.

    And before you try to score any points on that, I know you didn’t say that it is a problem. I know that you only suggested that it might be. I’m merely discussing it.

    Enough space to index the entire Web is the problem, imo, and I suggest that they’ve now stopped trying – they’ve changed tack. And I suggest that BD is the first big change in that respect. Before BD, their system tried to index everything it could. With BD, they don’t do that any more.

  832. Jeff I found a nice explanation of that in this url: http://www.duggmirror.com/technology/How_One_Spammer_Got_BILLIONS_of_Pages_into_Google_in_3_Weeks/

    I think thats one of the flaws of BigDaddy update..

  833. Dave (Original)

    RE: “Ha! For a change, you got something right, Dave. You didn’t mention space”

    Yes Phil, I know, as would *you* IF you read what people write rather than jumping in with smart arse irrelevant remarks.

    I have no desire to discuss anything further with you Phil.

  834. Then I wish you’d stop posting in response to me. You keep saying you’ve stopped and then you come back with more – even writing a response to me, but addressing it to someone else, who never mentioned anything about what you wrote about.

    But you were wrong about one thing there, Dave. You haven’t tried to discuss anything with me, so you can hardly have no desire to discuss anything “further” with me. All you’ve done is throw insults and try to score points – but you’re just not very good at even that.

  835. results are changing dramatically as predicted. pages reappearing. this would point at a space problem now being resolved back to normality.

    malc

    stiffsteiffs teddy bears with attitude.

  836. Of the sites I’m watching, I see one site down massively (almost wiped out – again), another site up 50% but still half of what it was before BD, and another one up from 3 pages in the regular index to ~150 in the regular index.

    So there are gains and losses – a bit of a shuffle. I don’t think that things will get back to pre-BD normality.

  837. I’ve written about the unfairness of dumping perfectly good pages from the index, but if BD really is the change that I’ve suggested, then I would have to say that it is fair, because it allows all sites to be represented in the index.

  838. Just past the one year mark for our pages in the SI – cache shows pages from 14 June 2005!!

    I still don’t understand what Google is up to! Gbot visits every day – covered the whole site a few days back but none of crawl appears in the index! What is the point of crawling if they don’t update index – this has gone on for six months now. This still points to problems that either they are not aware of or are unable to resolve. Admittedly we have few links and they are reciprocal but if they are going to de-index us why not just do it – why keep crawling.

    I have to admit that we do not understand what google wants any more and I think this is true of very many site owners. But there comes a point where you say – What’s the point anyway – just go with the other SE and forget G. And surprise, surprise traffic from other sources is increasing steadily as they move to increase their coverage of the web. So far Ask and MSN are the leaders.

    And Snap seems to be making rapid progress – I like the emphasis on the graphic elements of the website. Good for sites with lots of graphics which get a poor deal with G IMO.

  839. DavidW. Matt said earlier in the thread that they crawl more pages than they will index. He said it is because they don’t know what they might be missing.

    They might be missing links to other sites that they don’t yet know about, and, if what we have understood about BD is correct, they need to know about all the site’s links so that they can evaluate how much of it to index.

    I have a working/useable site that I haven’t yet ‘launched’, and googlebot crawled many thousands of pages from it more than a month ago. But I didn’t expect any of them to be indexed, except the homepage, and they weren’t.

  840. Incidentally, that site is very old, and none of it’s old content has existed for some time. But Google is suddenly showing some of the old pages as Supplementals, and their caches are dated Feb. 2005.

    I think that old caches like that, and why they turn up, are important to understanding what’s happening now. We don’t know why or where they have been stored, or why they are sometimes online and most times not. They show up sometimes in both the regular and supplemental indexes, but it doesn’t usually last for very long. Maybe they are backup indexes that are brought online while the current ones are being updated. But I don’t think it matters to us, because they don’t last very long.

  841. Sorry – that should have read:-

    I think that old caches like that, and why they turn up, are unimportant to understanding what’s happening now.

  842. On the contrary they are very important. I just got around 200 some pages indexed on my newest site and they are all in the supp index.

  843. PhilC – Yeah I know what was said – but what is the logic of it. If they visit a site/ page then presumably it is important enough to be noticed/ crawled then surely it follows that it is important enough to be indexed. If G go to the trouble of visitng a page then they may as well put it in the index – if you see what I mean. Its like reading the first few chapters of a book to see if its worth reading. Or visiting a hotel to see if its worth a visit. Weird logic in my mind.
    While I’m on the hotel theme I may as well expand it. The Google system is like a hotel visitor only relying on recommendations from other, previous guests as to the quality of a hotel. These other guests are complete strangers, you do not know thier tastes, reasons for visiting or whatever. You don’t even know what they thought of the hotel, only that they visited in the past. Surely we would all rather rely on our own judgement as to whether the hotel was any good or not.
    Google should develop thier own method of assessing the quality of a website and not rely solely on links from other sites. Linking is a very imprecise way of assessing the quality/ relevance of a site. Other SEs (some being developed) recognise the limitations of link relevance and are working fist to develop now algos – G have to move fast to address these issues or they will be overtaken by other, more progressive SE. (I won’t post the url’s as this is Matt’s blog – just look around – and make it snappy!!)

    regards DW

  844. Jack. I meant (and said) that I think that old caches turning up are unimportant. We don’t know why they sometimes turn up, and I suspect that we are seeing a backup index when it happens, but they don’t stay very long, so I think they unimportant for us.

    New pages being in Supplemental does matter, though.

    DavidW. We don’t know the actual reasons why Google crawls more than they index, but the two suggestions that I made don’t seem weird to me. For instance, if I were an engine and I wanted to assess the links around a site to decide how much of it to put in the index, I’d want to know about them all.

    On the hotel visitor theme, I’d say that a link to a site/page is like someone who has visited the hotel once, and thought so much of it that s/he keeps on recommending it to other people.

    Linking is a very imprecise way of assessing the quality/ relevance of a site

    When Google first used link texts to assess relevancy, it worked very well. Their results outshone all the other engines at the time. It’s only because Google started to do it that way that link polution has caused it not to work as well any more. Links were never a method of assessing quality, though.

    I doubt that Google is backward in bringing new methods and factors into their algorithm.

  845. I should say that over the last few days I have seen several updated caches for supp results. So hopefully that is a sign of good things to come. I can already tell the difference in sales.

  846. Dave (Original)

    RE: “I can already tell the difference in sales”

    Good news, I do mean that.

    So is Google now NOT broken πŸ™‚

  847. How is it that the largest spam attack in Google’s history is going on, and yet the comments continue to be dominated by a select few intent on arguing? I mean, kind of funny, kind of sad.

    The ability to hold billions of additional documents without seriously slowing down the searches, or even attracting attention, does put to rest the whole “out of space” theory anyways. πŸ™‚

    -Michael

  848. Dave (Original)

    RE: “does put to rest the whole β€œout of space” theory anyways”

    Matt put rest to that is his original post πŸ™‚

    Here ya go;

    “and we definitely aren’t dropping documents because we’re β€œout of space.””

  849. I will say it is showing improvement Dave. Spam is still very bad tho especially those sites that have been posted on the forums achieved with subdirectories. Don’t ask me to explain it’s all way over my head LOL. One of the search forumes I read did say that we could continue to see better improvements re the supp index.

  850. How is it that the largest spam attack in Google’s history is going on, and yet …

    I understand that the spam attack has been dealt with.

    does put to rest the whole “out of space” theory anyways

    Out of space – yes. As Dave said, Matt mentioned it in his first post:-

    First, I believe the crawl/index team certainly has enough machines to do its job, and we definitely aren’t dropping documents because we’re “out of space.”

    Matt didn’t say that they have the space to index everything – only that they have enough space to do their job. And he didn’t say that the dropping of pages isn’t related to space – only that they are not dropping them because they are out of space.

    It reminds me a bit of DMOZ. Not too long ago they had a huge backlog of suggested sites waiting for review. But now they don’t have a backlog at all, because they have changed their minds about how they go about things. Suggested sites are no longer a pool of sites to be reviewed – they are now merely a pool of sites they may or may not be looked at, so there’s no ‘backlog’. If Google have changed the way they do their job, by intentionally not indexing whole sites any more, then they save the need for a great deal of space, and they have enough space to do their job – as Matt said.

    Is it coincidence that the massive attack succeeded at a time when a very large number of pages had been dropped from index and freed up a lot of space, or did the spam get in because the space had been made?

  851. I understand that the spam attack has been dealt with.

    If so then I’m pretty sure you misunderstood. Is being, yes, but still going strong. Unless you meant that the manual banning of a small handful of domains that were registered about 3 weeks ago warrants calling the issue “dealt with”. So far (and I just woke up), I haven’t seen anyone from Google announce this was fixed.

    -Michael

  852. Are sites usually punished for linking internally? I’m working with a homepage that is almost completely links, but they are all internal. Is that going to affect the crawler?

  853. I must have misunderstood then. I read that they started to deal with on Sunday morning, but I haven’t checked for myself.

  854. Thanks for your very informative entry!

    Also, where do we go if we want to ask a question? I read your guidelines and it is not related to the current topic.

  855. Hi Matt,
    I hope my english is good enough to explain my meaning.

    You have said: “bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.

    Example:WebDesign
    Creates many websites – they owner of this different websites give this webdesigner website credits – most of the time in the footer- .
    website of webdesigner has an extra category: my portfolio and links back—=>reziprock in perfection.

    or Software, vbulletin ie…

    Now G would like to say: ok your problem we would not like to have reziprock links …you are a very bad webdesigner ?????

    If I saw a website with no reziprock links I know this is the work of a SEO

    and the website of G itself: I’m shure the website with the greatest bunch of reziprock links all over the world. πŸ˜‰

    I wish you well
    Monika

  856. When it comes to linking, there is no such thing as a free lunch. Instead of Bigdaddy upgrade, should be called no-free-lunches-upgrade.

  857. Dave (Original)

    I thought this was VERY interesting considering all the post about Google loosing ground since BD

    MediaPost reports

    “MORE THAN 2.78 BILLION SEARCHES were conducted on Google last month, marking 32 percent growth since May of 2005, according to new data by Nielsen//NetRatings. Google’s share of searches reached 49.1 percent–far surpassing that of rivals Yahoo (22.9 percent) and MSN (10.6 percent). “

  858. Every time Google does something, and there are major changes in the serps, some people want to hit back, and one of the ways they hit back is by saying that Google has shot itself in the foot and will lose market share, and stuff like that. But the people who say such things don’t even represent a drop in the ocean of the Web population, so their predictions never come true.

    But the growth rate isn’t anything to do with market share, so the Nielsen/NetRatings are no real indicator. Google’s market share has declined significantly since its peak a few years ago, but that’s mainly due to business arrangements and not to users moving on.

    Even so, Google can easily lose market share due to users moving on. People continue to report that they find the results to have too many useless pages of various types listed. I haven’t seen it myself, although it must occur. Something like that only needs to grow, and people will move to engines where they don’t find all the crap stuff listed. Google’s position isn’t set in stone – they have to continually work at getting rid of the crap, or people will move on and they will go into further decline.

    A search engine’s position is never secure. Look what happened to Inktomi. Not many years ago, they provided the results for so many search engines and services that a website only needed to be indexed by Inktomi to have most of the searchers covered. Not long afterwards, being indexed by Inktomi didn’t matter at all. That was due to business arrangements, but it happened. AltaVista was the #1 engine at one time, but it simply went into decline and now it’s an irrelevance. That was due to people moving on. Their hasn’t yet been a major search engine that has survived as a major engine for very long. Google certainly have to be on their toes, or they will follow suit.

  859. Concerned Webmaster

    Hi

    I’ve just checked google sitemaps and there seems to be a lots of action, googlebot (whatever its call now) has been crawling lots of pages 24,000+. All of these pages are old and have been removed a long time ago. To try and remove them from the google index I took the advice I read on this and other sites. I added lines to my robots.txt file and did a 301 on the pages. Within sitemaps google shows these 24000+ pages as blocked in the robots.txt file. Therefore I would assumed that they and there supplementary results would have dissapeared from the index. Oh No there still there, every single one of them. How the hell do you remove things from this beast.

  860. From my own experience as a web searcher, I have seen the quality of Google SERPS decrease steadily over the last two years.

    This trend cannot continue indefinitely without having an effect on Google’s market share and, eventually, the share price of Google stock.

    At this point, you are correct that very few people have noticed the increasing quality issues from which Google is suffering. However, a few articles in the right major publications can change that very quickly.

  861. I am still counfused… How long does it take to a new site be fully indexed?

  862. I guess I should have looked before posting.
    Both the sited I just posted are now back in the indexes.
    Weird.

  863. At last, Googles results are the most accurate again out of major search engines.

    I have to say in the last six months I was losing faith in Google’s search results and often gave up in frustration and found what I was looking for on MSN or Yahoo instead.

    It seems that Big Daddy has finally completed its task and Google is now producing quality search results again.

  864. I guess only Google is allowed to sell adverting.

  865. i have read the post about the “site:” operator issues (http://sitemaps.blogspot.com/2006/05/issues-with-site-operator-query.html) and i also posted some messages in google sitemap forum http://groups.google.ro/group/google-sitemaps/tree/browse_frm/thread/5918416e6cc9a08a/82468255bbd96906?rnum=31&hl=ro&_done=%2Fgroup%2Fgoogle-sitemaps%2Fbrowse_frm%2Fthread%2F5918416e6cc9a08a%3Fscoring%3Dd%26hl%3Dro%26&scoring=d#doc_d1476b7c7e2cd7cc (the whole post is named “G deleted about 300 of my webpages from the index; thinking about putting up an XML sitemap, but many posts here say that is HURTING them! What to do, yes or no??”)

    because i really think that technically google sitemap protocol hurts the site index (please read the google sitemap post form more explanation).

    i am also taking in consideration that this might be an “site:” operator issues, but i am 90% shure that is not (judging by the googlebot behaviour)

  866. Concerned Webmaster

    Hi

    What is the point of have 89,000 pages in the Google Supplemental Results index if the pages nolonger exist. Is this adding value to the Google users. Googlebot has been crawling my site and these urls, it see’s the robots.txt blocks it (also 301 redirects) and still I see these results in its Supplemental Results Index. There is a bigger issue then just this, when I login to Sitemaps I see no penalty for my site (it states that my site is in the index) buy there must be because my index page which is not Supplemental is buried in the Supplemental Results. I have a theory that within the Supplemental Results there is duplicate content and that is the reason why. HAS ANYONE GOT ANY IDEAS about this issue including matt if he’s back from holiday yet.

  867. Why are sites like this put at the top of Google and businesses are not?

    http://www.smartresell.com/seo-help.html

  868. Hi Matt,

    I work for a web design firm and we’ve recently had several clients come to us for seo with sites that have a weird pr 1 penalty/filter. The homepage of each site has a pr1 when it should be much higher, and the other pages in the site have their normal pr (higher than 1).

    This seems to be the result of them either getting links too quickly, or getting links with too similar anchor text. Do you have any insight you could share on this penalty/filter, and how to improve a site and/or its links to remove it? There doesn’t seem to be any real info on this anywhere else, and we’ve seen it with at least 5 different sites.

    Thanks for your time,
    Phil

  869. Has the algorithm part of bigdaddy been aborted? I seem to be seeing all the indexes in their pre-bigdaddy state; so, if the indexing was supposed to be complete (and if “sf giants” is supposed to find “sfgiants.com” as the top result), what happened to the new index?

    …or did the sf giants query become a bad litmus test?

  870. I wish that WMW would not be the Gate Keeper for a dialogue with Google. First of all, you cannot get the info unless you pony up $150 per year to Brad Tabke, and Tabke has a way of not enabling your account even when you pay him the $150. Then when you complain about it, you are lucky to get a response. In May of 2005, I signed up for an account. It took until August for Tabke to enable it. Did he compensate me for the months of an inactive account? Hell no. I am not sending him any more money.

    But what is it about Tabke? Why do I have to send this moron $150 to hear what Google has to say?

  871. I wanted to write and “thank you” for Big Daddy. It’s drained away the last percent of my Google traffic from my travel affiliate site so I now get zero referrals (except from Google images which provide no sales conversions and are just people plagiarising my pictures!). This has devastated my online business – which is currently my only source of income.

    It has to be said that Google has “had it in” for affiliate sites since 2005 and has taken every opportunity to nail any site which has achieved anything in the Google SERPS and displays affiliate content. Big Daddy is no exception. It’s almost like you know when an affiliate gets top 10 and instantly apply a penalty the day it happens.

    I have continually striven to build quality content around my affiliate site to make it have value add, but that clearly hasn’t been good enough for Google. Quality content takes time and to try and recover something from the time I invest in my website means that I have to advertise on it. What’s wrong with that?

    It’s understandable why many people, myself included, deeply detest Google’s arrogance and having read your “Indexing Timeline blurb” it’s easy to see why. You make comments on removing indexed content from a website just because it links to dissimilar theme sites. Who are you to pass judgment on who a website may or may not link to? It maybe that some of these sites have a perfectly good reason to link to a few dissimilar theme sites, you don’t understand their business so how can you make this judgement?

    In addition, Google “has it in” for small business websites too; presumably so you can make more money from Adwords. You have deindexed loads of the directories that we relied upon to give us one way links. This has reduced our backlinks count and trashed our sites rankings even further through link attrition. Then you penalise us for too much reciprocal linking when we try to recover – even when we partner with similar theme quality sites.

    I am sorry to say that I think Google’s popularity will wain with time unless it starts treating people and their livelyhoods more fairly. Even many of my non-SEO business associates complain about the continual SERPS churn with Google. It really has become an index of “here one day gone the next”. Your SERPS are all over the place as you continually apply penalties here there and everywhere. Your SERPS also heavily favour large authority sites and filter the rest out of existence. The Sandbox filter for new sites is just another example of Google unfairness.

    I hope sometime soon Google gets a kick in the butt that will really hurt, just as you have hurt a lot of small business livelyhoods. You are, after all a bunch of fat, bagel eating yanks who feel you own the web.

  872. So why would Google have my site indexed, but only the homepage?

    All my other pages are good and have content.. but the homepage is all thats there… And it has PR 0. What would be up with that?

  873. I have a reference site that a lot of people link to without telling me. For years, I have maintained a “reviews” page from where I link to the sites that say interesting things about my site. My pagerank has dropped (from 7 to 6, not the end of the world) and it occurs to me that most of the links from my site are reciprocal, although the other sites didn’t ask for it. How can a dumb algorithm distinguish this type of reciprocal linking from the artificial type?

    I may have to get rid of those links to protect my site and theirs from ranking retribution.

  874. I’ve switched to msn search since google algorythm now thinks that newspaper articles are obsolete content… ( they get indexed 6 months later, and when i go to the newspaper i get a “sorry this article has expired” message. )
    Sorry google but trying to ban spammers you banned lots of real content.
    Since the info i was looking for was on some little local newspaper they don’t and will never have lots of quality links, so they’ll get crawled once every other year…
    I also know several students who switched to yahoo or msn because of the fresher content that they can find. I’m sorry to say that but google broke their rule number 1, by censoring minor newspaper content from their results *they became evil*

  875. to echo our friend Steve, a few above this, it seems Google feel they can absolve themselves from any guilt over other peoples small businesses crashing and burning, by the simple resolve of saying that they should not base their business plan around Google, yet it is these same people who have made Google what it is, and bought their shares. its a bit like getting into political office on the back of 90 poer cent of the population voting for you and then disenfranchising them once elected. Does no one at Google feel even the slightest remorse that the antics of their algorithm, now known as a world leader, has in its eccentric and sometimes mistaken path taken out countless innocent businesses and private websites who grew into being with it only to be betrayed by it. its like having an all powerful guided missile with inherent duff navigation once in a while.

    every tinkering with the algorithm that fails may be just an exercise in data manipulation gone wrong at Google, but in the ensuing three to four months of chaos it causes to real people with real businesses in a real world it is cataclysmic and ruinous. i have just gone bankrupt having had all 12 of my websites reduced to one page. now im bankrupt they are back up there but of no use to me now. now that to me is serious, but it seems that to google is “pass the bowl that i might wash my hands”. i dont think it is good enough to know you are not just number 1 but almost deity status in search lore on one hand, yet proclaim no accountability or blame when the algorithm plainly goes wrong and condemns countless people and businesses to bankrupcy and suicide even in the interim until things are eventually rectified.

    Through all of this there is the arrogance that no one can directly contact or question Google except if it deigns to pu tout missives on this blog via Matt Cutts. How many of you out there ever got much more than a bloody machine generated answer to a question you did not ask in the first place after 5 days waiting, no attempt to read your complaint/query however valid. this is why we are all consigned to wroiting in this blog, because there is NO OTHER WAY OF CONTACT OR DISCUSSION, yet their actions arbitrarily ruin our years and years of work at a stroke, without even noting anyones problems or queries.

    Like Steve, i will not be crying when Google go under, as all before them have done as soon as they thought they were better than their “customers”. This algoritm is flawed now, and creaking under the strain of simply having enough space to service everything they are now into, a war on eight fronts, with us in no mans land till they figure it out.

    Google was fun, and we counted to them, now it is just another big business making money who just doesnt see us anymore or care what happens to our sites.

    Malc Pugh
    England.

  876. How about just expiring all supplemental results after one year?

  877. Matt,

    I read your “comment guidllines”, I think this meets the criteria.

    TO LINK OR NOT TO LINK, that is the question.
    I have been in touch with two SEO experts. One I am paying/hr and the other I am more or less paying since I purchased his S/W for SEO ranking.

    I am getting two completely different answers from them on LINKS and what I have found on the internet regarding this subject makes me feel like it is a ragging debate.

    One has told me to completely remove my LINK page. The other has said to add more “simialr” LINKS to my LINK page.

    So is it:????
    a link page or not
    inbound links only
    outbound links only
    reciprocating links
    links embedded in text that are germaine to the topic being discussed
    link directories or not
    link farms or not (how do you know farms from directories?) [I think I saw Link Farms refered to as WEB rings] (are they they same?)

    You can also see my web site has a hyphen, so I have felt the effects of BIGDADDY, and I am still trying to recover and get back to page 1.

    If you haven’t figured it out, I am not technical. I have just been trying to learn what I can on the topic of SEO, specifically Google. I am on page 1 on most of the others.

    If this last part is too site specific please ignore (delete it) and focus on my (and others) confusen about what role linking plays in getting a high ranking.

    Any definitive advise would really be helpful and most appreciated.

    Thank you for your time.

  878. I have read all comments (took a while), and I have to say that I am part of the silent majority.

    I also have to say that all our increases in traffic came from real-world events and activities that caused other people to naturally link to us. I think that having a bussiness based on a website is not like having a bussiness in the “real world”, and that it’s only natural that it goes up and down depending on search engine changes.

    If you get all your money from affiliates, ads, etc., with no real value on your website…. well, it’s only natural that any change on the web ecosystem takes down your whole bussiness. Sorry to be so harsh.

  879. I mean the “silent majority” that was unaffected by Big Daddy, and finds that google is doing OK.

  880. dear matt

    does google still penalise for spurious website links in guestbooks?? i have none any more, but occassionally someone “invents” one for me which has porn links and the like, and which i sometimes miss for days/weeks. is google aware of this practice and hence does not penalise, or is this still viewed as odious practices by the actual webmaster and synonymous with evil and hence punishable by banishment/death/oblivion?

    thanks

    malc pugh

  881. Hi Matt,

    I was first looking for more recent blog that put more broader emphasis on the latest update to the updates within Google’s algorithm but unfortunately i was unsuccessful.

    My questions are mainly focused on the most recent PR changes that i have noticed across DMOZ directory. Many pages have been striped from their previous high PR rating. For example, large categories such as http://www.dmoz.org/Shopping/ have a “0” PR but http://dmoz.org/Shopping/ has a rating. This was not the case beginning of this year or even just 3-4 months ago. Google has had a recent export of backlinks to all servers as well as an update to the toolbar pagerank but I have not noticed any visible updates to the algo. I have also noticed that Google has taken less of an importance for backlinks from directories, except of course YahooDir, DMOZ and Business.com.

    Great blog,
    db

  882. Can anybody tell me , if i want to write my own search Engine
    From where should i start ??????………

    Plzz post your answer to me……..any answer from you make sense for me………..

    Thax………Mehul

  883. Matt,

    I read the entire section on this page, and i’m new to the seo stuff (not new to programming logic though).. but it seems that what you post is a lot of guess work and just a needle in the haystack kind of take on the serps. I think you work for Google, but is it entirely possible that you are misleading people? Not intentionally, but because your own resources are limited.

    Just a skeptic on the net.

  884. dear matt

    still await answer to guest book question above

    cheers

    malc pugh

  885. I have a very straight forward q, at least I hope it is straight forward. I manage an online catalog of ~30M individual SKU’s (no word stem games or alternate representations, they are all true SKU’s) would I be better served to put them all under one domain (~40M pages of content) or would I be better served to split the catalog up under multiple domains. I do not want to be viewed as spamming the index, but I would imagine 40M pages on one domain would trip a filter or two. At the same time, if I split the catalog up and interlink the sites amongst themselves, would that also not be a bad thing? Please help, I am not sure what to do next…

    Thank you kindly
    J-man

  886. How many domains do you plan to use? Even if you use 40, it will be 1M pages /domain. How about creating domains for each category/group in your catalog and the subdomains for each sub group within that domain.

    Cateloge-
    ——–domain1
    —————sub 1
    —————sub 2
    ——–domain 2
    —————sub 1
    —————sub 2

    Now, what can contain 30-40 Million SKUs, hmmmm

  887. Holy Crap of all crap! People are still commenting on this one! That’s got to be some sort of blog record or something.

  888. I am still trying to figure out why some supplementals are listed so high in the search results. They are supplemental after all arent they?

  889. Thanks so much for the useful info especially the suplemental results. I was just wondering about this…….

  890. How many reciprocal links are too many ? Is there some kind of guidelines on reciprocal linking ?

  891. But do suppliemental pages get rankings? are they shown in SERP’s will thy be placed higher than the pages which are in googles main index? Is there a way to remove suppemental pages quickly?

    It will be great matt if you can mail me..Thank you for the great informative post.

  892. Thanks Matt,
    Best post ever. I agree that the bigdaddy update did go very well. The supplemental pages issue is certainly sorted out (for sites I was monitoring anyway).

    Replying to UK Loans: supplemental pages appear to get rankings, indeed I understand that everypage that is indexed gets ranked in some shape or form. They will not appear above pages that a) are not in the supplemental indeex and that b) that contain the same search phrase.

    Best way to remove SPs in my experience is to make sure the headers returned when these pages are requested contain a status of 404… and, submit sitemaps to google sitemaps.

    And BTW matt is too busy to email you, der!

  893. Anyone still with question about Links might want to have a read of this post here from Steven Bradley http://www.yellowhousehosting.com/resources/

    I read your post Matt and browsed the comments, the post of Steven answers a lot of questions within the comments your psot received. Providing people do read!

  894. Matt, my apologies for posting here just to offer a link back to myself, but I think Thomas meant to link to a different page on my site.

    The link Thomas posted above takes you to the main page of my blog, but since the content of that page changes with each new post I wanted to add the link to the right post to prevent people from clicking a link and not finding what they were looking for on the other end.

    I believe Thomas meant to link to the post 8 Signs Of A Quality Link

  895. Matt, as always you rock!

  896. Well to be honest no one can prediict what google can do next.
    just try to stay within the guidline.
    what I want also to say us advise stay away from google site map if your site is already indexed.Use it only if you want to resubmit your site for indexing after a ban.
    This my own experience.

  897. Nothing about Google can’t be predicted! Looks like we’ll have to obey their rules and to stay within the guidline.

    I quit using Google Sitemaps and sites are acting pretty good.

    Maybe it’s nice to make internal sitemap in XML file and link it from the main page, just to speed up the process of crawling, nothing else. XML sitemap could be removed later.

    From time to time I see glitches in SERPS, but this is usually corrected within couple of days.

  898. A good link if usually a link from very similar theme webpage.

    Forget Page Rank! πŸ™‚

  899. Try to keep your link exchanges as natural as you can, which means, don’t always use the same link text and do recips!

  900. This is the closest place I could find to tech support. πŸ™‚

    My site used to have 12,000 incoming links on Google. Yahoo and Alexa and Technorati still do. Now I have -0-. Have I been delisted? How do I find out why? The Webmaster error tools say everything is fine, but none of my content is listed anymore.

  901. Nice staff… I need few days to understand all of this.
    – Bigdaddy is a new crawler/proxy possibly shared with AdSense (? AdSense has different agent signature)
    – Indexing is a separate architecture layer; it includes subset of crawled pages
    – “Supplemental index” is an old system (April-2006); possibly some parallel upgrades (April-2007)

    P.S.
    Please remove meaningless “Supplemental Results” tag form SERP. Looks like a “Supplement” for Big (very young) Daddy.

  902. Thanks for all the info and I will stay away from ringtone links

  903. Hello
    I’ve read above comments and got a huge information. I’d like to give my suggestions on how to reduce supplementary results from your website. It is quite natural when you are going against the rules. What rules we have to acknoladge, do not link to those sites having too much inbounds from unrelated websites, if you feel uncomfortable of your website indexing, you will submit sitemap again for reindexing. More often the sitemap is submitted the more results you will get. I’ve started my website buildling in starting of this month, and soon I relalized that my nodes are going to google supplementary hell section.. I was surprised to see this unusual behavior of google… I heard this before that google drops backlinks and its algo.. but shocked when I could not get the results for supplementary indexing of google.. in other words google supplmentary results is not more than a hell…. but I didn’t loss the hope and submitted sitemap to google twice in a week.. the results are really amazing…. google indexed my 13 pages.. and most of the pages came back from the hell area of google and now they are displaying in as cashed pages…. hope you would like my idea to get back your site from google hell section….

    Thanks

    Bilal

  904. Yes Site Map also Matters. But google craweling all pages without site map.
    If we use site map we index MAX Pages. Thanks Mattt..Nice Posting

  905. can some one define once and for all the term spam!!!

  906. Can we trust people who call themself SEO guru?

  907. When will Google say something official about SEO?

  908. wow i think i have made that mistake before, the one involving removing the http://www.domain vs non www host names.
    I think it’s a common mistake that people make.

  909. It is evident that many disregard the quality of sites they reciprocal link exchange with. While they think that all those links help their ranking it can really have a negative ranking. Am I right? What about submitting to directories. What is Google’s stand on that?

  910. Natural Link Text it’s something that google should always count. This will make the difference between β€œclick here” billions and billions of links over the internet. SEO professionals know this and they always advice using Natural Link Text.

  911. Coding has nothing to do with page rank. And i think most everyone agrees by now that the green toolbar that shows PR is pretty meaningless.

  912. some of this stuff sounds like no-brainers! Why would a real estate site link to a quit smoking site? Can you say RED FLAG?

  913. I am very interested in the question of linking and how it affects the positions in SERPS. I agree that relevant links are required for good positions.

  914. What does exactly Natural Text Link stand for ? Regular link ? Thanks,

  915. Matt, why does it happen that Google sometimes seem to go through a phase of picking up the wrong pages for a site? At the moment I am experiencing this in almost every instance with my site (linked in my name). For example, I would do a search for ‘Joe Weider Pro 9635 Home Gym’ (I have an full-length post about that), but the listing in Google is for my page on the Joe Weider 1120 Home Gym. I can give numerous similar examples.

    This has now been going on for about 2 or 3 weeks and has been affecting my listings severely. It is extremely frustrating and is causing a significant drop in income. I haven’t done anything strange on the site just before it occurred. My posts are all written by myself – no duplicate content issues. No black hat SEO. It would be great if I could understand why this happens. Is there anything I can do to get my very steady, relevant listings back?

  916. Great!
    So relevant inlinks are always welcome, but what if I as a web developer sign my developed websites with a footer link to my personal website?
    How does this affect my clients websites?
    How does this affect my personal website?
    Is this a good practice?

  917. Matt / All

    There seem to be many sites that Google has not re-indexed for months and in some cases years, resulting in inaccurate information appearing in the results (snippet descriptions). These seem to be especially prevalent with domain information web sites, including

    http://www.souv.net/whois-com/ip-address/domain.com
    http://www.who.is/whois/
    http://whois.domaintools.com
    http://www.domaincrawler.com/domains/view/

    For example when performing the following search

    DotCorner.com last update date site:http://www.souv.net/whois-com/ip-address/

    The snippet text reads:

    Jul 24, 2008 … Last Update Date: 2007-09-05. Name Servers: … Registration Service Provided By: DotCorner.com. Contact: …

    The live page shows

    Last Update Date: 2008-09-07

    Indicating that Google has not re-indexed the page for at least 11 months.

    As a result there are many cases where performing a name search on an individual shows up domains that the individual may have owned years ago. In my case, Google is showing me as the owner of domains which now either contain explicit content or names suggesting explicit content.

    Is there a reason for this, and what can be done to fix the issue?

  918. In google search, I can find results which were published just minutes before. Whereas, in case of my blog, I can see my posts indexed only after a few days. This problem has started from my second post onwards. My first post was indexed & available in google search within few minutes of publishing. Is this due some setting problem with my XML map or robots.txt plugin?

  919. Hi, my website indexed status is

    URLs submitted Indexed URLs
    8,079 445

    I dont understand why is that, if 8,079 url submitted then why only 445 url indexed pls advice thanks
    tina

  920. Hi Matt,

    Ive been in the industry in the UK since before you guys started your world domination, today I spent some time looking through some old bookmarks on a pc I have not used for years and found this one. Talk about a trip down memory lane, Big Daddy was just one of many waves to come from you guys, I think Florida was the one though that was ‘The’ industry tidal wave that will never be forgotten.

    Anyway hope Im not too off topic here just was amazed and full of nostalgia.

  921. I’ve just read this and it seems that spammers are the only people whom will have a site that will end up penalized by Google indexing rate. I’ve been searching high and low these past few days for any reason my own site has stopped getting indexed quickly. I really can’t see any reason.

    I would really love to have some light shone on it but after visiting webmaster forum I was met by guys who do nothing but talk down to you and then ever since that indexing has gotten worse. Matt I’m getting real depressed about this. I really don’t know what’s going on. Everything I do is for the people, I’m not even building any links to the site it’s all natural but since 15th June indexing has gotten really bad and even worse since inquiring on the help forum.

    I know you don’t gotta reply too but I’ve got real respect for you and Google and for the people who visit my blog but not being able to publish posts on hot topics and get traffic for it is really making me feel depressed when me and big Steve are working real hard on this site. I do hope you can reply to me.

  922. This is true. In this age of internet business many people are spamming just in order to create negative impact on your site. So, try to keep safe distance from such spammers.

  923. I’ve noticed an excellent decrease in indexing time over the past few years. Google’s really good at picking up new content and indexing it in no time, especially if you use WP.

  924. Great ‘old’ post! Especially so now the panda update has been released (and not before time). Funny reading about big daddy, a lot of time has passed since then and I have to say i think Google has maybe lost a little focus, as with all the updates I still see irrelevant sites popping up in searches above site with genuine content and offerings. Hopefully you guys (Google) are on the right track with Panda. Time will tell but we are seeing some good site loss rankings as well as the bad.

  925. Matt you are a rock star! Thanks

  926. Very Informative Article for those who are looking forward a career in search engine optimization.
    Thanks A Lot

  927. It’s interesting that 8 years later people are still trying the same techniques of spammy aff blogs and link farms. 8 years people!!

css.php