Search Results for: white hat

WordCamp 2007 talk: Whitehat SEO tips for bloggers

By the way, if you enjoyed my Straight from Google: What You Need to Know talk from WordCamp 2009, you might also enjoy my WordCamp 2007 talk: Whitehat SEO tips for bloggers.

For convenience, I’ll include the video below:

And here are the slides from the 2007 WordCamp talk:

Not everyone has seen this talk, so I hope folks enjoy this talk from 2007!

Whitehat SEO tips for bloggers

Okay, I’ve got a bunch of pointers to summarize my WordCamp 2007 talk.

First off, here’s the PowerPoint deck that I presented. Google’s PR team was kind enough to verify that it was okay to release. I made the slides from scratch (not even a Google template), so there shouldn’t be any problems with notes in the slides or other metadata. Also note that I made this entire presentation the day of the conference, so let me know if there are unclear parts.

“But Matt, some of that talk is just bullet points! Where’s the context?” you might comment. Ah, I’m glad you mentioned that. John Pozadzides attended WordCamp and taped the talks, and he recently put up a video of the talk.

“But Matt, I don’t have an hour to spare to watch the video!” you might comment. Ah, I’m glad that you mentioned that. David Klein was at WordCamp, and he transcribed the talk into text form.

“But Matt, that transcript has a lot of words. It could take me 20-30 minutes to read all that!” you might comment. Well, I’ve already pointed to Stephanie Booth’s write-up of the session. You could also read the summary that Lisa Barone wrote. Or check out Stephan Spencer’s coverage for CNET.

Now you understand why I blogged about Alex Chiu a while ago; I used him as an example in my talk, so I wanted to explain what those two urls in my PowerPoint meant.

If you read Stephan Spencer’s write-up, he says some people thought that underscores are the same as dashes to Google now, and I didn’t quite say that in the talk. I said that we had someone looking at that now. So I wouldn’t consider it a completely done deal at this point. But note that I also said if you’d already made your site with underscores, it probably wasn’t worth trying to migrate all your urls over to dashes. If you’re starting fresh, I’d still pick dashes.

I also wanted to point out something I’m pretty proud of. If you were at the site review session at Pubcon last year in Vegas, you might remember that there was a chiropractor who wanted to do well for the query [san diego chiropractor]. At the time, Danny Sullivan teased him a bit and said “Well, you might want to put the words ‘San Diego Chiropractor’ together on the page that you want to rank.”

Well Danny, that site owner was David Klein and he took all the PubCon advice from the panel to heart. He started a blog, tweaked the copy on his site, and has even started to learn great linkbaiting techniques. For one thing, he transcribed the video of my talk, which traded some effort on his part to create a useful resource. Even better, he came to WordCamp with a creative idea, a pad of paper, and a digital camera. As he met folks at WordCamp, he had each person write their name, their website, and something that they wanted to do. Then he created an original cartoon of that person doing that thing. Go to the post with Matt Mullenweg and click on the picture of Matt to see what I mean. Matt said he wanted to be a writer, so David posted a cartoon of Matt as a writer.

How is this smart? People love to talk about themselves, and love to see themselves in the spotlight. So these little cartoons are natural linkbait: “Hey look, he drew me as a Photoshop plug-in developer!” How much did it cost to do this particular idea? Practically nothing: just the initial creative brainstorming and a little bit of elbow grease.

It was neat to see a regular site owner go from not knowing much about SEO in November 2006 to really improving his traffic with some creativity and straightforward changes. A good SEO can tune up your web site. But if someone is willing to take the time to study SEO, look for fresh ideas, and put in some effort, a regular person can definitely improve their website (and rankings!) as well. To see that come true with a chiropractor that several of us gave feedback to just last year was really exciting. That’s one of the big things that has stayed with me from WordCamp.

Update: Clarifying that Stephan’s write-up didn’t say that dashes and underscores were the same. Thanks, Stephan!

What was new at Searchology?

Google launched 3-4 new features at Searchology today. You can read about Search Options, Google Squared, Rich Snippets, or Sky Map in my previous post. But I also pay attention to the small things that Google said. I noticed several tidbits that I don’t think we’ve said in public before.

– Pat Riley mentioned a couple internal code names for spell-check features. There’s the normal “Did you mean:” spellcheck link in red at the top of the search results. Then there’s a more aggressive feature (internal Google codename: “Chameleon”) that does mid-page suggestions:

chameleon result for labor

Finally, there’s an even more aggressive feature (internal Google code name: “Spellmeleon”) for when we really think the user messed up. In that case, we’ll include a couple results for the corrected query first, then results for the user’s original query. Take the query [ipodd] for example. Our algorithms strongly suggest that the user meant to type “ipod” so we’ll include those search results first.

spellmeleon result for ipodd

By the way, if you’re a power searcher and you really did want “ipodd” then you can do the query [+ipodd] with a ‘+’ character in front of the word that you want to match exactly. Let me just say that Spellmeleon makes life *so* much better for my webspam team. Tons of spammers target typos and misspelled queries all the time. If users see a couple of valid results before they see results for a misspelled/typo query, well, lets just say that users are exposed to a lot less webspam in Google. I’m a big fan of Spellmeleon. 🙂

– Pat Riley also mentioned that if you do some of these search improvements in a naive way, the additional server load is equivalent to if Germany and France just appeared out of nowhere and started sending all their daily searches to Google. So you have to do some smart things to make this search improvement viable.

– Scott Huffman revealed that mobile search results are blended between results from the mobile web and results from the regular/normal web. Makes sense, but not everyone knows that.

– Marissa Mayer mentioned that about 1 in 4 searches triggers a universal/blended search result.

– Marissa also mentioned that 40% of searches on any given day are repeat searches for that user (I’m not sure if that means repeated that day, or just repeated compared to past searches). She mentioned that to explain why SearchWiki can be useful, because if you’re repeating a search, you may want to customize the results to your taste. Marissa also said SearchWiki receives hundreds of thousands of annotations each day.

– Someone asked what Google is doing to crawl the deep web. My advice is to check out Jayant Madhavan’s paper to read more. Here’s a direct link to a copy (PDF) of the deep web crawling paper.

– Someone asked how important is it to search video with a text search query? Google did this for political videos during the election and I’d really like to see more in this area. Together with fellow Googler Wysz, I’ve made about 50 videos to answer common webmaster questions. Right now it’s a pain to create caption files for those videos. If Google could give me a rough speech-to-text transcript (with timecodes) and let me edit the transcript to correct errors, that would be fantastic. Then someone in Turkey could read my videos even if they didn’t understand English. I would love that.

– In answer to a question from Vanessa Fox, Kavi Goel mentioned that Rich Snippets will roll out slowly at first (probably beginning as a whitelist of trusted sites) but that over time, more and more sites could show up with rich snippets. You can read more about rich snippets on the Google webmaster blog or see example code. And if you’re really into RDFa or microformats or rich snippets, the folks at O’Reilly did a nice interview with two Googlers (Othar and Guha) involved with the project.

TechCrunch got some video of Google Squared. The whole video is interesting, but the part that I thought was funny was 4 minutes, 12 seconds into the video where the Google demo person signs into Google Squared and Michael Arrington does the polite “password lookaway” and looks at Steve Gillmor, who is also doing the polite password lookaway.

– Finally, Tara Calashain asks for a custom date range form (hear hear!) and then points out something I missed. Once you move into searching with date ranges, you can sort Google results by date. This opens up lots of options for power searchers. Here’s a search for [hubble telescope] with sort-by-date selected:

Sorting by date for hubble telescope

That’s pretty useful.

– If you want to see the slides from Searchology, it looks like Yvo Schaap took the time to snapshot each slide as it appeared. Until/unless Google releases the slide deck, that’s where you can see the slides unofficially. My favorite is slide #8.

Q: Why doesn’t my site show in SafeSearch?, or “Do you hate Metallica?”

We recently received a question about why doesn’t show up in SafeSearch (e.g. note the results for [metallica] and then compare with adding the parameter “&safe=on”.

The answer is in their robots.txt. has

User-agent: *
Disallow: /

This means that we have crawled zero pages from If you look carefully, we show a description of from the Open Directory Project, not any snippet from their page. SafeSearch can’t return uncrawled/empty documents (unless they have been whitelisted), because the documents might turn out to be unsafe. Hopefully it makes sense that SafeSearch shouldn’t return a document to a user if we don’t know what that document actually has in it–what if the document had porn on it? So while we could whitelist, the correct answer is for their webmaster to allow us to crawl their site.

It’s easy to see how this misunderstanding could happen. For example, if you do the search [Nissan Motors] you get back a pretty useful snippet: “Manufactures automobiles including passenger cars, buses, trucks and related parts and accessories. (Nasdaq: NSANY).” It almost looks like we’ve crawled the page–but we haven’t. Nissan also forbids all search engines from crawling their site with a robots.txt, so that snippet also comes from the Open Directory Project.

Several years ago, the Library of Congress had a robots.txt that didn’t allow any search engines (they do now), so it wouldn’t show up in SafeSearch. So we changed it so a whitelist can trump an uncrawled document in SafeSearch. We studied the .gov domain and didn’t find any pornographic content (the closest we found was the Kenneth Starr report).

P.S. Metallica isn’t in my regular playlist, but I did watch Some Kind of Monster recently. It’s a much more nuanced view of the band than the Napster Bad Flash parody.

SEO article in Newsweek

If you haven’t seen it, Newsweek discusses search engine optimization (SEO). I talked to Brad Stone for this article, and it was clear that he had done his homework to delve into the world of SEO. I was worried that the article might sensationalize SEO, but in my mind it was pretty even-handed. Here’s a few comments on the article from my perspective:

  • I knew Brad was going to write about Rand Fishkin because he asked me a few questions about him, but I didn’t realize that he was also planning to write about a blackhat SEO (or if Brad mentioned it, I missed that).
  • Ironically, Brad decided to profile Earl Grey, one of the co-creators of a forum where blackhat SEOs sometimes chat. Why is this ironic? Well, I was doing some training on Friday, and one of the things I talked about was how to trace from one spam domain to find more spam domains. Guess what one of the examples was: Earl Grey’s sites! Small world, huh? A page on one of Earl’s sites says that he’s based in East Buffalo in New York, but we saw how that wasn’t true; it looks like he lives in Yorkshire in the UK.
  • By the way, if Earl didn’t want his identity known, he may not be happy with the Newsweek article. It gives a specific search plus the positions of his site on Yahoo and MSN, so it’s not too hard to discover the site.
  • The article mentions that the blackhat site doesn’t rank on Google and implies that it may be because Google can take longer to rank sites. I’m happy to say that’s not the reason; this domain was already caught for spamming (both algorithmically and manually) before the Newsweek article came out. I don’t know what else to say about this other than “woohoo!” back to the team at Google that works on quality.
  • The article also implies that is ranking higher because Rand Fishkin bought some backlinks. We’ve already covered this territory. Rand, those paid links from the Harvard Crimson and elsewhere aren’t helping the site. In fact, it looks like you bought links from the same network that the other two sites at the site clinic were buying from. 🙂 And I doubt Rand was expecting any direct PageRank impact from Avatar’s press release. But what is helping is good content like the articles about non-conforming loans and the new blog on that site. That’s why when I see strong links from Yahoo’s directory, Dmoz, and Wikipedia to Avatar, I’m not very surprised.

My takeaway: the blackhat’s site wasn’t ranking in Google (we’d caught it before the article appeared), and Rand has been building up the content on his client’s site. From a cursory look, that’s what is making the real difference for that site’s better ranking. One thing I’m really happy about is that the article didn’t portray SEOs and search engines as automatically being in opposition:

[Search engines] deplore the so-called black-hat SEOs who use unsavory techniques, like spamming the Web with dummy pages full of links, in an effort to make their sites appear popular. But they are increasingly tolerant of ethical or “white hat” SEOs like Fishkin, who primarily help their clients knock down the virtual walls that prevent search engines from fully indexing their site. … It’s good for Google and SEOs: better-organized sites increase the amount of content in Google’s index, while improving SEO rankings.

I couldn’t agree more. Google does not consider SEO to be spam. To Google, SEO only becomes spam when it goes against our quality guidelines and moves into things like hidden text, hidden links, cloaking, or sneaky redirects.

Reading through the piece, there’s a bit of an undercurrent of “SEOs must do some deep magic; maybe I should hire one?” Truthfully, much of the best SEO is common-sense: making sure that a site’s architecture is crawlable, coming up with useful content or services that has the words that people search for, and looking for smart marketing angles so that people find out about your site (without trying to take shortcuts). Google will keep working to make SEO easier and spamming harder. In my ideal world, a site owner wouldn’t need to think about SEO at all: Google would always find your content with no help. However, things as simple as a site map can improve how well search engines can crawl (and rank) your site.

In the coming days, I’m going to give some tips for great ways to get links that will help in Google. I’m sure I’ll pick on a spammer or two as well. 🙂