AIRWeb 2007 papers available

AIRWeb 2007 has released which papers they’ve accepted; Bill Slawski has posted the full list of papers, with links to the papers, over at Search Engine Land. I was on the program committee and helped review papers, but I’m not sure whether I’ll be able to make it to the WWW conference or the AIRWeb workshop. If you’re interested in webspam, these papers are fun to read.

Related Posts:
  • Call for Papers: AIRWeb 2007
    I'm on the 2007 program committee for AIRWeb, which is the workshop on Adversarial Information Retrieval that will be held May 8th 2007, in conjunction...
  • Call for Papers: AIRWeb 2006
    I'm on the program committee again for AIRWeb, which is a workshop on Adversarial Information Retrieval that will be held in conjunction with SIGIR (the...
  • Lazy Saturday afternoon
    So what am I doing on the first day of my vacation? That's right, reviewing 2-3 papers for AIRWeb. Grrr. That's the last thing left...

20 Comments »

  1. Matt Cutts Said,

    April 15, 2007 @ 8:50 pm

    By the way, if you run a periodic event like a conference, here’s an SEO tip: make one site for your event, and then add subdirectories or subdomains for each event. For example, which of these is a better idea?
    http://www.siggraph.org/s2003/
    or
    http://www.sigir2003.org/

    At first glance, you might think the latter is easier to remember. But someone who only visits your site once or twice a year probably wouldn’t remember the domain anyway. The subdirectory approach is easier to maintain because your pages are all on one site, and you only have to renew that domain (as opposed to renewing domains like sigir2003.org, sigir2004.org, etc.). Finally, if you run a popular event, you run the risk of people trying to get ahead of you and registering the logical name of one of your future events (CuttsCon2008 or whatever). Better instead to put your events under one domain.

  2. Search EnginesWeb Said,

    April 15, 2007 @ 11:03 pm

    It still boils down to a band-aid, ‘Us against Them” strategy…

    This hi tech ping pong game will go on for years - with adversaries constantly ‘one-uping’ each other…

    The real answer lies in taking a DNA approach, to discover the voids and frustrations that eventually force some ‘Good’ Webmasters to do these things. Not the Bad Webmasters intent on harming, but the masses who resort to these tactics.

    It is only fair to address WHY

  3. TheSEOGuru Said,

    April 15, 2007 @ 11:15 pm

    Using Spam Farm to Boost PageRank by Ye Du, Yaoyun Shi and Xin Zhao sounds good. Good time to compare all this with previous papers.

    Thanks
    Matt

  4. JohnMu Said,

    April 16, 2007 @ 12:17 am

    Any guesses on the webspam detection contest? I would have loved a shot at that, but I don’t really have the horsepower to do any useful analysis.

  5. Tim Wintle Said,

    April 16, 2007 @ 4:57 am

    Thank you, very interesting, especially “Combating Spam in Tagging Systems” - good to see that people are recognising we can’t keep treating them like standard symantics. Haven’t read “Web Spam Detection via Commercial Intent Analysis” yet, but it sounds like a very good idea, and something you can propably implement relatively easily over at Google.

    Out of interest, what journals do papers on webspam normally get submitted to? (not my official field, so I haven’t come across many of them)

  6. Deb Said,

    April 16, 2007 @ 5:05 am

    But Matt if you register a domain like sigir.org and then if you create folder, so the url will be sigir.org/2003/ - what is the problem

  7. Harith Said,

    April 16, 2007 @ 6:24 am

    Matt

    Talking about linkspam.

    Here is a suggestion for your consideration.

    - Disable PR on Google toolbar.

    - Sites interested in having Google toolbar to display the PR value of their site have to subscribe and accept/agree on adding rel=nofollow to all paid links, in addition to provide human readable disclosure that a link/review/article is paid.

    Thoughts?

  8. Andi Said,

    April 16, 2007 @ 8:40 am

    Web spam… I don’t know if you remember me Matt but almost two years ago when this blog was new you saved my website and many thousands of hours of my work by informing me that my site had been reincluded–I was about to abandon it because it had been banned as spam.

    Well, it has happened again my Google traffic has abruptly stopped Saturday though my site is still in the index. I do follow your blog but I don’t have time to keep up on what the details of spam are, I do my site and it gets continually better. Well, it would if it got any traffic, I will abandon it if you think it’s spam and your searchers will be denied my work.

    Over the past two years I’ve done very well as an AdSense publisher and Google stockholder but I guess I’ll have to find a new occupation. I would appreciate your giving my site just one more look before letting it die. Thanks

  9. Tim Wintle Said,

    April 16, 2007 @ 9:11 am

    Andi, I’m not connected with Google but my advice would be that 1) it is still in the index, so google think it is worth people visiting it. i.e. The problem is with how well it is ranking, so it just needs some good old SEO. I would say:

    Cut down on the number of links on a page, and add more (unique) text about each site being linked to. There is a lot of talk about google trying to remove search results and directories from coming up, as they want the search to take people direct to a shop. If you are really adding some value to the user by giving detailed informative reviews of sites then google’s algorithms will be more likely to return your search results.

    Have you used Google Webmaster Console?
    http://www.google.com/webmasters
    If there is a problem with your site it will tell you there. It will also let you see how many times your pages have come up in search results but have not been clicked on, you may find that people are just clicking on your result in the SERPs less than they used to.

    Hope this helps, doubt Matt can really give SEO advice, so I thought I would give my two cents.

  10. Andi Said,

    April 16, 2007 @ 9:28 am

    Thanks for your comments Tim. What you are suggesting would require a huge effort with out any assurance of a return.

    I’m not looking for advice, simply stating my position. If Google thinks my work is spam I’m outta here. If their policy is a guessing game I can find better things to do with my time, I’ve travelled that SEO road once before and it’s just not worth competing with the spammers, it’s not what I do. Many people have used my site and found it worthwhile but I can serve them better offline if Google thinks it’s spam.

    It would be a shame to flush all that work down the drain but it’s their choice not mine.

  11. Andi Said,

    April 16, 2007 @ 9:57 am

    I should add that I’m not too arrogant to make changes I simply am fed up with playing silly games and being kept in the dark. The site paid well for a time but if it’s over, it’s over. I am just asking that Matt give it one last look before pulling the plug.

  12. Harith Said,

    April 16, 2007 @ 2:29 pm

    Andi

    “The site paid well for a time but if it’s over, it’s over.”

    That site has served you well and generated a lot of $$$ for you. Time now to let it rest in peace. God rest its sole.

  13. Andi Said,

    April 16, 2007 @ 3:47 pm

    >>> God rest its sole. (shoe directory too…)

    It is now the best edited, most comprehensive, most up-to-date women’s apparel directory on the web, I should know I’ve spent years building it and surveying the competition.

    Women’s apparel is a huge industry, larger than search, and not dominated by one name.

    I don’t need Google to monetize my talents or this database though its creation was underwritten with AdSense dollars and it has been a sweet ride with Google, I’m sad to see that end.

    So the loss belongs to the people who search for clothing on Google. When GOOG hits 505. I’ll cash out and travel in Europe for a while, maybe with my database. I’ll be fine, though I do feel sorry for Google’s clothing searchers.

  14. Harith Said,

    April 16, 2007 @ 11:57 pm

    Andi

    “I’ll cash out and travel in Europe for a while, maybe with my database.”

    Good Idea. We have very nice weather at present.

    Honestly, that site of yours has spent its life carrying so many links on its shoulders. It couldn’t take it any more. Passed away suffering of Linksfobia :)

  15. Andi Said,

    April 17, 2007 @ 12:05 am

    Harith your sick obsession with death is disgusting, you haven’t a clue.

  16. Harith Said,

    April 17, 2007 @ 12:56 am

    Andi

    Ok. No more Linkfobia talk. Here is some serious things which you might wish to consider.

    Since the introduction of Google new infrastructure BigDaddy, we have been intrioduced to successive data pushes / data refreshes which can/might affect some sites rankings in remarkable way.

    Some webmasters have reported that their sites keep back and forth moving on GOOG’s serps.
    In addition we we really don’t know for sure whether the friends at the plex have introduced new algos or filters which cause the “unstability” of ranking/indexing of some sites.

    Maybe your site will bounce again within a week or so. Good news, right :)

  17. Andi Said,

    April 17, 2007 @ 5:44 am

    Harith I have already written off that web site and am reconfiguring the database for a post-Google business. If traffic returns that will be a gift but I am moving on. I am one being driven from the web entirely by Google’s “instability.” Maybe next decade. Thank you for your interest.

  18. Joel Lesser Said,

    April 17, 2007 @ 10:51 am

    This one is full of misinformation:
    http://www2007.org/workshops/paper_116.pdf

    “One trick that some webmasters play is that they announce they will only accept link exchanges from sites that are topically related, and not entertain link exchange requests from topically unrelated sites. Such sites are sill [they did not spell check their article evidently] participating in link exchanges, and in some cases, such web sites end up dominating the top search ranks for queries related to the topic”.

    The article implies that sites that link exchange with relevance for the end user are “spamming”. That’s outrageous.

    Nowhere in the article is there a mention of “editorial discretion”, “editor”, or any words synonymous with editorial discretion on making links. The article implies that anyone exchanging links for the end user is “spamming”.

    Lets hope that Google engineers are smarter than this. The article in itself sends the wrong message to webmasters who are making linking decisions for the end users.

    I’ll wager that the kids who wrote this article have never marketed a new site on the web in their lives.

  19. John Nagle Said,

    April 17, 2007 @ 1:13 pm

    The AIRweb papers for 2007 are big on link analysis and related graph theory. I’m not sure that link analysis means much for moderate-traffic business sites any more. For too many sites, most inbound links were created for promotional purposes. It’s possible to detect those. But where would one find “legitimate” inbound links to, say, a local plumber? Inbound links will probably be from directories, blogs, or ads, all of which are spammable. Links thus indicate something significant when they link to useful information, but not when they link to businesses.

    The paper on “Web Spam Detection via Commercial Intent Analysis” is straightforward. Detecting “commercial intent” works just like Bayesian filtering of e-mail spam; you train on some commercial content and some non-commercial content, and build a classifier. As the paper points out, Yahoo has had that up for years. It’s easier than classifying e-mail, since sites that sell something have to look like they’re selling something or nobody will buy.

  20. Andi Said,

    May 2, 2007 @ 4:11 am

    A follow-up on my slightly off-topic comments above. I have exited my recent “Google hell,” for good I hope.

    If by automation congratulations on a superior though flawed algorithm, if by human intervention thank you for a sagacity surpassing superior bot-love.

RSS feed for comments on this post

Got a webmaster-related question or suggestion that is not directly related to the topic of this entry? Instead of posting it here, your best bet is our official Google forum linked from http://www.google.com/webmasters/

Also, I pre-moderate first-time commenters. Please review my comment policy before leaving a comment.