Call for Papers: AIRWeb 2006

I’m on the program committee again for AIRWeb, which is a workshop on Adversarial Information Retrieval that will be held in conjunction with SIGIR (the conference for the Special Interest Group on Information Retrieval) up in Seattle. If you’re a spammer in Budaors (located in Hungary) and you’ve thought of the smartest way in the world to spam, this is your chance to document it. Come on, Leos: I know you want to brag. πŸ˜‰ Or of course if you’re someone who has an idea for countering spam. πŸ™‚ In case you’re thinking of submitting a paper, I’ll include the call for papers below.

Second Call for Papers (with revised deadlines)

AIRWeb 2006

Second International Workshop on
Adversarial Information Retrieval on the Web

Part of the 29th Annual International ACM SIGIR Conference on Research and
Development on Information Retrieval
10 August 2006 – Seattle, WA


The attraction of hundreds of millions of web searches per day provides
significant incentive for many content providers to do whatever is
necessary to rank highly in search engine results, while search engine
providers want to provide the most accurate results. The conflicting goals of
search and content providers is adversarial, and the use of techniques that
push rankings higher than they belong is often called search engine spam. Such
methods typically include textual as well as link-based techniques, or their

This, the second AIRWeb workshop, builds on last year’s successful meeting in
Chiba, Japan as part of WWW2005. This year we solicit submissions on any
aspect of adversarial information retrieval on the Web. Particular areas of
interest include, but are not limited to:

– search engine spam and optimization,
– crawling the web without detection,
– link-bombing (a.k.a. Google-bombing),
– comment spam, referrer spam,
– blog spam (splogs),
– malicious tagging,
– reverse engineering of ranking algorithms,
– advertisement blocking, and
– web content filtering.

Papers addressing higher-level concerns (e.g., whether ‘open’ algorithms can
succeed in an adversarial environment, whether permanent solutions are
possible, etc.) are also welcome.

Full papers are limited to 8 pages in SIGIR format; works-in-progress will be
permitted 4. At least three anonymous reviews will be provided per paper,
judged on the usual basis of relevance, originality, quality, and presentation.
Proceedings of the workshop will be placed online, and distributed at the
workshop. A selection of best papers will be invited to submit expanded
versions to an appropriate journal.


5 May 2006 E-mail intention to submit (optional, but helpful)
12 May 2006 Deadline for submissions
12 June 2006 Notification of acceptance
30 June 2006 Camera-ready copy due
10 August 2006 Date of workshop


Tim Converse, Yahoo! Search
Brian D. Davison, Lehigh University
Marc Najork, Microsoft Research


Sibel Adali, Rensselaer Polytechnic Institute, USA
Lada Adamic, University of Michigan, USA
Einat Amitay, IBM Research Haifa, Israel
Andrei Broder, Yahoo! Research, USA
Carlos Castillo, Universita di Roma “La Sapienza”, Italy
Abdur Chowdhury, AOL Search, USA
Nick Craswell, Microsoft Research Cambridge, UK
Matt Cutts, Google, USA
Dennis Fetterly, Microsoft Research, USA
Zoltan Gyongyi, Stanford University, USA
Matthew Hurst, BuzzMetrics, USA
Mark Manasse, Microsoft Research, USA
Jan Pedersen, Yahoo!, USA
Bernhard Seefeld, Switzerland
Erik Selberg, Microsoft Search, USA
Andrew Tomkins, Yahoo! Research, USA
Tao Yang, Ask Jeeves/Univ. of California-Santa Barbara, USA


Yes, this workshop and SIGIR will be happening at the exact same time as Search Engine Strategies San Jose, almost directly after SIGGRAPH 2006 in Boston, and soon after the Conference on Email and Anti-Spam. Sigh. Too many fun conferences all at the same time.

Update: Changed to use the revised call for papers, which has different dates.

29 Responses to Call for Papers: AIRWeb 2006 (Leave a comment)

  1. Good morning Matt

    I see two point which deserve special attention:

    “The conflicting
    goals of search and content providers is adversarial, and the use of
    techniques that push rankings higher than they belong is often called search engine spam.”

    How are you gonna define factors to determine that a document is ranked higher than it belongs?

    Publishers will always think that his document is of such quality that it deserves to rank high πŸ™‚

    Then we have this one:

    “Particular areas of interest include, but are not limited to:

    – search engine spam and optimization,”

    I don’t think that its fair to mix spam and SEO together in such manner. That show lack of understanding and very narrow limits of thinking, IMO, with all due respect.

    I might follow up with more comments πŸ™‚

  2. So many conferences, so little time πŸ˜‰

  3. Jenstar, wouldn’t it be neat to lay out a full year of fun conferences and events:
    SXSW, Burning Man, Cannes, SIGGRAPH, SIGIR, SIGCHI, SES, WMW, Daffodil season, and then just go off and do that for a year? πŸ™‚

    Harith, I didn’t draft the CFP. πŸ™‚

  4. Jeremy Wong 黃泓量


    You got home so early today. Congratulation!

    Apart from web-spam issues, I am interested in the ways to improve the quality of a web site. It would be grateful if you could place some links of such in your weblog later.

  5. Hi Matt

    I know πŸ™‚

    However, I see the biggest challenge in future will be in establishing full cooperation and understand among search engines and whitehat SEOs. Keeping in mind that the SEO themselves are “usually” under pressure from clients to achieve high ranking and high PR values regardless of the methods applied to achieve that.

    Educating clients and making them understand the importance of ethical SEO conducts is another challenge.

    So in fact SEO specialists are squeezed as a “happy Meal” burger. Search engines from one side and clients from the other πŸ™

    Have a great day.

  6. I really like the idea of this conference. One thing that will probably be talked about is scraper sites and strategies for weeding them out of the SERPs. I would have to say that they are an undesirable consequence of contextual advertising and may get worse as Yahoo and MSN increase the reach of their own brands of ‘Adsense’.

    Meanwhile honest webmasters must believe that all their efforts and patience will pay off because the major search engines are on our side, and will eventually prevail over those out to make a fast buck.

  7. Perhaps this should also cover the side-effects on third parties of black-hat spamming tactics (e.g. blog comment spam, scraper sites, etc), and ways to minimize that? (the nofollow link attribute probably helped to a small extent, perhaps a call for similar proposals?)

  8. Hmm from the title of the session I would have said that its more about using SE’s to hack. or in the probe stages of a run.

    check out

    Ill have to have a look at that conference in more detail.

  9. >> incentive for many content providers to do whatever is

    Content PROVIDERS? Oh, boy, some serious education needed here, hmm? Or are they using a deliberately broad definition of “content provider”?

    Interesting that there’s no explicit mention of cross site techniques (unless that’s considered to come under one of the other headings), or other “technical” methods (hijacking etc)

  10. Jammin’ and spammin’ – wow that’s got to hurt! πŸ˜‰

    If I change my β€œpermalinks” in wordpress (currently end in .html) to end in β€œ/” will Google recrawl and have no issue with this change?

    Matt – If you could answer or direct me to someone who can answer it would be great thanks.

  11. Sounds like a lot of quality hotel time and the ping that comes from a hotel connection. It will be no excuse, we expect blog updates! πŸ™‚

  12. Hi Matt,

    Interesting post.

    Especially notable is the mention to Google bombing since Google never officially acknowledged the existence of this spam technique that has hurt many.

    All in all it seems that the SE’s are a little out of tune with what is going on out there.

    If you for example open a shop, do you wait until your few customers start referring you to other people or do you start advertising to get known even if your product is not superior to your competitors?
    Are any of the Unilever products better than the ones from Procter & Gamble or is it all down to advertising?

    Please lets not forget. This is not strictly academic information retrieval. The online and offline business world do not sit so far apart.

  13. Matt,

    One of the topics is right up my alley except in reverse:

    – crawling the web without detection

    I stop the crawlers that think they’re being undetected, I stop a LOT of them as a matter of fact. Not sure I want to share how it’s done so they can make it harder to spot them, that wouldn’t be any fun.

  14. Matt

    I applaud Google’s attempt to combat spam, but many seasoned spectators puzzle over why glaringly obvious spam is not detected. Here’s an example: have a big ol’ block have keywords stuffed in the content and hidden with inline CSS.

    Not only that, but those boys are spamming AdWords. They’re bidding on lots of competing trade names (‘seth lovis’, ‘fentons’, ‘irwin mitchell’, etc.) displaying a destination URL of but directing to Against TOS?

    Hang ’em high! ;o)

  15. Mike wrote: [quote]Especially notable is the mention to Google bombing since Google never officially acknowledged the existence of this spam technique that has hurt many.[/quote]

    Marissa Mayer, Director of Consumer Web Products offically acknowledged (on the Google Blog) acknowleged the Googlebombing technique back in September 2005.

  16. Say hello to Prof. Sibel Adali for me. I took one of her database courses at RPI.

    Remember it’s RPI… not RIT!

  17. thanks sukhchander

    serves me right not having done my homework on this one. I think that I had Google bowling in mind though

  18. Being a former full voting member of ACM and a member of a number of its special interest groups I’d go to SIGIR or SIGGRAPH and skip the rest.

    Has the IEEE Computer Society got anything coming up?

  19. seems like more people everyday are no longer spamming the index – its easier and lot more profitbale to just do click fraud

    ps you can remove the san diego omni’s ip address from the blacklist at least until next year πŸ™‚

  20. Does this mean you won’t be at SES SJ? That’d be a shame… not having you there. What would all the Cuttlets do?

  21. I wish i could be there.
    Would be rather nice to see what you guys were planning.

  22. *Yawn*


  23. Dear Matt,

    “If you’re a spammer in Budaors (located in Hungary)…”

    You may have met the above address as an exact case but as a citizen of the mentioned country I can tell you that it is not representative. Or in case, we are rather tired of that American attitude of confusing Bucharest for Budapest or Bulgaria for Hungary. You had better concentrate on the USA, Africa or Ukraina.

    Now, as much as I do like Google and do appreciate its services (I even like your context-sensitive advertisement method) I still think that the source of most problems regarding spam is just Google itself. This, you may just know well.

    I realize that you did combat spam in the past and do combat it right now but I can assure you – unless you give up the present method of pageranking which is just food for a rather ambiguous profession: the so called SEO – that it will be a neverending story.

    It may be entertaining for a while to find clever and ingenious ways to defeat spam but after a while there remains only helplessness and fury. Just see the history of the past few years.

    I am not a professional, not an expert. I examined the past history of spam fighting and the spammers themselves as they are openly available everywhere. I am thoroughly impressed with both parties.

    I would suggest you to examine the followings:

    – whose interest is it ?
    – how can they do it ?
    – do become yourself a spammer so you could understand it.

    If they cannot do money from it then there will be no more spam.

    And also, it is an intentional misinterpretation of all kind of rights (human, or constitutional) that people think they have the right to show everything down they want on our throats.

    It is called: parasitism.

  24. May I be so impolite as to ask what is the exact meaning of the phrase “Adversarial Information Retrieval” ? It seems to be a key phrase, but yet I can’t seem to find any definition ?!

    I have read teh Wikipedia on “Adversarial process” but that did not make me any wiser, although IR might be thought of as a process…

  25. Matt, you’ve been work-spammed. In essence, you just googled “what to do on vacation” and got results that SHOULDN’T be relevant to the user πŸ™‚

  26. heh, because of that comment, Matt is now #1 for “what to do on vacation”

  27. Did they know about THIS Technique?

  28. Matt I see all these posts from you about congoo. Dont you work for google? Not that congoo is going to destroy google but arent you not suposed to promote competitive engines? I only ask cause I see they use Yahoo links on their results pages….

    Just curious

  29. Matt,
    Let the people decide – just a a like button – and include it in the algorithm