A word about metrics, part I

I’ve been reading the brouhaha about Hitwise’s press release about MySpace and Yahoo!, and I wanted to talk about metrics a bit.

Let me tell an Old Timey Story. When I joined in 2000, Google was a scrappy underdog search engine. Back then, Altavista was vastly more popular and reported 50 million searches a day. Google was popular among savvy webmasters and at many universities, and usage was growing quickly by word-of-mouth, but the smart folks at Google were eager for the company to be more well-known. At the time, the metrics services of the day vastly underrated the number of searches done on Google every day. Month after month, every report seems to show that Google had a tiny share of the market.

At some point, one of the metrics services (which shall remain nameless) came to Google so that we could try to reconcile our data with their claims. I wasn’t in the meeting, so afterwards I caught an engineer and asked what happened; why did our numbers differ by so much? “They solicit people to install an application for them” was the answer. “But that’s a horrible methodology!” I said. “That would get you a ton more novice users; expert users wouldn’t see the value and probably wouldn’t install the application as much.” The other engineer agreed.

That was an eyeopener for me. At the time, Google was much more popular with highly-technical users, who were less likely to show up in that metric. So while Google gained market share, that particular methodology always lagged in showing Google’s growth. In a way, it was a blessing in disguise: if competitors took the metrics at face value, they would underestimate Google and how fast it was growing. Ever since, I’ve taken every metric with a grain of salt–you have to think about underlying assumptions and limitations in the data.

Let’s do a simple exercise to see if you’ve been paying attention. Suppose someone calls you up on the phone to ask you to record what you’ve been watching on TV. “How did you choose me?” you ask. The caller says, “Oh, we go by the last four digits of your phone number.” Now, what limitations will there be in the data? People without phones will be left out in the cold. People who have two phone lines are more likely to get a call. And someone who ditched their landline for a cell phone might not get a call. That will absolutely skew the selection of people unless the group doing the survey makes special efforts.

Ah, writing down what TV you watch isn’t accurate anyway, you say. Let’s buy metrics data from TiVo–they can pinpoint exactly what their users watch! Well, where’s the flaw in that? Does everyone have a TiVo? No way! TiVo viewers skew toward the hip and smart (and moneyed). Plus some providers (Cox? Comcast? DirecTV? Dish?) may not use TiVo as much because they offer their own DVR. So TiVo’s data is biased too.

Now that you’re appropriately jaded and cynical, let’s look at something out there right now. Here’s a recent post that appeared on Podcasting News:

Nielsen: Podcasts More Popular than Blogging
July 12, 2006

Nielsen//NetRatings announced today that 6.6 percent of the U.S. adult online population, or 9.2 million Web users, have recently downloaded an audio podcast. 4.0 percent, or 5.6 million Web users, have recently downloaded a video podcast.

These figures put the podcasting population on a par with those who publish blogs, 4.8 percent …

Okay, if you think more people podcast than blog, raise your hand. Anyone? No? The thing to notice is that Podcasting News contrasted downloaders of podcasts with producers of blogs. The headline might have been technically correct; it would probably not be correct if the headline were “Podcasting More Popular than Blogging” (notice how I turned “podcasts” into a verb?). Yet that article was at the top of Techmeme, and your average reader could easily miss the distinction.

The story has a happy ending. I went back this morning to check if it was still on Techmeme, and Scoble and another podcasting site are calling people on it. In the instance above, the Nielsen numbers may have been completely accurate, but you still have to analyze how someone takes those numbers and think critically about what claims they make (or imply).

This is long enough and I haven’t even *begun* to talk about Alexa or Hitwise, so let’s split it up. Today is Meeting Galore Day, but there will be at least a part II.

83 Responses to A word about metrics, part I (Leave a comment)

  1. The old discussion about metrix or statistics. When I worked for a telecom startup a couple of years ago we need to have a 99.95 (or something like that) availability on our services. The first month we had 2 customers of which 1 had problems and didn’t get service the first month. So our average availability was of course 50% for that month.

    The next month the average was calculated again, new customers were included and our average availability was going up. After a year we were at 95%, or so we thought.

    Then I moved to the department that was maintaining these statistics and found out that the management was furious about the fact that availability was still not at the promised rate.

    Why weren’t we reaching the promised availability?

    Because we were still including the first month in the availability statistics! Which was rediculous of course. You need to calculate availability on a weekly, monthly and yearly basis. Once I changed the calculations the availability turned out to be just fine. (agreed, some weeks it was slightly below the minimum value, but now were able to explain that and the next week it was way above the minimum keeping the monthly values ok).

    It´s all about the preset assumtions. But as you can read here there is a 30% chance that statistics never lie. 🙂

  2. There is actually a 100% chance that I will read part II of this blog post. Actually, if you podcast it, the chance that i will download the podcast remains the same.

  3. There is a 50% chance I may understand my own analytics, a 50% chance that you’ll understand what I tell you about my analytics, and a 50% chance that they were right in the first place. That leaves a minimal chance (1/2 * 1/2 * 1/2 = 1/8, or 12/5%) that I can correctly convey the meaning of analytics to you.

    I don’t know if those 50% numbers are accurate (yet another variable… LOL), but they’re probably not all that far off when dealing with web analytics.

  4. Matt,

    You are argument is built on the assumption that getting the data requires having to install a toolbar or crap of some kind.

    Some companies may have access to logs of different ISPs.

    Re: every method of measurement will have a bias. True. But the statement doesn’t show the hitwise study to be false.

    I would still consider hitwise’s study a very plausible and considerable kind, if not out right true.

  5. “Opt in panels” for traffic measurement certainly have the adverse selection problem that Matt described — that savvy users are unlikely to want to install the software.

    But it’s worse than that. It’s not obvious that the “opt in” software is actually opt-in at all. Recall typical spyware installation methods: Misleading bundles, licenses not actually shown to users, nested licenses (where one license references another), etc. Then there are security exploits. I have video and packet log proof of monitoring programs installing through security exploits, without user consent. This is probably overdue for a write-up on my site…

  6. I don’t believe there will ever be a way to make metrics completely accurate.

    Having written a commercial website metric application myself, I can say there’s always going to be some uncertainty. There’s just so many things a user can do that you can’t plan for.

    I remember the meetings vividly: “I used my firefox extension to open the page in an IE tab, then disabled javascript, cleared my cookies and restarted my cable modem before clicking submit, but it showed up as 3 visits and didn’t associate my search keywords with my purchase… thats’ a problem, fix it”

    or

    “they visited from 24.65.xxx.xxx and that resolves to company Y.. who if you search google has the homepage of http://www.zzzz.com that lists a contact address of qqq@zzzz.com … can you show this in the stats for me so I can email the visitor?”

    anyway, enough complaining about that.. Metrics will never be 100% accurate save for the release of a “patriot browser” that records every action a user makes on their end… and even then they still won’t be accurate.

    I think the topic of metrics is a good one though, as very few business owners or webmasters actually understand them. If I had a webpage for every person I’ve met who thinks hits = visitors then I’d have a bigger index than google.

    as for MySpace, a LOT of their pageviews come from their crappy site design. What is done with 1 pageview (and a lot of ajax) on some sites, is done with 4 or 5 page reloads on MySpace, so pageviews in this case are useless.

    It takes me 6 pageviews to send a “bulletin” on myspace (homepage, login, bulletin page, compose page, confirmation page, thank you page) yet at the most only 2 pageviews on yahoo or gmail to send an email.

  7. iTWire Article Quote

    “Hitwise collects Internet usage information via a combination of ISP data partnerships and opt-in panels.”

    Using Matts way of thinking, what about all of the people who aren’t “opted in” and ISP’s that do not have a partnership with Hitwise?

    I’ve never cared much for statistics on the internet because, as Matt has demonstrated for us above, there are a million ways to twist bad statistics to make them look sound good.

    In order for something to be true, it has to be proven. You might have 100 people surveyed on whether they like peanut butter, and after 99 responses of yes, you could say (and assume) that “Everyone loves peanut butter”. But if that last person doesn’t respond and actually hates peanut butter, the truth isn’t reflected at all. Point is, unless you have “all” of the information, its a guessing game at best and you can only hope that the data you’ve collected represents the majority of the rest. “Limitations” as master Matt says. 🙂

  8. http://www.clicktale.com/
    These guys look like they will have some of the most accurate data, although parsing it and actually deriving something useful will be an interesting project.

  9. I think you guys say it best on your own site in this article from Jim Novo:

    http://www.google.com/analytics/cu/ac_monitor_visitor_conversion.html

    “People seem to complain a lot about the quality of web data, and some hard-core stats people have various problems with the way both log-based and tag-based analyzers measure activity. I say, get over it. What matters most in tracking interactive behavior is trends, and even if the data is not 100% accurate in some way, as long as you continue to use data collected in the same way each time, you can still build trend charts.

    People obsess way too much about finding an absolute answer (hard exact numbers), wasting a lot of time and resources, when a relative answer (is it getting better or worse) can be just as insightful, if not more. Trend charts are a great way to look at relative performance stats; that’s what I use. So do the best you can to get clean data to work with, but don’t waste a lot of time and effort looking for needles in the data haystack.”

  10. Excellent points. Statistics don’t lie, but people don’t think enough about what the numbers really mean, and statistics *spin* is the bane of reasonable analysis. However if the data suggests that pageviews at Myspace are greater than those at Yahoo’s non-email family of sites then it suggests you can reasonably call myspace a “higher traffic” website.

  11. The terrible thing is that so many people take what they hear at face value. They will read the “quick version” and think that that is the cold hard facts. Very few actually dig into the “real data” and make decisions for themselves.

    Regarding perfect sampling (or the lack thereof) I will always remember the mantra my college stats professor would always repeat, “Close enough is good enough”.

  12. Cool post about Metrics – Matt. I usually don’t see you getting into Metrics at all – I guess Hitwise must have gotten under your skin – I think they try to make the most with the data they pull to get visibility.

    It seems all the metrics services do that – they can only measure so much – and they try to make a story out of it – a story they often invent to get some publicity.

    I posted about it on http://www.webmetricsguru.com/2006/07/a_word_about_metrics_from_matt.html

  13. Here’s a nice visual example of how far off services like Alexa can be for mundane reasons:

    http://blog.jbyers.com/2006/07/12/linkedin-uncloaks/

  14. Matt,

    Your view on metrics is good howver for general public they do not get the insight picture.

    Take for example yahoo’s index size increased twice?

    flomarc.

  15. Matt, you’re accurate that almost all independent internet metric companies’ data has some sort of undercounting going on, since they are all basing figures on some sort of sample set of usage data. As you mention with the TV watching tracking example, the relative accuracy of their reports will be based upon how representative of overall users their sample set really is.

    Alexa’s a great example, too — the demographic that uses Alexa toolbar is likely very skewed, compared with average internet users.

    No external, sample-based usage estimation is going to be as accurate as one’s own internal usage metrics.

    I’m surprised you don’t mention comScore or I/PRO, though. IPro provides independent auditing of a site’s logfiles, so you could quote usage figures for your site if you wished, and they will validate your figures as independent auditors. They grew out of the print advertising business, where they’d use methods to assess how much distribution newspapers and magazines really had, so that advertisers could try to compare circulation figures consistently when assessing whether to buy media.

    Also, comScore/MediaMetrix provide independent “audience” numbers for all major internet sites, so they’d also be worthwhile to mention. Though, with them, there will be the same sorts of criticisms that their audience might be skewed in some way. My belief is that their sample size may be a bit too low, or they may not be sufficiently evenly distributed geographically to accurately compare usage. Though, I think their overall metrics are really pretty good, and may be quantitatively in the ballpark.

  16. Speaking of metrics, it would be nice to see some Google metrics. Number of searches per day, number of keywords used, countries served and so on (and I don’t mean Google Trends, that’s just charts with no scale). Fancy feeding us with some stats Matt? 🙂

  17. This feels a lot like Hitwise trading their credibility for some cheap press headlines. – Jeremy Zawodny

    If people do not start standing up against incorrect data (even if it is from someone in you niche resulting in loss of relationships) we will all be living in a make believe world run by spin doctors.

    Sad!

  18. what’s wrong with a world run by spin doctors? That band rocks! Little miss can’t be wrong…

  19. I guess it’s safe to assume that all those statistics on this page are B.S.
    http://www.google.com/ads/metrics.html

  20. we will all be living in a make believe world run by spin doctors

    In other “news”, the war on terrorism continues.

  21. Yeah, and Alexa data is skewed in other directions.

    Even if we’d find a better measurement than calling up random people asking them participate, as you mention in your example, that’s still far different from *impact*. Let’s say there’s a search engine which is heavily popular among bloggers. So those use it to search for info, and they sometimes pass on the info they found, filtered by the search engine — now in a way, that search engine had impact on all that blog’s readers.

    Or let’s say there’s a search engine which wouldn’t want to have the largest market share for *all* kinds of searches, but which just would want to have a market share for searches that are important to the world and have an impact. So what they’d be focused on is to make sure searchers come to them when they want to search for e.g. “human rights”, “democracy” or “bird flu”, because that’s what affects people’s lives more than, say, searching for online games.

  22. great post!
    People just don’t understand how to ‘read’ metrics most often.
    They completely forget all the biasses, all the surrounding factors and just look at absolute figures..
    Im looking forward to what you have to say about hitwise and Alexa 😉

    and besides that.. Is there a (perfect/good) way to measure anything? aren’t all data biassed?

  23. There’s been so much speculation about whether you folks use surfing info taken from the Google Toolbar to evaluate the performance of websites (number of page views, “stickiness”, etc.). Obviously Alexa data is inaccurate. I wonder if the Google Toolbar represents a better demographic segment though . . .

  24. This calls for a new metric.

    Let’s name it…BrouhahaRank, where the number of blogs that refer to a topic determine just how blown out of proportion it is. Naturally, any blog that is referred to by other blogs with respect to the same topic is deemed to be more Brouhish and therefore should have a higher BrouhahaRank.

    The more BrouhahRank links a blog gets, the better its BrouhahaRank becomes, and the more important a Brouhaha Link from that blog becomes in determining who else is blowing things out of proportion.

    No need to thank me, but all links to my blog will be appreciated (I don’t actually care about the Hitwise report, so I didn’t blog about it).

  25. Maybe google should make their toolbar data public? 😉

    Somebody mentions above about myspace having poor design and requiring lots of pageviews to perform an action. This is the genius of its design and the reason most web2.0 sites wil never win in the CPM advertising stakes.

  26. Great points…. The use of podcast vs blogs and users vs producers is an illustration of the larger overall issue: relaibility and validity if what is being called research and metrics. It surfaced 2 weeks ago with Jupiter Research’s report on corporate blogging and the Nielson report is more of the same.

    I don’t mean to be critical or provide a quest lecture because I agree with your overall point, sampling error… but your illustration of the phone survey doesn’t exactly make the point. Sampling is used to as I am sure you know, to select representative individuals within a population.

    The population requirements are determined by the research so if the purpose of the research was to measure what shows individuals were watching in the US then the sample should be designed to represent the US population in some agreed upon manner: age, region, gender and so on. Owning a phone, number of phones or kind of phone is probably not relevant because phone ownership or preferences is not part of the research on TV and you can reach an adequate sample size without people who don’t have phones, etc.

    If phones had something to do with having a representative sample for a specific TV research issue, then it might be relevant to the sample; and, ensuring a representative sample should not be considered “special efforts” but sound research design.

    As is clearly defining what you are measuring and any relevant terminology. A user is not a producer and not knowing the difference will skew the results just as counting men would when you are trying to measure the incidence of teen pregnancy .

    Marianne

  27. Matt,

    I think at the end of the day these service including Hitwise and Nielsen are more for directional purposes than anything. Yes, there methodology is flawed, but we can still use it to extrapolate what the general audience is visiting.

  28. What is statistical probability that the meteor will hit the Earth within next 10 minutes and kill us all?

    50% – It might happen or it might not happen.

  29. It is practical to analyze all of the Metric Sources and factor in the pros & cons of each to get a balance. Usually that balance is relevant, because they all tend to neutralize eachothers weaknesses.

    And of course, reading the critiques of each metric release, will decrease the liklihood of misinterpretations

    ———————————-

    As most early Search Engine Professionals will tell you…

    In early 2000 Altavista was the Geek Search Engine and Yahoo the Popular Directory.

    Suddenly their Database and Algo changed dramatically ONE DAY
    in an attempt to produce a better ROI http://www.ysearchblog.com/archives/000264.html

    for a few months Geeks navigated to DogPile – then around Mid 2000 they adopted AllTheWeb

    around the autumn of 2000 they began using BOTH AlltheWeb and Google

    then around early 2001 Google become the Geek search engine – finally overtaking Yahoo’s Popularity the NEXT Year

  30. Speaking of metrics…and you were…is there a PR metrics weather report coming soon?

  31. Matt Belkin, an Omniture VP, had a good post on comparing statistics from Web Analytics products (i.e. Omniture, WebSideStory, Google Analytics, etc.) and Panel-Based Audience Measurement (i.e. HitWise, Nielsen, etc.).

    http://www.omniture.com/blog/node/22

  32. Matt is talking about comscore, which is completely bogus. First they distributed their spyware app via file sharing networks. Second they report Absolute uniques per month, which means absolutely nothing.

    Hitwise reports average dailly visitors to a site. In other words hitwise reports actual traffic statistics from 20 million internet users by parsing ISP log files.

    NOw if you buy comscores data you can see although the number 2 site is ranked number 2 it only gets 6 million pageviews a month, yet the number 3 site gets 500 million a month. There are many many examples of sites doing popup advertising and various forms of spam show up insanely high on comscore but don’t even make the top 50 in hitwise because they have no “real visitors”

  33. Matt –

    At one point in my career I was involved with customer satisfaction research and we faced similar issues, so I am familiar with this topic. But I have to ask…

    If you beleive it, then why a Google toolbar? Do you throw all of the data out? It’s basically junk information. There is no way to project it to a population. At least Alexa publishes their information. Does Google just throw it out?

  34. Dave (Original)

    RE: “The caller says, “Oh, we go by the last four digits of your phone number.” Now, what limitations will there be in the data?”
    ====================================

    I’d say none really as it’s totally random. Unless there is a link between what people watch and whether they have 0, 1,2, 3, or more phones it doesn’t matter in the slightest. About the only vital factor would be the size of the random sample. Basically, the bigger the sample, the more accurate the results.

  35. Not being nasty in any way but you only have to look at this blog’s Alexa score to see how metrics can be skewed by data input – in this case a lot of visitors to this site must have the Alaxa toolbar installed.

  36. Toby – Exactly.

    Matt Chat?

    Haha!!!

  37. I’d say none really as it’s totally random. Unless there is a link between what people watch and whether they have 0, 1,2, 3, or more phones it doesn’t matter in the slightest. About the only vital factor would be the size of the random sample. Basically, the bigger the sample, the more accurate the results.

    Matt’s right on this. Besides the points he made, there are unlisted phone numbers (assuming they use the phone book); those of us who have a non-telco line (e.g. VoIP users); those of us who are able to filter out solicitations/telephone surveys/other crap through the features on our lines (THANK YOU PRIMUS CANADA); users of distinctive ringing; and a few others.

    So using phone numbers as a random sample to determine consumer/user behaviour doesn’t work. I don’t often disagree with you, Dave, but on this one, I’m gonna have to.

    And to show how bass-ackwards this topic is becoming, I’m actually going to agree with SEW on something. Rather than analyze individual metrics, the best thing any marketing type can do is to examine all available metrics and all the statistics from such. The more information that the site owner has available to him/her, the less likely the chance of misinterpretation.

    Alexa is flawed because of webmaster skew.
    Google Analytics and other Javascript-based metrics are flawed because some people disable Javascript.
    Some of the on-the-page counters are skewed because of the ease of manipulation.
    Raw log-based stats packages such as AWStats are flawed because raw logs don’t always track the referring URL accurately (they often miss the querystrings, for example.)

    But if you can collect all the information from different packages, filter, manipulate, and interpret, you may come up with something.

  38. Back to Matt’s podcasting data example, the problem with the article wasn’t the data itself, it was the interpretation (and misrepresentation) of the data. And probably the agenda was to make news where there wasn’t really any news – I mean, who cares whether 3% or 4% of the population downloads podcasts. That’s why they needed to spice up the data by comparing it to something, and they grabbed onto the blogging number without really thinking through the meaning of what they were writing.. (duh!) .. Wouldn’t it have been more interesting to talk about trends: Podcast downloads are up 3% from last year, blogging is up x%, and so on. I think the same applies to many web metrics.

    Gradiva Couzin
    http://www.yourseoplan.com/

  39. @Aaron Pratt

    “Toby – Exactly.

    Matt Chat?

    Haha!!! ”

    Mattchat is envisaged to be a place where off topic comments can be taken so that this blogs comments stay on track 🙂 Matt has (indirectly) helped me so much, as well as some of the Cuttletts, so I just wanted to give something back.

    Anyway, I put out some beanbags, prepared the pringles and other snacks, put on some funky music and chilled the beer – but not many people have come to visit…. and everyone’s invited! 🙂

  40. As far as the Alexa data SPEAK ON IT BROTHER!!
    I am so tired of people sending me graphs from Alexa of their site compared to another site and as if, it had any real world reflection to the actual traffic on either site.

  41. Dave (Original)

    RE: “Besides the points he made, there are unlisted phone numbers (assuming they use the phone book); those of us who have a non-telco line (e.g. VoIP users); those of us who are able to filter out solicitations/telephone surveys/other crap through the features on our lines (THANK YOU PRIMUS CANADA); users of distinctive ringing; and a few others.”
    ==================================

    Unless there is a link between these anomolies and what people watch on their TV, it wont have any impact in the slightest.

    If the pupose of the survey, or whatever, was something to do with phones, it may matter. However, as there no link between phones and what people watch opn their TV is has no impact.

  42. & i still don’t know how they know 4 million people watch a particular TV show last night.

  43. Its a bit of an amazing thought to think that a company can literally grab the collective consciousness of the world on a rotating daily basis..

    Take the toolbar for example. On a simplistic user level it has various bits and bobs that users like and find useful. But what of that all that query level data, and the stuff about where people go post query.

    There’s probably something algorithmic in there already that helps weed out scraper/doorway pages.

    If user follows query and stays on next page and hits affiliate link in

  44. Its a bit of an amazing thought to think that a company can literally grab the collective consciousness of the world on a rotating daily basis..

    Take the toolbar for example. On a simplistic user level it has various bits and bobs that users like and find useful. But what of that all that query level data, and the stuff about where people go post query.

    There’s probably something algorithmic in there already that helps weed out scraper/doorway pages.

    If user follows query and stays on next page and hits affiliate link in less than 8 seconds then page is equal to poo, kinda thing.

    And thats just a spam reduction benefit.

    What about all that marketing stuff! Wow wow wow, can you imagine the value to be had from seeing where people go, measuring the percentage of times that people going to say, https pages and then deducing that site x has a high conversion factor for z market area…dream hand job material for marketers.

    Then theres all the other stuff too, looking at deviant queries, where they go, what do they read, how long do they read it for, area demographics etc etc

    Scary really.

    [i]Matt I used some symbols in my previous post and WP stripped it all out, feel free to delete the previous item and edit this too[/i]

  45. I am so tired of people sending me graphs from Alexa of their site compared to another site and as if, it had any real world reflection to the actual traffic on either site.

    Actually, Alexa can be useful – so long as you know what you are doing with it.

    To compare geek site with teen site is pointless – two totally different user bases.

    But if you compare websites that target a similar demographic – then you can be reasonably sure that both sites would have similar percentages of Alexa users (high or low user base).

    Now, if you know your own site stats, check Alexa – then check a competitor site and you should be able to make reasonable inferences about their trafffic.

    No doubt – Hitwise is the best, I just wish they wern’t so damn expensive.

  46. Here’s another story from the old days.

    Circa 2002, there was this search engine growing in popularity and creating a whole new industry in it’s wake. See, this search engine was so popular that if it’s algorithm decided a site was “relevant” for particular results based largely on link popularity.

    So the average webmaster soon realised that these results could be manipulated by artificially inflating the link popularity of their site(s) through a variety of techniques. The result was that a lot search results didn’t rank sites based on actual quality – they ranked them based on their perceived (ie artifically inflated) popularity.

    As a young marketeer, I said to a colleague, “But that’s a horrible methodology!”. “That would mean novice users would be largely unaware of these issues and expert users (ie, the group of people who’s support helped launch said search engine into the mass market) would simply not rely on the results. And what use is a search engine if savvy users can’t rely on the results?”. My colleague agreed.

    Let’s do a simple exercise. Suppose you were searching for advice on a particular subject and found a site. The site owner asks, “how did you find me?” and you respond – you were high in the SERPs.

    Now, what limitations will there be in the data? Experts without SEO knowledge will be excluded. Experts without marketing budgets will be excluded. But those with the knowledge to communicate (via SE’s) will be perceived as experts.

    OK, OK a little rant-ish but intended as humour more than anything. 🙂

    There are hundreds of thousands of webmasters of the past few years that are pretty annoyed that Google hasn’t interpreted their data in the way that they would like, leaving them with an unfair slice of the pie.

    2002 – 2004, every Google update saw hundreds of webmasters being adversely affected by changes in the algo, and this isn’t just market share on paper – it’s income, businesses and financial stability.

    The business of metrics is just the same as the business of search engines – it’s just taking sets of data and disseminating it in a certain way based on the way you perceive end users will need and want to use it.

    In fairness though, although Google still has some flaws there is a greater feedback mechanism for end users than many business do, or indeed need to provide.

    So, this isn’t a rant. Just pointing out the similarities to the topic and how things change when you’re on the other site of the fence. 😉

    MG

  47. What is 1+1?
    Here is how the following people might answer:
    Math, 1+1 is 2
    Engineer, 1+1 is between 1.999 and 2.001
    Journalist, 1+1 is 11 (eleven)
    Accountant, 1+1 is… what do you want it to be?
    Statistician, 1+1 is… I couldn’t tell you, the population is not big enough.

    😉

  48. Well, i totally agree that these metrics company donot provide the accurate information. We were told by a company [nameless] that only 35% of the people tend to buy cell phone online, but the way we have figured out and consulted an expert reputed company [reffered by a very prestigeous SEO Specialist] we cam to know that over 55% of the surfers tend to buy cell phone online, this is a major difference if we look at the raw figures and had a major impact on the way we were planning our new strategies.

    Things turn out good once we know that and now have quite detailed information and strategic decisions were well planned on the basis of new information.

    Regards,
    Overridex!

  49. Most issues with web metrics boil down to the confluence of two factors which tend to “muddy the waters”:

    * People have a long history of both inadvertently and knowingly abusing statistics, as many have noted above. This is compounded by a tendency to perceive “numbers” as authoritative, beyond rebuke.

    This is perhaps best articulated in Darrell Huff & Irving Geis’ 1954 classic, “How to Lie with Statistics”:

    [http://www.amazon.com/exec/obidos/ASIN/0393310728/antezeta-20 US]
    [http://www.amazon.co.uk/exec/obidos/ASIN/0140136290/antezeta-21 UK]

    * The Internet is relatively young as a medium. Metrics terminology and techniques are not (yet!) commonly shared and defined, nor are they pervasive.

    Consider that although the oldest bank in the world was founded in Siena in 1472, the financial community is still struggling to adopt internationally recognized accounting standards [http://en.wikipedia.org/wiki/Generally_Accepted_Accounting_Principles Generally Accepted Accounting Principles]. That there is still confusion in the definition, measurement and use of web statistics shouldn’t be such a surprise.

    This leads to well-intentioned but ill-informed comparison of “apples and oranges”. Or perhaps, even worse, how often do you still hear internet professionals speak of “hits”, one of the most useless site metrics?

    Yet there is hope. Web analytics practitioners and vendors formed the [http://www.webanalyticsassociation.org/ Web Analytics Association] with a goal of standardizing terminology and methods in the internet metrics field.

    Eric Peterson’s “Web Analytics Demystified”, is also an excellent contribution to helping clarify web metrics techniques and tools. He does seem to have a bias against web log based systems which I do not share. [http://www.amazon.com/exec/obidos/ASIN/0974358428/antezeta-20 US only]

    In any case, web metrics do need to be put in perspective. William Hesketh Lever, one of the founders of today’s Unilever, is reputed to have once lamented: “Half of my advertising is wasted, and the trouble is, I don’t know which half.”

    Web metrics, properly understood and implemented, are capable of delivering insights based on hard data unrivaled in many other marketing areas. I do believe that Web Trends, one of the first significant commercial web analytics tools, got it right when they named their product – its all about verifiable trends, not absolutes “down to the penny”.

  50. no, I don’t think an engineer would answer between 1.999 and 2.001 as there are infinite numbers there that qualify. that’s not good enough.. we’d say 2.

  51. Alexa data is skewed, funny thing about the alexa toolbar is that most antispyware programs pick it up when your run a scan. It is not widely used anymore by the average surfer. I would say that most webmasters have it installed to help their own numbers out I suppose. Honestly, to what benefit though?

    At one point of time it was useful (Years ago) when people were using alexa to search. Funny thing is I asked 10 people yesterday who they use to search, 7 said google, 2 msn, 1 yahoo. Out of the 10 people I asked they said they never heard of alexa.

  52. “Alexa data is skewed” – “Obviously Alexa data is inaccurate”

    Alexa showed Matt was on vacation! It might not be a presise instrument to gauge traffic but there must be some truth to the stats. I turn up the adwords and the rankings in Alexa go higher and I dont get SEO and viewers from Korea.

  53. The Adam That Doesn’t Belong To Matt Said –

    Actually, you couldn’t be more wrong and Dave is closer to the truth.

    Your “counting and examining all metrics” just leads to a worse statistical problem of potential “double dipping”. In addition, I can guarantee that most metrics will not follow the same scale, the same accounting logic, etc. so using the “take everything and draw your conclusions” approach is a far worse statistical approach than a random sampling.

    It’s simple regression analysis – with an appropriately large sample size that doesn’t discriminate on a directly relational basis, a random sampling is the “best effort” metric.

    Remember, metrics are just that – metrics. They aren’t 100% truths – they are best guesses based on the information known.

  54. William Donelson

    Matt, exactly right. This is one of my big complaints about schools preparing kids for the future. Sure, there readin’ and writin’ and ‘rithmetic, but what about CRITICAL THINKING?

    If I know the future (and I do), I KNOW that it’s going to be different from today in many, many ways. Change is the only constant. So what schools should be teachin’ their students (along with the mainstays) is the Scientific Method (or whatever you want to call it) – a way to cafefully look at a situation, and identify all (or at least, most) of the variables.

    When I watch (even) “high quality” news on TV, I am appalled by the journalists’ often huge oversights in ASKING QUESTIONS. If journalists routinely make these huge oversights, then we are in trouble.

    Kids MUST be able to grow up and KNOW how to tell when Ads, Politicians, Business people, religious authorities, neighbours and strangers ARE MISLEADING them.

    Here’s a vote for Critical Thinking! Always Question “Authority” !!!

    Cheers, Matt!

  55. William,

    While I tend to be one of the types that questions authority and thinks independently to the point where it gets me into trouble, I disagree that it can and should be taught in schools (moreso the former). Critical thinking is about forming an individual opinion based on analysis, interpretation, prediction, and other factors. Some of these can probably be taught, but things such as interpretation and prediction cannot.

    In Canada, we tend to do the opposite…we teach our children so little of the fundamentals and so much about “how to interact with others” and “social skills development” and “the emotional impact of 2+2 on the squirrels and chipmunks” that we end up having a nation full of people who can’t string four words together to form a sentence. Even if we could form the sentence, we’d never be able to count the words in it.

    In other words, a pragmatic knowledge base needs to be laid down at any and all times before the critical thinking stage.

  56. “Hitwise collects Internet usage information via a combination of ISP data partnerships and opt-in panels.” ~ Gary

    This is a misleading statement in the context of the discussion regarding how user bahavior is monitored.

    HitWise collects it’s user-behavior data directly from ISP networks. Then, in order to OVERLAY demographic information with that data,

    “Hitwise also combines this rich ISP data with a worldwide opt-in panel to overlay demographic, lifestyle and transactional behavior across the thousands of websites that are reported on every day.”

    So, HitWise does not use opt-in software to track user behavior, but merely to supplement the behavior of their tracked users. Compare this to the wholly insufficient methods employed by Nielsen and I think you’ll see the effectiveness of HitWise’ data-gathering methodology.

  57. Great post Matt, you are right, I think everyone has to look at their own market and see what statics will tell them more about how they are doing, for example for my company we know that our destination’s visitor has grown 30% last year and our reservations numbers are increasing but our direct consumers are going low, but if we read the industry news we realize that more and more people is arriving with their whole trip reserved! 80% by internet and 20% by local travel agencies on whatever they came from. Statics can help you realize many things about your business but you have to know which are the ones you should pay attention, by the way i can’t wait for the alexa post!

  58. William Donelson

    Adam that doesn’t…

    I see no reason why they shouldn;t be taught together.

    The ability to see the world clearly, identify its separate parts, and base actions on analysis of those parts is the Definition of learning and knowledge. Young children do this automatically, every day, in learning how to live in the world. As they get older, those skills should be encouraged and developed.

    Instead, modern schools do NOT favour individuality; they favour rote-learning, memorisation and conformity (it’s less expensive that way). In classrooms with 30 students or more, it’s a full-time job just maintaining order, much less teaching the subject at hand.

    Still, I believe that supporting and enhancing student’s discriminatory and analytical skills would help in all their courses, no matter the subject, AND would probably improve attention spans and increase participation in all subjects, thereby reducing demands on teachers and improving social skills as well.

    I have lived in the UK for 21 years now, after 33 years in the USA, and I can tell you that BOTH the UK and the USA would benefit from citizens who can tell when they’re being conned !

  59. I have been pretty harsh on a few SEO’s lately and would now like to say something to Google.

    Dear Google,

    You can not blame newbie webmasters for needing metrics tools (which are often made by SEO’s for linkbait) because they come from a “community” where real live people can talk about them and have access to those who made them. They are the are “us” you are “them” currently.

    For you to offer metrics tools then taketh them away hurts. Today I look at my Pagerank and can make no sense of it so what will I do? You got it, I will end up back in the SEO forums asking people who do not work for Google (and in some cases want to destroy their reputation with incorrect data) for help.

    So, I am troubled by the bad advice I am getting from SEO’s but at the same time am bothered by Google being so darn secretive.

    You get my point and I see the improvement, not sure if others are.

    That is all.

  60. William,

    Much as I’d like to see applied knowledge and the ability to use common sense be taught, the problem with the theory is that you’re talking about the ability to apply knowledge, which isn’t something that can be taught without the individual first having some natural ability or inclination in that regard. And most people (in Canada at least) simply do not have this ability or inclination.

    I tend to agree with you about the conformity of learning, and the excessive focus on memorization. I’ve been of this opinion for several years now, having attended a post-secondary institution in Canada briefly before I came to a realization similar to this one:

    http://imprint.uwaterloo.ca/issues/012497/2Forum/forum01.html

    By the way, this article was written by someone I knew pretty well at the University of Waterloo and has been edited quite significantly. I actually have the original in hard copy but I don’t want to put it in digital format for certain reasons.

    Simply put, the education system isn’t going to help you, me, or anyone else. I believe the short-term future for issues such as independent thinking and applied knowledge will pertain to sites such as Matt’s and any information provided within.

    Matt, please don’t take this the wrong way, but sometimes you are prone to ambiguity. It’s very understandable, since you can only say so much without a Google lawyer breathing down your neck or without revealing too much of the secret sauce, but nevertheless there are some things in this blog that are said that are subject to interpretation.

    But…that’s actually a good thing. The same skills that would be required to correctly interpret the various statistical measures which you’ve discussed are the same skills that would be required to interpret much of the information out there.

    And therein lies what I believe will be the future of critical thinking…the large, for the most part uncategorized information base that the Web represents. People can sift through billions of web pages and disseminate, interpret, and apply as they see fit. In other words, the Web represents both an information medium and a means of refining and adopting both applied and factual knowledge skills. The schools just ain’t goin’ cut it no more.

  61. William Donelson

    My theory of how to fix 😉 education in the USA:

    Pay students for better performance, with the teachers getting a percentage of students’ awards.

    Students would take national standardized tests, and would get awards based on how much they’d improved each semester (I assume two tests per year).

    Hurray for Capitalism!

  62. I totally agree with Matt’s statement on needed to “think critically” on the claims someone makes with statistics. Many, many examples of bad data that people pick up the torch because it fits into a conventional wisdom that appeals to them. I just finished a great book on the topic Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven D. Levitt and Stephen J. Dubner – William Morrow (www.freakonomics.com). I highly recommend it.

  63. Until students started sharing the standardized test answers through their favourite P2P program and the system got screwed up. 🙂

  64. Do any of these traffic monitoring companies release thier margins of error? I view some of the keyword traffic some of these guys list as high ranking and there are obvious flaws with the process like not having large enough samples or having the samples too demographically skewed. What these folks are saying may be true for their samples, just their samples are lousy.

  65. Bob,

    If they did release that kind of info, it would be a lot harder for their aggressive telemarketing sales team to dupe otherwise intelligent people into signing up for a $30K+ per year service.

    I can see the disclosure statement now:
    “Actual data collected may be of lower quality than what you seem to get in our impressive demo. Service is actually based on the activities of 18 unemployed dial-up users.”

  66. Did you all see that myspace got more page views than yahoo and google this month. Now thats some stats for you.

  67. Dave (Original)

    Surely nobody rreally believes Alexa stats?

  68. “The thing to notice is that Podcasting News contrasted downloaders of podcasts with producers of blogs.”

    Podcasting News didn’t do that. Nielsen did.

  69. Any survey or sample that is voluntary or requires the action of someone to become part of the sample group will always be skewed. Sometimes the implications of the skew are not all that clear.

    The Alexa toolbar one is the best. Most virus and trojan removers will attempt to remove the toolbar (because it reports back your moves) or will at least flag it as a potentially bad component. Therefore, anyone who is savyy enough to run a virus scan or a trojan scan is savvy enough not to be part of thier sample.

    From my personal experience, you can move the ranking of your site up by a sizable amount simply by putting the toolbar on your own PC and surfing your own site on a daily basis. If you have a desktop and a laptop, you can go for a major double dip!

  70. It’s why statisticians go to school pretty much to learn how to skew any metric in their favor. Just as you (and others) have mentioned… look at Alexa. The only thing it’s arguably good for is comparing your own traffic against itself historically (but hopefully you have something like Google Analytics or the like that does a much better/more accurate job).

    It cracks me up to see digitalpoint.com as the 129th most trafficked domain (affording to Alexa). Which of course is skewed because who installs the Alexa toolbar the most? Webmasters. In reality, I would think it was cool if digitalpoint.com was the 10,000th most trafficked domain (based on REAL numbers). 🙂

  71. Dave (Original)

    RE: “Which of course is skewed because who installs the Alexa toolbar the most?”
    =========================================

    Digitalpoint forum members.

  72. You also have to understand that some Media Releases as the one above about Podcasting are now written in such a way as that it invites comments on blogs. It’s done traffic generation technique and back linking technique.

  73. i dont think so that by putting tollbar we have an increase in pr!

  74. I too will be interested in your Hitwise analysis – we just spent a nice chunk of money on a couple of seats for their package, and the number one question that was never satisfactorily answered was “what sort of skew can we expect in your data?”.

    For those who don’t know, Hitwise sample ISP logs (in the UK, they represent one third of all internet users) and build their reports on these. Reports available include which keywords drive traffic to your competitors but not you, the top 20/100 upstream and downstream sites from any page, etc etc. Interestingly they are currently unable to distinguish between PPC and organic traffic from Google etc, however I understand this functionality is imminent.

    Hitwise assured me that they use a wide range of ISPs, from your Tesco to your Demon – but it’s still slightly troubling to be basing major strategic and tactical decisions on data which is, in the end, unverifiable.

    By the way, Google is still skewed towards a certain demographic of tech-savvy users. Because it isn’t the default search engine on most new PCs, and obviouly because of the built-in search functionality of IE, and because Yahoo attracts more ‘casual’ browsers to their webmail, horoscopes, searches etc – Google can be a hard place to get ROI for certain markets, as opposed to Overture.

    PS; nice captcha.

  75. I remember the days when google was small, and Altavista owned the world.

    And I still don’t know why “hits” are measured at all. Add some extra images to your site, and instantly your hits increase – lol.

  76. Despite their OTT comparisons, Hitwise are a pretty decent setup. They (apparently) take anonymous logs from UK ISPs in order to determine who is visiting what (and when, and for how long for etc).

    It’s a very expensive service, but it allowed us to improve our CPO on google adwords and get an insight into what online marketing our competitors were doing.

  77. Here’s a question, and I really don’t mean to be smug or stomp my foot. I think it’s a legitimate one.

    All this talk of skewed data and incomplete samples and accurate metrics make me think of only one thing: the sandbox, or whatever you want to call it. It seems to me we could easily substitute the sandbox for any of the skewed sample examples above.

    It seems that actually was less a question than a comment – sorry.

  78. I remember the days when google was small, and Altavista owned the world.

    And I still don’t know why “hits” are measured at all. Add some extra images to your site, and instantly your hits increase – lol.

    spacer.gif, spacer2.gif, spacer3.gif, spacer4.gif…;)

  79. Dave (Original)

    RE: “And I still don’t know why “hits” are measured at all”

    Probably because ignorance is bliss 🙂 Some probably refresh all day long to keep their hits up.

  80. Most people don’t question the metrics they read, many people don’t think fully about the metrics they post (or have a slanted agenda).

    I don’t know how to solve the first problem. The second problem gets policed (all too infrequntly) by people like you.

  81. Great article, people get way too focused on the numbers sometimes and don’t always understand what exactly constitutes them…this can lead to bad decisions if that’s all you’re basing them off.

    Mark

  82. Yup… I do agree with mark. People focusing on numbers presented should know the value it constitutes. Mostly, metrics are always a great benefit since we can understand why’s and what’s happening on something we monitored and create or plan an actions to what we conclude. Thanks.

css.php