Its been two years since my inaugural blog post on April 29th, 2006: The Trouble With RSS. Over my site's second year, I wanted to do some long-term analysis on how different web analytics tools track hits, visits, and the like. As expected, they don't agree with each other:
- SiteMeter: 89,800 visits (132,000 hits)
- Google Analytics: 84,000 visits (140,000 hits)
- Webalizer: 431,000 visits (3,660,000 hits)
In contrast with the other two, Webalizer uses raw Apache logs to determine hit count, so it tracks every single dang hit... Over 3 million hits in one year??? That's clearly too many... I'm not that interesting... but the visit count might be more accurate. Webalizer is the only analytics tool that tracks folks who view my site with RSS Readers, which may hit my site several times per day... thus the higher visit count. The hit count is hyper inflated because it counts search engine spiders, spammers, and hack attempts (some better than others).
All told, if the majority of folks view my site with RSS, then Webalizer's count is more accurate. If most of them view it the old fashioned way, then the other two are more accurate. I'm probably in the 100,000 - 200,000 visits per year range.
Unfortunately, none of these numbers include the folks who read my site through an online RSS readers, like Google Reader, or Bloglines. These sites hit my RSS feed once, then share it with dozens of folks who subscribe to the feed... To get a better estimate, I could pipe my RSS Feed through something like Feedburner. Feedburner keeps track of how many subscribers you have on the online feed readers, and produces decent stats on it... however, once you move your feed to Feedburner, its almost impossible to move it out... so I'm not happy with that option. Even so, that still wouldn't track those who view my content through RSS aggregators like Central Standard Tech, or Orana, or other sites that run Planet.
Well, what about the data from Alexa? That site ranks web pages based on those who surf the web with a toolbar that tracks their every move. Personally, I think people who surf with that toolbar are opening up a major security hole... so their viewing audience is probably restricted to folks who are kind of tech savvy, but don't take security precautions. In other words, newbie geeks. I've never broken into the top 100,000 sites ranked on Alexa, but I frequently break the top 100,000 sites ranked by Technorati... although Technorati only ranks blogs.
UPDATE: As Phil noted in the comments below, most people use Alexa just to boost their own page rank. For example, you could have your web team install and enable the Alexa toolbar, but only when browsing you own web page. That would make your Alexa rank huge without any actual hits from the greater internet...
Even if we could accurately count how many people hit the site, we're still at a loss to know who paid attention. Google Analytics tries to measure "time on the page", other metrics include bounce rate, or even the number of comments.
Oh well... A reliable measure of relevance will always be elusive... but at least we have enough estimates to support a cottage industry of people analyzing those metrics to prove anything they are told to prove ;-).
Back to my anniversary... Lots of stuff has changed since my first anniversary post: I've traveled to South Africa, Brazil, and Argentina... I've remodeled my kitchen, I've nearly completed my second book on Oracle enterprise content management, I've given technology presentations at Oracle Open World, AIIM Minnesota, BarCamp Minnesota, and IOUG Collaborate in Denver. I've trained both salespeople and consultants on what Enterprise Content Management actually is, and I helped negotiate a settlement to an 18-month lawsuit against a local non-profit. Oh yeah... I implemented about a dozen ECM solutions as well...
Next year, I hope to have even more goin' on... and a few more web site visits.