The Trouble With RSS

I've had a love/hate relationship with RSS Feeds (and their cousin Atom) for some time...

These feeds are great for blogs, and occasionally other web sites. They are also useful for pushing simple content from one web server to another. However the current technology suffers from several serious limitations that will delay its widespread adoption.

[ much more after the jump ]

I put together the RSS Feeds samples for Stellent. Some was my own work from years ago, but most was a formalization of samples I collected from several developers (and non developers) at Stellent.

I love RSS feeds for some things. I like publishing server error logs as RSS feeds so I get instant notification when anything bad happens. Its a great way to syndicate your log files, which is a good best-practice for security. I also like being able to define a search, and save the results as a feed. Its a quick and easy way for non-administrators to set up their own subscriptions.

A coworker loves RSS so much that he hardly visits our intranet site anymore: he gets feeds of all content checked-in by his team, his workflow items, etc. I'm not that gung ho yet...

However, integrating RSS tightly with the Content Server was a bit frustrating. I bumped head-first into several significant problems and limitations in RSS that I believe will slow its widespread adoption. Maybe I'm being nit-picky, but when WebDAV originally reared its head, we were not vocal enough about its limitations. Some people got caught up in the fever and though it was the ideal and only way to implement enterprise content management.

As it turns out... not so much.

RSS has even bigger limitations than WebDAV. The OPML extension to RSS addresses some of these, but not all. Despite the limitations, RSS is easy to set up and can greatly increase your productivity... so you should plan on implementing something with RSS this year.

But be warned... don't buy into the hype just yet. And if you already have, allow me to crush your spirits by presenting my list of the top five major problems with RSS:

One: Most RSS Readers are Just Plain Awful

I don't want to point fingers here... RSS is a simple format, as such everybody and his dog decided to prove they were cool by creating an open source RSS reader. Most of them aren't worth the bandwidth they cost to download. A handful of commercial readers are better, such as NetNewsWire (the very first reader), but I feel that every one of them is either too clunky or too buggy for the average user.

Precious few readers support multibyte languages. It takes too many clicks to add a new feed, and when you do the folder structures are counterintuitive. Many of them don't even support the RFC 822 date format that is mandated in the RSS specification. I've had to munge date formats to get certain readers to parse them at all.

Many of them have a hard time telling one feed item from another. They usually use the LINK element as the global identifier. This is fine if you have a blog, but not if you want one web page to have multiple feed listings. Depending on your reader, you may have to munge up a fake URL just to get them to display new data.

Despite the fact that RSS was popularized by blogs, and seems specifically designed for blogs, even bloggers avoid it. According to the 2005 Blogads survey:

"Only 12 percent of the blog reading audience said it used RSS always or often."

Might this have something to do with the poor quality of the currently available readers???

UPDATE: The 2006 Blogads survey was about the same, only 11% use RSS always or often. The number of people who use RSS occasionally got a slight bump: from 17% to 19%. Still not stellar numbers.

Microsoft has promised to incorporate RSS feed support into their next major release of Outlook. But will it be a good RSS reader, or just fulfill the minimum requirements for "buzzword compliance"? That remains to be seen.

I believe that Outlooks's RSS reader will need to be hands-down superior to all other RSS readers available to have any impact whatsoever. Otherwise, it wont be used. The geeky amongst us will use our alternate readers, and the standard business users will continue to ignore RSS entirely.

My favorite reader (despite several extremely wacky design choices), is Bloglines. It is a web application, not a heavy client, nor a plug-in to your email software. Because it aggregates the feeds on the same web page, its easy to scan for items of interest.

Using it has enabled me to fly through about 50 feeds per day. This is mainly because its UI is so weird that one wrong click will delete all your feeds before you even get a chance to read them...

...and it's STILL the best RSS reader I've found.

That makes me sad.

Two: There is no Concept of Revisioning or Updates for Items

The RSS 2.0 specification supports what's called the GUID element. This allows you to set a globally unique ID for your feed item. This bypasses problems with requiring unique URLs for each feed item. The ATOM format has a similar element.

However, neither RSS, nor RDF, nor ATOM supports the concept of revisioning a feed item. In the real world, content needs at least two identifiers: one that is shared between all revisions of the item, and one for each specific revision.

If you have ever tried to consume RSS feeds from Reuters, you know how painful this limitation is.

You see, Reuters updates its news articles any time they make a minor addition or correction. But it is vital to keep the older stories on the site and in the feed since they are all "official" news releases. If they did otherwise, they would be accused of trying to re-write history, like some lesser organizations are prone to do.

Since RSS doesn't support grouping by revision, you sometimes get the same news article in your feed three or four times. The full story is different, but the summary in the RSS is completely the same. This is extremely annoying when you are trying to speed-read your feeds to get the day's news. This isn't Reuter's fault, the limitation is in RSS.

Three: Feeds Cannot Contain Other Feeds

Now this one REALLY bugs me.

Why can't I say "here's a feed item, and here's a link to another RSS feed of related items." Or why not "here's a list of all the feeds on my site, and when they were last updated." Or how about "This file contains all the RSS feeds that I subscribe to." Is it me? Am I asking for too much, here?

OPML allows you to create a list of RSS feeds, but its very limited. It doesn't support timestamps for when those feeds were last updated. Plus, why should we need yet another XML format? How about we get one right, and standardize on that?

Jeez... I have to say, if you aren't going to support the 'E' in XML (eXtensible Markup Language), why use XML at all? Just make RSS a bunch of tab-delimited text files, and be done with it. Then at least you'd have an excuse for the flatness of the format.

Four: RSS does not Scale Well

I recall an article in 2003 about how RSS feeds might help stem problems with spam. The author predicted that web sites will begin offering monthly newsletters, sale offers, coupons, etc., as RSS feeds instead of email. This way users do not have to give their email address to receive notifications. They would ping the RSS feed every hour (perhaps more often) to see if new data is available.

This is the "pull" model, similar to web pages. This is the opposite of the "push" model, used in email and instant messaging. Advertisers and spammers love the latter, but consumers prefer the former.

Imagine if telephones gave you this option... then telemarketers would have to wait for you to call them! Not a bad idea... but it wouldn't work very well if everybody decided to do it.

For example, my main airline (Northwest) offers RSS Feeds about airfare specials. Instead of giving Northwest your email address -- which they already have if you buy eTickets -- you subscribe to their feed to get notices about cheap tickets.

Hmmmmm... so instead of sending out 1 million emails once a week, Northwest could instead have 1 million people hitting their web site EVERY TEN MINUTES... and download the same file each time.

When Northwest finally puts a new item in the feed, then at last there is something new to download! But of course, you do not just download the one new item. Oh no. You must download the entire file, which contains one new item and twenty older items that you downloaded a hundred times already.

This doesn't sound like a winning strategy.

I recall when Slashdot first offered RSS feeds, we couldn't download them at Stellent. Initially, their feeds were dynamically generated. In an attempt to prevent the Slashdot effect on Slashdot, they blocked access if someone from your IP address downloaded the feed ten minutes prior. Of course, if you were behind a web proxy or had Network Address Translation, all your IP addresses looked the same to Slashdot... so only a lucky few could read the feeds.

This sent Alec off on one of his anti-NAT rants, but that's another story.

Of course these days Slashdot publishes their RSS feeds to static files... which allows the web server to time-stamp check the file for you. If your RSS reader obeys HTTP caching headers (which is a BIG if), then this performance hit isn't so bad. But when the file changes, you will still download a file that contains mostly data that you already have.

Five: Most Content Is Not Created With RSS In Mind

Ever notice the difference between RSS feeds from newspapers, and the ones from bloggers?

Newspapers work extremely hard to make sure they use good headlines for their articles. There is simply too much information these days, and almost nobody reads an entire newspaper. They scan for articles of interest based on the headline, or at most the first paragraph.

This effort translates very well into the RSS model: quick punchy titles, quick summaries, and a link to find out more.

Most bloggers on the other hand have headlines like OMG! or WFT?! or possibly ROTFL! These may be fine for email, but that's only because your audience will probably read your message regardless of the title. However, if one of those were the title to an RSS article, I'd probably skip it.

I have too many feeds and emails to read in a day. If you cannot get to the point in your title, then you certainly wont get to the point in the rest of the article... so it's probably a total waste of my time.

The problem is that its extremely difficult to write good headlines, and good first paragraphs. This is more of a social problem than a technological problem... but it significantly reduces the value of your RSS feeds. Unless your articles have verbose titles, and coherent summaries in the first paragraph, they will not make good RSS feeds no matter what technology you use.

Six: Polling A Web Server Means Lost Messages

Face facts, folks... Syndication feeds lose data.

Lets assume you check a RSS feed once per day, and it is configured to display the latest 10 articles. Fine... one problem: today they had a whole ton of activity, and posted 20 articles in one day.

Result? You have lost half of the relevant data. Do you really want to use something like this for a mission critical application?

The entire reason why asynchronous messaging is popular, is because things like polling are not robust enough, and do not guarantee a decent quality of services. This is why people invented things like XMPP/Jabber and Active MQ Messaging Service.


Despite the buzz, I doubt that RSS will make much of an impact until Microsoft has a decent implementation built into Outlook. It will remain a toy for geeks and bloggers until its as scalable and easy to use as email.

If the past is any indicator, Microsoft will do one of two things:

  1. Put a clunky, buggy, and feature poor RSS reader into Outlook.
  2. Say "nuts to the specifications," and invent a proprietary extension to
    RSS, then implement that very well. It will have lots of bells and whistles
    that few people will use, except virus writers.

I'm betting on #2. There's just too much buzz about RSS for Microsoft to think its OK to drop the ball on this one. And if anybody from Microsoft is reading this, please cram the following extensions down everybody's throat:

  1. The concept of a revision ID, as well as a feed item ID.
  2. Feed items that either are, or contain, 'child RSS feeds.'
  3. Timstamp support in 'child feeds' stating when they were last generated, so
    readers only have to check one file to see what feeds to download.

After these core technological problems are fixed, RSS will be worthy of its buzz. Then we can move on to the next big problem with RSS feeds: getting bloggers to write about something interesting.


The Trouble With RSS | Bex Huff

This infо is invaluable. When can I find out morе?

Feel free to νisit my wеb page;

Recent comments