URLs are Backwards...

Is it just me, or are URLs totally backwards? For example, take this email address:

bob@finance.company.com

Nothing too odd... the email is is going to bob, who works in finance at the company. Not many folks do email addresses like this, they might instead do bob0099@company.com, but I did it that way to compare it against a typical URL:

http://blog.company.com/2008/june/my-hands-are-bananas

Nothing too odd there, eh? You are going to the blog for the company, the article named my-hands-are-bananas, published in June, 2008.

What always bugged me is how they mixed up the order. A URL is supposed to be directions to find information... and directions always start off general (head east on I-94) and end up very specific (turn off the paved road and stop at the fifth pink trailer home).

But URLs totally mix up the order:

http://specific.general.very-general/very-specific/very-very-specific

Putting directions in that order makes about as much sense as these directions: turn left at reception, go to this company, go to France, then make a right.

A properly consistent URL should actually be structured like so:

http://com.company.blog/2008/june/my-hands-are-bananas

Adding to the oddness... things like .com and .org are called top-level domains. Yeah... it really makes sense to call something "top" when actually its on the "bottom."

Louis in the comments suggested that maybe this would be even better:

http://my-hands-are-bananas/june/2008/blog.company.com

That would would sure make type-ahead URL matching a hell of a lot easier...

Attention internet: please change.

But if you go that far....

Why have all that bizarre http, www, and slash this and slash that stuff - it's all counter to user-centricity

And these days doesn't everyone just browse by Google anyway?

PS - The funniest thing is that I got this when trying to comment

The URL of your homepage is not valid. Remember that it must be fully qualified, i.e. of the form http://example.com/directory.

or rss... or bookmarks

the slash is a necessary evil. It delimits the steps in the instructions.

Also, most of the URL is where, things like HTTP, HTTPS, or FTP are the how. Its perfectly valid to put the "how" at either the very beginning or very end of a location set... but its long past time that web applications assume a reasonable default.

Either that or the path is backwards

After all, why wouldn't you want to have the most meaningful information first?

my-hands-are-bananas/june/2008/blog.company.com

Shouldn't all paths be that way? That's how real-work addresses are built after all: person, house number, street, city, state, etc.

Re: Either that or the path is backwards

good point... I updated my blog to add this idea. Thanks!

Wow, I think you just blew

Wow, I think you just blew my mind a little bit. Where do I sign the petition?

I think it goes the right way...

addresses are specific to general... but directions should be general to specific.

so... is a URL an address, or is it directions? I'd argue that it started out as an address, but findability is much better if we flip it and consider a URL to be directions instead.

A new bubble

Does that mean we can get started on a new comdot bubble?

Asia versus the rest fo the world

Full disclosure: I'm so white I'm almost clear.

In most Asian countries the custom is to be lastname firstname, yet her in the USA we use firstname lastname. Perhaps there's a connection here to your URL rant!

:-)

hmmm...

that is possible... in the west, names are sometimes specific-to-general, like Product Marketing Manager... but other times its general-to-specific, like Vice President of Sales.

Likewise, addresses can go specific-to-general, as mentioned above. However, directions always go general-to-specific.

Great point. I look forward

Great point. I look forward to the revolution.

Does that mean Java got something right?

With all their com.corp.utility.db.oracle class naming? I admit, the Java naming conventions I've seen (I'm not a developer, so I don't understand all the reasons) are pretty easy to understand and I suppose they organize fairly easily too.

P.S. Sorry for the late comment--just getting through my feeds after a hiatus.

probably a little...

Java's package names do seem backwards at first, but the reality is that its simply the correct way to describe paths and locations.

You're mixing up two different concepts here.

The host-id (and optionally, the port) is a component of the URL, and may be a fully qualified domain name, a host name on the local network or an IP address. Figuring out what machine to forward the request to is the responsibility of the transport protocol (TCP) and the lower layers of the OSI network model (i.e. ethernet/WiFI/ARP and such). Network adapters, routers, switches and gateways do that stuff, and don't care about the application scheme (i.e. http or ftp) nor do they care about the url-path (i.e. /2008/06/urls-are-backwards). The naming convention for fully-qualified domain names was invented for the convenience of DNS long before URLs starting with "http" were relevant. Local network names are determined by the network OS (i.e. Novell or Active Directory, which both follow a specific-to-general format). Unlike FQ domain names and local network names, IP addresses follow a general-to-specific format as you parse the four octets left to right. Those things are like that for lots of reasons; making URLs roll off your tongue more naturally isn't one of them.
The url-path portion, on the other hand, is the purview of the application layers. They are interpreted by the web/ftp servers, browsers, ftp clients and such. The general-to-specific convention for url-path had different motivations behind it than the FQ domain name spec, and I suspect it's really a side effect of what's easier to parse in the programming languages that early implementations were written in.
Look at all the nonsense in the world:
Right now in my time zone it's 3:02 PM. Let's see, that's general (hour), specific (minutes), then more general (AM/PM means which half-day we're in). Military time would be better (15:02, with no ambiguity about half-day), but only soldiers, sailors and nerds use it, because apparently you need special training to "get it" viscerally.
What about dates? In the US, today is 07-17-2008, but in Europe it's 17.07.2008.
Don't get me started on the inability of Americans to deal with the metric system, or the fact that 7 days in a week makes no sense, or the bizarre collection of historical anecdotes that govern the number of days in a month.
My point is that the world is full of nonsensical crap like this, and URLs aren't the most egregious.

although...

as you correctly stated:

IP addresses follow a general-to-specific format as you parse the four octets left to right.

So... low level IP addresses are general-to-specific... file systems are general-to-specific... and high-level transport protocols like FTP and HTTP are general-to-specific... so these URLs follow the general-to-specific rule perfectly:

http://12.34.56.78/top/middle/bottom.html
file://home/nobody/htdocs/top/middle/bottom.htm

However... for whatever reason URLs to DNS names mixed this up, and went specific-to-general in some places, and general-to-specific in others.

That's why I say URLs are backwards, and I think its a pretty defensible argument... although not an important enough one to re-jigger the whole web.

Recent comments