Articles about computer software, hardware, and the internet.

Oracle Open World, 2009

I'm off to Open World! I came early this year, because Oracle is doing the ACE Director briefing on Friday. That's always a bit tense for me: sneak previews on cool technology that I'm not allowed to blog about! Alas, I'll survive... It will be nice to see all the other Oracle ACEs again, like Sten, Lonneke and Chris. I already bumped into Jason Jones at the airport.

For the first time, I'm not presenting anything this year. I had planned a few talks on security and Site Studio 10gr4, but this summer was busier than normal, and I couldn't put them together in time for the deadline. Kind of a bummer, but no big deal: I'll just present them at Collaborate 2010, or the local Minnesota Stellent Users' Group.

I don't know what I'll be able to share after my briefing today, but I'll do what I can. Also, if you are heading to Open World, and you'd like to meet up, send me an email!

Site Studio Performance Tuning: Now Posted

In case you missed my talk last month... IOUG has posted the full video of my Site Studio Performance Tuning Webcast. This was an hour long talk containing tips and tricks for making your web sites faster. Only half of it is specific to Site Studio or Oracle UCM: I also share tips on making general HTML pages faster, which should apply no matter what kind of system you use.

As usual... my presentation is available for download from Slideshare, if you'd like a copy... Although this one lacks the panache of the video version.

PS: sorry that its in WMV format... I had no control over that...

The Deep, Dark, Secret Origin Of Oracle UCM's Security Model

On a recent blog post about Oracle UCM -- Should Oracle Be On Your Web Content Management Short List? -- CMS Watch analyst Kas Thomas commented that he thought Oracle's security model was a bit spooky. He admitted that this may be because he didn't know enough about it: his concern stemmed from an overly stern warning in Oracle's documentation.

Alan Baer from Oracle soothed his fears and said that the documentation needed a bit of work... The documentation mentioned that changing the security model might cause data loss, which is in no way true. It should say that changing the security model might cause the perception of data loss, when in fact the repository is perfectly fine... the problem is that when you make some kinds of changes to the security model, you'll need to update the security settings of all your users so they can access their content.

Nevertheless, I thought it might be a good idea to explain why Oracle UCM's security model is how it is...

Back in the mid 1990s when UCM was first designed, it had a very basic security model. It was the first web-based content management system, so we were initially happy just to get stuff online! But immediately after that first milestone, the team had to make a tough decision on how to design the security model. We needed to get it right, because we would probably be stuck with it for a long time.

  1. Should it be a clone of other content management systems, which had access-control lists?
  2. Should it be a clone of the unix file permissions, with directory and file based ownership?
  3. Or, should it be something completely different?

As with many things, the dev team went with door number 3...

Unix file permissions were simply not flexible enough to manage documents that were "owned" by multiple people and teams. The directory model was compelling, but we needed something more.

Access Control Lists (ACLs) are certainly powerful and flexible, because you store who (Bob, Joe) gets what rights (read, delete) to which documents. The ACLs are set by the content contributors when they submit content. However, ACLs are horribly slow and impossible to administer. For example, I as an administrator have very little control over how you as a user set up your access control lists. Let's say some kinds of content are so important that I want Bob to always have access, but Joe never gets access. If Bob gets to set the ACLs on check-in, then there's a risk he gives Joe access. It's tough to solve this problem in any real way without a bazillion rules and exceptions that are impossible to maintain or audit.

Instead, the team decided to design their security model with seven primary parts:

  • SECURITY GROUPS are like a classification of a piece of content. Think: restricted, classified, secret, top secret, etc. As Jay mentioned in the comments, these are groups of content items, and not groups of users.
  • ACCOUNTS are like the directory location of where a content item resides in a security hierarchy. Think: HR, R&D, London offices, London HR, etc. These are typically department-oriented, but its also easy to make cross-departmental task-specific accounts for special projects.
  • DOCUMENTS are required to have one and only one security group. Accounts are optional. This information is stored with the metadata of the document (title, author, creation date, etc.) in the database.
  • PERMISSIONS are rules about what kind of access is available to a document. You could have read-access-to-Top-Secret-documents, or delete-access-to-HR-documents. If the document is in an account, then the user's access is the union intersection of account and group permissions. For example, if you had read access to the Top Secret group, and read access to only the HR account, you'd be able to read Top-Secret-HR content. However, you would not see Top-Secret-R&D content.
  • ROLES are collections of security group permissions, so that they are easier to administer. For example, a contributor role would probably have read and write access to certain kinds of documents, whereas the admin role would have total control over all documents. Change the role, and you change the rights of all users with that role.
  • USERS are given roles, which grants them different kinds of access to different kinds of documents. They can also be granted account access.
  • SERVICES are defined with a hard-coded access level. So a "search" service would require "read" access to a document, otherwise it won't be displayed to the user. A "contribution" service would require that the user have "write" access to the specific group and account, otherwise you will get an access denied error.

This kind of security model has many advantages... firstly, it is easy to maintain. Just give a user a collection of roles, and say what department they are in, and then they should have access to all the content needed to do their job. It works very well with how LDAP and Active Directory grant "permissions" to users. That's why it is usually a minimal amount of effort to integrate Oracle UCM with existing identity management solutions.

Secondly, this model scales very well. It is very, very fast to determine if a user has rights to perform a specific action, even if you need to do a security check on thousands of content items. For example, when somebody searches for "documents with 'foo' in the title," all the content server needs to do is append a security clause to the query. For a "guest" user, the query becomes "documents with 'foo' in the title AND in the security group 'Public'." Simple, scalable, and fast.

There are, of course, dozens of ways to enhance this model with add-on components... The optional "Collaboration Server" add-on includes ACLs, along with the obligatory documentation on how ACLs don't scale as well as the standard security model... The optional "Need To Know" component opens up the security a bit to let people to see some parts of a content item, but not all. For example, they could see the title and date of the "Hydrogen Bomb Blueprints" document, but they would not be able to download the document. The "Records Management" component adds a whole bunch of new permissions, such a "create record" and "freeze record." I've written some even weirder customizations before... they aren't much effort, and are very solid.

I asked Sam White if he could do it all over again, would he do it the same? For the most part, he said yes. Although he'd probably change the terminology a bit -- "classification" instead of "role," "directory" instead of "account." In other words, he'd make it follow the LDAP terminology and conventions as closely as possible... so it would be even easier to administer.

I do think it is a testament to the skills of the UCM team that the security model so closely mirrors how LDAP security is organized... considering LDAP was designed over many years by an international team of highly experienced security nerds. I'm also happy when it gets the "thumbs-up" from very smart, very paranoid, federal government agencies...

Enterprise 2.0: Ignore the Fads, Follow the Trends

A few years back, Andrew McAfee "coined" the term "Enterprise 2.0." Recently, he's been criticized on the web here, here, and here, for his definition... Critics are saying his definition is outdated, unhelpful, and flawed. Some of this criticism is a tad harsh, but a lot of it is valid. McAfee responded by re-stating what E2.0 is:

Enterprise 2.0 is the use of emergent social software platforms within companies, or between companies and their partners or customers.

Kind of light on the details, eh? He continued to define related terms like "social software", "platforms", "emergent", and "free-form"... which fleshed out the definition a bit... but still, I'm left with a big question. How is any of this actually helpful??? It doesn't mention technologies... it doesn't mention purpose... it doesn't mention value. Based on this definition alone, there's not really a compelling reason for anybody to get excited about it. Luckily, because of the Web 2.0 cool-aid, anything with a 2.0 after it will generate buzz, so people latched on.

Let's contrast this with the definition of ECM by AIIM:

Enterprise Content Management (ECM) is the strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists.

Its not perfect, but it should be pretty dang clear to any businessperson what problems ECM solves, and what every day tasks will be easier if it is done right. It also makes it obvious that its about strategies and methods; not just tools and technologies.

I frequently lament that anybody is trying to define what Enterprise 2.0 is, before we even know what it is. The 2.0 clearly means that it is intended for the "next generation" of enterprise software... but what is the next generation of enterprise software? If it's nothing more than enterprise social software -- which is what McAfee says -- then why on earth do we also need the term "Enterprise 2.0"? If its just blogs, wikis, and next generation collaborating tools, then we already have a term: Web 2.0. In either case, the phrase "Enterprise 2.0" is useless.

Now, if Enterprise 2.0 is truly meant to define the "next generation" of enterprise software tools, then the term will one day become useful. However, since these tools are still being envisioned and designed as we speak, a definition is still fairly useless... since we don't know what Enterprise 2.0 is yet!

If anything, the definition of "Enterprise 2.0" should reflect the trends in enterprise software, not just the fads. Ignore blogs and wikis. Shun social software. Instead, take a good, hard look at the broad trends that will have a major effect over the next 10 years. Here is a small sample:

  • The never-ending increase in computer power: storage, network bandwidth, processor speed, and cloud computing... there will soon be another tipping point like there was in the early 1990s.
  • Retiring baby boomers, who are taking a lot of institutional knowledge with them en mass.
  • The millennials, who have never known a world without the internet, and who are natives to online collaboration.
  • Globalization: more competition means you need better tools to test out innovations. Companies need to fail faster, and learn better if they are to survive.

What do all these trends mean for Enterprise 2.0 software? It's hard to say for sure... but what is clear is that more and more of the most important data and software will emerge on the "edge" of your networks. Why have a central repository at all when the average laptops are powerful enough to run their own content management systems? The average user now has tremendous power to create content, and run easy-to-install collaboration tools. The genie is out of the bottle my friends... all we can do now is try to control the damage. Identity management, enterprise search, and distributed information management can help with security and content... but for the application proliferation problem, I'd bet on enterprise mashups.

As the baby boomers retire, you can forget the idea of teaching them new software so they can share their knowledge. No way, no how, ain't going to happen. Instead, you need a new system for capturing "people" knowledge as effortlessly as possible. My idea is to just rip-off Robert Scoble. He made a name for himself with nothing more sophisticated than a camcorder and some editing software. You want knowledge from technophobes? Why not engage them in one-on-one taped interviews? Low tech people-oriented solutions are frequently the best option for capturing content and context, although you will need something like an enterprise YouTube for consumption.

As the millennials enter the work force -- what some people call the "gamer generation" -- what will their needs be? The obvious solution is that they want something like Facebook for the enterprise. News flash: there already is Facebook for the enterprise... it's called Facebook. More compelling is the idea that employee management and business process management will evolve into enterprise simulation software. Something like "SimCity Enterprise Version". Software like this will need to be seeded with a ton of historical data, information about your processes and employees, and information about the current market. Then, you can run a simulation on the "what if" scenarios in a world of interdependent agents. This may seem far-fetched, but there is a lot of software out there right now that solves one specific piece of this puzzle... it's just that nobody has put all the pieces together yet.

We don't need another word for Enterprise Social Software... nor do we need to ride the coattails of Web 2.0 to sell the same old application with a Wiki bolted on. However, we do need to be aware that the enterprise will change a lot in the next 10 years: and not because of fads, but because of trends.

Web Form Tip: Add Excel-Like Calculation To Input Fields

When I'm filling out web forms -- especially ones with financial data -- I find myself frequently missing the ability to use Excel-like math syntax. For example, you could type this into an Excel field:

= 111 + 222

And the moment you moved to another cell, it would calculate the answer, and place 333 in the cell for you. This is extremely handy, but alas, very few web sites allow this feature. So much power in web browsers, and yet this little touch of usability is relatively absent. It reminds me of the scene from Futurama, where Phillip J. Fry and Bender the surly robot are trying to hammer out their monthly budget:

    BENDER: Now to figure out how much money I'm raking in off those twerps!  
                (Scribbles out some numbers with a pencil and paper) 
                Awwwwww, I need a calculator.
    FRY:    You are a calculator!
    BENDER: I mean a good calculator.

In an effort to help make web sites more like "good" calculators, I'd suggest adding some simple JavaScript to turn any number field into an Excel-calculator field. It's pretty simple, really... just capture the "onBlur" and "onKeyPress" events. If the user moves to a new field or hits "return", the code evaluates if the cell begins with a "=" character. You can try it out in the fields below... The relevant source code follows.

Value 1

Value 2

Value 3

<script>
	// create the always useful 'trim' function
	function trim(str) {
		return str.replace(/^\s+|\s+$/g,"");
	}

        // check to make sure the JS code is 'safe'
        function isMathProblem(str) {
                return ! str.match(/[a-z,A-Z]/);
        }

	// check to see if the 'return' key is pressed
	function isReturnKeyPressed(e) {
		var isReturn = false;
		var characterCode = 0
		if(window.event)
			characterCode = e.keyCode;
		else if(e.which) 
			characterCode = e.which;
		if (characterCode == 13)
			isReturn = true;
		return isReturn;
	}
	
	// evaluate the math, if it starts with a '=', ignore errors
	function doExcelMath(field, e) {
		if (typeof e != "undefined" && !isReturnKeyPressed(e))
			return;

		var val = trim(field.value);
		if (val.charAt(0) == '=' && isMathProblem(val)) {
			val = val.substring(1);
			try {
				val = eval(val);
				field.value = val;
				field.focus();
				field.select();
			}
			catch (ignore) {
			}
		}
	}
	
</script>
<form name="excel-webform" method="get" action="#">
<b>Value 1</b> <input type="text" name="value1" onBlur="doExcelMath(this)" onKeyPress="doExcelMath(this, event)"></input><br />
<b>Value 2</b> <input type="text" name="value2" onBlur="doExcelMath(this)" onKeyPress="doExcelMath(this, event)"></input><br />
<b>Value 3</b> <input type="text" name="value3" onBlur="doExcelMath(this)" onKeyPress="doExcelMath(this, event)"></input><br />
</form>

If this doesn't catch on, I might have to make a Greasemonkey script instead to cram this code into every web site I use...

NOTE: it can be risky to allow a user to call the JavaScript 'eval' function on arbitrary input data. They could accidentally munge up their page, or insert cookies into their browser. In most cases this will not lead to a successful cross-site-scripting attack... but depending on what you keep in your cookies, you should run some penetration tests to make sure nothing bad can happen.

Webcast: Site Studio Performance Tuning

UPDATE: My presentation on Site Studio Performance Tuning is now posted online.

It will be hosted by the Independent Oracle Users Group (IOUG) I'm going to be talking about general web site performance challenges; some of which will probably surprise you. I'll also cover what kinds of hardware and software that will speed things up, as well as little-known Site Studio features that you can take advantage of.

If you can't make it, don't worry! We'll be posting it IOUG archived webcasts page as well.

Quote of the Day

"... the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding a banana AND the entire jungle." -- Joe Armstrong

Ahhh... them Lisp bigots never change! Until they turn into Erlang bigots... but the guy has a valid point.

Folks who write Ruby and Python based web frameworks kind of understand the "banana problem" and work very hard to overcome it. They use principles like late binding, code injection, Don't Repeat Yourself, etc... But a lot of Java frameworks for web sites seem to not care about the banana problem at all... which leads to these kinds of layers:

  1. Database table
  2. XML Schema definition of table
  3. Auto generated Java objects with attributes matching table columns
  4. EJB Business Object Interchange Layer
  5. EJB Business objects
  6. GUI Interchange Layer
  7. HTML Scripting Form
  8. Finished HTML Page

That's pretty much like saying... "Well, as long as we dragged the entire Jungle along with us, we might as well make you swing from every tree before we give you a banana!"

Oh, and watch out for the Gorilla...

Inventor of Oracle UCM Named "Oracle Innovator"

This is nice to see... my former boss Sam White was named an "Oracle Innovator:"

That's pretty cool... only 23 people in all of Oracle are "innovators," which is pretty impressive considering the company probably has 100,000 employees. That's like the top 0.02%!

I personally thought this product was innovative because of how Sam took a very holistic view of the problem. Way back in the 1990's he saw how people were misusing Java as an "applet" platform... when it was really excellent at being a server. He invented this component plug-in architecture that was way ahead of its time. Only in the past few years have people recognized this model and labeled it as the "Inversion of Control" pattern... but few still understand its power. Not to mention that the entire system was based on web services, before there was even a word for it!

Sam also eschewed "object oriented programming" in favor of "data-driven programming," which was also against the grain. People sometimes look at me sideways when I bash object oriented philosophy... but once you work a lot with "pure" objects you see what a maintenance mess it can be. In addition, Alan Kay -- the inventor of object oriented programming -- also agrees that the currently philosophy is not what he intended and easily leads people into the weeds.

Congratulations, Sam!

Blog, or Blorphan?

Blogs, Wikis, and other Web 2.0 goodies have an important place in a broader enterprise content management strategy... but some caution is advised. As I mentioned in last year's talk "Enterprise 2.0: How You Will Fail," I think it might be more important to focus on the practical realities.

The free-flow of information is great and all, but does it translate into actual productivity? Or are you just creating faddish tools that will eventually be abandoned by users, after the novelty wears off?

Let's take blogs for example... Technorati's State Of The Blogosphere 2008 report claims that there are 133 million blogs in the world... Sounds great so far... but only 7.4 million (5.6%) of these blogs posted an article in the last 4 months! A mere 1.5 million (1.1%) posted an article within the last week, and about 900,000 posted in the last day (0.68%).

If these numbers are reflective of what you would find in a corporate blogging initiative, the outlook is fairly bleak. Assume you have a large push to get your employees blogging, and you succeed in getting 1000 bloggers in your company. If these statistics hold, that means that only 6 blogs out of 1000 will have useful, up-to-date information! Another 50 may have useful information, but it could be up to 4 months old... and possibly stale.

The rest of them could very well languish as "blorphans." One or two posts initially... but then only updated when the author is bored. In general, these posts will be tiny gems of knowledge strewn about your enterprise; usually outdated, and frequently without context.

So much for using blogs to measure the "pulse" of your company!

If you start a corporate blogging initiative, please do no attempt it without a strategy for giving people the tools and encouragement they need to keep going:

  • Have lots of helpful info on how to blog, perhaps coupled with a training program.
  • Give incentives for blogging... not monetary, but have public rankings of hot topics, hot bloggers, most linked content, most forwarded content, and the like.
  • Have a "blog for blogs," where people can exchange tips on blogging, and teach each other on the benefits of blogging.
  • Have a "president's club" for bloggers, elected by their peers, for bloggers that genuinely helped them. This could be for the best tips and trick, best breaking news, or the best analysis.
  • Use blogging tools that are easy to use, and which allow people to track their popularity, and how people tag their blog.
  • Rate improvement in blogging skills on yearly employee review forms, and be sure to give them time to blog.

Most people agree that public blogs help companies by making them more "transparent." Even if customers love your products, they will always have the fear that you might "go away" and not be able to help them in the future. Blogs from real people with real passion can help your customers feel more connected to the "pulse" of your company... even if that "pulse" is filled with stale information.

However... for internal people, the best way to keep everybody up-to-date is likely a more formal knowledge sharing process. Or you can just stick to rumors an innuendo, since company rumors are 80% correct anyway...

New Site: samplecode.oracle.com

For a long time, folks have been asking for a SourceForge-like site for Oracle consultants where they could share free code snippets. I've been trying to get one of these going for a while... but I knew it would be nothing without Oracle branding and an internal push. Well, Oracle recently announced this site:

You need to be either an Oracle employee, or an OTN member in order to use it. It's backed with Subversion (yay!), so you'll need hat to contribute. The number of projects is still fairly small at the moment... and it doesn't have a category for Oracle UCM yet. However, once it does, I'm sure we could get the number of projects there up to 20 or so ;-)

A Modest Proposal For Bug Bounty Bonuses

Every software manager wants to reduce the number of bugs in their code... but debugging code is a painful process, and developers don't enjoy it. So what is a software manager to do?

A common mistake is to put a "bounty" on bugs... say, $10 for every bug found. This is a rookie mistake, because it won't take long before developers intentionally insert bugs so they can be found, fixed, and collect their bounty. Other managers tried to game the system differently... such as only giving the bonus to testers. But, this leads to back-channel markets where developers tip off the testers, and get a kickback.

In general, development managers use non-monetary encouragement to get their team to fix bugs. This can be public shame -- who broke the build, who has the most bugs in their code, etc. -- or it can be public praise -- who has the least bugs, who fixed the most bugs, etc.

But part of me thinks that there has to be a way to make a bug bounty game that is less prone to abuse... so I came up with this:

  1. Have a code review about once per quarter; set aside two weeks or so. Do not tell the developers when this code review will happen.
  2. Set aside a fixed bonus that each member of the team will get that quarter, say $1,000.
  3. During code review, you get $10 for each severe bug you find in somebody else's code... add this to the bonus.
  4. If somebody finds a bug in your code, you lose $20 from your bonus.
  5. If you fix a bug in your own code, you only lose $10.
  6. This process will continue until the median bonus on your team hits some kind of trigger value, say $900.
  7. After the trigger value is hit, there would be some kind of minimum bonus that you would give. If some new code had a ton of bugs in it, then the developers would get the minimum... say $800.
  8. After the code review is complete, everybody gets their bonus, along with a chart showing who got what for a bonus.

The idea here is to inspire some kind of healthy competition to fix other people's bugs... but also to minimize the number of bugs in your code, and to gain familiarity with the code of others. Naturally, developers will always come up with some kind of strategy to game the system... so lets look at how a dev team of 10 might behave:

One strategy is everybody does the minimum... they each fix 10 of their own bugs, the average drops to $900, and everybody gets a nice bonus. This is fine with management, because 10 bugs got fixed.

Another strategy is not to play at all... Well, then the bonuses never fall below the $900 trigger median, so nobody gets a bonus. The cash rolls into next quarter's code review.

But... what if one person tries a different strategy? Say, everybody is lazy and does nothing. So, the one bug fixer will fix 6 bugs in the other 9 developer's code. That's an extra $540 for him, and it brings the median down to nearly $900. He gets $1540, everybody else gets $900... and everybody knows this. Hopefully, this will encourage a good mix of lazy and energetic bug fixers... which beats the alternative.

Will this discourage people from writing tricky code before a code review? For example, let's say somebody is working very hard on some Big New Feature, and it won't be complete for another month after the code review. Everybody gangs up on him to find bugs, since that part of the code is rife with them. They each find 10 bugs, dragging his bonus down to negative $1800!!! Of course, this triggers the median being below $900, so everybody gets a bonus. The poor developer of the Big New Feature gets the minimum $800 bonus, and everybody else gets $1100.

That's a slight disincentive... but you can fix it with having the code review at a random time in the quarter so nobody can predict it. This will discourage people from writing code modules that will take 3 months to complete... but I see that as a positive side effect.

What if two developers conspire; one makes the bugs, the other finds them? One of the commenters brought this one up. I could make bugs that are hard to find/fix, then I tell you how to do it. You find 50 bugs in my code, and drive my bonus to $0. You get an extra $500, and the median falls below $900 for the team... Total for you is $1500, total for me is the $800 minimum. We split the difference, and walk away $350 richer. That would work once... maybe twice... but then you'd have to explain to your boss in your yearly review why your code is so loaded with ghastly, obscure bugs! In which case, its highly doubtful you'd get a nice raise. If you want to make $350 once, and sacrifice a $3500 raise a year from now, be my guest!!!

There is a slight risk that developers would horde bugs. Say, they would find a nice juicy bug in somebody else's code, but not tell anybody until the next code review. No problem... as long as the bug gets fixed. That's why you want these quarterly. However, there isn't much advantage to hiding bugs in your own code... but there is a slight advantage to sneaking in a bug into somebody else's code. But, as long as you have source control, it's easy to track down who made what code addition... so unless you have team members hacking into each other's accounts, this really isn't an issue.

Finally, some developers might find it demeaning. If the same people keep getting the minimum bonus, that might affect their morale and productivity. However, getting the minimum is a sign that this person probably needs help. A senior developer should mentor the poor newbies, show them how to write bug-free code, and help them find bugs in other people's code. Maybe even help them track down 5 bugs in other people's code, and hang on to them until the next code review. This is where bug hoarding can play a positive role for morale... And besides, if the developer consistently gets the minimum bonus even with mentoring, then you might want to reconsider whether this person is a positive addition to the team...

So what do you think? Is this a bug bounty bonus game that might actually work???

The W3C Kills XHTML in Favor of HTML 5!

File this one under its about frigging time!

The W3C has announced that it is dropping support for the XHTML 2 working group, in favor of HTML 5. Now, I'm a big fan of HTML5, but I don't see this as necessarily good news...

In the past, the W3C tried to dump HTML entirely in favor of XHTML... Here's an idea! You know all those web sites you just made? Well get ready to do it all over again... because with new and improved XHTML your rewritten pages will look and act exactly the same! Obviously, XHTML failed utterly to gain in popularity. Probably because the W3C is just plain awful at making specifications that are actually useful...

Not to be left out, the HTML faithful decided to create the Web Hypertext Application Technology Working Group (WHATWG) to try to continue development... they were upset that HTML was neglected and stuck on version 4.01 since 1999! This group included folks from Apple, Mozilla, and Opera, with some help from Google. That working group came up with some great ideas and actual functioning technology, so the W3C finally caved and made a working group for HTML 5 back in 2007... Fast forward 2 years, and the W3C finally realizes that XHTML 2 was going nowhere, so it was time to ditch it in favor of HTML 5.

Now... part of me feels this is bad... because now that they can no longer ruin XHTML, the W3C can just focus all of its energies on ruining HTML. I'm also irked by the justifications the W3C put forward about dropping XHTML:

HTML and XHTML 2 working groups were formed by W3C in March 2007. "Basically, two years ago we chartered two working groups to work on similar things, and that created confusion in the marketplace," said Ian Jacobs, W3C representative.

Yeah, right... they were doing "similar things." Not even close, Ian. Allow me to translate: the W3C doesn't have a clue what people need the web to do, so we're going to allow Apple and Mozilla to figure it out for us, then we'll claim credit!

Classy...

Let's hope the good folks at WHATWG can see through this garbage, and don't let the W3C ruin HTML 5.

Email Patterns Can Predict The Health Of Your Company

As I mentioned previously and in my latest book, data mining your corporate email can yield some pretty interesting information... even if you don't read the contents. My angle is that by analyzing who emails whom and when, you can get a sense of who is "friends" with whom... and by doing so you can hit the ground running with any Enterprise 2.0 social software initiatives.

One nugget that I never thought of was how the emergence of email "cliques" can determine whether or not your company is in serious trouble... Two researchers -- Ben Collingsworth and Ronaldo Menezes -- recently analyzed the email patters at Enron to see if there were any predictors of the impending doom. Initially, they thought they would find interesting changes immediately prior to a large crisis... However, what they found was that the biggest change in email patterns happened one full month prior to the crisis!

For example, the number of active email cliques, defined as groups in which every member has had direct email contact with every other member, jumped from 100 to almost 800 around a month before the December 2001 collapse. Messages were also increasingly exchanged within these groups and not shared with other employees... Menezes thinks he and Collingsworth may have identified a characteristic change that occurs as stress builds within a company: employees start talking directly to people they feel comfortable with, and stop sharing information more widely [prior to a crisis].

Interesting stuff... although this is only one data point. The increase of "active email cliques" is probably a good indicator of the amount of stress and negative rumors in your company, or in a specific division. However, as an actual predictor, it might not work so well. It will be difficult to know for sure, because its really difficult for researchers to get access to random corporate emails.

Also, if you institute any kind of email data mining system, people will alter their behavior. These email cliques will simply go offline if think that big brother is watching... they will probably leave some kind of a trail, but it will be more subtle, and lead to lots of false positives.

Ultimately, as a manager you're probably better off just talking with your employees to see if they are demoralized... because spying on them might only make matters worse.

(Hat Tip: Nat Torkington)

HTML 5 Versus Flash/Flex

There's been some chatter lately about how the next version of HTML 5 might make Flash irrelevant. And not only Flash, but also Adobe Flex, Microsoft Silverlight, and Oracle JavaFX might similarly become useless.

This is because the latest version of HTML has a lot of features that were previously confined to advanced animation plug-ins... the three I like the most are:

  • The <audio> and <video> elements, which allow for embedding rich media directly into the browser; the #1 use case of Flash.
  • The <canvas> element, which allows for images and vector-graphics to be directly rendered with JavaScript, which allows simple animations; the #2 use case of Flash.
  • Offline data storage so your users can keep a 5Mb database offline, manipulate data, and re-sync the data later; an uncommon use case, but vital for rich internet application that you can use on an airplane.

These features have been necessary for a long time... and even though HTML 5 is not yet a finished standard, most of it is already supported in major browsers: Firefox 3, Internet Explorer 8, and Safari 4. This means that you can create a HTML 5 application right now! Probably the most famous HTML 5 application out there is Google Wave for email, which we are all just dying to try out!

I feel that this kind of competition will be healthy... I'd wager that 90% of what people currently use Flash for could just as easily be done in HTML 5. Also, by being standards compliant, you'll have fewer concerns about vendor lock-in. What happens if Adobe gets into trouble, then is bought out by Computer Associates? No more Flash for you!

However, there is still a problem... currently HTML 5 compliant browsers are only 60% of the market... I know quite a few enterprises that are still on IE 6, fer crying out loud... Flash has the distinct advantage of working on older browsers, and has about a 95% market penetration. Although, last year at this time only 5% of users had a HTML 5 compliant browser, so maybe by May 2010 HTML 5 will be as popular as Flash?

Hard to say...

UPDATE 1: Well, it's now May 2010, so I redid the numbers... and according to the browser numbers from W3Schools about 75% of the market is using HTML5 compliant browsers. Now that Google has dropped support for IE6, I'd wager this number will be close to 95% in May 2011...

This question came up recently in the content management universe... a few weeks back EMC/Documentum unveiled their latest UI at the Gartner conference on Portals and Collaboration... and it was a pretty slick Flex-based UI. A daring move... However, slick UIs don't need Flex. Billy and I got a demo from Jason Bright about Media Beacon's latest app. It was very flashy, and uses pure HTML, CSS, and JavaScript. As Jason told CMS Watch:

"Flex, like ActiveX, Silverlight, and Java Applets before them are, in a sense, replacements to the browser. Each replaces the web browser in a proprietary way. While I love Flex as a technology, I do not think it is a good strategic decision to throw out the traditional browser for a new client-server model no matter how attractive"

The problem boils down to this: there are millions of people dedicated to making the web better; but only one small part of Adobe is dedicated to making Flash better. The same holds true for Silverlight and JavaFX.

If I were writing a one-off rich internet application, I might choose something like Flex, because Flex development time is half what it would be for a similar HTML/CSS/JavaScript app. There are so many browser bugs, and oddities in JavaScript, that its always a long slog to debug it. With the possible exception of the Google Web Toolkit, there really are no good ways to easily design a flashy HTML/CSS/JavaScript application... whereas designing application with Flex is relatively simple.

But... if I were making an application for resell, or one that I intended to have other people maintain, I'd be more hesitant to use anything but web standards. HTML 5 is right around the corner; product development cycles are long; and HTML 5 browsers could reach 90% market saturation in 12 months.

All things considered, the best option now is HTML 5...

UPDATE 2: in case you have been living in a cave, and missed the launch of Apple's new iPad, you might have missed the fact that the iPad will not support Flash or Flex. I'm uncertain whether this new device will really take the world by storm, but if it does, it will be one more reason to switch to an HTML 5 code base.

UPDATE 3: it appears that Steve Jobs has gone on records about why the iPad and iPod will NEVER support Flash. Steve-o brings up a few more reasons I did not cover here: Flash is a power hog, it doesn't support "touch" interfaces, and it crashes a lot. Steve Jobs ends with a plea: Adobe should use its brainpower to make a cross-platform IDE for HTML5, and stop trying to cram Flash down our throats. If they don't, then the "next Adobe" certainly will...

The Revolution Will Not Be Televised; It Will Be Tweeted -- Epic Win

This photo is of an Iranian protester helping evacuate an injured cop, and get away from an angry mob... As Sullivan said:

How To Tell Who The Good Guys Are? They're the ones who sometimes rescue a beleaguered riot policeman.

Skip the mainstream media... Go to Andrew Sullivan's blog, The Big Picture, or #iranelection on Twitter... something unbelievable is happening...

Joel on Platform Vendors

A while back I blogged about the lack of Oracle UCM "vertical applications". A vertical application is an add-on to an existing product or platform, but one that is industry specific. A lot of Oracle UCM consultants have created very general add-ons, and have sold them along with their services.

On occasion, Oracle implements one of these general features, and the add-on product becomes obsolete. Unsellable... and this can cause some grumpiness... but it doesn't have to be this way.

Joel on Software has recently had a similar rant about people who make add-ons to platforms... but in this case, he's referring to the iPhone. Similar to Oracle UCM, the iPhone is a platform... so you'll get some folks who just "fill the gaps," and others who create entirely new markets. A lot of gap-fillers had their profits crushed when the new iPhone OS rendered their add-ons obsolete. Some quotes:

A good platform always has opportunities for applications that aren’t just gap-fillers. These are the kind of application that the vendor is unlikely ever to consider a core feature, usually because it’s vertical — it’s not something everyone is going to want. There is exactly zero chance that Apple is ever going to add a feature to the iPhone for dentists. Zero.

Or, more succinctly, as Dave Winer once said:

Sometimes developers choose a niche that’s either directly in the path of the vendor, or even worse, on the roadmap of the vendor. In those cases, they don’t really deserve our sympathy.

Yes... If you make a general add-on to Oracle UCM, you have a wider possible audience... but that doesn't mean you'll be able to sell to it all! You'll have a tiny bit of market penetration, and then one day Oracle will just write a clone of what you did.

When it comes to add-ons to platforms, verticals are almost always more profitable. The market might be smaller, but it is much easier to highlight the need to your market, and the competition is less. If you make something good, odds are you'll be able to sell it for a looooong time.

"Web 2.0" is the Millionth English Word???

Well, isn't this convenient... according to the Global Language Monitor, the phrase "Web 2.0" has become the one-millionth word in the English language... narrowly beating out "Noob," "Slumdog," and "Cloud Computing."

Firstly... yes, English does have more words than any other language. The British Empire kind of spread English everywhere... and unlike French and Spanish, English acts like a sponge, absorbing every word it can find! Taboo, Tatoo, Tortilla, you get the picture.

But... I call shenanigans. I think this thing was rigged to get maximum press coverage. "Web 2.0" is not a word, its a phrase. Also, it has been around for about 7 years now, and was hugely popular in the technology field for the past 5. It is a much more common phrase than "Cloud Computing." The word count folks claim that it needs to be mentioned 25,000 times before its an "official" word... But the New York Times alone mentioned it on 2,700 occasions! I'm sure a survey of other sites would demonstrate that this word hit the 25,000 sweet spot many years ago...

Others are likewise skeptical:

Part of what makes determining the number of words in a language so difficult is that there are so many root words and their variants, said Sarah Thomason, president of the Linguistic Society of America and a linguistics professor at the University of Michigan... Thomason called the million-word count a "sexy idea" that is "all hype and no substance."

I'll agree there...

How Software Engineers Think...

A Software Engineer, a Hardware Engineer, and a Departmental Manager were on their way to a meeting in Switzerland. They were driving down a steep mountain road when suddenly the brakes on the car failed. The car careened almost out of control down the road, bouncing off the crash barriers, until it miraculously ground to a halt scraping along the mountainside. The car's occupants, shaken but unhurt, now had a problem: They were stuck half way down a mountain in a car with no brakes. What were they to do?

"I know," said the Departmental Manager. "Let's have a meeting, propose a Vision, formulate a Mission Statement, define some Goals, and by a process of Continuous Improvement find a solution to the Critical Problems, and be on our way."

"No, no," said the Hardware Engineer. "That will take far too long, and besides, that method has never worked before. I've got my Swiss Army knife with me, and in no time at all I can strip down the car's breaking system, isolate the fault, fix it, and we'll be on our way."

"Well," said the Software Engineer, "before we do anything, I think we should push the car up back up the road and see if it happens again..."

HA!

(Hat tip Dreaming in Code by Scott Rosenberg)

Oracle Glassfish Now Supports Jython and DJango

Oracle -- as you know -- plans on purchasing Sun and all their Java-licious technology. This includes the open source Glassfish application server, which is a free competitor to Weblogic, which Oracle obtained in the Sun BEA acquisition... and they both competed with OC4J, which was Oracle's application server prior to 2008.

I -- along with everybody else -- am very curious to see how all this plays out... It certainly appears that OC4J has lost favor, and Weblogic stole the show... but now Oracle "owns" an open-source alternative to Weblogic as well. So which one should you choose? Naturally, this depends a lot on what out-of-the-box features and integrations you need... But if I were a developer creating a new application from scratch, I'd probably go with Glassfish. Besides being open source, they will soon have built-in support for JRuby/Rails and Jython/DJango web frameworks. To me, that says the people behind Glassfish really "get it" when it comes to delivering web frameworks that make developers more productive...

According to Vivek Pandey's blog, the latest preview release of Glassfish v3:

  1. Provides GlassFish v3 connector and deployer as OSGi module. Which means that deployment of a Python application will trigger Jython Container code.
  2. Wire up the HTTP request and response at very low level by implementing a GrizzlyAdapter, hence resulting in better runtime performance and scalability using grizzly scalable NIO framework.
  3. WSGI (Web Services Gateway Interface) is a Python standard to wire a Web Server to Python web frameworks such as Django or TurboGears etc. Jython Container implements WSGI interface and so it would be pretty easy to add support for various Python web frameworks. Currently, we have Django and we will have others such as TuroboGears, Pylons etc.
  4. Currently Jython Container is available thru GlassFish v3 Update Tool. In the future it may appear with GlassFish v3 core distribution.

His blog also has step-by-step instructions about how to enable Jython and DJango... with luck, this will be rolled into the final release, so these steps will be easier.

I'm also curious to see what Jake and the AppsLabs boys might think about Glassfish... those guys are building some of Oracle's most "social" applications, and they are big JRuby/Rails fans. I'm more of a Python/DJango guy myself. I've said many times that if I were to rewrite the Oracle Content Server from scratch, I'd probably have picked DJango as the core framework... But DJango in a Java container??? That's even better! Quick coding, easy modifications, plus the reliability of Java.

But that's just for my needs... others may prefer the "Weblogic way" for different reasons.

Looking forward To The Weekend...

Man, this has been a hectic few weeks... I just launched one site for a client. It went smoothly, but it was a lot of work and late nights. I've been spending so much time writing documentation that I lost the will to blog. Unfortunate... considering what happened this week.

I'm talking specifically about the highly disruptive Google I/O conference. It looks like Google Wave is going to be huge... it will no doubt set the standard for web-based email collaboration.

I'm happy that its using XMPP instead of HTTP behind the scenes... this is a great idea, since XMPP is a high-end instant messaging protocol, whereas HTTP is a freaking dinosaur. I'm hoping this push will mean that browsers might naively support XMPP in the near future... Imagine that! Being able to get data -- like RSS Feeds, new email messages, and bundles of web sites -- pushed to you when they change, instead of having to poll the web site a bazillion times... or use awkward and obtuse asynchronous JavaScript. This technology choice has caused a few folks to predict the downfall of HTTP.

Nothing would make me happier than the death of HTTP, but it's not happening yet... As others have noted, Google Wave is still very dependent on HTTP... it only uses XMPP for server-to-server communication. The web browser still has to poll the server for more data. Although, I'd wager that once this takes off and Google servers are swamped, they might sneak XMPP into Google Gears and use that instead.

It looks like Wave will be easy to integrate with, and its all open-source... You don't need to host it at Google, you can install their server, or just implement the protocol. This is good, considering how many enterprises might want to make Microsoft Exchange more Wave-y. I have a couple of ideas for Wave plug-ins... but I have to wait until Google gets me a user account for testing :-(

Oh well... Its probably for the best. I could probably use one less distraction this month...

Recent comments