I may have read too much into James' criticism yesterday... he clarified his point that WSDLs in the ECM space are awkward, which I agree with.
I feel this is probably because most Enterprise Content Management vendors are still pretty green when it comes to the value of web services, and service-oriented architectures (SOA). Stellent was using SOAs nine years ago, before there was even a name for it. It helped us solve specific problems, and add features rather quickly. Adding an XML/SOAP interface was a natural extension to the existing "content services"... so, provided Oracle learns through osmosis, its safe to say Oracle "gets it" when it comes to SOA and ECM.
I'm confused when he claims I'm pro-ReST, because I'm quite clearly a critic of ReST for enterprise systems... although I like its simplicity. And Stellent/Oracle ECM services are both clearly coarse-grained and stateless. I cover that in chapter 2 of my book on Stellent, which I would encourage he read before judging. And, as I said earlier, custom security integrations are fairly simple... so "deferring authorization" is possible in a number of ways.
Here's a quick sample of what Stellent web services looked like nine years ago:
And here's what they look like via SOAP:
And that's just one of the ways to get SOAP output... Note: it ain't perfect, and never will be. I'd like Oracle to make a handful of changes so its easier for the consumer to know what to do with the response... but that limitation is mainly due to the fact that Stellent was several orders of magnitude smaller than the competition, and thus comprehensive developer documentation was a lower priority...
However, its a genuine web service, and not a lame wrapper around an existing fine-grained API... which I believe is what James was complaining about.
James is at it again... this time inciting me by claiming I support crapy WSDLs as an Enterprise Content Management (ECM) standard.
I like SOAP and SOA... but I hope it wont come as a shock that I hate WSDLs, and I'm hardly a Java fanboy. So the odds are quite quite low that I'd support a crap WSDL standard that names elements arg0, uses a session ID, demands a plethora of complex types, or is to Java-y.
As I've said before in my anti-ReST rants, WSDLs and the Microsoft WS-* stack ruined the simplicity of SOAP on purpose. Why? According to Tim O'Reilly, a Microsoft architect confessed that they did it to force people to use Microsoft tools to SOAP-enable their applications. Typical... unfortunately the Java folks followed suit, and now anything other than bare-bones SOAP is a nightmare to code without an IDE.
Even if that weren't true, I still don't like the idea of WSDLs because it lulls people like James into a false sense of security... once we have WSDLs, it will be easy to get all of our systems to communicate!
They said the same damn thing about everybody using databases, everybody using portal servers, everybody using XML data files, etc... and I'm not amused. Integrating systems always requires a thoughtful hand, unless the applications are nearly worthless commodities... such as a toaster and an electrical plug.
The problem is not lazy developers, the problem is the fundamental difference between syntax and semantics and its effect on the human/computer interface. Even in mind-numbingly well defined web services problems -- such as stock trading data -- people call the same thing by different names. Why? Because its useful. It adds value. And that causes people who create web services for stock market info to give different variable names to the same exact thing... even though the terminology of the stock market has been the same for decades.
So... lets assume they eventually get together and agree on terms, and now all APIs to stock market info are exactly the same. Great! Now their service is a worthless commodity. In order to survive, they need to innovate. What does that mean? Inventing new ways of looking at the same data... then others will follow... again, using different words for the same damn thing. Suddenly, their APIs no longer mesh.
Anybody recall when linguists decided to invent Esperanto? Gosh, imagine it! A universal language that everybody can communicate with! All language problems were instantly solved overnight! Hmm... then why on earth did they also have to invent Interlingua, Ido, and other IALs? Because they were solving the wrong damn problem.
Now, I'm not saying all standards are crap, I'm saying that there are already four of them that treat ECM like a worthless commodity. There's absolutely no point in demanding a fifth one that's exactly the same, except based on fancy-new-buzzword-X. That is, unless there's something specifically wrong with the existing ones that can only be solved with a WSDL. Otherwise, what's the point? His complaints thus far can be solved with a tiny bit of refactoring, and better documentation.
And finally, nothing gets done without incentives. Plenty of ECM vendors have done quite well by delivering what the business users want. Some anticipated demand, some created demand, but most were followers. Unless somebody actually sees or anticipates a demand for a fifth, sixth, or seventh ECM standard based on WSDLs, its relatively pointless to give people the same mistake in a different wrapper.
SOA is still not being done well in the enterprise... people need to fail a little before best practices emerge. Until other apps are as SOA enabled as (certain) ECM systems, its a bit pointless to push so hard. I stand by my original statement that a decent ECM standard won't be worthwhile until 2009... but my current money is on something similar to WebDAV or APP; it won't be a Java-specific standard (JSR170 or JSR283), nor will it be anything as pointlessly complex as WSDL... plus it won't take off until Microsoft endorses it.
So ultimately, James and I agree. A WSDL standard for ECM would be a nightmare... but not because of limitations in the ECM market: rather because of limitations and broken promises on behalf of WSDLs.
UPDATE: I may have read a little too much into James' post... see my re-response.
So I was out having pasta last Thursday with a buddy of mine... his job is setting up Active Directory Federated Services (ADFS) at Microsoft HQ, and has been for many years... When stuff doesn't work right, he's one of the guys who gets to tweak the system till it does. When things go really wrong, he gets to phone a real live Active Directory developer, shake him out of bed, and see if the developer knows the voodoo incantations to get things working again.
Fun job... and I'm sure a lot of Active Directory admins would give their left arm for his rolodex...
Under the hood, ADFS uses Kerberos (plus voodoo) for authentication, and a SAML token for authorization -- a.k.a. entitlement management... he's helped set up federated access between Microsoft and several partners (such as Intel). He said its a whole lot easier now than it used to be. Its still far from simple to configure and manage, but setting up certificates is a breeze compared to the early betas.
I told him about my reservations regarding SAML (noted by certain bloggers)... I like the goals and all, but it was so complex I just didn't think it was (yet) worth the well-known maintenance effort. I preferred a "wait and see" approach. If I saw it hit a critical mass, then I'd bite. Then he said, "don't you understand? SAML completely eliminates the concept of the extranet!"
Then it hit home...
I've been doing web content management (WCM) for so long I'm stuck in the internet/intranet/extranet thought mode, and I just assumed that people would keep doing it that way... but a SAML integration would mean that one single logical server could satisfy the security needs for all audiences.
As Alec says: that's the way the internet used to be, and its about time it went back.
Not that such a pipe dream would necessarily happen... you might currently have dozens of content silos. However, if they are bound to a SAML enabled user repository, you could have no fear allowing access to that content from your extranets. Extracting data from a silo is a whole separate issue, naturally... but at least when you do so, it's secure.
Of course, the extranet as a concept isn't fully eliminated... I'd like my parter Company Foo to see a Company Foo branded site... and my other partner Company Bar to see a Company Bar branded site... but that's more personalization than anything.
Naturally, the devil's in the details. Just because it's possible to do it all with one system, that doesn't mean it'll happen. Setting up flexible personalization, reusable content, and getting everybody to agree on an "Entitlement Management" system won't be a picnic... The three p's always rear their heads: politics, paranoia, and performance. Also, from a security standpoint, some may consider SAML to be brittle and not defensible -- a single point of failure, in other words. Also, it's probably economically infeasible to force everybody onto one logical system... much to the chagrin of IT.
I'm still not sold, but I'm warming up to SAML...
Content silos -- the sworn enemy of enterprise content management -- are perhaps inevitable... because consolidating all that information is a never ending task. Consolidation helps mitigate the negative effects of silos, but even better are tools and systems that make consolidation unnecessary...
James McGovern was busy this weekend... one of his blogs asks the simple question why do software vendors sell insecure products... another asks specifically why Enterprise Content Management (ECM) people avoid talking about it, and another lamenting on how few ECM blogs are out there.
Well, not to be boastful, but I personally blogged twice about security holes in ECM products in the past 2 weeks: security holes in JSR170, and security holes in Web 2.0 apps. I don't know why other bloggers in the ECM field are being so lazy ;-)
Now... regarding why software vendors sell bad security products... I believe the Nobel-Prize winning economist George Akerlof covered that back in 1970 with his paper about buying used cars: The Market for Lemons. You can thank Bruce Schenier for bringing this to everybody's attention.
I blogged about lemons in security software when Schenier's original article came out... basically the problem is one of information asymmetry: the seller knows much more than the buyer. When it comes to used cars, a decent used car will cost $10,000, and a lemon costs $5,000. Unfortunately, the buyer isn't sure if that expensive used car is a lemon. Thus, to save money on the off chance that the car sucks, buyers typically pay the average price in situations with information asymmetry problems... This creates an economic situation where decent used cars don't sell well... thus fewer people bother to sell decent used cars... and eventually the used car lot is nothing but a bunch of lemons.
You can fix the information problem with economic signals, such as having a trusted mechanic look over the vehicle before you purchase it... that way even in a bad market, you can find good products.
The same holds true about security software: the seller knows much more than the buyer... vastly more in fact. Thus the market ensures that most people buy security software "lemons". To make matters worse, the economic signals are either worthless (such as endorsements by analysts), or prohibitively expensive (such as a complete penetration test by a security firm).
Thus, until the market situation changes, people will continue to purchase mediocre security systems... even if decent ones exist.
Its a bit unfair to blame the enterprise, or even the software vendors, for this market reality... they're all just people at a user car lot trying to do their best. Until there are genuine economic incentive to having secure systems (discounts on liability insurance), and there are decent economic signals about what products are actually secure (endorsements from insurance companies), it doesn't make business sense to purchase expensive security software when you have no evidence that it is actually secure.
As Schenier has said many times, the best economic option is for enterprises to talk big, but do as little as possible.
I'm sure that's frustrating as hell for the guys in Information Technology... but that's why we need Enterprise Architects...
No sooner did I link to an article on Steve Job's blog, then he decides to take a break from blogging. While Steve's away, he decided to do the hand off to his good buddy -- and fellow hater of the microtards -- Larry Ellison from Oracle. Steve has already warned us:
Larry Ellison has been bugging his PR people for a while to let him start his own blog at Oracle. But they're like, No friggin way are we letting you show the world how absolutely bonkers you are. So, fine. He comes to me and he says, Hey, in the summer, when it's slow, how about I take a turn on your blog.
Which is a long way of saying: Larry is going to be guest-blogging for me over the next few weeks, until Labor Day, and then I'll pick up again. I warn you: Larry is nuts.
I don't know about you, but I'm pretty curious about how this is going to play out...
UPDATE: Fake Larry Ellison was fired after a mere three posts, and it looks like somebody from the New York Times discovered the true identify of Fake Steve Jobs... its Danial Lyons, a senior editor at Forbes. I knew this would happen soon after the fake biography of Steve Jobs was announced... apparently Lyons made the idiotic move of claiming to be a "writer for a major business magazine" in his faux bio.
Oh well, it was funny while it lasted.
Some of them are links back to my blog here about the Stellent Book, but I've also put up a few presentations I gave... such as the one about Enterprise Mashups, and the always popular Introduction to Writing Stellent Integrations.
I plan on putting more items up there eventually... I'm just having too much fun with the free-form blog to write long, dry, and boring whitepapers... When I add more, I'll announce it here.
"Web 2.0 tools and technologies are the latest in a long line of technologies that have taken root with consumers who then smuggle them into the business world. IM is one notable example. To this point, the Web 2.0 tools that we inquired about fall well short of the value that businesspeople associate with IM. Thirty-seven percent of respondents reported substantial business value from IM, compared with an average of just 16% for the other Web 2.0 tools."
This makes me chuckle... because as far as I know, IM came waaaaaaay before the whole web 2.0 stuff. I believe ICQ predates blogs and wikipedia by many years. And AOL IM has been annoying people for at least a decade. Looks like another attempt by the web 2.0 hipsters to claim parenthood over something that came before them...
Some of the data surprised me... such as the fact that more IT people believe Podcasting has substantial value (21%) than do wikis (14%). Social networking and blogging are also far down on the list. I'm surprised folksonomy and social bookmarking wasn't covered, but the full report may have more data.
The popularity of IM and podcasting does fascinate me... because that's classic centralized Web 1.0 technology: I speak, you comply. Blogs, wikis, and social networking are more about the wisdom of crowds, where the correct decision is more of an emergent property of the system... Blogs act as meme breeders, wikis document the conventional wisdom and zeitgeist, and social networks help you connect with the right people to get things done. The speaker is secondary... whereas podcasts are far more fascinating to the speaker than the listener.
Personally, I feel that the entirety of IM's value can be accomplished with a combination of text messaging, email, presence (Twitter, Pownce, etc.), and on-screen collaboration (WebEX, LogMeIn, Yugma, etc.). Of course, IM is simple, so many people will prefer the one swiss-army-knife approach to using 4 separate systems that don't work together...
That is, until some genius at integrates Twitter and WebEX into Google Apps... or any combination of the above... then we might finally have something that deserves the title "Decent Collaboration Software".
About a month ago, I was awarded the title Oracle Fusion Middleware Regional Director. And I know what you're thinking... what the heck is an Oracle Fusion Middleware Regional Director? Let me explain:
- I don't work for Oracle,
- I'm not directing anybody, and
- I don't have a region.
Apparently, this name was confusing to many people, so they decided to merge it with the Oracle ACE Program, and promoted me to an Oracle ACE Director... which sounds even cooler. Although my profile looks a bit funny now...
Seriously... I'm kind of excited about this. Partially because I got the title, but also because Oracle thinks that people who do what I do deserve official recognition. Oracle started this program for people who are something of a developer's advocate: somebody who helps out the Oracle community with tips, tricks, articles, or by working closely with local user groups. There's about 40 worldwide, probably growing to 60 eventually.
I also get the chance to chat with people on the product team to hopefully steer product direction based on developer need. So that means whenever I go out drinking with Alec, Andy, or my wife, its a business expense. Ha!
Anyway... Oracle is -- like Stellent was -- primarily a software company. Sure, they do consulting and training, but that's not their main focus. Therefore, a strong developer community is essential for the success of their business. Developers hate paying for consulting and training... so a strong community always means giving away great information for free. That's the only way to convince excellent developers to love the product. Trust me. When it comes to paying for software or training, the smarter the developer, the stingier the developer.
Oracle decided that it made lots of sense to bring me on board, since I already do the kinds of things a director should do:
- moderate the Stellent User Group
- create sample code,
- give presentations on content management,
- write books, and
I sure hope I can also swing a free pass to Oracle Open World out of this... maybe even a T-shirt.
The main disagreement is about SAML. I didn't see its value, and detailed Oracle/Stellent's architecture to explain why. James mostly agreed, except for one interesting use case:
If ECM vendors simply leveraged Active Directory not solely for authentication but also as a user store and mapped to it at runtime then the need for SAML disappears within most scenarios within the enterprise. It still ignores a potential scenario where your users aren't stored in any repository that the enterprise owns.
Bingo... the one situation where something like SAML comes in handy. Somebody has totally valid credentials to access the repository. However, the authentication and authorization of that user must be done by connecting to a server that is not owned by the enterprise. Stellent/Oracle can handle multiple user repositories, but typically only if its within the enterprise.
For example, assume the person trying to access your ECM system is a business partner, prospect, or customer... They already have passwords and credentials stored behind their organization's firewall, but if you can't access it, you need to duplicate all that info, and make them log in again. Until fairly recently, you were forced to do it this way: you could have SSO across an enterprise, but not easily between enterprises. Things like SXIP and SAML fix this, so you can have federated (or distributed) single sign on.
Imagine: one password to connect to the entire internet... The developers at Stellent knew a while back that something like this was the ultimate endpoint, but the question was which protocol was going to win out? SSL certificates are a management nightmare... Should we follow SAML/XACML because its a standard, or OpenID/SXIP because they are (fairly) open source, simple, and usable right now?
Which is better? Without a clear contender, or any any specific market demand, its very risky to take the lead... the safe bet is to be knowledgeable and reactive. If somebody asks for SAML, it's no problem to add it to Oracle. However, at present my money is against SAML/XACML for the long-term.
I've never deployed either enterprise wide, so I cannot speak about the maintenance problems... perhaps SAML is easy to maintain, but given its complexity, I'd find that surprising.
I'm also very nervous about SAML because it is endorsed by Microsoft, whose first attempt to solve this problem was the god-awful Microsoft Passport. Also, Microsoft has a long history of ruining open standards that threaten them. Active Directory is huge money, as is the enterprise search market, not to mention Sharepoint. I don't expect Microsoft to play nice for long...
Don't think so? Remember their proprietary Kerberos extensions? Or how about how they ruined SOAP with the ungodly complex WS-* stack? If Google tries to press harder into the ECM space -- and not just enterprise search -- then the other shoe will certainly drop, and decent SAML implementations without Active Directory may be impossible.
I sense danger...
And now I'm also nervous that SAML might be catching on in the ECM zeitgeist... one recent proposal included the terrible, rotten, just plain awful idea of integrating XACML, internet search, and ECM together. I challenge Guy Huntington to put his money where his mouth is, and implement something like that himself. I defy him to get his pet project to scale well or perform without millions in hardware for every ECM on the planet.
OK, I think I've figured out the disconnect between me and James McGovern regarding SAML... When he asked if Oracle's ECM supported SAML, I was about as puzzled as if he had asked if it supported client connections via JDBC. Well... I suppose you could make that happen, but why not just connect directly to the database? It just made no sense...
Here's why: James has apparently never used Oracle's ECM solution, and is commenting on the poor architecture of other enterprise applications. I believe if he took a peek at chapter 2 of my book, he'd recognize that SAML support is unnecessary in this case... (psst, bug Billy for a free one ;-)
Here's the deal... back in version 3 of the product (we're now at version 10), the dev team saw the emergence of LDAP and Active Directory. We knew it made no sense for an ECM product to be both a user repository and a content repository. That just made things overly complicated. Plus, we could never keep up with the feature requirements. Instead, we recommend integrations that "slave" the content server to an existing user repository.
Put your users in a user repository, put your content in a content repository. It just makes sense.
Here's how a basic request operates: first, the content server asks the external system to authenticate the user's password (or token), and also return a "blob" of info about him. Every user repository has a different API, but this "blob" usually contains group memberships and attributes. The next step is to map the user data to content server specific security groups and security accounts. This mapping can be done in many many ways, from zero configuration to a few dozen lines of custom Java (or C++). Again, depends on the system. Finally, the security check determines if this user is allowed to execute the specific service (like GET_FILE), with the specific document, based on the security groups of the document, the security level of the service, and the user's roles & accounts.
It can get a little more complex with ACLs, personalization, and workflows, but you get the picture.
This happens on the fly: no authorization data is replicated, its only cached for a few minutes for performance reasons. Thus, all user management is where it should be: in the user repository. The content server does a mapping to a content-specific security model, no more.
This is called an External user. People also set up Local users, which are just stored in the database. Local users are discouraged in production systems, thus they are typically only used for testing and superusers. A small handful of customers use exclusively Local users, but they typically don't need, have, or want an enterprise user repository... thus, the only people who could possibly benefit from a SAML interface to Oracle's ECM would never use it.
But what if the Active Directory domain controller is on the other side of the planet, and performance sucks? It appears that some ECM systems make the interesting choice of replicating the user repository... but we'd suggest instead using a product that is explicitly designed to replicate a user repository, and "slave" the content server to that... such as Active Directory Application Mode (ADAM). Some customers went so far as to create home-brewed LDAP spiders to cache data, and then integrate all their apps with the cache.
I feel that making every application on the planet support SAML is a silly duplication of effort... I think its better that applications allow for loose slave-like integrations with dedicated user repositories. Use the right tool for the right job.
Now... SXIP and OpenID? Those are genuinely interesting... I'd bet that people will be willing to pay for an integration with them before they'll pay for SAML. Plenty of clients use SalesForce.com, and might be interested in a cleaner integration between content and customers.
Hopefully this clears things up...
I'm going to have to be more clear in my rants... my anti-ECM-standards rant is getting some people so hopped up they can't see straight. The latest is from Craig Randall:
Bex Huff left a comment on Mark’s post, which referenced his reply-via-post. Bex makes several good points, but at the same time what I perceive is that if an ECM standard isn’t reasonably or capably an end-all-be-all standard for the domain, then why bother. (Bex, if I misunderstood your post, please leave a comment to set me straight.)
huh... I actually said almost exactly the opposite.
In previous posts on my blog, I said that a end-all-be-all ECM standard is impossible. ECM is a marketing term, not a technical term, thus over 100 apps can claim to support "ECM", but can deliver whatever the hell they want. Good luck creating a standard interface to a marketing buzzword.
If you want some modest ECM standards, and a simple interface, fine. There are 4 such standard already, just pick a damn horse. Stellent/Oracle supports 3, and will probably support all 4 soon... just in time for the 5th to be finalized. Joy.
Not that it matters... nobody uses the standards that already exist, yet they keep asking for more. I understand why: every ECM standard is far far too simple to be useful. Why should somebody shell out thousands of dollars for an ECM system, and access it with a "standard API" that hides 90% of what they paid for? At the same time, an enterprise can have several ECM systems at once... and it would be nice if a middleware layer could have a single API to access them all. Nice, but not nice enough that they will willingly sacrifice important features...
I'm tired of wasting cycles on the pipe dream of a useful ECM standard, until the market changes enough for one to be feasible. That will happen after Microsoft fixes SharePoint, more consolidation happens in the market, and the vendors who merely claim to have ECM either shut up or go away. Like I said, probably not before 2009.
James McGovern kindly linked to my screed against REST, but I think he misunderstood me when I talked about SAML. No problem... if I read as many blogs per day as he does, I'd do the same.
His quote was this:
Bex Huff provides an interesting perspective on REST within the ECM domain. His comment: you could "punt" and rely on wacky SAML, but that just seems to complicate things beyond necessity... seems as if folks in the ECM domain don't believe in the notion of SSO and would rather force complexity in other ways such as making folks log into different systems of course using different passwords, making enterprise administrators duplicate identity stores instead of leveraging an existing one such as Active Directory and so on.
Now... everybody I know in the ECM space cares about Single Sign On (SSO). In fact, Stellent/Oracle supports Active Directory and LDAP out of the box, a few minor tweaks gets you SSL certificates, plus we've made dozens of customizations for Site Minder, and custom/exotic SSO system. I even made an ANT script that could build a custom security integration with just about anything with a few lines of C++.
Trust me, we all know and love SSO.
The problem I have is more specific to SAML. I just don't like it. In fact, I hate SAML. Nothing personal, I just start out hating all technology. I have to. Otherwise, I find it difficult to discover its flaws. If I don't know the flaws, I can't effectively recommend when to use it. There is no silver bullet, and after working with computers for 20 years I've learned to distrust almost everything.
So, I started out hating SAML four or five years ago, when I first heard of it. Guess what? Thus far I've encountered no reason whatsoever to reduce my dislike.
Most of the cool stuff in identity management seems to be with OpenID and SXIP. SAML has been around forever, and who is using it? Its not saying "here's some useful technology," its saying "here's how things should be done." It feels like something from the peaks of the XML ivory tower that makes the claim (yet again) that the entire world would magically be better if we took all information and put <angle brackets> around it... Where's the evidence? Where's the proof?
I get why people are hot about Active Directory, SXIP, and OpenID... I just don't believe SAML has proven it deserves any hype. It might make somebody's job easier, but at what cost? I'm totally open to the possibility that I'm wrong, or that SAML 2.0 is a million times better... but I'll believe that when I see it.
Only a select few people get to play with Oracle's 11g database (beta) before November -- and I'm not one of them :-(
No matter... it looks like they are sneaking in some pretty awesome features, one of which appears to be something along the lines of a just-in-time query optimizer for screaming fast performance:
Iggy Fernandez, editor of the NoCOUG Journal, the official newsletter for the Northern California Oracle Users Group, praises what he says is 11g's unheralded "learning optimizer" feature.
"When relational databases replaced hierarchical and network databases in the 1980s, the promise was that programmers would no longer need to optimize their queries by hand," said Fernandez. While no database has "completely realized" that vision, 11g, says Fernandez, is a "great step -- the query optimizer simply learns from its mistakes -- in fact, it can stop a query that is already in progress and try a different approach!"
awesome... although if my guess is correct, this will only work for parameterized queries, and not SQL statements built on-the-fly.
I did also hear a fairly detailed rumor of some unbelievably cool performance features they have planned for a "future release", which may or may not be 11g. No, nobody inside Oracle leaked this to me... all parties are independent of big red. I just know a guy who knows a guy who owns the patent. And that's ALL I plan to say about it...
Update: If you got here from James McGovern's blog, you should read this as well.
OK, Mr. Process Perfection (and File Net user) Mark Masterson has responded to my anti-standards screed, and raises some interesting ideas. Instead of following the RSS hype, skip to ATOM for an ECM standard. I have to say, ATOM plus the REST-based ATOM Publishing Protocol is an interesting idea. If only I didn't dislike REST so much I might endorsed it.
But I still sense danger for these reasons:
- A resource-oriented interface to an enterprise content management repository may not be the best approach... especially when you need to do things like workflows, subscriptions, conversion, multiple taxonomies, or basic business process management. You need a service-oriented interface that focuses on the action, not the back-end implementation. See my anti REST rant for more info.
- The response that search doesn't matter, and to just "Use Google" to find content is beyond glib... using Google means losing metadata. Instead, the interface should make it brain-dead easy to discover lists of items based on metadata, as well as "related content" in nice little buckets. If we're just going to kick metadata in the head, we might as well just use ZFS over iSCSI, and call it good.
- I can't say for certain yet, but I'd suspect it would be tricky to embed multiple content items in one feed item, as well as binary data... Can APP be used to "chunk" files larger larger than 2GB? Since APP is so resource-based, what if I want a batch check-in of multiple resources for better performance? What about syndicating secure data out of the repository? Again, I sense danger...
For the record, its wasn't tough to get Stellent to output RSS feeds. That only took a few hours... What was a royal pain was discovering how rotten most RSS readers are, and trying to tweak the output just right so that everybody could consume it.
Switching to ATOM may be a tiny bit tougher because it supports more metadata... its an obvious replacement for RSS, but I think a few more pieces need to be added to the "ATOM Stack" before it could do as much as WebDAV. Search, specifically... and I'd push towards a service-oriented publishing model.
Regarding the security layer... you could "punt" and rely on wacky XCAML/SAML, but that just seems to complicate things beyond necessity... and that ain't good for anybody except security consultants. A simpler idea would be an ATOM and LDAP Mashup, and make every single resource identity aware. If done right, you can authenticate with the enterprise LDAP server, and authorize with the department's federated LDAP server. Seems pretty simple to me...
And thanks Mark, for not turning this into a Nerd Fight.
Just a brief rant before this gets out of hand...
James McGovern and other are back talking about Enterprise Content Management standards... his latest advice is to take a closer look at RSS or WebDAV as a standard. Others chimed in that there may be something there...
If I may offer some advice: HOLY CRAP, NO!
RSS is a nice model for consumption of streams of text, but it has many many problems. It doesn't have revisions. It can't handle large binary data streams. Even by jumping to ATOM and adding metadata, you still can't do searching, editing, or contribution for crying out loud. In a word, NO!
WebDAV is good for quick changes to web files on a shared filesystem... but authentication is a mess, it cannot handle dates properly, nor metadata-based search, nor metadata-based contribution without massive kludges. AIIM says that Records Management will be a huge factor for why people purchase ECM... and WebDAV just doesn't have the metadata muscle to keep up. In a word, NO!.
Also, I believe Laurence Hart completely misses the point of Billy Cripe's comment. Standards only enable business if they have sufficient features. The entire point of a standard is that you lose functionality by standardizing, but you gain flexibility. The question is, does that help more than it hurts? At present, it hurts more.
I totally disagree that a standard -- or anything else for that matter -- is inherently good. If they were, then everybody would stop whining about the lack of ECM standards, and freaking use one of the four existing standards. Stop worshiping technology for technology's sake, and make something useful.
The four existing standard are crap, and the next 4 will be as well... unless:
- The analysts stop letting Microsoft get away with calling Sharepoint an ECM system,
- The top 5 (or 10) ECM vendors get together and decide what a real ECM standard needs to be useful, and
- All these niche repositories either write an interface, or go away.
Like I said, probably not before 2009. Those niche products are still pretty useful, even if they aren't ECM.
Alternatively, a large neutral company -- like BEA, Sun, or Sybase -- designs a "universal connector" for each specific ECM system. Several small firms have made these, but they were bought up and shut down by Documentum. IBM used to have a good one for WebSphere, but it also has languished because it means people could dump Content DB or FileNet whenever they wanted. Great for the WebSphere team, but bad for their ECM team.
And forgive me if I find Microsoft, EMC, and IBM to be completely disingenuous when calling for decent ECM standards. Those 3 companies either blocked decent open standards, or shut down universal connectors.
Many time vendors try to sell Business Process Management (BPM) along with Enterprise Content Management (ECM) as a means of helping companies get their information process under control. However, there's a huge disconnect in both the buying habits and implementation schemes for these two systems.
After many years, organizations finally admit that they need to get their content under control, and support ECM as enterprise infrastructure. They even feel good about using hosted solutions for their content management -- good news for 3 of my clients.
However, BPM is still stuck in the fiefdom stage. Very few implementations are company wide: they are highly departmental. Despite management believing BPM is highly useful, there is a strong problem with lack of ownership, which may be a big cause of the percieved lack of success.
Whereas individual departmental groups will find BPM highly useful... but without enterprise-wide owners, you'll be unable to get enterprise-wide information process under control.
Over the weekend, James McGovern complained about the lack of Enterprise Content Management (ECM) standards. He was brutally criticising one EMC Documentum blogger for his general lack of enthusiasm on the subject. Many customers don't care one iota about the standards, other than as a "checkbox feature" to be fully buzzword compliant. However, McGovern saw it as ECM's responsibility to push customers in the right direction.
Now, I can understand why an enterprise architect would be screaming for an ECM standard. After all, its their job to make sure their portal server can easily interface with the half dozen repositories in the organization. Its not unusual for a large company to have many ECM systems -- due to mergers, acquisitions, or departments that hate each other -- and it would be nice to have one standard way to interface with them all.
However, I totally disagree that a useful ECM standard will be developed any time in the near term. Why? One simple reason... there are already four separate ECM standards, none of which are much used.
First, there was ODMA, which some used for content management. Then BEA came up with the Service Provider Interface. Then came WebDAV, who's biggest supporter was Microsoft. Then the Java folks chimed in with JSR170. Now, we are awaiting the fifth: JSR283. Guess what? They all suck.
Why??? My opinion is that its because there are 40 different organizations who claim to be "Enterprise Content Management," and they all have a different definition of what that means. Some have limited metadata, some have extensive metadata. Some use a file and folder structure, others realize that an ECM system with findable content must have multidimensional metadata taxonomies which render a folder structure obsolete. Some have workflows and business process management, others barely have revisioning. Some can render a Word document into thumbnails on the fly, others barely support files. Some have compound documents, others don't.
Thus, any standard will be forced to be the lowest common denominator between all 40 systems. Any customer limiting their enterprise to just those basic ECM services will be horribly disappointed at the lack of features. I know of one major customer who was extremely gung-ho about "standard" ECM interfaces, until they realized that the standards lacked vitally important features. Now they never use them. Other customers were more extreme: one actually banned WebDAV across the entire enterprise.
A true ECM system is vastly more interesting than the search/edit/save model that standards bodies would have you believe... A fully-featured "standard" ECM interface won't look anything like SQL. Databases are about structure, ECMs are about semantic and context as well.
Now, there is some value in a highly simplified interface to an ECM repository, but we must all acknowledge that this is the goal, and design a highly simple interface that everyone agrees is barely useful. We should stop wrapping everything in rigid and obtuse XML like WebDAV, or in over-engineered and under-useful standards like JSR-everything. One reader suggested a ReSTful API to simplify things. Now, I'm not a ReST fanboy, but I see the merits of a resource-based interface like ReST for "dumb" content access... as long as the hype acknowledges it's uselessness, it won't cause much harm.
Until the market defines the minimum requirements of an ECM system, there's not much point in making a beefy interface. Almost everybody agrees that such a definition would leave out Sharepoint because of its wretched metadata engine... everybody, that is, except Microsoft... if the analysts continue to support Microsoft's claims that Sharepoint is ECM, then a decent standard will never get traction, because there's no way Sharepoint could support it. Jeez, Sharepoint barely supports WebDAV... what are the odds it will support JSR283???
So until Sharepoint gets its act together, some more consolidation happens in the market, and some of these niche players stop calling themselves ECM, I don't foresee efforts towards a decent standard ever bearing fruit. That might happen by 2009, but don't hold your breath.
CMS Watch has recently released their Web CMS Kudos and Shortcomings, where they reviewed the top 40 open source and commercial Content Managament Systems (CMS). A kindly reader make a spreadsheet of the scores. For some reason, they ranked the Python-based Plone 2.0 the #1 CMS. It was the only CMS to have a positive score of 2, the rest were negative.
Huh, one positive score, and 39 negative scores. That speaks volumes about their methodology.
I like Plone 2.0, but a #1 rank? No freaking way... It doesn't have nearly the performance, features, usability, flexibility, or simplicity of Drupal 5.x... which is the best open source CMS, IMHO. Don't believe me? Then ask Google why they keep funding Drupal so much... this is despite Plone being written in Google's favorite language (Python), whereas Drupal is in dirty, dirty PHP. Because of my Python bias I tried really hard to like Plone, but it never quite stacked up.
That said, I'm glad to see Oracle/Stellent ranked as the "least sucky" of the major market players, with a mere -5... or "second least sucky" if you include Day at -4. IBM, Documentum, and Microsoft scored -12 or worse. Alfresco was dead last with -16. Curiously, Interwoven scored only a -8...
According to the AIIM Blog, the latest AIIM State Of The Industry Survey is available... One of the questions was what were the top 3 obstacles encountered during your Enterprise Content Management (ECM) implementation? The results are below:
- 44%=Underestimated process and organizational issues
- 32%=Lack of knowledge or training among our internal staff
- 30%=Project derailed by internal politics
- 29%=Uneven usage due to poor procedures and lack of enforcement
- 21%=Underestimated the effort to distill and migrate content
- 20%=Excessive "scope creep"
- 19%=Failed to address taxonomy and metadata concerns
- 18%=Low user acceptance due to poor design or clumsy implementation
- 16%=Failed to think or benefits/issues beyond our business unit.
- 16%=Poorly defined business case
- 13%=Lack of knowledge or training among our external staff/suppliers
- 13%=Budget was overrun
- 12%=Failed to prioritize "high-value" content
What I found interesting, was that very few of these issues can be tied to failures in the technology... its seems to be a failure of the organization. Lack of training/resources/knowledge could possibly point to an overly complex product... "scope creep" could be due to inflexible technology, or inflexible project managers... but bad taxonomy was blamed more often than the "clumsy implementation."
This kind of reinforces what I always believed about ECM... its not really about the technology, its not even about the content; its about people. I always hated those snake oil salesmen selling coblaberation instead of collaboration. A hunk of technology isn't going to make all your process problems go away. If you have good processes, technology will make your life simpler. If you have bad processes, technology will amplify your problems.
Despite my bashing of Agile analogies yesterday, I am a fan of its goals... and I frequently encourage people to audit their process and see what pieces of Agile will work for them.
CIO Magazine has a great article about how a grumpy old IT manager converted to Agile. It was a good article, and reminded me of how the Stellent dev team worked.
"Eric [the Agile consultant] recommended re-architecting the core in order to empower parallel development."
Parallel development is one of the key reasons why 3 Stellent developers could easily do the work of 30. It was unique to the Stellent core eight years ago, and still fairly unique today.
Dubbed component architecture, it allowed developers to write new apps without interfering with others' code. Nearly every new feature could be first designed as a component delivered on the side, then merged into the core codebase when appropriate. For all intents and purposes, a properly written component behaved exactly as if it were a fundamental part of the core product, but with vastly fewer change management issues.
There were only a few parts of the product not accessible through component architecture... usually by design. Sometimes we'd open them up to modification... but sometimes they were so important for the stability of the system that we had to disallow modification with components.
This design eliminated the need for most developer meetings, as well as the fear of stomping on somebody else's work. Whenever a conflict arose, we would fix it, and fix the core so that similar conflicts would not happen in the future.
"In his Turing award lecture 'The Humble Programmer,' E.W. Dijkstra wrote: 'One of the most important aspects of any computing tool is its influence on the thinking habits of those that try to use it, and... I have reasons to believe that that influence is many times stronger than is commonly assumed.'"
Very very true... If your programming methodology is so process-oriented that even a dumb programmer can follow it, then the end result will be dumb programmers. If, however, your tools allow for agility and improvisation without planning, the end result will be programmers too clever for their own good.
I myself prefer something in the middle: process for infrastructure, agility for applications. This means you need to have a good grasp of where the flexibility needs to be in your system, and switch methodologies appropriately. You must alter both your software development methodologies, and your change management protocols.
Unleash the agile developers when flexibility is key... but use the process-oriented developers when doing the mundane tasks that absolutely must be done according to a regulated standard.
Otherwise, your system will be similar to a universe where avant-garde Jazz singers do your taxes, and straight laced economists make music.
No offense to Mick Jagger...