This section of the blog contains articles about the Oracle suite of Enterprise Content Management applications. This includes Universal Content Management (UCM), Web Content Management (WCM), Universal Records Management (URM), and a little bit of Information Rights Management (IRM). I helped create several of these products, and thus am very opinionated about how they should be used... I also cover technologies and topics relevant to content management in general, such as enterprise search, and identity management.
Besides the articles in this section, you may also benefit from the following sources:
If you know of another notable Oracle ECM site, send me an email! I'd define "notable" as any "official" site, or a site that posts useful information at least once per month...
There's been some chatter lately about how the next version of HTML 5 might make Flash irrelevant. And not only Flash, but also Adobe Flex, Microsoft Silverlight, and Oracle JavaFX might similarly become useless.

This is because the latest version of HTML has a lot of features that were previously confined to advanced animation plug-ins... the three I like the most are:
These features have been necessary for a long time... and even though HTML 5 is not yet a finished standard, most of it is already supported in major browsers: Firefox 3, Internet Explorer 8, and Safari 4. This means that you can create a HTML 5 application right now! Probably the most famous HTML 5 application out there is Google Wave for email, which we are all just dying to try out!
I feel that this kind of competition will be healthy... I'd wager that 90% of what people currently use Flash for could just as easily be done in HTML 5. Also, by being standards compliant, you'll have fewer concerns about vendor lock-in. What happens if Adobe gets into trouble, then is bought out by Computer Associates? No more Flash for you!
However, there is still a problem... currently HTML 5 compliant browsers are only 60% of the market... I know quite a few enterprises that are still on IE 6, fer crying out loud... Flash has the distinct advantage of working on older browsers, and has about a 95% market penetration. Although, last year at this time only 5% of users had a HTML 5 compliant browser, so maybe by May 2010 HTML 5 will be as popular as Flash?
Hard to say...
This question came up recently in the content management universe... a few weeks back EMC/Documentum unveiled their latest UI at the Gartner conference on Portals and Collaboration... and it was a pretty slick Flex-based UI. A daring move... However, slick UIs don't need Flex. Billy and I got a demo from Jason Bright about Media Beacon's latest app. It was very flashy, and uses pure HTML, CSS, and JavaScript. As Jason told CMS Watch:
"Flex, like ActiveX, Silverlight, and Java Applets before them are, in a sense, replacements to the browser. Each replaces the web browser in a proprietary way. While I love Flex as a technology, I do not think it is a good strategic decision to throw out the traditional browser for a new client-server model no matter how attractive"
The problem boils down to this: there are millions of people dedicated to making the web better; but only one small part of Adobe is dedicated to making Flash better. The same holds true for Silverlight and JavaFX.
If I were writing a one-off rich internet application, I might choose something like Flex, because Flex development time is half what it would be for a similar HTML/CSS/JavaScript app. There are so many browser bugs, and oddities in JavaScript, that its always a long slog to debug it. With the possible exception of the Google Web Toolkit, there really are no good ways to easily design a flashy HTML/CSS/JavaScript application... whereas designing application with Flex is relatively simple.
But... if I were making an application for resell, or one that I intended to have other people maintain, I'd be more hesitant to use anything but web standards. HTML 5 is right around the corner; product development cycles are long; and HTML 5 browsers could reach 90% market saturation in 12 months.
All things considered, the best option now is HTML 5...
A while back I blogged about the lack of Oracle UCM "vertical applications". A vertical application is an add-on to an existing product or platform, but one that is industry specific. A lot of Oracle UCM consultants have created very general add-ons, and have sold them along with their services.
On occasion, Oracle implements one of these general features, and the add-on product becomes obsolete. Unsellable... and this can cause some grumpiness... but it doesn't have to be this way.
Joel on Software has recently had a similar rant about people who make add-ons to platforms... but in this case, he's referring to the iPhone. Similar to Oracle UCM, the iPhone is a platform... so you'll get some folks who just "fill the gaps," and others who create entirely new markets. A lot of gap-fillers had their profits crushed when the new iPhone OS rendered their add-ons obsolete. Some quotes:
A good platform always has opportunities for applications that aren’t just gap-fillers. These are the kind of application that the vendor is unlikely ever to consider a core feature, usually because it’s vertical — it’s not something everyone is going to want. There is exactly zero chance that Apple is ever going to add a feature to the iPhone for dentists. Zero.
Or, more succinctly, as Dave Winer once said:
Sometimes developers choose a niche that’s either directly in the path of the vendor, or even worse, on the roadmap of the vendor. In those cases, they don’t really deserve our sympathy.
Yes... If you make a general add-on to Oracle UCM, you have a wider possible audience... but that doesn't mean you'll be able to sell to it all! You'll have a tiny bit of market penetration, and then one day Oracle will just write a clone of what you did.
When it comes to add-ons to platforms, verticals are almost always more profitable. The market might be smaller, but it is much easier to highlight the need to your market, and the competition is less. If you make something good, odds are you'll be able to sell it for a looooong time.
Oracle -- as you know -- plans on purchasing Sun and all their Java-licious technology. This includes the open source Glassfish application server, which is a free competitor to Weblogic, which Oracle obtained in the Sun BEA acquisition... and they both competed with OC4J, which was Oracle's application server prior to 2008.
I -- along with everybody else -- am very curious to see how all this plays out... It certainly appears that OC4J has lost favor, and Weblogic stole the show... but now Oracle "owns" an open-source alternative to Weblogic as well. So which one should you choose? Naturally, this depends a lot on what out-of-the-box features and integrations you need... But if I were a developer creating a new application from scratch, I'd probably go with Glassfish. Besides being open source, they will soon have built-in support for JRuby/Rails and Jython/DJango web frameworks. To me, that says the people behind Glassfish really "get it" when it comes to delivering web frameworks that make developers more productive...
According to Vivek Pandey's blog, the latest preview release of Glassfish v3:
- Provides GlassFish v3 connector and deployer as OSGi module. Which means that deployment of a Python application will trigger Jython Container code.
- Wire up the HTTP request and response at very low level by implementing a GrizzlyAdapter, hence resulting in better runtime performance and scalability using grizzly scalable NIO framework.
- WSGI (Web Services Gateway Interface) is a Python standard to wire a Web Server to Python web frameworks such as Django or TurboGears etc. Jython Container implements WSGI interface and so it would be pretty easy to add support for various Python web frameworks. Currently, we have Django and we will have others such as TuroboGears, Pylons etc.
- Currently Jython Container is available thru GlassFish v3 Update Tool. In the future it may appear with GlassFish v3 core distribution.
His blog also has step-by-step instructions about how to enable Jython and DJango... with luck, this will be rolled into the final release, so these steps will be easier.
I'm also curious to see what Jake and the AppsLabs boys might think about Glassfish... those guys are building some of Oracle's most "social" applications, and they are big JRuby/Rails fans. I'm more of a Python/DJango guy myself. I've said many times that if I were to rewrite the Oracle Content Server from scratch, I'd probably have picked DJango as the core framework... But DJango in a Java container??? That's even better! Quick coding, easy modifications, plus the reliability of Java.
But that's just for my needs... others may prefer the "Weblogic way" for different reasons.
Sorry for the lack of blogging, folks... Last week was IOUG Collaborate, and I was usually indisposed. For those who didn't make it, you can check out my presentation on Slideshare. I gave my talk on A Pragmatic Strategy for Oracle ECM, as well as my Top 10 Ways To Integrate With Oracle ECM.
Billy put up some of his talks as well. The How To Be A Rock Star with ECM talk was well received... although the slides don't quite capture the whole presentation.
Overall, I was pleased with the turnout... I was kind of bummed out that there wasn't a bigger Oracle ACE presence there. I saw Dan Norris a few times, but there wasn't an 'official' ACE briefing. Oh well... I guess I'll need to wait for Oracle Open World in the fall. With all the new Sun customers and partners, that place is going to be chaos.
If you are attending the IOUG Collaborate conference this year, you might want to check out my talks:
These are both repeats of the ones I gave at Open World 2008 a few months back... although the "Top 10 Ways" have changed a little bit since the introduction of the RIDC connector... I'm planning on something completely different for Oracle Open World this year. ;-)
I'm also doing a book signing after my "Pragmatic Strategy" talk... It will be at 2:30pm on Monday, at the bookstore. The bookstore is in the middle of level 2, outside the entrance to the exhibit hall. If you'd like a signature for either book, swing on by!
Unfortunately, there aren't any plans for the Oracle ACEs to get together... although I'm pretty sure that Dan Norris and others will be attending.

In case you haven't heard, Oracle bought Sun... after being teased by IBM, and watching its stock price plummet, Oracle began talks with Sun last Thursday about possible acquisition...
If you were surprised, don't feel bad... Neither IBM nor Microsoft had a clue this was going to happen.
First thoughts... holy crap! Oracle sure saved Sun from becoming a part of the IBM beast... and now Oracle (more or less) owns Java, and has access to all those developers who maintain it. This is win-win for them both, in my opinion. Sun gets most of their revenue from hardware, which Oracle avoided doing for decades, so overall there's not much overlap in product offerings -- unlike last year's BEA acquisition.
The hardware-software blend is a compelling story... Imagine getting all your Oracle applications and databases pre-installed on a hardware appliance! Not bad... You could even get one of them data centers in a box, slap a bunch of Coherence nodes on each, and have a plug-and-play "cloud computer" of your very own.
Second thoughts... how the heck is the software integration plan going to work? Sun helps direct a lot of open source projects... including JRuby, Open Office, and the MySQL database... not to mention the OpenSSO identity management solution, and the GlassFish portal/enterprise service bus/web stack. The last two are award winning open-source competitors to existing Oracle Fusion Middleware products. Oracle now owns at least 5 portals, and at least 4 identity management solutions... unlike past acquisitions, existing Oracle product lines are going to have to justify themselves against free competitors. I can foresee a lot of uneasy conversations along the lines of:
So, Product Manager Bob... I notice that your team costs the company a lot of money, but your product line isn't even as profitable as the stuff we give away for free... Can you help me out with the logic here?
There are a lot of open source developers shaking in their boots over this... but I'm being cautiously optimistic. Oracle can't "kill" MySQL: there are too many "forked" versions of MySQL already, any one could thrive if Oracle tried to cripple the major player. Likely they will simply try to profit from those who choose to use a bargain brand database. Case in point, Oracle could sell them their InnoDB product, which allows MySQL to actually perform transactions.
Middleware is the big question mark... but with a huge injection of open source developers, products, and ideas, I'm again cautiously optimistic that -- after an inevitable shake-up -- the Middleware offerings would improve tremendously.
And Open World 2009 is going to be a lot more crowded...
Sometimes when I'm working on a big-ish project, I need to quickly whip out a script to alter items in the content server. The old-school way to do this would be to use the IdcCommand application... other folks might prefer a Java application written with the J2EE connectors in the Content Integration Suite (CIS), or maybe even SOAP... but my preference would be to do it all in a scripting language. In particular, Jython.
Jython is a Java implementation of the Python programming language... which is my favorite language these days. Jython did stagnate for may years, stuck on Python 2.2, and more than a little buggy... but the project is alive and kicking and just released version Jython 2.5 beta 3, which I recommend you use. I'd wager that the Jython project was revived partly because of envy about the rise of Ruby and JRuby. Whatever the reason, I'm always happy to have new code to play with.
You can invoke any Java libraries in Jython, so naturally you could use SOAP or CIS to make administrative scripts. However, I think the majority of people would prefer a new-ish Java connector for Oracle UCM: the Remote IntraDoc Client (RIDC). In contrast with both CIS and SOAP, the RIDC connector is very lightweight, very fast, and very simple to use. There's no WSDL or J2EE bloat at all; RIDC is just a "Plain Old Java Object" wrapper around UCM web services... so it's very easy to embed in a Java application.
To get started, download the most recent version of the Content Integration Suite from Oracle. This ZIP file contains two folders: one for the new RIDC connector, and one for the standard CIS connector. I'd suggest you take a look at the "ridc-developer-guide.pdf" before you go any further. The samples and JavaDocs are also very useful, but you can peruse them later.
Next, download Jython 2.5b3, and run the installer.
Next, make a folder to contain your UCM Jython scripts. Copy the "jython" launcher file from its install directory to this directory. On Windows, this file is named "jython.bat". Also copy the RIDC library "oracle-ridc-client-10g.jar" to this folder.
Next, edit your copy of the Jython launcher file to make sure the Java classpath includes the RIDC library. You can set this near where they set JAVA_HOME at the top. On Windows, you would edit "jython.bat" and add this:
set CLASSPATH=%CLASSPATH%;D:\FOOBAR\oracle-ridc-client-10g.jar
On Unix, your would edit the "jython" text file, and add something like this:
CLASSPATH=$CLASSPATH:/FOOBAR/oracle-ridc-client-10g.jar
That's it! Now just run "jython" on the command line, and you'll get an interactive shell where you can load Java classes, and use them. Loading them is fairly similar to how you load libraries in Python. For example, the script below will load the RIDC libraries, connect to the content server, run a search, and dump out the results:
Remember: whitespace is relevant in Python, so watch your indentations...
You can easily expand on this to create scripts to run archives, batch update metadata fields, resubmit items to the indexer, or run them through a converter to generate PDFs or HTML. Also, there are multiple ways you can set up the security if you don't want to send the password with every request, or if you want to use SSL instead of clear-text sockets. See the RIDC documentation for examples.
Enjoy!
On my recent book tour, I presented some real-world examples of successful UCM strategies. It included some tips and warnings that Andy and I used to help us write the book... and shared some hard return-on-investment numbers from existing UCM clients. I uploaded the presentation to Slideshare, for those of you who were interested... or the lazy can just view it below:
I spent some time talking about the basics... what problems does ECM solve? What causes initiatives to fail? How do you define and measure success? And what are some tips for ensuring success? This is more strategy than technical, so hopefully everybody on your ECM team will get something useful out of it.
The hard numbers for cost savings came from the Survive or Thrive With UCM talks that Oracle has been touting recently. There's a lot of good information in those webcasts, which I won't repeat here. I'll just strongly encourage you to check them out.
Naturally... the people who had the best success to report were those who were the most disciplined in taking metrics. How much less paper are we printing? How much time is saved because the process is now automated? How much easier is it for employees / customers / partners to find the information they need? How much faster can you deploy new web sites? How much faster can you perform comprehensive audits? How much extra revenue can we credit to the system? How can you prove value to your boss?
If you don't ask the hard questions, and make measurements before and after, it will be difficult to ever quantify success...
Last week Bob Rhubart interviewed Billy Cripe, Vince Salvato and myself about Enterprise 2.0. Bob will be releasing it in three chunks, which you can download with the links below:
It was fun to put these together... and thanks a lot to Bob for editing all of our ramblings into easy to follow chunks! Feel free to comment on these podcasts below...
If you watched Michelle's Oracle ECM Community Call on March 10 -- or you spotted one of the leaks in the twitscape -- you would have heard about the new site for Oracle ECM announcements: ECM Alerts.
She worked really hard to put this together and promote it, and already its Google Rank is impressive...
The goal behind the blog is to give a forum for Oracle ECM Product Managers to announce the latest news about each of their products. It will contain product release information, integrations, samples, and general how-tos for most of the products in the Universal Content Management suite. The site makes a good use of tags and categories, so you can subscribe to only the product announcements that matter for you. And because everything is piped through Feedburner, you can subscribe to alerts by email or RSS.
The product managers seem to like the idea, and already there are a good number of product alerts. I'd wager that it will take a few more weeks to get everybody on board with this... after which it will likely be the best place to get Oracle ECM Announcements.
For existing ECM customers who were used to the Stellent customer newsletters, this will be a welcome addition.

The White House just launched their latest Democracy 2.0 web site: Recovery.gov. It helps you get up-to-date info about how your stimulus money is being spent. Its pretty slick, although it appears to be down right now. Its running the open source Drupal content management system... which is the same CMS I use to run my own blog.
As Alex noted in the comments, they are using a customization of theme recovery_v3, and they appear to have re-written a lot of the components from scratch. Might they contribute their customizations back to the Drupal community?
Another stimulus-related web site you should check out StimulusWatch.org, which lets citizens vote on prospective city projects that might get some of the federal money. These are not yet approved by the federal government, so voice your opinion before its too late!
Naturally, I would have gotten a warm fuzzy if Recovery.gov used Oracle ECM, but I'm just jazzed that they are using version control at all! I'd like to take this to the next level, and force Congress to use something like Subversion to write legislation... Then we'd know exactly who to blame for specific bills ;-)
Jake recently had a good post over at Apps Labs about the importance of "Social Search". He has promised a part 2 today... so I encourage you to check it out.
The question is, how do we make enterprise search better? Some people complain that enterprise search should behave more like Google search, which I vehemently disagree with, for one primary reason: enterprise search is a FUNDAMENTALLY different problem than internet search. Here are some examples:
The internet search problem is like this:
The whole problem reminds me of a scene from The Zero Effect:
Now, a few words on looking for things. When you look for something specific, your chances of finding it are very bad... because of all things in the world, you only want one of them. When you look for anything at all, your chances of finding it are very good... because of all the things in the world, you're sure to find some of them.
Internet search is like looking for anything at all... whereas enterprise search is like looking for something specific:
Trying to solve both problems with the same exact tool will only lead to frustration...
Now... Solving this problem with social tools is a much easier, and arguably better approach. People usually don't want to know the answer, people usually want to know who knows the answer. This is an observation as old as Mooer's Law (1959) about information management:
“An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have it.”
Fifty years later, and folks still don't quite seem to get it... The average user does not want to read enterprise content! They don't read documentation on the subject, nor do they read books on the subject, nor do they read blogs on the subject... In general, people don't care to actually learn anything new; they just want the quick answer that lets them move on and get back to their normal job. Most people look for information so they can perform some kind of task, and then they'll be more than happy to forget that information afterward. Its a rare individual who learns for the sake of knowledge... These folks are sometimes called Mavens, and everybody wants to be connected with these Mavens so they can do their jobs better. As a result, these Mavens will always be overwhelmed with phone calls, emails, and meeting invites.
As those mediums became flooded, some of your resources fled to other places -- like Twitter, or Facebook, or enterprise social software -- and forced would-be connectors to follow. This constant movement (or hiding) helps a bit... but its only a matter of time before those mediums get flooded as well, and the noise overwhelms the signal.
In order to truly solve the enterprise search problem, you need to first understand why people may choose to never use enterprise search, no matter how good it is... then try to bring them back into the fold with socially enabled enterprise search tools. Don't just help people find information; help them find somebody who understands what the information means. Connecting people with mere words can easily backfire, and might actually make these people a burden on society. Instead, connect them with real, live humans who are eager to teach the knowledge being sought. At the same time, you need to work hard to protect these Mavens, so they don't flee your system in favor of another.
This is a problem that Google's search engine cannot solve -- mainly for privacy and trust reasons -- but it is 100% solvable in the enterprise. I'm just wondering why so few have done it...
There's a great developer site out there called 99 Bottles Of Beer. It shows you how to output the lyrics of the oh-so-annoying camp song in well over 1000 different programming languages.
Woah... 1000 languages, you say? Yes, there are well over 1000 known programming languages, but please keep in mind how developers think. Most of these languages are klunky, impractical, or intentionally impossible to use. These are sometimes called esoteric languages, or even Turing tarpits. Here are some of my favorite bizarre programming languages:
Kidding aside, there's a pretty good argument that learning how to print out 99 bottles of beer is a useful exercise when learning a new language. You need to learn the syntax of variables, conditionals, text output, and loops. Not to mention the fact that every language has nuances that can sometimes help you to further minimize your code base, but not sacrifice clarity... there's probably a dozen ways to write it in each laguage, each with different benefits.
So -- seeing how Oracle UCM was being left out -- I submitted the below code to their site. 99 Bottles of Beer, in IdocScript:
<$numBottles = "99", bottleStr = " bottles "$> <$loopwhile (numBottles > 0)$> <$verse = numBottles & bottleStr & "of beer on the wall,\n" & numBottles & bottleStr & "of beer!\n" & "Take one down, pass it around,\n"$> <$numBottles = numBottles - 1$> <$if numBottles > 0$> <$if numBottles == 1$> <$bottleStr = " bottle "$> <$endif$> <$verse = verse & numBottles & bottleStr & "of beer on the wall!\n"$> <$else$> <$verse = verse & "no more bottles of beer on the wall!\n"$> <$endif$> <$verse$> <$endloop$>
Nifty, eh?
Naturally, there are multiple ways to do this... you could use resource includes, localization strings, result sets, etc. But that's part of the fun of learning a new language. I'll leave it as an exercise for my audience to make it better.
One of the biggest challenges in social networks is keeping them updated. When you first log in, its a blank slate, and you have to find all your friends and make connections to them. This is a bit of a pain, so sites like Facebook and LinkedIn allow you to to import your email address book. They then data-mine the address book to see who you know that might already be in the network, which helps you make lots of connections quickly.
Ignoring the obvious security and privacy concerns, there are still two big problems with this:
In my latest book, I give some practical advice about how Content Management fits in with social software and Enterprise 2.0 initiatives... One of the ideas that I liked to drive home is that not all connections are equal, and it takes a lot of effort to keep quality information in your social software systems. Who is connected to whom? Which connections are genuine? And who is just a "link mooch" who is spamming people with "friend" requests just to ratchet up his ranking?
That latter one is particularly problematic on LinkedIn... Its littered with sub-par recruiters who send friend request spam so they can get something from you... but they never care to do anything for you.
Luckily, in the enterprise these problems can be solved relatively easily: data mine your email archives for who is connected to whom! By monitoring a host of statistics on who emails whom, about what, and when, you have a tremendously powerful tool for building social maps. You can determine who is connected to whom, who is an expert on which subject, and where the structural holes are in your enterprise. And you never need to maintain your connections! Any time you send a message to a friend, your social map is automatically rebuilt for you!
In order to do so, you'll need to run some data mining tools to find answers to the following questions:
Unfortunately, many employers have a policy against using company email for personal communications. Ironically, this policy could hurt the employer in the long run, because analyzing the violations of that policy are frequently the best way to determine who is well connected in your company! So, before you deploy any social software in the enterprise, encourage your employees to goof off via email (within reason), and set up some technology to data-mine your email archives (like Oracle Universal Online Archive, or something similar). Then keep tuning your map based on the email messages people send.
That will help you hit the ground running with enterprise social software...
UPDATE: This book tour has been rescheduled for March 17th-19th.
Well, its not really a book tour... but Andy and I will be visiting 3 cities for roundtable discussions on "Pragmatic Content Management". Oracle is organizing the whole shindig, and space will be limited... Andy will be giving a talk on Pragmatic ECM strategy, then I will present on implementation advice. Then there will be a 30-minute roundtable discussion, and we'll wrap it up before lunch.
For more specific information, please read the official invitation from Oracle. Here are the cities and dates:
If you want a book signed, please register and drop by!
I'm a power hater. I don't hate often, but when I do, I do it with gusto. So I have to say, this pile of vaporware called "The Semantic Web" is really starting to tick me off...
I'm not sure why, but recently it seems to be rearing its ugly head again in the information management industry, and wooing new potential victims (like Yahoo). I think its trying to ride the coattails of Web 2.0 -- particularly folksonomies and microformats. Nevertheless, I feel the need to expose it as the massive waste of time, energy, and brainpower that it is. People should stay focused on the very solvable problem of context, and thoroughly avoid the pipe dreams about semantics. Keep it simple, and you'll be much happier.
First, let's review what the "Semantic Web" is supposed to be... A semantic web is about a system that understands the meaning of web pages, and not merely the words on the page. Its about embedding information in your pages so computers can understand what things are, and how they are related. Such a beast would have tremendous value:
"I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize." -- Tim Berners-Lee, Director of the W3C, 1999
Gee. A future where human thought is irrelevant. How fun.
First, notice that this quote was from 1999. Its been ten years since Timmy complained that the semantic web was taking too long to materialize. So what has the W3C got to show for their decade of effort? A bunch of bloated XML formats that nobody uses... because we apparently needed more of those. By way of comparison, Timmy released the first web server on August 6, 1991... Within 3 years there were 4 public search engines, a solid web browser, and a million web pages. If there was actually any value in the "Semantic Web," why hasn't it emerged some time in the past 18 years?
I believe the problem is that Timmy is blinded by a vision and he can't let go... I hate to put it this way, but when compared against all other software pioneers, Timmy's kind of a one trick pony. He invented the HTTP protocol and the web server, and he continues to milk that for new awards every year... while never acknowledging the fact that the web's true turning point was when Marc Andreessen invented the Mosaic Web Browser. I'm positive Timmy's a lot smarter than I, but he seems stuck in a loop that his ego won't let him get out of.
The past 10,000 years of civilization has taught us the same things over and over: machines cannot replace people, they can only make people more productive by automating the mundane. Once machines become capable of solving the "hard problems," some wacky human goes off and finds even harder problems that machines can't solve alone... which then creates demand for humans to solve that next problem alone, or build a new kind of machine to do so.
Seriously... this is all just basic economics...
Computers can only do what they are told; they never "understand" anything. There will always be a noticeable gap between how a computer works, and how a human thinks. All software programs are based on symbol manipulation, which is a far cry from processing a semantically rich paragraph about the meaning of data. Well... isn't it possible to create a software program that uses symbol manipulation to "understand" semantics? Mathematicians, psychologists, and philosophers say "hell no..."
The Chinese Room thought experiment pretty clearly demonstrates that a symbol manipulation machine can never achieve true "human" intelligence. This is not to imply human brains are the only way to go... merely that if your goal is to mimic a human you're out of luck. Even worse, Gödel's Incompleteness Theorem proves that all systems of formal logic (mathematics, software, algorithms, etc.) are fundamentally error-prone. They sometimes cannot prove the truth of a true statement, and other times they prove the truth of false statements! Clearly, there are fundamental limits to what computers can do, one of which is to understand "meaning".
Therefore, even in theory, a true "semantic web" is impossible...
Well... who the hell cares about philosophical purity, anyway? There are many artificial intelligence experts working on the semantic web, and they rightly observe that the system doesn't have to be equivalent to human intelligence... As long as the system behaves like it has human intelligence, that's good enough. This is pretty much the Turing Test for artificial intelligence. If a human judge interacts with a machine, and the judge believes he is interacting with a real live human, then the machine has passed the test. This is what some call "weak" artificial intelligence.
Essentially, If it walks like a duck, and talks like a duck, then its a duck...
Fair enough... So, since we can't give birth to true AI, we'll get a jumble of smaller systems that together might behave like a real, live human. Or at least a duck. This means a lot of hardware, a lot of software, a lot of data entry, and a lot of maintenance. Ideally these systems would be little "agents" that search for knowledge on the web, and "learn" on their own... but there will always be a need for human intervention and sanity checks to make sure the "smart agents" are functioning properly.
That raises the question, how much human effort is involved in maintaining a system that behaves like a "weak" semantic web? Is the extra effort worth it when compared to a blend of simpler tools and manual processes?
Unfortunately, we don't have the data to answer this question. Nobody can say, because nobody has gotten even close to building a "weak" semantic web with much breadth... Timmy himself has said "This simple idea, however, remains largely unrealized" in 2006. Some people have seen success with highly specialized information management problems, that had strict vocabularies. However, I'd wager that they would have equivalent success with simpler tools like a controlled thesaurus, embedded metadata, a search engine, or pretty much any relational database in existence. That ain't rocket science, and each alternative is older than the web itself...
Now... to get the "weak semantic web" we'll need to scale up from one highly specialized problem to the entire internet... which yields a bewildering series of problems:
I'm sorry... but you're fighting basic human nature if you expect all this to happen... my feeling is that for most "real world" problems, a "semantic web" is far from the most practical solution.
So, where does this leave us? We're not hopeless, we're just misguided. We need to come down a little, and be reasonable about what is and is not feasible. I'd prefer if people worked towards the much more reachable goal of context sensitivity. Just make systems that gather a little bit more information about a user's behavior, who they are, what they view, and how they organize it. This is just a blend of identity management, metadata management, context management, and web trend analysis. That ain't rocket science... And don't think for one second that you can replace humans with technology: instead, focus on making tools that allow humans to do their jobs better.
Of course, if the Semantic Web goes away, then I'll need to find something else to power hate. I'm open to suggestions...
Yikes... Confusing, unclear, and cluttered since July of 2007... Not quite a ringing endorsement from the "crowd," eh?
The Wikipedia article for the Association for Information and Image Management isn't any better... at least Stellent's tiny tiny page is excusable since it doesn't exist as a company anymore. Considering the fact that folks like IBM, Oracle, EMC, and Microsoft all have product suites in this industry -- and considering how all of them tout blogs and wikis -- you'd think that somebody would have cleaned up Wikipedia by now.
I guess we all have better things to do...
Personally, I find this a refreshing reminder that the "semantic web" will NOT save you. Unless you do the hard work of creating new business processes around new information management technology, you'll just be cluttering your enterprise with ever more outdated, useless, and false data.
Back at Oracle Open World 2008, Oracle gave some lip service to how they would get into cloud computing... in case you are not familiar with the term, "cloud computing" is a way of designing your systems so that your data resources (and sometimes your services) behave as if they are "in the internet cloud." Its a combination of a service-oriented architecture, software-as-a-service, and storage-as-a-service. Developers love it, but system administrators are still a bit weary...
Basically, you rent the computational power and storage you need, and only pay for what you use. In theory you can rely on your provider -- such as Google or Amazon.com -- to take care of backups for you. Its a great idea for startups (Twitter does it) and mid-sized companies, so they can keep costs down, while still leaving room to grow. For large companies with their own dedicated data centers, cloud computing makes less sense for production software... but its usually a great idea for development and testing.
Anyway... I was curious how Oracle's "Cloud" strategy would develop... and I was pleasantly surprised to find some recent collaboration between Amazon and Oracle. They put together some Best Practices for Oracle In The Cloud, which I found on Justin Kestylyn's blog:
I really like the idea of encrypting database backups, and storing them in the cloud. That's an excellent idea, for pretty much anybody... and it is supported back to Oracle Database 9i. Check out the Cloud Backup Whitepaper for more info...
I also really see the value for using the Amazon cloud for the persistence layer for archives. The Oracle Universal Online Archive could be a real killer app, but proving its value will need about a Terabyte of storage, just to do a proof-of-concept. Unfortunately, that's not exactly something you can run on a VM Ware virtual machine... but you could do it as an Amazon Machine Image (AMI).
I wouldn't be surprised if we saw more and more archiving solutions that use Amazon's Cloud for persistence...
I had expected that it would take another 3 weeks to release this, but my second book is now available for purchase! As promised, this is more of a business strategy book, and less of a technical book... however, Andy and I did sneak in some good implementation details along the way. We designed this book so every member of your ECM team should get something useful out of it.
The purpose of the book is to present what we call a "pragmatic strategy for content management." For multiple reasons -- both political and technical -- it is rarely feasible for all of your content management products to be from one vendor. Perhaps you just merged with another company and you each have different vendors; perhaps you need blogs and wikis now and cannot wait for your ECM vendor to create a decent offering; perhaps SharePoint has grown like a fungus in your enterprise, and now you need some way to manage the insanity.
Some say the solution is rationalization: consolidate all content into one system... but that's not the whole story. You don't want to wind up like those poor saps running Lotus Notes, do you? Your users will rebel if you take away their nice collaboration tools, or if you tell them they can't have new ones. Entire departments will collapse if you eliminate content silos without any concern for users' productivity.
Instead, the pragmatic approach is to do the following:
The book is 250 pages long... but you don't have to read the whole thing. The chapter breakdown is as follows:
Chapters 1, 2, and 8 are relevant no matter which vendor you use for Enterprise Content Management. We do mention Oracle numerous times, but you can just BLEEEEEEP over that if you use tools from different vendors.
Chapters 3 through 7 show how to implement a "pragmatic ECM strategy" using Oracle tools. Some of this data may or may not be relevant to non-Oracle customers. In most cases, you should find it helpful to see what is possible, so you can determine the distance between where you are now, and where you want to be tomorrow.
I worked pretty hard on this, and I'm relatively pleased with the results... but I'm sure the haters out there will find something to complain about ;-)
Enjoy!
Recent comments
1 day 9 hours ago
2 days 3 hours ago
4 days 2 hours ago
1 week 1 day ago
1 week 1 day ago
1 week 1 day ago
1 week 1 day ago
1 week 1 day ago
1 week 1 day ago
1 week 2 days ago