A friend of mine was on a flight a few years back with a Business Process Management (BPM) expert, who had done a lot of consulting for various car companies... the fellow told an interesting tale about how one single bad business metric made him swear off GM cars forever... and it just might be a major reason for their downfall.
The business process seemed innocuous enough at first glance... GM wanted to control costs on their auto parts. So, the process stated to test every auto part, and look for the most expensive parts that were the most reliable. Next, ask suppliers for a cheaper version of that part. Sure, the cheaper version of that part might fail more often... but so what? These are the parts that were severely over-engineered. You don't need space shuttle quality parts in a minivan... so what's the harm?
Notice the problem? If not, don't worry... neither did the millionaires who ran GM.
Now, consider how a rival car company dealt with the same problem. They would also run tests on every car part. They would also keep metrics on which car part failed the least, and which failed the most. However, they did very different things with the data. The rival company took the parts that failed the most often, and either demanded higher quality versions, or searched for a new vendor. Sometimes these changes would increase the price of the part, sometimes it would decrease.
Now do you see the problem? Each business process had an overall side-effect to the quality of the cars produced. As rival companies made their cars more and more reliable, GM was making theirs less and less reliable! Instead of focusing cost cutting on the overall finished product, they decided to tie cost cutting directly to making lower quality cars!
After realizing this, and being completely unable to get anybody at GM to change their metrics, the gentleman decided to swear off GM cars forever...
Peter Senge warned about similar problems 30 years ago in his famous book The Fifth Discipline. Side effects, negative feedback loops, and simple delays in cause-and-effect can wreak havoc on any business process you put together. No matter what your metric, there is always a case where a "good" result is a very bad thing... The key is to try to predict how that could be possible... otherwise, by doing your job better and better, you just might be dooming your company to mediocrity.
Naturally, its probably unfair to blame this all on one single business process. There are also the armies of people asleep at the switch who should have done something to correct it. Unfortunately, the folks who design the business processes are usually unable or unwilling to accept this harsh reality... and if they are politically powerful, bad processes remain. This is why we need "nice" tools that help gather information about these negative consequences that are outside of the model, and make it clear that its time to change...
That's one of the stated goals of Enterprise 2.0 tools... but even they can't help you unless you first try to build up trust and camaraderie in your company. Only then is it easy for people to accept the harsh reality.
As many of you know, Twitter limits you to only 140 characters in each "tweet." That doesn't sound like much... but if you try you can cram a good deal of data in those 140 characters! In fact, Quasimondo figured out how to tweet a pretty good version of the Mona Lisa!
The technique was pretty clever: Tweet in Chinese! Twitter allows you to use UTF8 characters, which means if you pick a language with a lot of possible letters -- like Chinese -- you can encode a great deal of data into one single letter. If properly encoded, you can cram 210 bytes of data into 140 Chinese letters.
So, the guy came up with a way to sketch the Mona Lisa in about 200 bytes, then encoded it into 140 Chinese letters. You can see the results below, which look pretty cool. The English translation is a tad odd, however:The whip is war
that easily comes
framing a wild mountain.
Hello, you in the closet,
singing--posing carved peaks
of sound understanding.
Upon a kitchen altar
visit a prostitute--
an ugly woman saint--
lonesome mountain valley,
your treasury: a dumb corpse and
funeral car, idle choke open.
exactly what you would call nervous.
Well, do not suggest recalcitrance
those who donated sad.
The smell of a rugged frame
strikes cement block once.
Cape. Cylinder. Cry.
Interesting... It makes me wonder what the Tao Te Ching would "look" like... It also makes me wonder what kind of word salad we would get if we "translated" corporate logos into Chinese...
Continuing on my anti-semantic-web rants... I feel obligated to note that I expect very little of use to come out of the latest pony in this show: Wolfram Alpha. There were the obvious insanely overly-optimistic reviews that said its a search engine that will change the universe forever!!! A few other folks were cautiously optimistic that it might have limited long-term value... I think Spigel Online had the best summary:
Clever presentation, but a weak database: The soon-to-be-launched Wolfram Alpha search engine is already being touted as the "Google killer." SPIEGEL ONLINE has tested a preliminary version. The conclusion: It knows a lot about aspirin, a little about culture -- and it thinks German Chancellor Angela Merkel's political party is an airport.
Its not really a "Google Killer." Its not even a search engine per se... I'd describe it as a slightly smarter almanac... and its going to take about 1000 full time employees just to keep it vaguely useful. In general, the more clever your code gets, the more likely it is to go off the deep end and give you very very bad data.
Personally, if the inventors dial back their claims, this might have limited use... if not, it will probably languish like pretty much every similar hunk of software in the past... (anybody out there remember Cyc?)
Back in the early days of the web, Peter Deutsch from Sun penned a classic list: The Fallacies of Distributed Computing. Peter took a long, hard look at dozens of networked systems that failed, and realized that almost every failure made one or more catastrophic assumptions:
- The network is reliable.
- Latency is zero.
- Bandwidth is infinite.
- The network is secure.
- Topology doesn't change.
- There is one administrator.
- Transport cost is zero.
- The network is homogeneous.
Any time you make an assumption along the lines of the fallacies above, your project will almost certainly fail. These fallacies are best explained in an article by Arnon Rotem-Gal-Oz, but today I will focus on fallacy #5: Topology doesn't change, and how the semantic web will fail partially because its creators made this fatal assumption.
As I mentioned before, proponents of the "Semantic Web" are trying to dial down their more grandiose claims, and focus on items with more concrete value. The term that Tim Berners-Lee is using these days is Linked Data. The core idea is to encourage people to put highly structured data on the web, and not just unstructured HTML documents, so the data is easier for machines to read and understand.
Funny thing, people have been doing this for decades. Tons of folks make structured data available as "scrapable" HTML tables, as formatted XML files, or even as plain ol' Comma Seperated Value (CSV) files that you can open in Excel. Not to mention the dozens of open web services and APIs... allowing you to do anything from check stock quotes, to doing a Google Maps mashup. There really is nothing groundbreaking here... and I find it painfully disingenuous for somebody to claim that such an obvious step was "their magic idea."
Well, not so fast... in an attempt to breath relevance back into the "Semantic Web," Tim claims that "Real Linked Data" needs to follow three basic rules:
- URLs should not just go to documents, but structured data describing what the document is about: people, places, products, events, etc.
- The data should be important and meaningful, and should be in some kind of standard format.
- The returned structured data has relationships to other kinds of structured data. If a person was born in Germany, the data about that user should contain a link to the data about Germany.
OK... so your data has to not only be in a standard format... but it needs links to other data objects in standard formats. And this is exactly where they fail to heed the warnings about the fallacies of distributed computing! Your topology will always change... not only physical topology, but also the logical topology.
Or, more succinctly, what the heck is the URL to Germany?!?!?
Look... links break all the time. People move servers. People shut down servers. People go out of business. People start charging for access to their data. People upgrade their architecture, and choose a different logical hierarchy for their data. Companies get acquired, or go out of business. Countries merge, or get conquered. Embarrassing content is removed from the web. Therefore, if you use links for identifiers, don't expect your identifiers to work for very long. You will need to spend a lot of time and energy maintaining broken links, when quite frankly you could do quite fine without them in the first place.
An identifier says what something is. A link says where you can find it. These concepts should be kept absolutely separated. Its a bad bad bad bad bad idea to blend the "where" with the "what" into one single identifier... even the much touted Dereferenceable URIs won't cut it, especially from a long-term data maintenance perspective... because the data they deference to might no longer be there!
So, where does that leave us? Exactly where we are. There are plenty of ways to create a system of globally unique IDs, whether you are a major standards body, or a small company with your own product numbers. But we shouldn't use brittle links... we should use scoped identifiers instead. We need a simple, terse way to describe what something is, that in no way, shape, or form looks like a URL. The identifier is the "what." We need a secondary web service -- like Google -- to tell us the most likely "where." At most, data pages should contain a link to a "suggested web service" to translate the "what" into the "where." Of course... that web service might not exist in 5 years, so proceed with caution.
For example, we could use something similar to Java package names to make sure anybody with a DNS name can create their own identifier... For example, there's a perfectly good ISO standard for country names. So you tell me, which is a better identifier for Germany?
I don't know... Openlinsw.com and DBPedia might not be around in 3 years, and data is supposed to be permanent. Wikipedia will probably be around for a while, but should it go to the English page or the German page? The ISO 3166 identifier may not be clickable, but at least it works for both German and English speakers! Also, if you remove the dots and Google it, the first hit gives you exactly the info you need. Plus, these ISO codes will exist forever, even if the ISO itself gets overrun by self-aware semantic web agents.
I just can't shake the feeling that using links for identifiers leads to a false sense of reliability. Your identifiers are some of the most important parts of your data: they should be something globally unique and permanent... and the web is anything but permanent.
Lets' accept the fact that the topology will change, create a system of globally unique identifiers that are independent of topology, and go from there.
In a recent TED Talk, Tim Berners Lee laid out his next vision for the world wide web... something he likes to call "Linked Data." Instead of putting fairly unstructured documents on the web, we should also put highly structured raw data on the web. This data would have relationships with other pieces of data on the web, and these relationships would be described by having data files "link" to each other with URLs.
This sounds similar to his previous vision, which he called the "semantic web," but the "linked data" web sounds a bit more practical. This change of focus is good, because as I covered before, a "true" semantic web is at best impractical, and at worst impossible. However, just as before, I really don't think he's thought this one through...
The talk is up on the on the TED conference page if you'd like to see it. As is typical of all his speeches, the first 5 minutes is him tooting his own horn...
- Ever heard of the web? Yeah, I did that.
- I I I.
- Me me me.
- The grass roots movement helped, but let's keep talking about me.
- I also invented shoes.
I'll address his idea of Linked Data next week -- preview: I don't think it will work. -- but I first need to get this off my chest. No one single person toiled in obscurity and "invented the web." I really wish he would stop making this claim, and stop fostering this "web worship" about how the entire internet should be the web... because its actually hurting innovation.
Let's be clear: Tim took one guy's invention (hypertext) and combined it with another guy's invention (file sharing) by extending another guy's invention (network protocols). Most of the cool ideas in hypertext -- two-way links, and managing broken links -- were too hard to do over a network, so he just punted and said 404! In addition, the entire system would have languished in obscurity without yet another guy's invention (the web browser). There are many people more important than Tim who laid the groundwork for what we now call "the web," and he just makes himself look foolish and petty for not giving them credit. Tim's HTTP protocol was just an natural extension of other people's inventions that were truly innovative.
Now, Tim did invent the URL -- which is cool, but again, hardly innovative. Anybody who has seen an email address would be familiar with the utility of a "uniform resource identifier." And as I noted before, URLs are kind of backwards, so its not like he totally nailed the problem.
As Ken says... anybody who claims to have "invented the web" is delusional. Its would be exactly like if a guy 2000 years ago asked: "wouldn't it be great if we could get lots of water from the lake, to the center of the town?" And then claimed to have invented the aqueduct.
As Alec says... the early 90s was an amazing time for software. There was so much computing power in the hands of so many people, all of whom understood the importance of making data transfer easier for the average person... Every data transfer protocol was slightly better than the last, and more kept coming every day. It was only a matter of time until some minor improvement on existing protocols was simple enough to create a critical mass of adoption. The web was one such system... along with email and instant messaging.
Case in point: any geek graduate of the University of Minnesota would know that the Gopher hyperlinking protocol pre-dated HTTP by several years. It was based on FTP, and the Gopher client had easily clickable links to other Gopher documents. It failed to gain popularity because it imposed a rigid file format and folder structure... plus Minnesota shot themselves in the foot by demanding royalty fees from other Universities just when HTTP became available. So HTTP exploded in popularity, while Gopher stagnated and never improved.
But, the popularity of the web is a double-edge sword. Sure, it helps people collaborate and communicate, enabling faster innovation in business. But ironically, the popularity of the web is hurting new innovation on the internet itself. Too much attention is paid to it, and better protocols get little attention... and the process for modifying HTTP is so damn political, good luck making it better.
For example... most companies love to firewall everything they can, so people can't run interesting file sharing applications. It wasn't always like this... because data transfer was less common, network guys used to run all kinds of things that synced data and transferred files. But, as the web because much more popular, threats became more common, and network security was overwhelmed. They started blocking applications with firewalls, and emails with ZIP attachments just to lessen their workload... But they couldn't possibly block the web! So they left it open.
This is a false sense of security, because people will figure ways around it. Its standard hacker handbook stuff: just channel all your data through port 80, and limp along with the limitations. These are the folks who can tunnel NFS through DNS... they'll find their way through the web to get at your data.
What else could possibly explain the existence of WebDAV, CalDAV, RSS, SOAP, and REST? They sure as hell aren't the best way for two machines to communicate... not by a long shot. And they certainly open up new attack vectors... but people use them because port 80 is never blocked by the firewall, and they are making the best of the situation. As Bruce Schneier said, "SOAP is designed as a firewall friendly protocol, which is like a skull friendly bullet." If it weren't for the popularity of the web, maybe people would think harder about solving the secure server-to-server communication problem... but now we're stuck.
All this "web worship" is nothing more than the fallacy of assuming something is good just because it's popular. Yes, the web is good... but not because of the technology; it's good because of how people use it to share information... and frankly, if Tim never invented the web, no big loss; we'd probably be using something much better instead... but now we're stuck. We can call it Web 2.0 to make you feel better, but it's nowhere near the overhaul of web protocols that are so badly needed... Its a bundle of work-arounds that Microsoft and Netscape and open source developers bolted on to Web 1.0 to make it suck less... and now it too is reaching a critical mass. Lucky us: we'll be stuck with that as well.
What would this "better protocol" be like? Well... it would probably be able to transfer large files reliably. Imagine that! It would also be able to transfer lots and lots of little files without round-trip latency issues. It would also support streaming media. It would have built-in distributed identity management. It would also support some kind of messaging, so instead of "pulling" a site's RSS feed a million times per day, you'd get "pushed" an alert when something changes. Maybe it would have some "quality of service" options. Most importantly, it would allow bandwidth sharing for small sites with popular content, to improve the reach of large niche data.
All these technologies already exist in popular protocols... but they are not in "the web." All of these technologies are likewise critical for anything like Tim's "Linked Data" vision to be even remotely practical. All things being equal, the web is almost certainly the WORST way to achieve a giant system of linked data. Just because you can do it over the web, that doesn't mean you should. But again... we're stuck with the web... so we'll probably have to limp along, as always. Developers are accustomed to legacy systems... we'll make it work somehow.
Now that I've gotten that out of my system, I'll be able to do a more objective analysis of "Linked Data" next week.
- PART ONE: we began by discussing the importance of "social media," both in and out of the enterprise. We also touched on the importance of a Web 2.0 infrastructure to enable the "right kind" of collaboration. I spent a tiny bit of time discussing the importance of social search, in contrast to typical enterprise search, but not in the depth that I would have liked...
- PART TWO: this talk was more about collaboration. Bob begins with the question, isn't Enterprise 2.0 just "anarchyware?" Meaning, it might do too good of a job at "flattening" the corporate structure, that it might lead to poor decisions and chaos. We dealt with how you can avoid anarchy, sometimes with a slow adoption that your corporate culture can tolerate, and sometimes by putting extra seat belts in these systems. The best enterprise 2.0 architectures should be natural extensions of systems that allow effective committee-based decisions... believe it or not, there are several good ways a big committee can make decisions, although it takes a bit of discipline. I also challenged Friedman's assertion that "The World Is Flat" with a rant about how The World Is Spiky. Random collaboration is pretty much just noise; true innovation will only occur when you get the right people to collaborate...
- PART THREE: finally, we get talking about the digital natives: the latest generation of workers versus the baby boomers. The former will love the latest E 2.0 systems, because they will help them expand their influence, whereas the latter will hate changing their habits and sharing their knowledge so close to retirement... which is a shame, because that is exactly what businesses need. Luckily, systems like Facebook and Twitter are becoming so fun, that there's still hope to bring similar systems into the enterprise. We also discuss why should architects care at all about enterprise 2.0?
It was fun to put these together... and thanks a lot to Bob for editing all of our ramblings into easy to follow chunks! Feel free to comment on these podcasts below...
The question is, how do we make enterprise search better? Some people complain that enterprise search should behave more like Google search, which I vehemently disagree with, for one primary reason: enterprise search is a FUNDAMENTALLY different problem than internet search. Here are some examples:
The internet search problem is like this:
- Heavily linked pages, which can be analyzed for "relevance" and "importance"
- Spam is a constant problem
- People don't want you to monitor their behavior
- People obsess about their Google Page rank
- People obsess about their hit count
- People aren't looking for the answer, they are looking for an answer
The whole problem reminds me of a scene from The Zero Effect:
Now, a few words on looking for things. When you look for something specific, your chances of finding it are very bad... because of all things in the world, you only want one of them. When you look for anything at all, your chances of finding it are very good... because of all the things in the world, you're sure to find some of them.
Internet search is like looking for anything at all... whereas enterprise search is like looking for something specific:
- People don't want general information; they want the 100% definitive answer
- The trust level is usually higher between co-workers, than between random web surfers... or at least it should be. Otherwise, you got bigger problems than information management.
- You know exactly who is running the search
- You know exactly what department they are in, and what content they are likely to need
- You know exactly their previous search history, possibly even their favorite "tags"
- Spam is minimal, or non-existent
- Content uses few, if any, hyperlinks to help determine relevance
- People usually write content because of obligation, and do not usually care about making it easy for their audience to understand
Trying to solve both problems with the same exact tool will only lead to frustration...
Now... Solving this problem with social tools is a much easier, and arguably better approach. People usually don't want to know the answer, people usually want to know who knows the answer. This is an observation as old as Mooer's Law (1959) about information management:
“An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have it.”
Fifty years later, and folks still don't quite seem to get it... The average user does not want to read enterprise content! They don't read documentation on the subject, nor do they read books on the subject, nor do they read blogs on the subject... In general, people don't care to actually learn anything new; they just want the quick answer that lets them move on and get back to their normal job. Most people look for information so they can perform some kind of task, and then they'll be more than happy to forget that information afterward. Its a rare individual who learns for the sake of knowledge... These folks are sometimes called Mavens, and everybody wants to be connected with these Mavens so they can do their jobs better. As a result, these Mavens will always be overwhelmed with phone calls, emails, and meeting invites.
As those mediums became flooded, some of your resources fled to other places -- like Twitter, or Facebook, or enterprise social software -- and forced would-be connectors to follow. This constant movement (or hiding) helps a bit... but its only a matter of time before those mediums get flooded as well, and the noise overwhelms the signal.
In order to truly solve the enterprise search problem, you need to first understand why people may choose to never use enterprise search, no matter how good it is... then try to bring them back into the fold with socially enabled enterprise search tools. Don't just help people find information; help them find somebody who understands what the information means. Connecting people with mere words can easily backfire, and might actually make these people a burden on society. Instead, connect them with real, live humans who are eager to teach the knowledge being sought. At the same time, you need to work hard to protect these Mavens, so they don't flee your system in favor of another.
This is a problem that Google's search engine cannot solve -- mainly for privacy and trust reasons -- but it is 100% solvable in the enterprise. I'm just wondering why so few have done it...
There's a great developer site out there called 99 Bottles Of Beer. It shows you how to output the lyrics of the oh-so-annoying camp song in well over 1000 different programming languages.
Woah... 1000 languages, you say? Yes, there are well over 1000 known programming languages, but please keep in mind how developers think. Most of these languages are klunky, impractical, or intentionally impossible to use. These are sometimes called esoteric languages, or even Turing tarpits. Here are some of my favorite bizarre programming languages:
- Whitespace: no letters, no numbers, no symbols... the only valid syntax is tab, space, and carriage returns.
- LOLCODE: the syntax looks like something you'd see on a LOL cats poster. I HAZ A BEERZ ITZ 99! IM IN YR LOOP! IZ VAR LIEK 0? KTHXBYE!
- Piet: just damn pixels on a screen... no letters even!
- Cow: instead of number and symbols, you only get moOmOOmoOmOoOOM.
- Brainf**k: trust me... you do NOT want to maintain code written in this language...
Kidding aside, there's a pretty good argument that learning how to print out 99 bottles of beer is a useful exercise when learning a new language. You need to learn the syntax of variables, conditionals, text output, and loops. Not to mention the fact that every language has nuances that can sometimes help you to further minimize your code base, but not sacrifice clarity... there's probably a dozen ways to write it in each laguage, each with different benefits.
So -- seeing how Oracle UCM was being left out -- I submitted the below code to their site. 99 Bottles of Beer, in IdocScript:
<$numBottles = "99", bottleStr = " bottles "$> <$loopwhile (numBottles > 0)$> <$verse = numBottles & bottleStr & "of beer on the wall,\n" & numBottles & bottleStr & "of beer!\n" & "Take one down, pass it around,\n"$> <$numBottles = numBottles - 1$> <$if numBottles > 0$> <$if numBottles == 1$> <$bottleStr = " bottle "$> <$endif$> <$verse = verse & numBottles & bottleStr & "of beer on the wall!\n"$> <$else$> <$verse = verse & "no more bottles of beer on the wall!\n"$> <$endif$> <$verse$> <$endloop$>
Naturally, there are multiple ways to do this... you could use resource includes, localization strings, result sets, etc. But that's part of the fun of learning a new language. I'll leave it as an exercise for my audience to make it better.
One of the biggest challenges in social networks is keeping them updated. When you first log in, its a blank slate, and you have to find all your friends and make connections to them. This is a bit of a pain, so sites like Facebook and LinkedIn allow you to to import your email address book. They then data-mine the address book to see who you know that might already be in the network, which helps you make lots of connections quickly.
Ignoring the obvious security and privacy concerns, there are still two big problems with this:
- These systems find connections, but they ignore the strength and quality of those connections.
- You have to constantly import your address book if you keep making new friends.
In my latest book, I give some practical advice about how Content Management fits in with social software and Enterprise 2.0 initiatives... One of the ideas that I liked to drive home is that not all connections are equal, and it takes a lot of effort to keep quality information in your social software systems. Who is connected to whom? Which connections are genuine? And who is just a "link mooch" who is spamming people with "friend" requests just to ratchet up his ranking?
That latter one is particularly problematic on LinkedIn... Its littered with sub-par recruiters who send friend request spam so they can get something from you... but they never care to do anything for you.
Luckily, in the enterprise these problems can be solved relatively easily: data mine your email archives for who is connected to whom! By monitoring a host of statistics on who emails whom, about what, and when, you have a tremendously powerful tool for building social maps. You can determine who is connected to whom, who is an expert on which subject, and where the structural holes are in your enterprise. And you never need to maintain your connections! Any time you send a message to a friend, your social map is automatically rebuilt for you!
In order to do so, you'll need to run some data mining tools to find answers to the following questions:
- Who do you send emails to? These are the people you claim to be connected to.
- Does this person reply to your emails? If so, the connection is mutual.
- How often do you email? A one-time email is probably not a connection, but a weekly email might be a strong connection.
- How long does it take them to reply to you? A faster reply usually means your communications get priority to them, and they feel a stronger connection to you.
- How long do you take to reply to them? Again, a faster reply from you means that their communications get priority from you, meaning you feel a strong connection as well.
- Do you answer emails about a topic, or just forward them along? Just because you are the "point man" for Java questions, that doesn't mean you "know" Java... but it probably means you "know who knows" Java, which is sometimes even better.
- Does one person usually do all of the initiation of new emails? If so, then this might be a lopsided friendship, or it might just mean that one person has more free time.
- What are the topics of conversation? In reality, the more often you discuss work, the weaker the connection! If you also discuss gossip, news, current events, sports, movies, family, or trivia, then you probably have a stronger connection. The more topics you discuss, the more likely you are to be close friends.
- What is the flow of email from one department to another? If its peer-to-peer, then these departments are comfortable sharing information. If it always goes through the chain of command, then these departments are socially isolated, and probably unlikely to trust each other.
- Who do you email outside the company? If an employee in the marketing department emailed a friend who works at the company Ravenna, and your sales person is trying to connect with somebody at Ravenna, then these two employees might want to connect.
Unfortunately, many employers have a policy against using company email for personal communications. Ironically, this policy could hurt the employer in the long run, because analyzing the violations of that policy are frequently the best way to determine who is well connected in your company! So, before you deploy any social software in the enterprise, encourage your employees to goof off via email (within reason), and set up some technology to data-mine your email archives (like Oracle Universal Online Archive, or something similar). Then keep tuning your map based on the email messages people send.
That will help you hit the ground running with enterprise social software...
UPDATE: This book tour has been rescheduled for March 17th-19th.
Well, its not really a book tour... but Andy and I will be visiting 3 cities for roundtable discussions on "Pragmatic Content Management". Oracle is organizing the whole shindig, and space will be limited... Andy will be giving a talk on Pragmatic ECM strategy, then I will present on implementation advice. Then there will be a 30-minute roundtable discussion, and we'll wrap it up before lunch.
For more specific information, please read the official invitation from Oracle. Here are the cities and dates:
- Cincinnati: Tuesday, March 17, 2009
- Memphis: Wednesday, March 18, 2009
- Houston: Thursday, March 19, 2009
If you want a book signed, please register and drop by!
I'm a power hater. I don't hate often, but when I do, I do it with gusto. So I have to say, this pile of vaporware called "The Semantic Web" is really starting to tick me off...
I'm not sure why, but recently it seems to be rearing its ugly head again in the information management industry, and wooing new potential victims (like Yahoo). I think its trying to ride the coattails of Web 2.0 -- particularly folksonomies and microformats. Nevertheless, I feel the need to expose it as the massive waste of time, energy, and brainpower that it is. People should stay focused on the very solvable problem of context, and thoroughly avoid the pipe dreams about semantics. Keep it simple, and you'll be much happier.
First, let's review what the "Semantic Web" is supposed to be... A semantic web is about a system that understands the meaning of web pages, and not merely the words on the page. Its about embedding information in your pages so computers can understand what things are, and how they are related. Such a beast would have tremendous value:
"I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize." -- Tim Berners-Lee, Director of the W3C, 1999
Gee. A future where human thought is irrelevant. How fun.
First, notice that this quote was from 1999. Its been ten years since Timmy complained that the semantic web was taking too long to materialize. So what has the W3C got to show for their decade of effort? A bunch of bloated XML formats that nobody uses... because we apparently needed more of those. By way of comparison, Timmy released the first web server on August 6, 1991... Within 3 years there were 4 public search engines, a solid web browser, and a million web pages. If there was actually any value in the "Semantic Web," why hasn't it emerged some time in the past 18 years?
I believe the problem is that Timmy is blinded by a vision and he can't let go... I hate to put it this way, but when compared against all other software pioneers, Timmy's kind of a one trick pony. He invented the HTTP protocol and the web server, and he continues to milk that for new awards every year... while never acknowledging the fact that the web's true turning point was when Marc Andreessen invented the Mosaic Web Browser. I'm positive Timmy's a lot smarter than I, but he seems stuck in a loop that his ego won't let him get out of.
The past 10,000 years of civilization has taught us the same things over and over: machines cannot replace people, they can only make people more productive by automating the mundane. Once machines become capable of solving the "hard problems," some wacky human goes off and finds even harder problems that machines can't solve alone... which then creates demand for humans to solve that next problem alone, or build a new kind of machine to do so.
Seriously... this is all just basic economics...
Computers can only do what they are told; they never "understand" anything. There will always be a noticeable gap between how a computer works, and how a human thinks. All software programs are based on symbol manipulation, which is a far cry from processing a semantically rich paragraph about the meaning of data. Well... isn't it possible to create a software program that uses symbol manipulation to "understand" semantics? Mathematicians, psychologists, and philosophers say "hell no..."
The Chinese Room thought experiment pretty clearly demonstrates that a symbol manipulation machine can never achieve true "human" intelligence. This is not to imply human brains are the only way to go... merely that if your goal is to mimic a human you're out of luck. Even worse, Gödel's Incompleteness Theorem proves that all systems of formal logic (mathematics, software, algorithms, etc.) are fundamentally error-prone. They sometimes cannot prove the truth of a true statement, and other times they prove the truth of false statements! Clearly, there are fundamental limits to what computers can do, one of which is to understand "meaning".
Therefore, even in theory, a true "semantic web" is impossible...
Well... who the hell cares about philosophical purity, anyway? There are many artificial intelligence experts working on the semantic web, and they rightly observe that the system doesn't have to be equivalent to human intelligence... As long as the system behaves like it has human intelligence, that's good enough. This is pretty much the Turing Test for artificial intelligence. If a human judge interacts with a machine, and the judge believes he is interacting with a real live human, then the machine has passed the test. This is what some call "weak" artificial intelligence.
Essentially, If it walks like a duck, and talks like a duck, then its a duck...
Fair enough... So, since we can't give birth to true AI, we'll get a jumble of smaller systems that together might behave like a real, live human. Or at least a duck. This means a lot of hardware, a lot of software, a lot of data entry, and a lot of maintenance. Ideally these systems would be little "agents" that search for knowledge on the web, and "learn" on their own... but there will always be a need for human intervention and sanity checks to make sure the "smart agents" are functioning properly.
That raises the question, how much human effort is involved in maintaining a system that behaves like a "weak" semantic web? Is the extra effort worth it when compared to a blend of simpler tools and manual processes?
Unfortunately, we don't have the data to answer this question. Nobody can say, because nobody has gotten even close to building a "weak" semantic web with much breadth... Timmy himself has said "This simple idea, however, remains largely unrealized" in 2006. Some people have seen success with highly specialized information management problems, that had strict vocabularies. However, I'd wager that they would have equivalent success with simpler tools like a controlled thesaurus, embedded metadata, a search engine, or pretty much any relational database in existence. That ain't rocket science, and each alternative is older than the web itself...
Now... to get the "weak semantic web" we'll need to scale up from one highly specialized problem to the entire internet... which yields a bewildering series of problems:
- Who gets to tag their web pages with metadata about what the page is "about"?
- What about SPAM? There's a damn good reason why search engines in the 90s began to ignore the "keywords" meta tag.
- Who will maintain the billions of data structures necessary to explain everything on the web?
- What about novices? Bad metadata and bad structures dilute the entire system, so each one of those billion formats will require years of negotiation between experts.
- Who gets to "kick out" bad metadata pages, to prevent pollution of the semantic web?
- What about vandals? I could get you de-ranked and de-listed if you fail to observe all ten billion rules.
- Who gets to absorb web pages to extract the knowledge?
- What about copyrights? Your "smart agent" could be a "derivative work," so some of the best content may remain hidden.
- Who gets to track behavior to validate the semantic model?
- What about privacy? If my clicks help you sell to others, I should be compensated.
- Will we require people to share analytical data so the semantic web can grow?
- What about incentives? Nobody using the web for commerce will share, unless there's a clear profit path.
I'm sorry... but you're fighting basic human nature if you expect all this to happen... my feeling is that for most "real world" problems, a "semantic web" is far from the most practical solution.
So, where does this leave us? We're not hopeless, we're just misguided. We need to come down a little, and be reasonable about what is and is not feasible. I'd prefer if people worked towards the much more reachable goal of context sensitivity. Just make systems that gather a little bit more information about a user's behavior, who they are, what they view, and how they organize it. This is just a blend of identity management, metadata management, context management, and web trend analysis. That ain't rocket science... And don't think for one second that you can replace humans with technology: instead, focus on making tools that allow humans to do their jobs better.
Of course, if the Semantic Web goes away, then I'll need to find something else to power hate. I'm open to suggestions...
In the early days of computer science, people discovered what was later to be called "Conway's Law":
Any organization that designs a system (defined more broadly here than just information systems) will inevitably produce a design whose structure is a copy of the organization's communication structure.
In other words, lets say you are designing a complex system -- an auto manufacturing plant, a new financial market, a hospital, the World Health Organization, or a large software solution -- the efficiency of the end result will always be limited by the efficiency of how the committee communicates. Lets say two segments of your system need to communicate with each other... however, the two designers of those systems were unable to communicate effectively with each other. The end result will invariably be a system where those two segments are unable to exchange important information properly. If I have to run an idea by my boss before handing it off to my peer in another department, then I'll almost always design a system that uses the same paths for sending important messages... whether or not its the optimal approach.
This helps explains why large companies love Enterprise Services Buses, but small companies think they are the spawn of the devil... neither is correct, however both opinions derive from the communication structure in their respective organizations.
This goes beyond the obvious communication problems between silos and corporate fiefdoms... even the physical components you design will inevitably mirror your ability (or inability) to communicate. From Wikipedia:
Consider a large system S that the government wants to build. The government hires company X to build system S. Say company X has three engineering groups, E1, E2, and E3 that participate in the project. Conway's law suggests that it is likely that the resultant system will consist of 3 major subsystems (S1, S2, S3), each built by one of the engineering groups. More importantly, the resultant interfaces between the subsystems (S1-S2, S1-S3, etc) will reflect the quality and nature of the real-world interpersonal communications between the respective engineering groups (E1-E2, E1-E3, etc).
Another example: Consider a two-person team of software engineers, A and B. Say A designs and codes a software class X. Later, the team discovers that class X needs some new features. If A adds the features, A is likely to simply expand X to include the new features. If B adds the new features, B may be afraid of breaking X, and so instead will create a new derived class X2 that inherits X's features, and puts the new features in X2. So, in this example, the final design is a reflection of who implemented the functionality.
How do you avoid becoming a similar statistic? Simple: be flexible.
The more flexible you are when making the design, the more flexible you are to adopt new ideas and new ways of communicating, the more likely you are to create a useful product. For those who looooooooooove process, then what you need is a process for injecting flexibility into your system when metrics demonstrate a communication problem.
The number one task of any business is to make money. The number two task is to improve inter-departmental communication. After that, all problems can be solved.
I've always said, the most important skill a technical person can posses is the ability to communicate... you might not have a remarkable impact on any one feature, but you'll be better positioned to understand the whole problem, and the whole solution. Talk with your peers, and make sure that the lines of communications are 100% open across divisions... Especially divisions that hate each other. Make sure people feel connected, and that they can trust the opinions and needs of others.
Only then will a committee be able to design a system less dysfunctional than itself...
Yikes... Confusing, unclear, and cluttered since July of 2007... Not quite a ringing endorsement from the "crowd," eh?
The Wikipedia article for the Association for Information and Image Management isn't any better... at least Stellent's tiny tiny page is excusable since it doesn't exist as a company anymore. Considering the fact that folks like IBM, Oracle, EMC, and Microsoft all have product suites in this industry -- and considering how all of them tout blogs and wikis -- you'd think that somebody would have cleaned up Wikipedia by now.
I guess we all have better things to do...
Personally, I find this a refreshing reminder that the "semantic web" will NOT save you. Unless you do the hard work of creating new business processes around new information management technology, you'll just be cluttering your enterprise with ever more outdated, useless, and false data.
I had expected that it would take another 3 weeks to release this, but my second book is now available for purchase! As promised, this is more of a business strategy book, and less of a technical book... however, Andy and I did sneak in some good implementation details along the way. We designed this book so every member of your ECM team should get something useful out of it.
The purpose of the book is to present what we call a "pragmatic strategy for content management." For multiple reasons -- both political and technical -- it is rarely feasible for all of your content management products to be from one vendor. Perhaps you just merged with another company and you each have different vendors; perhaps you need blogs and wikis now and cannot wait for your ECM vendor to create a decent offering; perhaps SharePoint has grown like a fungus in your enterprise, and now you need some way to manage the insanity.
Some say the solution is rationalization: consolidate all content into one system... but that's not the whole story. You don't want to wind up like those poor saps running Lotus Notes, do you? Your users will rebel if you take away their nice collaboration tools, or if you tell them they can't have new ones. Entire departments will collapse if you eliminate content silos without any concern for users' productivity.
Instead, the pragmatic approach is to do the following:
- Consolidate content when possible into a "strategic ecm infrastructure." This can -- if desired -- be the single repository that satisfies all your content management needs; however this is not a requirement.
- Federate content services to tactical and legacy applications. This means managing content in other repositories with a combination of enterprise search, universal records management, and enterprise mashups.
- Secure content wherever it lives. Ironically, in most cases your data is only secure when it is not in use! Once you move it from one system to another, it is at risk. Your information should always be secure, whether it is locked down in a database, or in a USB thumb drive at the bottom of your sock drawer.
The book is 250 pages long... but you don't have to read the whole thing. The chapter breakdown is as follows:
- The State of Information Management: a good grounding in what exactly ECM is all about, and why it is important.
- A Pragmatic ECM Architecture: the steps you need to take in order to realize the value of an ECM initiative.
- Assessing Your Environment: make a big list of what needs to be done, and by whom. Which content should be consolidated, and which is best left where it is?
- Strategic ECM Infrastructure and Middleware: this is the "strategic" part of the puzzle. Consolidate to this system whenever cost-effective, and extend it to your portals and enterprise applications with SOAs, ESBs, or ECM standards (WebDAV, CMIS, etc.).
- Managing Legacy and Non-Strategic Content Stores: all the tools for "tactical" integrations with systems that are not (yet) cost effective to consolidate. Your content management strategy should never punish you for failing to consolidate: the goal is to make content manageable.
- Secure Information Wherever It Lives: tools for making sure content is secure, even when it leaves a secure repository.
- Bringing Structured and Unstructured Strategies Together: your ECM initiative should be a part of a broader information management initiative. This chapter presents tools that helps you bridge this gap.
- ECM and Enterprise 2.0: here we present a (better) definition of Enterprise 2.0, and how ECM fits into the ecosystem. It presents a strategy for Pragmatic Enterprise 2.0, and explains how many Enterprise 2.0 initiatives could fail without a comprehensive strategy.
Chapters 1, 2, and 8 are relevant no matter which vendor you use for Enterprise Content Management. We do mention Oracle numerous times, but you can just BLEEEEEEP over that if you use tools from different vendors.
Chapters 3 through 7 show how to implement a "pragmatic ECM strategy" using Oracle tools. Some of this data may or may not be relevant to non-Oracle customers. In most cases, you should find it helpful to see what is possible, so you can determine the distance between where you are now, and where you want to be tomorrow.
I worked pretty hard on this, and I'm relatively pleased with the results... but I'm sure the haters out there will find something to complain about ;-)
I usually like to give verbose book reviews... but I realized that I've fallen more than a little behind lately. Writing my second book sucked up a lot of my free time, but I was still able to squeeze in about one non-fiction book per month... not to mention the hundreds of hours of podcasts on ancient history (my current obsession).
I decided to avoid books on programming and technology this year, and focus mainly on business and communication. I think it was a good idea: partly because I get the best software news from blogs, partly because of the utter lack of software innovation in 2007 and 2008, but mostly because I felt the need to read more about economics and management. If more software geeks did the same, I think the world would run a lot more smoothly...
Anyway... below are the books I read in 2008 that I felt worthy of a review on Amazon and my blog. I hope you find them useful:
Alexander Hamilton by Ron Chernow -- I read this because I had a long standing white-hot hatred of Hamilton, so much so that it deeply amused my friends. Sam White suggested I read this book to get a different perspective. After a few years, I finally did, and it really did turn me around a bit. I now have a lot more respect for Hamilton, and can see through the obvious propaganda that was set against him... Hamilton is still a political fool, but he's a military and financial genius, and the USA would be much worse off without him. And Aaron Burr was a tool.
Getting To Yes by Fisher, Ury, and Patton -- Highly recommended! This is a great book about negotiation, both in theory and practice. It demonstrates how there are three general kinds of negotiators: soft, hard, and principled. Its the latter category that will always be able to find a solution that satisfies both parties, without either party feeling like they gave in. I've used this concept multiple times recently -- sometimes with more success than others. It will always remain a useful tool to help me find the win-win situation in every conflict.
The Influencer by Patterson, Grenny, Maxfield, McMillan, and Switzler -- the follow-up to the book "Crucial Conversations," this book gives some pretty practical advice on how to set up systems that promote positive change. This is a combination of individuals, social groups, and the environment itself... all 3 areas need systems that encourage both the ability and the motivation for positive change; otherwise it will not last. In each area, there are multiple tools that can help, but a true "Influencer" will know what tools to use and when. Highly recommended for anybody who wants to make lasting change.
Speak Peace in a World of Conflict by Marshall Rosenberg -- This is a good grounding in the principles of Non-Violent Communication. It shows some basic techniques for how to communicate in a language of needs, rather than in a language of good/evil/right/wrong. It has more real-world examples for folks, which makes it more accessible to skeptics, and first-timers. If you like it, I would also recommend Non-Violent Communication, Getting To Yes, and Crucial Conversations.
Three Cups of Tea by Mortenson and Relin -- This was a fun read... its a real-world story about a man who failed to climb mount everest, and wound up lost in a remote area of Pakistan. The people there were so kind to him, he promised to return to build a school. After multiple setbacks -- and some hard lessons about life in this region -- Mortenson now runs the Central Asia Institute, and has built nearly 80 schools in the region. He gives an interesting perspective into the instability of the region, including the Taliban and the real causes of 9/11.
The Turnaround Kid by Steve Miller -- A fairly timely book for anybody curious about the US automotive industry. I've always been fascinated by turnaround CEOs: people who relish taking a failing company, and making it profitable again. Steve Miller was one of my heros there, because he engineered the turnaround of about a dozen companies... most recently Delphi. In case you didn't know, Miller was the real brains behind Lee Iacocca's turnaround of Chrysler in the 1980s. He has quite a few words of advice for US manufacturers, which you might want to heed before you need his help!
The Warren Buffett Way by Robert Hagstrom -- Forget it. You will never be Warren Buffet. Accept it. Don't invest your money in the stock market: invest in your business, or yourself. Even if the stock market is your business, you're probably not going to pick stocks better than a computer. Nevertheless, if you want to know how Warren Buffet made his billions, this is a good primer. The book also constantly reminds you to not get carried away: put your money in a S&P index fund, and get back to work. Stock speculation is only profitable for insiders with nearly illegal insider information, or people who work amazingly hard at it every day (like Buffett).
Founding Brothers by Joseph Ellis -- I liked this book... its a short book, geared for both US history buffs, and the general public. It was a good overview of six important moments in US history: the Hamilton/Burr duel, the Hamilton/Jefferson/Madison dinner about debt assumption and the creation of Washington DC, the early arguments about the slave trade, Washington's retirement after a mere 2 terms, the early Adams and Jefferson presidencies, and the later friendship between Adams and Jefferson. I'm not positive it deserved the Pulitzer Prize, but it was certainly one of the better history books I've read.
E-Myth Mastery by Michael Gerber -- the latest in the E-Myth series. This book helps entrepreneurs create systems that allow their company to run, so that they can free-up their time to build and grow the company. As a computer geek who has observed highly ineffectual business process, I was skeptical that this book could teach me anything. I was pleasantly surprised... its a bit big, and I wouldn't recommend it unless you are actually running a business -- or a part of a business -- but it certainly opened my eyes to the value of a culture of entrepreneurialism. It does suffer from a fairly tedious writing style, and perhaps others in the E-Myth series would be a better fit -- such as the E-Myth Revisited -- but it opened my eyes a bit so I'll give it a solid 3 stars.
The Undercover Economist by Tim Harford -- decent coverage of scarcity theory, and slight coverage of comparative advantage, but not much ground-braking information here. Its not as good as Freakonomics, which I also disliked. I'm still looking for a book on classical economic theory that I can tolerate... any suggestions are highly welcome!
Blink by Malcom Gladwell -- this follow-up to The Tipping Point was a bit of a disappointment. There was a lot of good data in it, but I felt that his entire thesis was flawed. Its all about "thinking without thinking," by trusting your "gut." Yeah, that always works out... there was some good data about folks who could read emotions by observing facial muscles, and how the mind operates when under stress, but otherwise it wasn't very thoughtful. Worth reading if you take it with a grain of salt.
Well... this is pretty negative...
CMS Watch came out with their 12 predictions for 2009, and number seven was "Oracle will fall behind in the battle for knowledge workers." Here's the relevant quote:
At one level, Oracle had a banner year in 2008: completing or consolidating numerous large acquisitions that bring in heavy streams of ever-beloved maintenance revenues. But 2009 will expose Oracle's weakness with front-office applications at a time when Microsoft, IBM, and many smaller players are fighting for the hearts and minds of knowledge workers.
Customers are already feeling indigestion, as different Oracle teams market overlapping and often incomplete solutions. For example, Oracle is struggling to combine four different enterprise portal offerings, and many customers are chafing at the financial and architectural challenges of aligning with the putative winner, Oracle WebCenter Suite (OWS). Similarly, collaboration and social software services remain divided between OWS and the new Beehive offering -- a bad situation made worse by the fact that both are really development platforms and not finished toolsets. Meanwhile, longtime Stellent UCM customers complain that Oracle is moving away from the product's Web CMS roots to emphasize heavy-duty document and records management.
First, the acquisition of BEA did really shake up Oracle's whole knowledge management / collaboration / Enterprise 2.0 strategy... and yes, there is considerable overlap in the product offerings. However, ultimately this will be a good thing, because only the best of the best will become strategic products under the "WebCenter" brand. This will take time to digest... it may or may not be "all better" by the 11g release in 2009, but I remain optimistic based on the previews I've seen... so the architecture will likely become much more simplified.
Although, I do have to agree that a lot of Oracle's offerings here are platforms, instead of complete applications -- Stellent/ECM being one exception. The WebCenter platform will never be huge, unless it has pre-packaged "Killer Apps" built on it. This is a general fact about all platforms, and is very much true here as well. There are several in the works -- collectively called "Fusion Applications" -- but I have no clue when they will be released.
Second, regarding the financial challenges, I guess I don't know what he means here... the current WebCenter bundle is a bit pricey, mainly because it's a bundle of so many different tools. Remember, WebCenter is a brand, and not just a single piece of technology. Oracle will probably figure out smaller, cheaper bundles that sell better, so I don't see this as that much of a long term problem. Maybe some folks are upset about the price of migration from older platforms to WebCenter... but nobody is forcing them to upgrade. They'll have to do a technology refresh at some point, and Oracle will continue to support and make new released of their non-strategic product lines... so I guess I'll need to hear more before I can respond.
Third, regarding existing Stellent UCM customers, Oracle is actually moving in both WCM and document/records management at the same time. The heavy-duty document and records management offerings are badly needed by many of their existing enterprise customers, so there's a lot of sales opportunity by productizing a few enterprise-level integrations. While at the same time, they spent a lot of time and energy in the next version of Site Studio (Web Content Management) including their Open WCM initiative... This will be big in 2009.
The Stellent faithful have been hearing this line for a long time, but their patience will be rewarded as soon as January.
For those who watched the December 10 customer call, you'd know that you will be able to play with this next-generation of Site Studio relatively soon. A lot of it will be released as Site Studio 10gr4 at the beginning of 2009. The rest will be released in 11g, which is slated for some time in 2009. Alan Baer will be doing a Deep Dive into Oracle Site Studio 10gr4 in January, if you want to know more.
And finally, we should note that of the dozen 2008 predictions by CMS Watch, they claim seven came true, three did not, and two are in the "maybe" pile... so take this prediction with a grain of salt. Oracle has several decent ECM products due out in 2009... so this warning could be both a wake-up call, and a self-denying prophesy.
There have been millions of technological innovations since cave men first invented the wheel... many of them -- such as the printing press, the sewing machine, and the robot -- have put people out of a job. However, it is completely illogical to state that technology eliminates jobs. If that were true, then 10,000 years of innovation would mean no jobs left on the planet... The relationship between technology and jobs is much more complex than that.
Put simply, innovations may be disruptive, but they can never replace a human who actually gives a damn. This may be difficult to believe -- especially if you recently lost your job because a robot/computer could do it faster... but innovations don't fire people; managers fire people... and both labor and management use technology as a scapegoat.
Here's my theory on how this all works:
- For better or worse, the majority of people are motivated by economic means. Not entirely, mind you, but significantly... and everybody would prefer to have more money if possible.
- The primary thing that keeps an economic system growing and creating new wealth is increased worker productivity.
- Technological innovations make workers more efficient.
- This means a short-sighted employer can purchase new technology, lay off workers, and maintain existing production levels... however, this trick is easy for the competition to replicate, so its a terrible long-term solution.
- Alternatively, workers could learn how to work with new technology, and become phenomenally more productive than just technology alone. This is difficult for the competition to replicate, because it relies on a culture of training, sharing knowledge, and institutional learning... so its a great long-term solution.
- Therefore, employers who use new innovations plus retrained labor will always be more competitive, and the first to find and cultivate new markets.
- When this happens, overall worker productivity increases, and more wealth is created for everybody: investors, innovators, managers, and workers.
Scribes lost their jobs when the printing press was invented... but cheap books created huge demand for new kinds of books, and the printing industry boomed. Tailors lost their jobs when the sewing machine was invented... but cheap clothes created huge demand for new fashions, and the clothing industry boomed. Naturally, this doesn't always work for low skilled workers, and all this amoral capitalism is painful for people who lose their job... so a smart government would provide its citizens with temporary unemployment pay, education, and jobs programs to help them through the disruptive phase. But, that's a blog post for a different web site ;-)
This same rule applies to knowledge workers... don't think of them being "replaced" with software, think of them being "empowered" by software.
I am personally highly skeptical about "Enterprise 2.0" software that claims to help people effortlessly find content, seamlessly connect with people, and make effective business decisions as a "crowd". That's not to say these tools have no value... but they are no replacement for people who know what they are doing, and have a desire to get better at it.
Neither Wikipedia nor Google can replace people who intuitively understand a subject, and can weed out "false" information from the mountain of badly written presentations, reports, and blogs... Neither LinkedIn nor Facebook can replace the people who genuinely love connecting with thousands of friends, staying in touch, and helping people out... And nothing, nothing can replace a manager with leadership and consensus building skills. All these people have a genuine talent for discovering useful information, connecting people to each other, and managing a group.
If you have talented employees, you can never replace them. If you don't have them, then software is a stop-gap solution; not a substitute. Technology can only raise the bar a little... ordinary folks will use technology to become slightly better than average at a task... but those with talent can use the exact same technology, and leave everybody else in the dust.
"A wealth of information creates a poverty of attention" -- Herbert Simon
For the longest time, many people believed that one of the biggest information management problems is simply getting access to data. Previously, data was hidden away in a "silo," making it difficult to obtain, because you had to deal with an "information broker." You know the stereotype: somebody who rations their hoard of information for job security purposes, and refuses to share unless forced to... Not always, but many times this "broker" frequently acted more like a "bottleneck."
20 years ago, this led people to believe that if we could only bypass these brokers, and access the information directly, then all our problems would be solved! If only we could get the raw, unfiltered data, we would be much more efficient!
It turns out, not so much...
If you want a successful ECM/Knowledge Management/ Enterprise 2.0 initiative, one of the biggest mistakes you could make is to focus too much on sharing information.
Don't get me wrong: there are many situations where sharing information is vital... however, those situations are pretty obvious. You won't need to look hard to find them, and solving information access problems is fairly straightforward. Sometimes its technical, in which case basic content management tools can help out. Some times the problem is political, in which case the optimal information management strategy requires you to first understand the cultural reasons why your employees refuse to share information... after which you can either chose a software solution, or just force everybody to go to anger management training.
But... assuming that you break down the cultural bottlenecks to innovation, now you have another problem: "infoglut." In other words, information overload. Emails you don't read, presentations that don't make sense, reports that don't flow, the proliferation of websites, blogs, wikis, and social software across the enterprise... people talking a lot, but communicating very little.
In many ways, the best solution to infoglut is to bring back the information broker. Now that information is completely free, you need some kind of filter to let you know what information is relevant. This does not mean that you should bring back the information bottlenecks... instead, you need tools that makes the information broker more effective. This may include:
- Smarter search engines: the current ones try to determine relevance just based on keywords, which doesn't work so well. Google's search engine does a great job on the heavily hyperlinked web, but its a terrible tool for the whole Enterprise. Smarter search engines need to use identity management and analytics to know what content is currently popular with people similar to the current user. They also need a human to maintain a controlled thesaurus, in order to get a vague idea about what content is similar.
- Polite relevance filters for email: how many of you would love to have a customer support queue for your email inbox? "Thank you for your email! I currently have 1000 unread items in my inbox, all flagged 'important.' The average wait time for a response is 97 hours." Of course, you might also need one for your phone...
- Recommendation sites: these allow your greater audience to "vote" on what content is relevant. Examples include Digg and Reddit, which are both good, but only for highly broad topics. I know what's hot in "Technology," but not what's hot in "Embedded Linux."
- Better tools for human information brokers: I firmly believe the broker is the solution, not the problem. Instead of replacing them with technology, work with them to design systems that are more effective at brokering information. If they see their influence increase the more they share, they will do everything they can to get the right information to the right people at the right time. There isn't much commercially software along these lines at the moment... but that doesn't stop sites like AllTop from doing it anyway.
In short, don't blindly bash content silos and information brokers. Sure, they kept data hidden from you... but it was data you probably didn't want to see in the first place. Replacing a human broker with a Digg-clone may work for a while... but that's easy. And since its easy, soon everybody will be doing it, and it will cease to be a competitive advantage. As I've said many times, no technology can ever replace a human being who actually gives a damn.
Technology should empower your brokers, not replace them. Otherwise, you'll soon be swimming in useless information... just like everybody else.
UPDATE: There's a bit of confusion about what I mean by "broker." I do not mean to imply that information should go back to being "locked away." That is both illogical and impossible. Rather, I'm stating that the role of the "broker" has changed into being more of a "filter," and is arguably more important than ever. As Clay Shirky says, "It’s Not Information Overload. It’s Filter Failure." Sometimes a broker is one person, sometimes its several people, sometimes its a software algorithm. None is superior to the other: they all have merits. Therefore, its important for the filter to be a combination: better search algorithms based on relevance, Digg clones for the enterprise, and better tools to help individuals who choose to become brokers. This means multiple silos will be built upon the same raw information, based solely on which filter you pick. Like it or not, its already happening... and people will use whichever one makes them more productive.
When there is a lack of unified purpose, information sharing leads to chaos... and sometimes can cause more problems than it solves. To illustrate this point, I'd like to share the legend of King Ammon.
In a dialog between himself and Phaedrus, Socrates told the tale of king Ammon. He was a wise and just ruler, and all the gods admired him and his virtues.
One day, Ammon was met by the Egyptian god Thoth, who was an inventor, and the "scribe of the gods." Thoth admired Ammon, and wanted to share his inventions with Ammon and all his Egyptian subjects. Ammon was impressed with most of the inventions... except for one: writing.
Ammon was not a fan of writing... and chided Thoth for creating it:
What you have discovered is a receipt for recollection, not memory. And as for wisdom, your pupils will have the reputation for it but not the reality: they will receive a quantity of information without proper instruction, and in consequence be thought very knowledgeable when they are for the most part quite ignorant. And because they are filled with the conceit of wisdom instead of real wisdom, they will be a burden to society.
Hmmm... so Ammon feared what would happen if somebody read something, didn't understand it, quoted it anyway to appear wise, but in actuality had no real wisdom... and in doing so became more powerful, perhaps even respected, so that people even followed him... but because he only appeared to be wise, he made bad decisions, and ultimately became a significant burden to his fellow men.
gee... sound like anybody you know?
Naturally, we only have this great story because of the written word... so nobody would go so far as to claim that writing is bad. However this legend does bring up a valuable point for knowledge management systems:
We should NOT focus on sharing information; we should focus on teaching knowledge.
You shouldn't just dump data to a blog and expect people to read it... you shouldn't dump half-baked documentation into a wiki and expect others to maintain it... you shouldn't just deploy an enterprise search or ECM system, then allow it to become a dumping ground for "data."
What we need are systems that teach; not systems that share. Because without that context, without teaching, and without experience, sharing information could very likely lead to problems...
...and it might actually make you a burden to your fellow men.
As I mentioned previously, Oracle is finally supporting the UCM Blogs and Wikis that Stellent released almost 3 years ago...
This led many people to ask, where can I get them?!?!? I recently got word that these will not be released as a separate download from OTN, but instead will be released as patch on Metalink. The officially released name is Content Server Blogs and Wikis Components (2008_09_18), which you download from Metalink as patch number 7504090.
Please note, in order to make these work, you'll need the latest Site Studio patches as well. Specifically, the Site Studio 10.1.3.3.4 - October 2008 roll-up patch (Build 220.127.116.113), which is patch number 7007799 on Metalink.
Naturally, you will need a Metalink account to download these... which means you need to be an Oracle customer... or a partner who shells out extra for access even though you should get it for free just by being an ACE Director :-\
But I digress... Enjoy the Web 2.0 goodness!