Why Google Will Never Be Good At Enterprise Search

Jake recently had a good post over at Apps Labs about the importance of "Social Search". He has promised a part 2 today... so I encourage you to check it out.

The question is, how do we make enterprise search better? Some people complain that enterprise search should behave more like Google search, which I vehemently disagree with, for one primary reason: enterprise search is a FUNDAMENTALLY different problem than internet search. Here are some examples:

The internet search problem is like this:

  • Heavily linked pages, which can be analyzed for "relevance" and "importance"
  • Spam is a constant problem
  • People don't want you to monitor their behavior
  • People obsess about their Google Page rank
  • People obsess about their hit count
  • People aren't looking for the answer, they are looking for an answer

The whole problem reminds me of a scene from The Zero Effect:

Now, a few words on looking for things. When you look for something specific, your chances of finding it are very bad... because of all things in the world, you only want one of them. When you look for anything at all, your chances of finding it are very good... because of all the things in the world, you're sure to find some of them.

Internet search is like looking for anything at all... whereas enterprise search is like looking for something specific:

  • People don't want general information; they want the 100% definitive answer
  • The trust level is usually higher between co-workers, than between random web surfers... or at least it should be. Otherwise, you got bigger problems than information management.
  • You know exactly who is running the search
  • You know exactly what department they are in, and what content they are likely to need
  • You know exactly their previous search history, possibly even their favorite "tags"
  • Spam is minimal, or non-existent
  • Content uses few, if any, hyperlinks to help determine relevance
  • People usually write content because of obligation, and do not usually care about making it easy for their audience to understand

Trying to solve both problems with the same exact tool will only lead to frustration...

Now... Solving this problem with social tools is a much easier, and arguably better approach. People usually don't want to know the answer, people usually want to know who knows the answer. This is an observation as old as Mooer's Law (1959) about information management:

“An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have it.”

Fifty years later, and folks still don't quite seem to get it... The average user does not want to read enterprise content! They don't read documentation on the subject, nor do they read books on the subject, nor do they read blogs on the subject... In general, people don't care to actually learn anything new; they just want the quick answer that lets them move on and get back to their normal job. Most people look for information so they can perform some kind of task, and then they'll be more than happy to forget that information afterward. Its a rare individual who learns for the sake of knowledge... These folks are sometimes called Mavens, and everybody wants to be connected with these Mavens so they can do their jobs better. As a result, these Mavens will always be overwhelmed with phone calls, emails, and meeting invites.

As those mediums became flooded, some of your resources fled to other places -- like Twitter, or Facebook, or enterprise social software -- and forced would-be connectors to follow. This constant movement (or hiding) helps a bit... but its only a matter of time before those mediums get flooded as well, and the noise overwhelms the signal.

In order to truly solve the enterprise search problem, you need to first understand why people may choose to never use enterprise search, no matter how good it is... then try to bring them back into the fold with socially enabled enterprise search tools. Don't just help people find information; help them find somebody who understands what the information means. Connecting people with mere words can easily backfire, and might actually make these people a burden on society. Instead, connect them with real, live humans who are eager to teach the knowledge being sought. At the same time, you need to work hard to protect these Mavens, so they don't flee your system in favor of another.

This is a problem that Google's search engine cannot solve -- mainly for privacy and trust reasons -- but it is 100% solvable in the enterprise. I'm just wondering why so few have done it...

When people say they want

When people say they want enterprise search to "work like Google" what they mean is that they want a simple box to type words into, and they want to get the right answer on the first page, and they want it to be fast.

It's that simple.

the hard part is getting "right answers on the first page"

The simple box is easy... making it fast is easy... but making it relevant is hard.

Even Google's search appliance has had limited success. Enterprise content is just not created with the same care that web content is... Which is another reason why enterprise RSS has not really gotten off the ground.

my comment has no subject :)

It's amazing how many vendors, Google included, have forged into enterprise search with the intention of being Google for the intranet. Algorithms and relevancy are clearly not the same inside the firewall, which makes you wonder why so many have gone that route.

Anyway, I think adding a social dimension to existing search engines would go a long way toward improving the overall results without a huge investment in relevancy and weighting.

I guess we'll see.

one step further

"In general, people don't really want the information they are looking for; they just want to do their jobs. "

I will take this one step further. Lately, it seems like people just want a paycheck, they don't really WANT to do their jobs, it's a price they pay for the paycheck. More and more people just want a handout and the gov't seems to be willing to give it to them. Sorry, do I sound bitter?

Seriously though, the apathy and selfishness seem to be gaining ground, which is further emphasized by your statement
"People usually don't want to know the answer, people usually want to know who knows the answer."

I just need to find someone to do or answer _______________ for me.

Re: one step further

sounds like somebody is a bitter Maven ;-)

By the way, I altered the prose in my quote to make it flow a little better...

a topic close to my heart

And to my job, since I'm the Chief Scientist of Endeca, I've blogged about the subject a fair amount. Here are a couple of posts that hit the topic head-on:

Why Enterprise Search Will Never Be Google-y

Can Search be a Utility?

Searching gmail has been an

Searching gmail has been an interesting experience. It's definitely fast, but it doesn't offer much in terms of sorting other than date. Filtering by sender, subject, etc. helps narrow things but it still has a way to go to figure out which email about jucy lucy's was the most important one it could return.

Strangely, they don't seem to look at signals like conversation length, stars, etc. that they have access to.

Re: Searching gmail has been an

ummm... so how many emails about "jucy lucy's" do you get in the average month??? ;-)

Maybe you should just search for "Matt's," since they were obviously the inventors of it...

Enterprise Search has the future

''Some people complain that enterprise search should behave more like Google search, which I vehemently disagree with''.

With this point I totally agree. With google search you are looking for someting en more answers can help you. But with enterprise search you are looking for a specific answer!

Thank you for this post.

Is that one question or two?

Hi Bex,

I'm not sure that your "enterprise search problem" isn't actually the problem that internet users wish they could solve ... but accept that given the constraints of the internet, it is practically insoluble. And to date, google has the best solution on offer.

Within the enterprise, I think the problems go to a deeper social dynamic. An issue of matching the effort for contribution, the expectations of payoff, and the actual benefit achieved. It is a simple socio-economic problem, regardless of the technology used.

Within the enterprise, our "audience" is massively constrained by definition. I could (and certainly do) update wiki pages if I think it will benefit myself in the short-term. I might do the updates if I thought my co-workers would benefit (but only if they have a culture of acknowledging the contributions of others). And I probably wouldn't, if I thought the only audience was my boss/the organisation (since they are not in the habit of valuing such efforts when it comes to critical issues like promotion or salary review).

So, google or not, I think we are back to the fundamental challenges of knowledge management in the enterprise that have been going around and around ever since "KM" became a certified acronym.

"...but it is 100% solvable in the enterprise. I'm just wondering why so few have done it..." .. isn't that an argument against the very premise of the statement? I don't believe for a minute that this problem is an easy fix and 100% solvable; while technology and the rising tide of the internet/google culture helps, it is not the solution either!

Re: Is that one question or two?

"isn't that an argument against the very premise of the statement? I don't believe for a minute that this problem is an easy fix and 100% solvable; while technology and the rising tide of the internet/google culture helps, it is not the solution either!"

That's not what I meant... the goal is to help people find the information they need via an "enterprise search" tool that makes people think "this is as useful as Google." That just is not possible without social search as well.

The problem with enterprise search tools is that they focused too heavily on the information. Even today, the solutions to the search problems are usually information based: extract metadata, use microformats, embed RDF and give birth to the semantic web, etc. etc. etc. These tools all have the same limitation: they depend on the quality of the existing content, and fall apart when confronted with lazy users.

As I said in my article -- and you reiterated with your comments about wikis -- employees will only create content when financially incented to do so. There are numerous exceptions, but its a good general rule. Even so, clear & concise writing isn't exactly a common skill... even more rare is the ability to empathize with your audience and communicate in a manner they would understand. So even if your workers were generously incented, you're still hosed.

Therefore, the best bet is a blended approach... when a user searches, it should query Google, but also your local enterprise, and any social media you have laying around to give a comprehensive result. I've seen enterprise search startups that do exactly this... and yes, the underlying technology is pretty darn simple. However, you need to bite the bullet and do the hard work of generating and maintaining social maps in order for the data to remain relevant. You also need to instill policies and procedures to "protect your mavens."

Google can actually be good for enterprise

Bex,

I believe that your list of enterprise search problems is a bit out of date or at least it applies to an "old fashioned enterprise". I do agree that you typically have full information about the user, his search profile, etc. At the same time I do not think that "trusting co-workers" is actually relevant. More often or than not you are looking for information that has been authored by people working in another division, branch, etc as you normally rather well aware of what your close co-workers are doing. So I would be talking about "trusting the experts" where their level of experise is ideally etsablished using some sort of automated rankong algorithm that takes into account documents, articles, etc. that these people authored, commented about, etc. - something similar to Google algoeithm

"Content has few hyperlinks" typically applies to a company that perhaps deployed a CMS, but didn't really master their Intranet web presence. Quite a few companies these days have Intranet portals that allows you not only cross-refernce content, including documents residing in a document repository or a file system, but also tracj usage of this content. taken together that allows to build the content rank, again using something similar to Google algorithm.

"Writing because of obligation" is also only partially true - these days there are so many blogs on a typical Intranet...

I believe that the problem lies with the search engines that are often used on corporate Intranet amd happened to be there as a result of CMS software deployment. I won't name the names:) but these search engines often relies on a simple text search, often even without gramm or dictionary support.

Re: Google can actually be good for enterprise

"Content has few hyperlinks" typically applies to a company that perhaps deployed a CMS, but didn't really master their Intranet web presence.

Incorrect... the vast majority of digital content available on the web was not created for the web... Hell, most web pages I visit don't appear to be designed for the web. Not to mention the volume of email archives, powerpoints, and reports that make limited use of hyperlinks. COnverting this content to HTML is simple, but repurposing it for the web with tons of links isn't usually cost-effective. Relationships between these documents is usually stored outside of the documents. Enterprise search can use these relationships to build lists of similar content, but relevance is another problem entirely.

So I would be talking about "trusting the experts" where their level of experise is ideally etsablished using some sort of automated rankong algorithm that takes into account documents, articles, etc. that these people authored, commented about, etc. - something similar to Google algoeithm

That's what we mean by "social search." When I search for the word "Java", I would get documents back relevant to "Java", but I would also do a social search to determine who has been highly rated by their peers for "Java" knowledge. I would recommend against using their comments history to inflate social rank, mainly because those systems would be ripe with abuse. People would say "Java Java Java" if they wanted higher rankings... and they would avoid sharing information with the word "COBOL" to avoid being associated with topics they find "uncool"...

I do not think that "trusting co-workers" is actually relevant.

I would strongly disagree... let's say I'm a sales guy, and I need a definitive answer about whether our product supports "buzzword XYZ." There is no official documentation of this feature, or this support... but it does exist. The developers made it and tested it, but it never got into the official documentation set. Or so says some random blog on your intranet...

This is a vital piece of information... it can mean the difference between a big bonus, or getting fired. So what will I do? I'm going to keep asking the same question over and over until I find somebody I trust, who gives me the information I need. Trust is critical when you are looking for the definitive answer, which is what enterprise search is all about.

search engines often relies on a simple text search, often even without gramm or dictionary support.

so very true... which makes me sad... but its the same old problem where people try to just slap down software, and then expect it to solve everything...

A bit of follow up

Bex,

I think that youy are missing the point here...

1) There is indeed tonnes of mail archives, documents in the file systems, etc. that were not designed for the web. But you do not need converting them into HTML to turn them into web documents. You can easily crawl these archives, file systems, etc. into an Intranet portal. Then use links to these document on the portal pages or your collaborative tool, etc. That will then allow you to collect and analyse click-throughs to these documents as well ascross-refernces.

2) You could indeed inflate social ranking if you just count the number of time that somebody said "Java, Java". But that is exactly what ranking algorithms that Google (or other providers) is using is designed to prevent from happening. relevance is given to cross-refernces and / or page use. In case of a person that would've been refernces to materials that she has authored, commented, etc.

3) You You just proved yourself that "trusting co-workers" is not that relevant. You need to "trus an expert" hwo has proved her expertise, if you wish via social ranking. In your example, if you are a sales guy, trusting official company documentation would be the last and worst thing to do. You may end up loosing the deal trying to prove the "your products are tightly integrated ...or something" when in real life they are not. You would be much better off trusting an expert who knows this product inside out.

Igor

Re: A bit of follow up

1) popularity analysis is simple, commonly used, and insufficient. For Google-quality relevance, you need to make it "social." Analyzing click-through rates for popularity is better than nothing... but where you get the biggest bang for the buck is by analyzing who the person is who ran the search, and what they clicked on. For example, when project managers search for "Java," the top 10 links would be much different than when HR folks search for "Java", and more different still than when developers search for "Java."

2) Really? And who will tune that algorithm to prevent cheaters? Google has to CONSTANTLY update their Page Rank algorithm, and the only reason it works is because its inner workings are kept very secret... Who will pay for the army of tuners every enterprise will need to prevent cheaters? How will you educate and train this army? And won't this just create an army with the knowledge of how to rig the system? A better solution is likely a blend of technology and human processes. For example, a relatively simple algorithm that eliminates the obviously bad ideas... then a human "popularity contest" to sort through the rest.

3) How can you say "trusting co-workers is not that relevant," then two sentences later claim "you would be much better off trusting an expert." We're talking enterprise search here... the "expert" about your company's product had damn well better be a co-worker! If not, your company has HUGE problems. You absolutely need to trust your co-workers to appoint and help you locate this expert. If nobody in your company has expertise on the subject, then by all means hop on Google and look for an outside expert... but you would be wise to force an outside expert to demonstrate their expertise before trusting them.

Its the find vs re-find problem

Bex,

This is a great post. I have long ranted to anyone who will listen to me that searching for information on the internet is not the same problem as trying to find an email or document. I call this the "find vs re-find" problem. Searching Google is a find problem - you're looking for information and anything that delivers that information is (usually) good enough. There are possible hundreds of sites on the web that could deliver that information to you. Google happens to (IMHO) deliver better success on highlighting top locations to find information.

However - gmail and google desktop completely fail at helping re-find information. With the re-find problem you are not looking for just any information, you want specific information - and there may only be one result (file, email) that meets that criteria. Personally, the email re-find issue is a recurring problem at work. I get 200+ emails a day (which is its own problem), generic searches that return 100+ results are NOT helpful. I do not wany just any email about XYZ account. I want THE email that JOE sent me LAST WEEK about XYZ account where he talked about the WIDGETS. And thats just if it was writen down in email - to your other point, usually I just have to go find Joe...

I love the quote about internet search - "People aren't looking for THE answer, they are looking for AN answer "

Most enterprises hire smart people to provide THE answer for the variety of problems that company solves on a regular basis. If ANY answer would do they would outsource that task, or even automate it. I don't think its only a problem of mavens or gurus - it is finding THE answer from the right file, email, database, or person - wherever or whomever that may be.

Re: Its the find vs re-find problem

I think this is the disconnect about enterprise search. Enterprise search might focus too much on the re-find problem... so much so that the only way you can find something in the enterprise is if you already know it exists.

Internet search is geared more towards browsing and serendipity. Which is excellent on the greater web, but not always what you need in the intranet.

Some say the solution is to push the enterprise more towards internet search... but I think that because these problems are different, the enterprise needs some completely different tools to get things done here. And because of the environment -- strong identity management, monitoring of user behavior, and enterprise social software -- I think they would get better bang for their buck by focusing on social search.

A bit of a leap

I agree that there are two separate problems as exemplified in the following quote:

“When you look for something specific, your chances of finding it are very bad... because of all things in the world, you only want ONE of them.”

True, but this doesn’t negate relevancy. You still want the most relevant result at the top. A tool with great relevancy can excel in both cases.

Another piece of this that is hard to nail down is what people want and how they behave.

1. “People usually don't want to know the answer, people usually want to know who knows the answer.”

2. “In general, people don't care to actually learn anything new; they just want the quick answer that lets them move on and get back to their normal job.”

I see these two premises as being somewhat in opposition. This often happens when using vague axioms to build an argument. They apply differently in different situations. Do people want a quick answer or do they want to know who knows the answer? Let’s just admit that we may not be have figured out exactly what people want just yet.

“In order to truly solve the enterprise search problem, you need to first understand why people may choose to never use enterprise search, no matter how good it is...”

If this is really true then it’s not only Google that will never be good at enterprise search.

I understand that there is a social component that could be useful, but I am not buying this: “Now... Solving this problem with social tools is a much easier, and arguably better approach.”

That easy to say, but simply pointing out that there are two very different problems being tackled by Google technology and that there is some vague general principle about people wanting to know what people know information is one thing. Taking the leap that non-social Enterprise Search itself is totally flawed and Google will never excel at Enterprise Search is a bit of a leap.

Re: A bit of a leap

"I see these two premises as being somewhat in opposition."

ummm... how??? I would agree that they are not always in agreement, but I've never seen them to be in opposition. It all boils down to lazy users, who just want a fast answer, without bothering with having to "learn" anything new.

"If this is really true then it’s not only Google that will never be good at enterprise search."

I fully agree... I don't mean to pick on Google's enterprise search platform per se... I'm simply trying to make people wake up and stop trying to squeeze the "content" for better search relevance in the enterprise. If your goal is to find information -- in a non-annoying way -- then you are better off with socially aware search that protects your Mavens.

"Taking the leap that non-social Enterprise Search itself is totally flawed and Google will never excel at Enterprise Search is a bit of a leap."

Now I gotta disagree. Part of my argument is that in order for Google to "be good," they would have to "be evil." In other words, monitor user behaviot everywhere. The other part is based on experience. I've been programming for 25 years, and in the information management industry for a decade. I have brutally evaluated the claims of numerous products, much to the chagrin of would-be salesmen. I can say with certainty that enterprise search has not progressed in any practically useful sense in the past 6 years... with the singular exception of "socially aware" search. Many have challenged me on this claim; none have produced any data to support their position.

I'd be ecstatic if you could prove me wrong... because then I'd have a product to recommend to my clients.

thank you so much

Thanks admin. This is really a very nice website. I want to write you a few things, but I believe I do not have time to naturally want to sleep now

Recent comments