Subscribe via Email
Enter your Email Address:
Delivered by FeedBurner

Monday, April 28, 2008

Elsevier's Y.S. Chi Points to the Value of Engineering Content "Experiences"

We had a good time at the Buying and Selling eContent conference in Scottsdale this year, our first time at this event. Certainly there were many familiar faces from the content industry there, including Y.S. Chi, Vice-Chair of Elsevier, who gave what I thought was probably the most insightful presentation of the conference. Y.S. highlighted how the content industry needs to focus more on developing valuable "experiences" than more content.

What did he mean by this? Well, certainly Y.S. had in mind many of the workflow-oriented tools that publishers are beginning to emphasize in trying to add value to long-established database services. And I would think that it might also include some of the advanced social media projects that Elsevier and other major scientific publishers are embarking on that enable scientists to collaborate on building valuable reference content and research.

But I think that Y.S. was also pointing towards one of the key factors that makes publishing so hard for many established companies these days: owning content is not as important oftentimes as getting content to do something useful for people. If content can be thought of as the raw materials of publishing, then getting content from point "A" to point "B" is no longer such a great business to be in now that the Web and corporate intranets make the A-B value proposition pretty low on the value totem pole.

Search engines that publishers put on top of their own content collections help to find those raw materials more easily, but the value of those searchable services is considered high only when they are able to locate all of the possible content that applies to a given problem or task. Leaving content out of the equation means that you have only part of what you need to build a valuable experience. That's kind of like building a fantastic roller coaster but leaving out a few hundred feet of track. Sometimes doing just part of the job very well is just not enough.

The traditional solution that publishers would use to address the "missing track" issue would be to license more content or to create it themselves. That worked pretty well when there were relatively few sources of licensed content and relatively few people willing to create it themselves. But most sizable enterprises have very sophisticated sources of internal content as well as a growing array of sources that are generated by their peers in other companies or universities that help them to meet their goals. Add in Web sites that are growing sources of fresh information about businesses and key trends and it's not so easy to fill in that missing track. It's as if the roller coaster that the customer wants keeps growing far faster than the publsihers' ability to fill the gap.

In the experience economy that Y.S. references it's all about anticipating the gaps and finding more innovative ways to fill them more quickly than someone else, so that the raw materials of content can be transformed into experiences as efficiently as possible. Well, if the value of experiences is so high, then why do publishers still place so much emphasis on getting content integrated into the back end of databases when it's greatest value is found outside of a database? In other words, why not push the point of content integration as close to the point at which content is experienced? This will enable more sources to be brought together from more places more easily and efficiently.

Well, not surprisingly that's the key to what MuseGlobal does with content. The Muse Content Machine is a content integration platform that enables a publisher to assemble more searchable sources of content more quickly and more effectively than any one else. Instead of trying to create one master database with one login and one search engine The Muse Content Machine can enable a publisher to build applications that access multiple searchable sources from a single query. To your customers it will look like you've built a rich application from one commonly indexed database. But behind the scenes The Muse Content Machine federates content from thousands of different types of content sources and delivers the freshest and most relevant content from each source. Nasty details such as multiple source logins, different data formats and different types of content sources - search engines, databases, Web harvesting, feeds, video and audio, catalogs - are all ironed out very neatly and efficiently by The Muse Content Machine.

The Muse Content Machine lets publishers focus on getting the right content into the experiences that their customers want, regardless of whether those sources are in their key databases, at the customer's site or available from the Web. With thousands of different types of content sources already integrated into The Muse Content Engine our answer to "can you integrate this?" is usually "been there, done that." Best of all, when changes to a source occur The Muse Content Machine makes it easy to respond to those changes and keep all of your sources working together. To your customers it will be just one big powerful experience - but to you it will be the miracle of the most advanced content integration capabilities making everything that's not unified appear to be a unified source of content. The result: you'll have spent a lot less on developing content and a lot more on developing the experiences that will bring in higher revenues with lower maintenance costs.

Keep your eye on Elsevier as they begin to take advantage of Y.S. Chi's vision - and keep an eye on MuseGlobal as we help publishers, software companies and other media players to create more realizable value from unified content.

Monday, April 21, 2008

Dow Jones and Generate: Making the Most of Web Harvesting Services

Kudos to Dow Jones Enterprise Media for announcing their acquisition of Generate, the online business information harvesting service that has been a center of much buzz for the past several months. What made Generate such a hot topic of discussion? While there are a lot of companies out there involved in harvesting content from the Web, Generate was taking this content, brushing up its quality and enabling it to shine in high-end business applications that help sales and marketing professionals to understand quickly how to translate Web content into sales and business development opportunities. Instead of focusing just on the mass markets Generate was the first company that tried to turn harvested Web content into a high-end business information application.

A great story, but if it was so great why go the acquisition route now? I see two factors that made this a good time for Generate to cash in with Dow Jones. First, Dow Jones brings the Generate team a much larger and entrenched sales force already selling Factiva and Dow Jones feeds, products which have proven themselves but which are not likely to make huge new sales strides in a rougher economy. Adding Generate to their sales kit will enable them to penetrate more accounts more quickly without the "will this startup survive or not" question hanging over their heads. The second factor, though, is probably more important: having corralled all of the Web content that they could get their hands on through Web harvesting, how was Generate going to add more value to the product? Well, inevitably the answer would have been to add more content from licensed databases and from client databases.

Adding Factiva content to the Generate quiver of content sources is certain to give a boost to their value-add analysis capabilities. The question is, how many more sources can be integrated quickly with Generate's platform - or any other Web harvesting platform, for that matter. Web harvesting is a highly potent way to gather great business information - it's something that MuseGlobal does as well - but it's hardly the end point for making a great workflow application for today's enterprises as quickly as possible. Internal databases, subscription databases, file management platforms, Web site content management systems - all of these need to be sources for enterprise applications that are going to deliver the most valuable answers to today's professionals. Web harvesting is tuned to do just that - to get the most important content out of Web pages as efficiently as possible. Integrating content in from other sources, including real-time feeds, is not necessarily Web harvesting's strength.

Web harvesting engines are essentially Web search engine crawlers with special processing to extract specific fields of content from Web pages. That's great for what it is, but that's not necessarily going to get you timely content from other sources such as document servers, databases and datafeeds. Access methods, protocols, update cycles, security and logins, proprietary data formats - all these and more can make it difficult or downright impossible to use the same software that you use for Web harvesting to access other content sources and return the freshest information available. Our ten years of experience in developing content integration technology shows that it's a far better approach to let Web mining do what it does best and to use other techniques to integrate content from other sources into applications driven by federated content sources.

This is where content integration technology from MuseGlobal can help Web harvesting applications to shine. Instead of trying to get Web harvesting software to integrate other sources, why not use The Muse Content Machine to let Web harvesting do what it does best in a content integration architecture that's already able to integrate both Web harvesting and thousands of other types of content sources? The Muse Content Machine will enable you to take search engines, subscription databases, Muse Web harvesting or your own Web harvesting technologies, client databases, real-time news and data feeds and any other content source you need and get them to produce rich, federated content for your Web sites and client applications quickly and effectively. We can configure The Muse Content Machine to integrate all of the content sources that you need and return results from a single query into a single or multiple streams of updates or alerts tailored to your specifications - or build front-end applications with easy-to-use Muse application development tools that can federate content from all of your key content sources into rapidly developed user applications.

So our hats are off to Dow Jones for picking one of the most valuable up-and-coming companies harvesting insights from Web content. Generate's Web harvesting and semantic analysis combined with Dow Jones databases is sure to accelerate the power of business information in today's major enterprises. The Muse Content Machine can help these kinds of integrations of Web harvesting to move from concept to reality far faster than you may imagine. Then again, if you're familiar with track record as the leader in OEM content integration, perhaps you can imagine it.

Monday, April 7, 2008

Why are Search Engines Still Searching for Answers to Content Integration?

Recently I read an interesting article on, entitled "Searching for an Answer in the Enterprise." The article notes that enterprises are beginning to come up against the uncomfortable fact that today's enterprise search engines were never really meant to deal with all of the various types of content resources that the typical major enterprise uses to store information. While your typical search engine can be trained fairly well to traverse some of the more common data sources such as major databases and document servers there's a wealth of legacy systems, document management systems, content management systems and ERP systems that store much of the typical enterprise's internal wealth of content.

Unfortunately search engines aren't designed to deal with such a wide variety of content sources easily. Search engines are designed to look at accessible documents and to make an index of all of the information that's in them, so that someone can come along later and search that index. That's great - if you have access to all of the documents that you need to index on a regular basis. Unfortunately in many enterprises that's a very, very difficult thing to arrange. Many enterprise databases and other data repositories are enormous and are being updated constantly. To incorporate their contents in a separate search engine index would take enormous resources both for allowing the search engine to crawl such a database with some regularity and to update and store the search engine index - not to mention questions about security and data integrity that such a crawl might raise. In an era in which data leaks can mean huge legal and public relations issues solving search can be a lot easier than it sounds in slide presentations.

Federated search technology can help to overcome these limitations. Instead of trying to build one enormous index of every document that someone might need to search a federated search engine will take one search request and then formulate queries for each of the sources where there's content that might answer someone's question. Instead of relying on occasionally updated Web search indexes a federated search enables the index associated with each content source to be traversed separately. This means that as soon as each of those individual indexes is updated you're going to have access to the most current information. The results from each source are then combined to display the most relevant content available from all sources.

This certainly helps to solve the problem of getting the very precise results that the typical enterprise user expects - they don't want a pretty relevant report on last quarter's results, they want THE report on last quarter's results - but it's an approach that only works when you have access to every source of content that someone may need. If you don't then people will be disappointed and go off to use another way to find the information that they need. In other words, your investment in search technology will be ignored.

So to do federated searching the right way you need access to ALL of the possible content sources that someone may need to access. Yet this article points out:
In addition, Microsoft has moved to the forefront in the promotion of new federated search capabilities based on Creative Commons' OpenSearch standard. Several companies, including Open Text Corp., Business Objects, Cognos and EMC Corp., are developing federated search connectors to enable Microsoft's enterprise search customers to connect to their information systems.
Well, this is good news, I suppose, but why settle for "some" when you need all? This is of course where Muse comes in. We've been doing nothing but develop technology for federated content integration for more than ten years. Of course our thousands of pre-built source connectors can help you to do all of the federated searching that you need, but that's just where federated content integration starts. In addition to search engines, databases, ERP, ECM and the other alphabet soup acronyms that make up today's universe of enterprise content repositories Muse technology enables our partners to deliver content from the Web, from subscription content sources, from datafeeds and from web mining applications as well. Our partners get OEM software from Muse that enables them to equip any search service, Web application, email newsletter service or enterprise software application with all of the content that they need to deliver.

So our hats are off to all of those enterprise software solutions who are working hard to address some of the solution to federated content integration. You're doing great, but let us know when we can help you with a solution to all of your federated content integration needs. We're ready when you are!