MuseGlobal
Subscribe via Email
Enter your Email Address:
Delivered by FeedBurner

Monday, April 21, 2008

Dow Jones and Generate: Making the Most of Web Harvesting Services

Kudos to Dow Jones Enterprise Media for announcing their acquisition of Generate, the online business information harvesting service that has been a center of much buzz for the past several months. What made Generate such a hot topic of discussion? While there are a lot of companies out there involved in harvesting content from the Web, Generate was taking this content, brushing up its quality and enabling it to shine in high-end business applications that help sales and marketing professionals to understand quickly how to translate Web content into sales and business development opportunities. Instead of focusing just on the mass markets Generate was the first company that tried to turn harvested Web content into a high-end business information application.

A great story, but if it was so great why go the acquisition route now? I see two factors that made this a good time for Generate to cash in with Dow Jones. First, Dow Jones brings the Generate team a much larger and entrenched sales force already selling Factiva and Dow Jones feeds, products which have proven themselves but which are not likely to make huge new sales strides in a rougher economy. Adding Generate to their sales kit will enable them to penetrate more accounts more quickly without the "will this startup survive or not" question hanging over their heads. The second factor, though, is probably more important: having corralled all of the Web content that they could get their hands on through Web harvesting, how was Generate going to add more value to the product? Well, inevitably the answer would have been to add more content from licensed databases and from client databases.

Adding Factiva content to the Generate quiver of content sources is certain to give a boost to their value-add analysis capabilities. The question is, how many more sources can be integrated quickly with Generate's platform - or any other Web harvesting platform, for that matter. Web harvesting is a highly potent way to gather great business information - it's something that MuseGlobal does as well - but it's hardly the end point for making a great workflow application for today's enterprises as quickly as possible. Internal databases, subscription databases, file management platforms, Web site content management systems - all of these need to be sources for enterprise applications that are going to deliver the most valuable answers to today's professionals. Web harvesting is tuned to do just that - to get the most important content out of Web pages as efficiently as possible. Integrating content in from other sources, including real-time feeds, is not necessarily Web harvesting's strength.

Web harvesting engines are essentially Web search engine crawlers with special processing to extract specific fields of content from Web pages. That's great for what it is, but that's not necessarily going to get you timely content from other sources such as document servers, databases and datafeeds. Access methods, protocols, update cycles, security and logins, proprietary data formats - all these and more can make it difficult or downright impossible to use the same software that you use for Web harvesting to access other content sources and return the freshest information available. Our ten years of experience in developing content integration technology shows that it's a far better approach to let Web mining do what it does best and to use other techniques to integrate content from other sources into applications driven by federated content sources.

This is where content integration technology from MuseGlobal can help Web harvesting applications to shine. Instead of trying to get Web harvesting software to integrate other sources, why not use The Muse Content Machine to let Web harvesting do what it does best in a content integration architecture that's already able to integrate both Web harvesting and thousands of other types of content sources? The Muse Content Machine will enable you to take search engines, subscription databases, Muse Web harvesting or your own Web harvesting technologies, client databases, real-time news and data feeds and any other content source you need and get them to produce rich, federated content for your Web sites and client applications quickly and effectively. We can configure The Muse Content Machine to integrate all of the content sources that you need and return results from a single query into a single or multiple streams of updates or alerts tailored to your specifications - or build front-end applications with easy-to-use Muse application development tools that can federate content from all of your key content sources into rapidly developed user applications.

So our hats are off to Dow Jones for picking one of the most valuable up-and-coming companies harvesting insights from Web content. Generate's Web harvesting and semantic analysis combined with Dow Jones databases is sure to accelerate the power of business information in today's major enterprises. The Muse Content Machine can help these kinds of integrations of Web harvesting to move from concept to reality far faster than you may imagine. Then again, if you're familiar with track record as the leader in OEM content integration, perhaps you can imagine it.

1 comment:

Sgx Nifty said...

Can we take risk by investing on shares ?

sgx nifty