Subscribe via Email
Enter your Email Address:
Delivered by FeedBurner

Wednesday, May 2, 2012

Federated Search & Big Data gets bigger

The world of independent Federated Search is diminishing; last week IBM announced that they will be acquiring Vivisimo.[1]  There are a number of interesting aspects to this, and the analysts have covered some of them [2],[3], but some particular quotes from IBM itself and the analysts piqued my interest:

“The combination of IBM's big data analytics capabilities with Vivisimo software will further IBM's efforts to automate the flow of data into business analytics applications …” [IBM]
IBM also intends to use Vivisimo's technology to help fuel the learning process for their Watson
applications.” [IDC]
Overall, this is a very smart move for IBM, and it indicates that unstructured information is going to play an increasingly large role in the Big Data story…” [IDC]

All this shows the handling of structured and unstructured information growing in importance.

What does IBM want Vivisimo for? It seems to all stem round Big Data and the analytics that it can produce to enable better corporate decisions.  Of course, there’s also the lovely teaser of a better performing Watson! Both Watson and Analytics massage vast amounts of data and information to draw conclusions, assign values, and create relationships. But, like all such endeavors, the quality of the result depends critically on the quality of the incoming data. GIGO says it all!

Big Data analytics work very well with structured data, where the “meaning” of each number or term is exactly known and can be algorithmically combined with its peers, parents, siblings, and opposites to give a visualization of the state of play at the moment or over time. Gathering such data is a tedious process (hooray for computers!), but is not intrinsically difficult. All that needs to happen is to set up a mapping from each data Source to the master and let it run. The mappings are precise and the process effective, but the volumes are vast and the time-to-repeat rather slow for today’s fast paced world.

However, now add the fact that not everything you want to know is held in those nice regular relational database tables, and the picture looks far less rosy. Product reviews are unstructured, press releases are vague, social comments are fleeting, and technical and legal documents tend to be obtuse. But all these are vital if you want to make a really informed decision. So bring in Federated Search to the rescue.

Federated Search is a real time activity. It is focused on just what data or information is needed now. And it provides quality data. It is directed to just those Sources needed for “this report”, and it analyzes them in terms of known semantics so that the reviews, blogs, etc. mesh with the numerical analytics, and then provide the essential “external view” of the situation. And this is done right now, in real time. For the knowledge based systems (like Watson) the FS Sources provide in-depth data pertinent to the current problem. And if the Sources don’t have it, FS goes and finds it, thus allowing  Watson (as an example)  to add it to its knowledge base, and provide a more informed opinion.

So that is why IBM is adding Federated Search to its armory. What are the issues? In a word (or two): coverage and completeness.

All the Big Data systems use standardized access to the massive databases of the corporation’s transaction and repository systems. Most of these understand SQL or some other standard access language, and the customization is a matter of reading a schema mapping table. That mapping table is the same for every SharePoint or Exchange system (or similar), so once created, it is easily deployed. These types of standardized accesses are often referred to as “Indexing Connectors” because they extract enough data to enable the content to be indexed and searched. (For more on this see a future post on the deep differences between Connectors and Crawlers.)

Now, move to the world of web data and the complexity and difficulty escalates enormously.  The number of formats and access methods multiplies almost to the point of one-to-one for each Source. As an example look at the two press releases for this acquisition: IBM’s is a press release, with an initial dateline, and no tags, Vivisimo’s [4] is a blog post with tags and an author. The same Connector will not make sense of both at the level of detail needed for a decision making analysis.

Add in the velocity of the data in the social media (“velocity”, as you will recall, is one of the 3 “v”s that define Big Data – Volume, Variety, Velocity) and the relatively slow to aggregate times of conventional databases become a problem. Timing is an issue because of volume, but also because applications have to analyze input data from users and other sources, store it in their transactional database, and then the ETL function has to extract from that database and move the data to the analytics database or storage area. These are two stages, both relatively slow, that must be batched together.

So, once moving from structured data to unstructured data, and from the sheltered waters of the corporation to the rough seas of the Web, a very different set of techniques is needed. And that is where Federated Search (FS) comes in.   This is the truly hard, difficult part, and it’s where MuseGlobal shines.   But first, some more information on what FS is, and what it needs to do.

FS is immediate, which involves many synchronization and “freshness” issues, but essentially solves the “velocity” problem by obtaining data as it is needed. That is because FS is a “on demand” service. It is brought into play just-in-time to get the data when needed, not in batch mode to store it away just-in-case. Since it is used when needed it needs to be able to target the Sources of interest right now. That means it is flexible and dynamically configured, not painstakingly set up ahead of time and left alone.

Since it is a focused operation, targeting only the data needed, it must be able to get the maximum out of each Source. This requires two levels of complexity not common in other types of connectors or crawlers. These Sources have specific protocols and search languages and often security requirements. All these must be handled by the FS Connector so that the search is faithfully translated to the language of the Source, and the results are accurately retrieved. Second is getting the retrieved data into a useable form (and format). This involves a “deep extract” involving record formats, field/tag/schema semantics, content semantics, data normalization and cleansing, reference to ontologies, field splitting, field combination, entity extraction on rules and vocabularies, conversion to standard forms, enhancement with data from third Sources, and other manipulations. None of this is off-the-shelf processing where a single connector can be parameterized to work with all Sources. So FS has started at the “single, deep” end of the spectrum (crawlers are the epitome of the “broad, shallow” end) and builds Connectors to the characteristics of each Source.

These Connectors bring focused, quality data, but they come at a price. Vivisimo and MuseGlobal, and the other FS vendors build a very special type of software – something that we know will eventually fail, when the characteristics of the Source change. This needs a special dynamic architecture to accommodate it. It needs very powerful ways to build Connectors which can involve data analysts and programmers, as well as highly sophisticated tools, such as the Muse Connector Builder. It needs a robust and automated way to check for end-of-life situations, such as the Muse Source Checker, and a highly automated build and deploy process – the Muse Source Factory has been delivering automated software updates for 11 years now. Source Connectors *will* stop working, and a big part of a viable FS ecosystem is being able to get them back on line quickly and reliably.   MuseGlobal has put together a data virtualization platform with thousands of Connectors, because we know there’s a one-on-one relationship with each data source if you want to connect to the world out there.   Figuring out the unstructured data problem was one of our main goals at Muse from the very beginning, some 11 years ago.

Of course, building Connectors in the first place is an equal challenge, including the human element of dealing with a multitude of companies publishing information and data. This is something all FS vendors have to handle, and MuseGlobal chose to create a Content Partner Program about 10 years ago where we talk regularly to hundreds of major Sources and content vendors. Breadth of coverage of the Connector library is a major factor in “getting up and running” time, and a major investment for the FS vendors. We believe that Muse has one of the largest libraries with over 6,000 Source specific Connectors, as well as all the standard API and protocol and search languages ones for access where that is appropriate – but still with the “deep extraction” which is the hallmark of Federated Search.

It is not an easy task to get right at a quality and sustainable level, but a few vendors have produced the technology. MuseGlobal is one – and Vivisimo is another.

IBM Analytics and Watson are set for a real quality revolution!

Another analyst 's comments can be found on enterprise search blog at [6].

(*) You will need to be a subscriber to see the report


«Oldest   ‹Older   401 – 449 of 449
Puremelda said...

Despite the firm giving Political Custom Writing Services to the clients, they do not compromise the standard of an article, as they continuously give high-quality Political Science Essay Writing Help that enable the customers to attain profound grades in their Political Science Essay Writing Service Online.

Ibraham Jacob said...

great to read this post the stuff I got from this post is great , check out Kelly Mcgillis Top Gun Jacket it would also be grateful for you.

Legitimate Custom Research Paper Services said...

Are you looking for Unmatched Legitimate Custom Research Paper Services? You don’t need to fail your Custom Research Paper Writing Services when you can utilize the best Dissertation Help Services or Affordable Custom Research Paper Services from the help of the online paper writers.

Legitimate Custom Research Paper Services said...

You should take some time and evaluate the Custom Essay Writing Services and Cheap Custom Essay Services as most students buy Affordable Online Essay For Sale and get ripped off.

Legitimate Custom Research Paper Services said...

All of your Nursing Research Papers Services and Nursing Research Paper Writing Help hardships and problems can be solved easily by providing Custom Nursing Writing Services assistance.

Legitimate Custom Research Paper Services said...

It is normal for students to be anxious about hiring an online professional Legitimate Custom Research Paper Services Provider because they can never be sure whether they can get high-quality affordable Custom Research Paper Services and the right Research Paper Assistance Services or not.

Legitimate Custom Research Paper Services said...

The backbone of Term Paper Writing Services company is its writers and ours are the choicest in the industry to ensure that clients acquire the Best Term Paper Services and that will make you collect a high-quality Custom Term Paper Writing Help. said...

I go to see customary two or three goals and areas to get articles or studies, and this is moreover incredible. assignment helps said...

I go to see customary two or three goals and areas to get articles or studies, and this is moreover incredible. assignment helps said...

I go to see standard a few objectives and zones to get articles or studies, and this is also mind boggling. assignment helps said...

That is extraordinary.. I saw that minute when Ghost Rider came to wonder perplex mission. assignment helps

Unknown said...

Sage 50 Customer Service Number
Sage 50 Customer Service Number
Sage 50 Customer Support Number
Sage 50 Tech Support Number
Sage 50 Support Number
Sage 50 Customer Service Number 1-855-548-3394 for Sage 50 technical help. Get technical assistance, just dial Sage 50 Support Phone Number 1855-548-3394

Unknown said...

Sage 50 Technical Support Phone Number
Sage 50 Technical Support Phone Number
Sage 50 Technical Support Number
Sage 50 Tech Support Phone Number
Sage 50 Tech Support Number
Get technical assistance, just dial Sage 50 technical Support Phone Number 1-855-548-3394. Sage 50 Support Phone Number. Sage 50 Customer Support Number said...

This Is Great Information For Students. This Article Is Very Helpful I Really Like This Blog Thanks. I Also Have Some Information Relevant For assignment helps said...

We endeavor to give best suggestion help to the entirety of our understudies, and assist them with accomplishing A+ grade when in doubt. We handle that this objective of our own mean a tremendous responsibility with respect to us – to pass on top of the line hypothesis or research paper to the entirety of the understudies (as their future relies on it). To satisfy our central goal, we interface basically the best subject specialists (every one of whom are either Masters or PhDs). assignment helps said...

Its a shockingly Innovative and enlightening post, a commitment of thankfulness is all together for sharing. We are again continued your websote thanks and uncommon day : assignment helps

Ankita Tiwari said...


Sonal Sahni said...

Sonal Sahni one of the most demanding Mumbai Escorts Female and one of the key highlights of Independent Mumbai Escorts is that they have a cordial and close character. We can guarantee you that every one of our young ladies love meeting new individuals and connecting with their customers. Mumbai Escorts Agency in Mmbai City will help of you to find best selection. To get unique experience of Escorts in Mumbai Book Escorts Girl. Entrusting enough in your own body and aptitudes to make it accessible for the delight in others requires the individual who has a high confidence and information on their own worth. Without believing in one, the accompanying can be troublesome. Call Girls in Mumbai who pick this profession know what their identity is and what they are doing. High class Escorts Agency in Mumbai is really different to other agencies. Contact Now

Sonal Sahni said...

Kanpur Call Girls Service | Rajkot Call Girls Service | indore Call Girls Service | Nagpur Call Girls Service | Udaipur Call Girls Service

Sonal Sahni said...

| Jodhpur Call Girls Service | Bhubaneswar Call Girls Service | Bhopal Call Girls Service | Coimbatore Call Girls Service | Varanasi Call Girls Service | Mumbai Call Girls Service }}

meldaresearch said...

Are you eager to hire fast, timely, and reliable Research Paper Help Online? Find the best Online Research Paper Help Services for all your Custom Research Paper Writing Services.

meldaresearch said...

For the best Legitimate Custom Coursework Writing Services, find the best Online Coursework Writing Service Provider for all your Custom College Coursework Services.

Jacob Oram said...

What is search and what are benefits of search? We can get all the answers in this post. Thanks for sharing this nice post. By: Cheap dissertation writing service.

Jaswal Gupta said...

After looking over a number of the blog articles on your website, I truly appreciate your way of writing a blog. I book-marked it to my bookmark webpage list and will be checking back soon. Please visit my web site as well and let me know what you think.
kbc official winner

john amber said...

I suggest all members choose Solidworks Assignment Help Australia for the best guideline in your academia. The perfect and expert assistant your requirement and learning assignment project by the university.

Term Paper Writing Services said...

Our Research Proposal Writing Service is accessible online via a majority of mobile devices to bring Best Research Proposal Writing Providers close to you. Students can access Online Research Proposals Services from any location if they have a pertinent mobile device.

Term Paper Writing Services said...

Custom Research Paper Services from a credible firm like our Legitimate Custom Research Paper Writing Company ensures that you attain excellence, especially in your College Research Paper Services endeavors.

Nick Wilks said...

With initiative to provide you low-cost price available assignment writing services, Go Assignment help gives you expert assistance for your project completion in a detailed and descriptive manner. Assignment writing services by our native writers. Secure your A+ Grade. We provide On-Time Delivery. Trusted help with economics assignment service we provide.

meldaresearchusa said...

Information Technology Writing Services you acquire from Information Technology Research Paper Writing company guarantee you high grades in all your Custom Information Technology Writing Services.

ruchi chadda said...

Yes we are doing same as your were saying, it is looking not like as you want but it will be soon.
Delhi escorts
Escorts in Delhi
Independent Escorts in Delhi
Delhi Call girls
High profile escorts in Delhi
Housewife Escorts in Delhi
Delhi Russian Escorts
Delhi College Girls Escorts
Dwarka Escorts
Aerocity Escorts
Noida Escorts
Gurugram Escorts
Connaught Place Escorts
Janakpuri Escorts
Vasant Vihar Escorts
Vasant Kunj Escorts
Mahipalpur Escorts
Paharganj Escorts
South Ex Escorts
Surajkund Escorts
Mayur Vihar Escorts
lajpat nagar escorts
karol bagh escorts
Punjabi Bagh Escorts
Laxmi Nagar Escorts
Pachim Vihar Escorts
Malviya Nagar Escorts
Moti Bagh Escorts
Saket Escorts
Uttam Nagar Escorts
Green Park Escorts
Nehru Place Escorts

Simran Sharma said...

Kolkata Escort
Kolkata Russian Escort
Kolkata Russian Escorts
Kolkata Russian Escort Service
Kolkata Russian Call Girls
Kolkata Escorts
Kolkata Escort Service
Kolkata Model Escort
Kolkata Female Escort
Kolkata Hotel Escort
Kolkata Call Girls
Kolkata Independent Escorts
Kolkata Vip Escort
Kolkata Sexy Escort


Olivia Crew said...

Excellent information Providing by your Article, thank you for taking the time to share with us such a nice article. Amazing insight you have on this, it's nice to find a website that details so much information about different artists. Kindly visit the LiveWebTutors website we providing the best assignment help services in Australia.

For More Info: Finance Assignment Help

Unknown said...

A dazzling site to gratefulness in setting on the originator. Visit here for dazzling relationship around the world. assignment helps

Trunganh1131 said...

This is the information I searched for, thankfully you shared it, thank you very much. Please read my shared posts:

bestproducts said...

There are many other reasons as well but these 4 that we discussed above are the largest ones. With the best assignment help service, these issues can be solved. Best assignment help service providers are some of the best assignment help online who are highly knowledgeable and experienced in their fields.

Assignment Help said...

thanks for this informative post. If you are looking for assignment help then Great Assignment Help is always ready to help their customers. Here you find the team of professionals who gives you the best assignment services.

Jacob Oram said...

This post is very much an informative especially for IT students. They can learn about big data. Thanks for this amazing post. By: Master dissertation writing service.

koka queen said...

All a-level services can be provided by our Pooja Gupta Call girls Service in Jaipur they are well trained and high profile girls from the Pooja Gupta Jaipur area. From all other locations of India we recruited female escorts in Pooja Gupta, They also Anglo Indians and high profile independent models.

call girls in Jaipur

Escort services in Jaipur

College Call girls in Jaipur

High class Call girls in Jaipur

House Wife Escort services in Jaipur

Desi call girls in Jaipur

smokegood said...

register idn poker
judi gaple
agen idn poker
judi kartu

Sonia Mehra said...

Some things that you do not realize about Hyderabad escorts when you get to the city, you are in need to know that you have a lot of options when you make your way to for the first time. Escorts can give you a good time, and remember that Hyderabad escort agency provides you escort to show you a good time in a city without much trouble.

Sonia Mehra said...

Hyderabad escorts top model available for 24x7 at your service. They provide gorgeous escorts girls for your unforgettable sexual pleasure. They had very simple hiring method just call for 9347773494 their assistant gives you a proper detail picture with bio
informaiton.they have various types choose your choice and fulfill your sexual feeling with there best of your choice.

Sonia Mehra said...

Hyderabad Escorts by the sonia mehra. they have selected the best high educated or Girlfriend ...

Sonia Mehra said...

VIP Hyderabad female escorts would be give you extreme choice for using an agreeable mood female escorts.

repairtechsolutions said...

iphone display replacement

samsung display replacement
I quite like reading through an article that will make men and women think. Also, many thanks for permitting me to comment!

microsoft display replacement

This is the perfect site for anybody who wants to understand this topic. You realize so much its almost tough to argue with you (not that I really would want to…HaHa). You definitely put a new spin on a topic that's been written about for a long time. Great stuff, just excellent!

oneplus display replacement

samsung laptop service bangalore
Aw, this was an extremely good post. Spending some time and actual effort to create a great article… but what can I say… I put things off a lot and don't manage to get anything done.

lenovo laptop service bangalore

hp laptop service bangalore

Expo 2020 said...

I enjoy kinds very own post. It will be respectable to get a single narrative inside and outside of the core of this distinctive core specialized niche will likely be commonly knowledgeable.
expo 2020

Binary said...

Great content comes by great delivery. Thank you for info. Tourism essays explore different issues in their respective fields of study and the tourism industry as well. As a subject and an industry, tourism is quite interesting, which makes writing such essays interesting but challenging at the same time. Very high standards have been set in writing tourism essays To get help kindly visit tourism essay


john amber said...

I suggest all members choose Economics Assignment Help Australia for the best guideline in your academia. The perfect and expert assistant your requirement and learning assignment project by the university.

alice shuker said...

There are numerous students around the world who are afraid of assignments as it needs a lot of time and expertise in academic writing. Students need to conduct proper research and follow a specific structure to write assignments of high quality. All of this is done with submission deadlines coming close with each passing day. They obviously need help with assignment online so that they can write high- quality assignments and score better grades while also gaining in-depth knowledge on the topic from top experts who provide help with assignment online. we have a huge team of experts who are experts in their respective fields and provide top quality content in their assignment help services. Help with assignment online

poly said...

I’m following our content closely and I must admit its greatest content I have read on internet. Thank you. Writing an essay should be a systematic process. Herein, organization is key to effective articulation of ideas. Note that such organization should precede the actual writing process. This makes the development of requisite skills quite important when it comes to how to write an essay outline. To get help kindly visit essay outline


«Oldest ‹Older   401 – 449 of 449   Newer› Newest»