MuseGlobal
Subscribe via Email
Enter your Email Address:
Delivered by FeedBurner

Monday, May 14, 2012

The Rise of the Connector - Part 1



Connectors are the heart and soul of Federated Search (FS) engines and with the rise in importance of FS in today’s fast paced, Big Data, analyze everything world, they are crucial to smooth and efficient data virtualization and flow. MuseGlobal has been building Connectors, and the architecture to use them (the Muse/ICE platform) and maintain and support them (the Muse Source Factory) for over 12 years. The people who design and build Connectors must have rich technical expertise, and also have a deep understanding of data and information and its myriad formulations.

This series of posts will look at the problems arising as data grew in volume, spread across systems, moved outside the enterprise, and became all important for the business intelligence which informs current corporate decisions. Not surprisingly, as a leading FS platform Muse and its ecosystem are in the forefront of providing solutions to data problems in the modern world.

This first post considers the growing importance of being able to access data from inside an organization. (The second post looks at the problems arising as data is needed from outside the enterprise, and the complexities of access and extraction that result.)

Part 1         Wanted: data from over there, over here


As the world of Big Data grows daily and the importance of unstructured data becomes more evident to information workers and managers everywhere, methods of accessing that data become critical to success.

Typically in an enterprise the majority of their data is held in relational DBMS’s which are attached to the transaction systems that generate and use it. These include HR, Bill of Materials, Asset Management systems and the like.  However for managers to make strategic decisions on even this data is difficult, they need to see it all at once. The analysis managers need is performed by a Business Intelligence (BI) system, and it works on data held in its own (OLAP) database, which is specially structured to give quick answers to pre-formulated questions.

And here is the first problem: transaction systems with lots of data, and an analysis system with an empty database.  The solution: set up and run a batch process for each working database that takes a snapshot of its data and transforms and loads it into the OLAP database. This is ETL (Extract, Transform, Load) and is where most big company systems are at the moment. The transaction systems have no method of exporting the data, and the analysis engine just works from what it has. This three part solution works and it works well, but it has some problems.

Running a snapshot ETL on each working system at “midnight” obviously takes time, and can be nearly a day old before the process starts. This lack of “freshness” of the data didn’t matter too much 5 or even 2 years ago. It took so long to change systems as a result of the analysis that data a day or so old was not on the critical path. But today’s systems can adapt much more rapidly, and business decisions need to be based on hourly or even by-the-minute data. (Of course, if you are in the stock and financial markets then your timescale is down to micro-seconds, and you have specialist systems tailored for that level of response.)  So first we need to improve on our timing.

In order to do that we need to move from a just-in-case operation to a just-in-time one. Rather than collect all the data once a day, we need to be able to gather it exactly when we need it. Of course gathering it overnight as historic data is still important and makes the whole process work more smoothly and quickly as the just-in-time data is now only a few hour’s worth and so can be processed that much quicker to get it into the BI system. Now we have a two-legged approach: batch bulk and focused immediate updates. Sounds good, but the ETL software for the batch work will not handle the real time nature of the j-i-t data requests.

For a start the ETL process grabs everything in the transaction system database – all customers, all products, all markets. But a manager is generally going to ask for a report on a specific customer or product. It would be endlessly wasteful to grab all that “fresh” data for all customers, when only data for one is needed. So the j-i-t process has to be able to query the transaction system, rather than sweep up everything. It is also almost certain that the required report will need data from more than one transaction system, but probably not all of them. ETL is not set up to do this; it needs a system capable of directing queries at designated systems and transforming those results. And, finally, the extracted data may well need to be in a different format. After all now we are loading the data directly into the Business Intelligence analysis engine for this report (for speed), and not importing it to the OLAP database.  This means that the structure and semantics are all different.

Increasingly the tools of choice for these j-i-t operations are Federated Search (FS) systems such as MuseGlobal’s Muse platform. They can search a designated set of sources (transaction systems), run a specific query against them, and then re-format the results and send them directly to the analysis engine. Initial examples of FS systems are user driven, but for this data integration purpose, the more sophisticated FS systems are able to accept command strings and messages in a wide variety of protocols, formats and languages and act on them, thus allowing the FS system to act a s a middleman getting the data the BI engine needs exactly when it needs it. Muse, for example, through its use of “Bridges” can accept command inputs in over a dozen distinctly different protocols, and can query all the major enterprise management suites in a native or standards-based protocol.


Should we move?


The need for speed of analysis and the volume of data involved grows every day it seems. It takes time to extract all that data and to build a big OLAP database just in case we want it.  What’s more, building, and changing the structure to adapt to changing analysis needs takes time – a lot of it.

So modern BI systems have moved to holding their database in memory, rather than on disk, just so everything is that much faster. Modern analysis engines, many based on the Apache project’s Hadoop engine, can handle a lot of data in a big computer, and do it rapidly. Both Oracle (Exalytics) and SAP (Hana) have introduced these combined in-memory database plus analytics engine, and others are coming. (See here for an InformationWeek take on the war of words surrounding them.) These engines can be rapidly configured (often in real time, through a dashboard) to give a new analysis report – as long as they have the data!

Moving all that data from the transaction system takes time, so the current mode is to leave it there and rely on real-time acquisition of what is needed. This is of course much less disruptive, fresher, and much more focused on the analysis at hand. This is not to say that historical data is not important; it is, and it is used by these engines, but the emphasis is more and more on that last bar on the graph.

So, again we need a delivery engine to get our data for us from all the corporate data silos, get it when it is needed, and then deliver it to the maw of the BI analytics engine. Once again the systems integration, dynamic configuration and deep extraction technologies of a Federated Search engine come to the rescue. Muse supports the real time capabilities, parallel processing architecture, session management, and protocol flexibility to deliver large quantities of data when asked for, or on a continuing “feed” basis.

615 comments:

«Oldest   ‹Older   601 – 615 of 615
Help By Expert said...

This article gives the light in which we can observe reality. This is a very nice one and gives in-depth information.Match.com Refund,Match Refund,Match com RefundVisit Us.

Emily Wilson said...
This comment has been removed by the author.
Wonder Ads said...

Nice article! Thank for this information.
agencia em lisboa para google ads

Buy OTC said...

Stay Fresh and Dry with Drysol Dab-O-Matic!

Anonymous said...

Thank you for this post, I truly enjoyed reading it. Please visit my page to learn more about the Lot Polish Airlines Chicago Office



lily said...

Ryanair Special Assistance program is a vital aspect of the airline's commitment to inclusivity andRyanair Special Assistancepassenger satisfaction. This program aims to provide support and accessibility for travellers with reduced mobility or specific needs. It encompasses services such as wheelchair assistance, priority boarding, and dedicated seating to facilitate easier access. To avail of these services, passengers are encouraged to notify Ryanair in advance or during the booking process. By proactively addressing passengers' requirements, Ryanair ensures a smooth and comfortable journey. Whether it's assistance within the airport or onboard the aircraft, Ryanair's dedication to special assistance highlights their customer-centric approach to air travel, ensuring that every passenger's experience is as pleasant and hassle-free as possible.

Anonymous said...

I appreciate you sharing this blog; please visit mine. Office of LOT Polish Airlines Brussels Office

lily said...

The Air France unaccompanied minor age policy is an essential component of the airline's commitment to providing a secure and supportive travel experience for young passengers travelling alone. Typically, Air France considers children between the ages of 4 and 17 as unaccompanied minors. However, it's crucial to note that the specific age range and requirements may vary depending on the flight's destination and other factors. This policy ensures that children within this age group receive the necessary attention and assistance during their journey, including supervision, help with boarding and disembarking, and ensuring they are safely handed over to their designated guardian upon arrival. By establishing this age range, Air France strives to offer both parents and young travelers peace of mind, making the travel experience as comfortable and safe as possible.


abhinavnarula03 said...

"Unlock the thrill of cricket with Bappa Online Book, your gateway to seamless gaming on Sky247 Exchange ,. Elevate your online cricket experience in India with our top-notch cricket ID services. Join the excitement, register with Sky247 Exchange on Bappa Online Book, and let the games begin! Your journey to cricket glory starts here.

Top Sports Cappers said...

buy sports picks online!

Ayesha Khan said...

Thank You So Much Sir. GB Whatsapp Apk, GB Whatsapp Download, GB Whatsapp

GBWhatsApp APK said...

thank you so much sir for sharing gb whatsapp, gb whatsapp apk, gb whatsapp download.

blog said...

Desi Serial Watch Dahej Dasi Today Episode Online, Nazara Tv Dahej Dasi New Serial Full Video Episode Free, JhanakDahej Dasi Drama Serial Watch All Episodes.
Today Video Episode Full Update On Nazara Tv and Hotstar. Hindi Desi Serial Pandya Store Complete.


Dahej Dasi

Brookyln said...

nice post, keep up with this interesting work. It really is good to know that this topic is being covered also on this web site so cheers for taking time to discuss this! Love to talk about the best best workforce management software.

Larry walker said...

Wow, this blog website looks amazing! Sugar ka ilaj
What was the process behind creating it

«Oldest ‹Older   601 – 615 of 615   Newer› Newest»