Friday, April 07, 2006

How Robust is Your ESB?

Organizations are increasingly looking at using an Enterprise Service Bus as the technical foundation for building SOAs these days. With the ESB playing such a critical role in your SOA infrastructure, it is important to understand the characteristics by which to measure the robustness of your ESB. There are various definitions of an ESB out in industry, so the best way to look at them is by the capabilities that people commonly think of when they refer to an ESB.

Although the functionality of an ESB can vary greatly from vendor to vendor, you will find that common to all of them is a core set of functionality that includes the following:

  • Protocol Adaptors
  • Transformation
  • Routing
  • Orchestration
  • Synchronous/Asynchronous Messaging
  • Endpoint virtualization

Based this reference set of capabilities and an understanding of how this type of functionality is typically implemented, one can get an idea of where there may be potential performance bottlenecks, fragile points, and other robustness issues that can plague an ESB. Knowing what these are can help you to make a more informed decision when choosing your ESB vendor and product.

Protocol Adapters
One of the core functionalities of ESBs is protocol adaptation--that is connecting services that run on one type of protocol to services that run on another type of protocol. For example, adapting from SOAP to JMS, FTP to SOAP, SOAP to IIOP, etc. ESBs that are built on top of a messaging-oriented middleware will typically use adapters that connect different protocols into the messaging bus. This is similar to how some products accomplish data transformation using a canonical data model but here the ESBs use a canonical protocol internally based on messaging. Adapters then just transform other protocols to/from this format, while internally components communicate using this normalized protocol. Other ESBs that are not built on top of a message-oriented middleware will typically use a series of interceptors, similar to a pipes and filters architecture in which messages flow through a sequence of interceptors. Usually this type of architecture will perform both protocol adaptation and data transformation as the messages flow through the interceptors.

Transformation
Data transformation is very resource intensive. All the parsing that is required makes it very CPU intensive. The resulting parse trees are loaded into memory for manipulation, making it also very memory intensive. For very large datasets, not everything can be loaded into memory, making the operations also very I/O intensive. Because of these characteristics, it is very important to look for an ESB with a high quality transformation engine. The transformation engine can be evaluated using different types of datasets and transformation. For example, evaluate how it handles large, complex source and target schemas. How does it handle complex transformations? How does it handle large datasets? Run tests using these scenarios and observe CPU, memory, and I/O usage for the transformation engine.

Routing
Services communicate with each other by placing messages onto the bus and the ESB determines how to route those messages to the correct destinations. There are a couple different ways in which this functionality may be implemement. Some implementations may use a generic set of dispatchers that listen on a set of generic incoming channels and dispatches the messages to their destinations. Another approach uses specific channels for the different services. With this approach the service consumer just places the message onto the specific channel of its intended service and a dispatcher specific to that service will dispatch the message to the destination.

Some implementations route messages by having different queues for different destinations. Observe the CPU and memory usage of the message bus processes to see if the message router is saturating the message bus. What is the performance of message router for different types of routes, multiple hops, etc.? Parsing of the message itinerary can also be performance intensive. Message routers often use multiple threads to listen to incoming messages and route them to their destinations.

Orchestration
Orchestration allows developers to hook together multiple services using complex logic to form a larger integrated process. Most ESBs provide orchestration capabilities based on some type of BPEL (or equivalent) orchestration engine. Orchestration is a complex capability and there are many issues you need to consider when evaluating an ESB’s orchestration capabilities. Does it pre-compile the orchestration instructions to improve performance or does it process the instructions on the fly? If the instructions are pre-compiled, then it will be more difficult to dynamically update the process to respond to new events. On the other hand, if the instructions are processed on the fly, then there may be a performance hit due to that. A process is typically long running and thus maintains state and context. This means that the orchestration engine is usually a stateful service. Being stateful makes it more difficult to distribute for failover and scalability. For example, if the ESB containing the orchestration engine in which the process is running fails, how will it migrate that process and its state to a backup instance? The statefulness also creates an affinity with a particular ESB instance which means the messages have to flow through that particular instance—making load distribution across multiple instances difficult, if not impossible.
Keep an eye on the memory usage of the orchestration engine since a lot of implementations build an entire object graph of the process. How many processes can it run concurrently?

Synchronous/Asynchronous Messaging
The messaging backbone is what allows the highly distributed nature of an ESB. One of the results of this is that the capabilities of the ESB can then be distributed across multiple nodes--one of the primary distinctions between an ESB and some of the more traditional EAI hubs. Some of the vendors whose products don't have a messaging backbone will argue that one is not necessary to achieve this highly distributed characteristic of ESBs. In any case, regardless of how the capability is implemented, most ESBs support highly distributed synchronous and asynchronous messaging. Thus, it is important to understand what characteristics should be of concern when evaluating this capability. Important things to consider--throughput of the messaging, size of the messages, latency of message transfer, reliability of message delivery. Observe how these metrics vary across different messaging patterns:

  • Small, chatty messages
  • Large, bursty messages
  • How many inbound messages can it handle
  • How many outbound messages can it handle

Endpoint Virtualization
Clients are abstracted away from the actual physical network location of a service. They invoke services by using logical addresses for the service endpoints provided by the ESB. This allows for such things as multiple instances of that service for load-balancing and failover. One of the concerns here is whether or not a particular ESB's implementation of endpoint virtualization is interoperable with the WS-Addressing standards.



,

No comments: