Sunday, July 20, 2008

The Space as a Messaging Backbone - Why You Should Care

One of the main themes of GigaSpaces 6.5 XAP release (which is also true for our direction as a company in general), is the fact that we consider ourselves as a very complete application platform, and not just an IMDG solution. This has been evident in prior versions of the product, by with 6.5 we believe we got even closer to that goal, and that in many cases we can replace altogether a JEE application server.
We are taking this quite seriously, and actually invested a lot in making sure that this vision is also realized. A major part of this effort is a project we took upon ourselves to test and document the migration process from a typical JEE application to full Space Based Architecture. To keep the comparison as unbiased as possible, we delegated the actual coding and testing work to a team of professionals at Grid Dynamics (which BTW did a great job).
They first implemented a typical OLTP application using JEE (JMS, EJB, Hibernate, Spring) deployed it on a leading app server and tested the application in terms of latency, throughput, CPU utilization, etc. Next, they migrated the application in a few steps to full SBA.
Each step was documented and tested to assess its affect on the system and validate the benefits it gives. The steps were as follows:
  1. Add 2nd level cache to Hiberante, which didn't require any code change.
  2. Change the data acccess layer to use the space instead of the database, with data base write behind (or Persistency as a Service) in the background
  3. Change the messaging layer to use the Space instead of the app server's JMS implementation
  4. Implement full SBA by collocating business logic, data and messaging in the same JVM by utilizing GigaSpaces processing units
Interestingly, steps 1 and 2 provided nice improvement in latency, but almost none in the throughput front, as shown here. After analyzing these results, we realized that the main bottleneck for throughput increase was the application server's messaging infrastructure, which uses the disk to maintain resiliency for JMS "persistent messages" and is working in an "active-standby" topology (which means only one node is handling messages at any given moment).
When replacing it with the Space, we saw a major increase in throughput, as seen in steps 3 and 4 in the graph above. Another important fact which is not evident from the graph, is that GigaSpaces is much more scalable since the Space can be partitioned across multiple JVMs, and message load is shared between partitions (unlike most MOMs, which end up writing to the disk and typically use the active-standby topology).

Who Else Should Care
Besides ordinary JEE applications, this information is also very relevant to ESB-based applications. When implementing such applications, most people overlook the above, and use messaging servers or even worse, relational databases to handle the transport for their ESB based applications. For example, 65.3 % of Mule users use JDBC for transport, 48.4% use JMS, 30% use MQ Series for tarnsport (accroding to the 2007 Mule user survey).
So you see the benefit in using GigaSpaces as a transport layer instead of the traditional solutions.
With GigaSpaces 6.5, we have built in the integration with Mule into the product, so you can enjoy the GigaSpaces scalable transport (and other integration points) out of the box.
In addition, there are a number of other ESB integrations available.
You can find a very complete integration package with Apache ServiceMix in our community web site, OpenSpaces.org. Also, here's a nice integration project of JavaSpaces and Apache Camel which also enables you to use GigaSpaces as a transport layer with Camel.

Going back to the migration project I mentioned above, we plan to publish the application itself and the results we came up with in the process so people can check it our for themselves and understand how we did things. So stay tuned, there's a lot more interesting stuff to follow.

My TechTalk at TSS.com

You're all welcome to watch my talk at TSSJS Prague about the challenges in scaling web 2.0 applications. It's now online at TSS.com