Sunday, January 13, 2008

OpenSpaces Dynamic Scripting Support

One of the things I intend to do under my new hat as developer community manager is regularly publish posts about new and cool product features. Since our Early Access Program is will go live soon, I can also write about planned features so that GigaSpaces users can experience them by downloading a beta version and trying them out.
The main goal here is to make the community aware of the new product features and get feedback from developers even before the version is released.
The first new feature I wanted to introduce is the new OpenSpaces Scripting Support.
You're probably wondering what scripting has to do with distributed, ultra-scalable systems.
After all, when one hears the words Groovy, JavaScript or Ruby the almost immediate association is web applications, web browsers and HTML.
With the emergence of the likes of Groovy and JRuby, scripting is becoming more and more mainstream and used for building many types of applications. In addition, the realization that domain specific languages can be very elegant, powerful and useful in many cases also contributes to this important mind shift.

So how can a script be useful in a distributed space based application?
For many of our customers, one of the most appealing features of GigaSpaces is the ability to perform calculations and business logic on the space nodes.
This can be useful for a number of cases, such as performing aggregations on space data, validating data as it is written to space, enriching this data, etc.
Before version 6.0 was released, the only way to do that was to use Space filters and the custom query pattern. While very powerful, this was quite complex and cumbersome to use.
In 6.0, we introduced Space Based Remoting as part of the OpenSpaces framework.
This is a very powerful abstraction, enabling your application to transparently enjoy all the goodies the space can give, such as high availability, load balancing, sync/async execution and parallel processing via a Map/Reduce style API.
However, as with any remoting implementation, you have to physically deploy the remoting endpoint on the server side (in our case you define it within your processing unit).
For many applications, this wouldn't be a limitation, but for some there is a need to control what gets executed in the remote node in a more dynamic fashion - or better yet, let the client application decide what should be done on the server (space) side. This can be extremely useful for application that need to dynamically define the execution logic, such as algorithmic trading application that let the trader define the trading algorithm.
And this is where OpenSpaces Scripting support fits in.
The idea is that instead of deploying the endpoint on the server, every space has a built in generic script executor. The space client can then submit the script to be executed on any of the spaces, or even on all of them simultaneously using Map/Reduce style API. So clients can actually change the execution logic dynamically.
You can think of it as a sophisticated form of a database driver - only instead of submitting SQL statements and being limited to the relational model, you can now submit an actual program to your grid, which can do most anything Java code can do!!
The client can also control whether the script will be cached or not, and whether it will execute synchrounously or asynchronously.
When caching is enabled, the scripts are cached in their compiled form on the space side to support faster execution.

And now for some code samples
Setting up a scripting client is very simple.
On the space side, if you're using our EDG edition and starting a data grid, it's already enabled automatically. If you're deploying your own processing unit, you need to include the following in your pu.xml file:
<os-core:space id="space" url="/./mySpace">
<os-core:filter-provider ref="serviceExporter"/>

<!-- A GigaSpace instance (that can be used within scripts) -->
<os-core:giga-space id="gigaSpace" space="space"/>

<!-- Theh scripting executor remoting support -->
<bean id="scriptingExecutorImpl"
class="org.openspaces.remoting.scripting.DefaultScriptingExecutor" />

os-remoting:service-exporter id="serviceExporter">
<os-remoting:service ref="scriptingExecutorImpl"/>

os-events:polling-container id="remotingContainer"
<os-events:listener ref="serviceExporter"/>
That's it, now you processing unit is all set for script execution.
As for the client side, things are even simpler. In this post, I will only show how to do it from a Spring based client application, However this can also be done from your code using our configurers (I will cover it in a future post).
One of the nicest things here, is the approach we took with regards to configuration via annotations. It's follows the spirit of what the guys at SpringSource did with 2.5, enabling you to configure dependency injection via annotations. Note that the client class implements Spring framework's InitializingBean interface which provides a convenient callback at application startup:
public class ScriptRunner implements InitializingBean
private ScriptingExecutor asyncScriptingExecutor;

private ScriptingExecutor syncScriptingExecutor;

public void afterPropertiesSet() throws Exception
asyncScriptingExecutor.execute(new ResourceLazyLoadingScript()
"name", "Uri"));
syncScriptingExecutor.execute(new StaticScript().cache(true)
"println name")
"name", "Cohen"));
As you can see, all the client code has to define is a field of type ScriptingExecutor which is annotated either with @AsyncScriptingExecutor or @SyncScriptingExecutor (for asynchronous or synchronous script execution).
The client Spring beans XML file would then look like this:
<os-core:space id="space" url="jini://*/*/mySpace" lookup-groups="uri"/>
os-core:giga-space id="gigaSpace" space="space"/>
os-remoting:annotation-support />

bean id="scriptRunner" class="ScriptRunner"/>

All we had to do for the annotations to get picked up is introduce the <os-remoting:annotation-support/> tag, which directs OpenSpaces infrastructure to processes all the beans in the application context that actually contain the appropriate annotations.
And finally, here's the Groovy script file referenced above:
for (i in 0..9999)
new Message("blabla"))
'Done writing 10000 objects to the space'

A few more interesting things to note here:
  • The execution unit is an object which implements the org.openspaces.remoting.scripting.Script interface. Here we show two such implementations of it - one that lazily loads a script file from a location in the classpath, and another one which simply takes in the script itself as an argument.
  • We define the script type, in this case Groovy. If the space does not contain the groovy libraries in its classpath this will fail. As mentioned before, we support JavaScript, Groovy and JRuby out of the box. If you favor another scripting language you can easily add support for it as well.
  • The method chaning approach we took with regards to Script configuration (as we also with other configuration elements of GigaSpaces). We believe it's a very elegant way of configuring your application (and probably the closest one can get to a domain specific language in Java :)).
  • The script itself references a gigaSpace variable. Every script running inside the space has a number of contextual variables that are available to it automatically, such as the space itself and the Spring ApplicationContext in which it's defined.
OpenSpaces Dynamic Scripting Support is a very powerful tool that adds a great deal of flexibility to grid applications. Let's recap the benefits of this new feature in short:
  1. It provides you a mean to invoke dynamic scripts on any or all grid members, much the same way you would invoke a SQL statement or even a programon on a database.
  2. It gives the combined power of Java alongside new and powerful scripting languages such as Groovy and JRuby.
  3. It enables you to perform distributed aggregations using a Map/Reduce style API.
I hope I managed to give you short glimpse of what this feature can do and how powerful it is.
In one of my the next posts I'll go into this a bit deeper and discuss some more interesting capabilities of this feature such as distributed aggregations, interceptors and other aspects that come up in this context, such as performance, class loading and security.

No comments: