Wednesday, October 27, 2010

SpringOne 2010: Creating the Next Generation of Online Transaction Authorization

I spent the second-last week of October at SpringOne 2GX 2010 in Chicago and I thought some of you might get something useful out of my notes. These aren’t my complete reinterpretations of every slide, but just things I jotted down that I thought were interesting enough to remember or look into further.

Creating the Next Generation of Online Transaction Authorization
presented by Maudrit Martinez, Anatoly Polinsky and Vipul Savjani

These three guys from Accenture presented patterns of architecture with Spring Batch and Spring Integration that they have used in production systems for both online and batch processing of financial transactions.

Their diagram showed two technologies – Pacemaker and Corosync – that I hadn’t heard of before. Apparently Corosync is the clustering technology recommended by the Rabbit guys. They also used a product called Hazelcast for a distributed cache and Grid Gain for a compute grid.

They combined Spring Batch with Grid Gain in order to partition the processing of a batch of transactions across multiple nodes. The presenter was fairly impressed with GridGain’s over-the-wire classloading. (To be fair, this idea has been around at least since RMI was released in '97.)

Rather than passing the transaction data around their whole integration network, they instead placed the data in the distributed cache and passed around only the keys to the items in the cache.

They made use of a library called ScalaR, which is a DSL for using GridGain in Scala. They used Scala to process the transactions chiefly because of the availability of the ScalaR DSL and also due to its provision of Actors for simplified concurrent programming. and because, due to the need for performance, they didn’t want to use an interpreted language like Groovy.

They mentioned that parts of GridGain (though perhaps only the DSL) have reportedly been re-written in Scala, and that the GridGain team chose Scala over Groovy because of its compiled, static typing providing better performance than interpreted languages.

They showed where their code was calling Hazelcast and I noted that there wasn’t any attempt at decoupling the cache implementation – a Hazelcast instance was retrieved by calling a static method. Perhaps it was just some demo code they'd thrown together.

I noticed a cool way of converting a Scala list to a Java list that I hadn’t seen before:
new ArrayList ++ myScalaList
From what I can tell, this ++ operator isn't standard (at least you can't use it in the Scala 2.8 REPL), but it was an interesting, succinct syntax that caught my eye.

They mentioned the STOMP protocol, which is a text-based protocol for message broker interoperability supported by Rabbit MQ, among others.

The Spring Integration config they used to send a message to Rabbit didn’t have any implementation, but just an interface they had defined which was then proxied by Spring to send a payload onto the channel.

They mentioned a couple of times that the advantage of Rabbit MQ over JMS is that Rabbit's AMQP is an on-the-wire protocol whereas JMS is a Java API. They didn’t elaborate on why this was an advantage, but I suppose the protocol easily allows other programming languages to integrate with the messaging, where as a Java interface doesn’t offer any standard way to do that.

Their implementation for processing transactions used a chain of three actors: #1 for coordinating the authorisation of the transaction (I think – it may have been for coordinating the authorisation of multiple transactions?), #2 for performing the authorisation, which chiefly meant looking up a collection of rules and then passing these rules off to Actor #3, which was an executor for the Rules.

While searching for an online profile for Anatoly Polinsky, I found this great presentation on Spring Batch that he apparently authored. It also looks like he has released some of the code from the presentation in a project called 'gridy-batch' on github.

Want to learn more?

From Amazon...

From Book Depository...

2 comments:

  1. Graham,

    Thank you for your feedback. Here are clarifications on the demo:

    The reason I chose Scala to process offline transactions in the demo was not the speed, but the fact that I was already using ScalaR which is a GridGain's DSL that written in Scala. I actually like Groovy a lot, and when I mentioned performance, I was refering to the reason why GridGain did not choose Groovy for their DSL, where they actually saw a big performance improvement with Scala.

    In order to pass transactions around the system, I used one of the many good Enterprise Integration Patterns called "claim check": http://www.enterpriseintegrationpatterns.com/StoreInLibrary.html, since it makes a lot of sense to only care about the reference when going from one channel to another. Good observation :)

    The reason I did not decouple Hazelcast APIs is because they are very unintrusive: they return you Java datastructures ( java.util.Collection, java.util.concurrent, etc.. ). I am actually thinking on including an abstract layer into a Spring Data project based on Hazelcast, so stay tuned :)

    Your assumption is correct, the advantage of AMQP over JMS is that it is a protocol rather than API, so it does not impose implementation on end users, plus it decouples consuming and producing sides by a layer of Bindings between Exchanges and Queues.

    And yes, I still give kudos to GridGain team for the network classloading, not that much for the concept, but for a simplicity and complete transparency that you get with GridGain

    Again, thanks for the feedback,
    /Anatoly

    ReplyDelete
  2. Thanks for the extra comments, Anatoly. I've made some updates to incorporate your clarifications.

    I see where you're coming from now when you say that GridGain's approach to network classloading is really easy. It's been a long time since I worked with RMI, but I do recall having to jump through quite a few hoops (mostly security-related) in order to get it working. It's like they assumed that the normal use case would be to create a server for executing completely untrusted code, but in reality the majority of applications are probably doing exactly the opposite.

    Cheers,
    Graham.

    ReplyDelete