Saturday, November 6, 2010

SpringOne 2010: Concurrent and Distributed Applications with Spring

I spent the second-last week of October at SpringOne 2GX 2010 in Chicago and I thought some of you might get something useful out of my notes. These aren’t my complete reinterpretations of every slide, but just things I jotted down that I thought were interesting enough to remember or look into further.

Concurrent and Distributed Applications with Spring
presented by Dave Syer

My favourite quote from this talk, and possibly from the whole conference, is one which I want to take back to my workplace and put into practice with avengeance:

Using single-threaded algorithms on 32-core machines is a waste of money

Dave also presented a really simple but useful definition of thread-safety:

Thread safety = properly managing concurrent access to shared, mutable state

Applying this definition, you can see there are three ways to tackle thread-safety: you can either eliminate mutability, or you can eliminate sharing or you can eliminate concurrency. Eliminating concurrency is the core aim of mutexs and locks, e.g. synchronized blocks. Eliminating mutability is one of the chief design idioms of functional programming.

On the topic of eliminating shared resources, Dave pointed towards the frequent use of ThreadLocal within the Spring Framework to associate unshared resources to individual threads. He made note of the potential for memory leaks with ThreadLocal, highlighting that, with the long-running threads in most servers, you have to ensure you clean each ThreadLocal up when you’re finished otherwise your data will hang around forever. (Sounds like going back to pairing new & delete!)

Dave talked about a method on ExecutorService that I've never used before called invokeAny() that will execute every task in a given list of tasks concurrently (assuming a multi-threaded ExecutorService implementation) and return the result of the first one to complete. The remainder of the tasks are interrupted. I imagine where you might use this is if you have a situation where you have two or three different algorithms, each of which can outperform the other two for certain structures of data, but where the most efficient algorithm for a given individual input can't be (easily) determined before execution. So, on a many-multi-core machine, you have the option of just running all three against the same data, taking the result from the first algorithm to complete and killing the others.

Dave briefly discussed an emerging (I think?) pattern for concurrency called Staged Event-Driven Architecture or SEDA.

He mentioned that Spring Integration 2.0 (RC1 released Oct 29) includes support for transactional, persistent message queues.

He highlighted the difference between between a distributed Applications (running the same binary on multiple nodes) and a distributed Systems (running related, communicating applications across multiple nodes). He said that it was wise to prefer looser coupled messaging architectures for distributed Systems because of the likelihood of unsynchronised release cycles.

Want to learn more?

From Amazon...

From Book Depository...

No comments:

Post a Comment