Tuesday, December 09, 2008

Discovering Processor in Coherence - Part (I)

Besides a strong support for event driven architecture Oracle Coherence has a concept of processors running inside the grid. One construct that makes Coherence programming a little different is its ability to invoke light weight processes inside the grid instead of bring data over the wire and then processing it. To do so the application has to pass an agent to the cache that gets executed on the data. It is not necessarily required to pass a fully implemented one though.
Problem statement: To have an ability to change the implementation of the processor on the fly and without having to re-deploy it.
Following approach is inspired by Coherence's Command Pattern but lets start from the beginning. We need to create a common interface that all commands need to implement:

public interface Command extends Serializable {
public Object process (Object object);
}

These are the commands that will be looked up instead of being passed an instance of. Lets have a few implementations:
public class MyLogger implements Command {...} and,
public class MyProcessor implements Command {...}

We need a place to put these commands in.
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
<caching-scheme-mapping>
<cache-mapping>
<cache-name>Command</cache-name>
<scheme-name>distributed-scheme</scheme-name>
</cache-mapping>
</caching-scheme-mapping>
<caching-schemes>
<distributed-scheme>
<scheme-name>distributed-scheme</scheme-name>
<service-name>DistributedCache</service-name>
<backing-map-scheme>
<local-scheme></local-scheme>
</backing-map-scheme>
<autostart>true </distributed-scheme>
</caching-schemes>
</cache-config>

Now how to execute these commands in such a way that execution logic can be changed but executor does not? I like Map Listeners because it can be attached to a cache in multiple ways and also provides a framework for auto-processing.
For the sake of simplicity lets implement the following logic: For each entry put in the cache run a process on that entry but process that can be changed. Listener will look something like:
public class DiscoveryListener implements MapListener {
private String cacheName;
private static NamedCache dCache;

static {
dCache = CacheFactory.getCache ("Command");
}

public DiscoveryListener(String cacheName) {
this.cacheName = cacheName;
}

public void entryInserted(MapEvent mapEvent) {
Command command = (Command)dCache.get (cacheName);
Object value = mapEvent.getNewValue();
if (command != null) {
command.process (value);
}
}
....
}

And of course load the processors in the command cache:

NamedCache cCache = CacheFactory.getCache ("Command");
cCache.put ("cache1", new MyLogger ());
cCache.put ("cache2", new MyProcessor ());

And, depending on the cache the listener is attached to it will run the corresponding command. What it gives you an ability is to change the MyLogger () to SomeProcessor () for a cache on the fly and the rest of the application remains agnostic.
What's next?
Problem statement has still not been fully implemented.
  1. We still need to deploy all the processor classes and has to be available at the deployment time. This has to change.
  2. Why to use the Command interface? Could it be something else?
  3. Who puts the Commands in the Command cache? Can we have a Command feeder? An external controller that dictates what gets executed?

2 comments:

Robert Varga said...

Hi Ashish,

are you trying to introduce a class-loader getting its class files from a cache?

Wouldn't it be too slow, as it is problematic if not impossible to programmatically traverse the entire class hierarchy referenced from the command class?

Would you put the entire jar file into the cache instead? That again may be slow...

Otherwise it is doable, but isn't it easier to just implement the processor in Groovy and compile it on the fly? Of course you would still have the problem of the referenced classes and also it depends on Java 6 JDK being in place.


Best regards,

Robert

Ashish said...

Hi Rob,
To use Groovy I have to learn it first ;) Yes you are right, what I am trying to do is to provide a facility such that not only the data but processing units can change or at least built in such a way that rules of execution can change. In part (II) of this blog I will try to build something with a custom named cache that does a look up and downloads the processor before executing it (of course if not found already present). And I agree speed is an issue in any of the approaches I have thought so far.