Wednesday, January 18, 2012

Measuring a Toddlers weight and Object size in heap

...And the common problem is they both need some warm up phase. Toddlers are notorious to not stay standstill and if you have one you know typically the first reading is almost always wrong. But for these naughty Kids there is a way to measure it - Hold them in your arms measure your weight with the kid and then measure yours alone and subtract it. This modelling can also be applied to when Objects need to be measured for their size in a heap. Why? Because flakiness do exist with JVMs and based on sheer observation the first reading is almost always wrong. So not being part of a JVM team how am I suppose to measure an Object size fairly accurately? By reducing the flakiness. By measuring it multiple times. By minimising the affect of chaos.
Problem: To provide an API so that it can measure the size of an Object passed to it

Solution: Clean up the JVM, measure the heap size, create multiple instances of the Object, create strong references to these instances. clean up the JVM again and measure the heap. Take a difference and divide by the total number of Objects used.

If you are looking for measuring the size of Objects in a Coherence cache then look into Coherence's MemoryCalculator APIs. This solution uses Coherence's PoF framework for serialzing and deserializing non-Serialozable Objects.

1. We need to create multiple instances of the Object passed so that strong references can be maintained to these Objects and do not let GC collect these. So make multiple instances of the passed object. How? There are multiple options -

  • If Object implements Serializable or any of Serializable types - Serialize the Object into byte array and to create an instance use this byte array to reconstruct the Object.

  • If Object is Cloneable - To create a new instance then clone the Object

  • If Object is neither Serializable nor Cloneable - Oracle Coherence provides a mechanism to serialize a non-serializable Object called Portable Object Format. PoF as it is commonly called, allows programmers to write external serializers for an Object that does not implement Serializer interface. These Serializers and the Object they serialize and deserialize can be defined in a pof configuration and loaded by a Coherence system property tangosol.pof.config. Once this is done the Object is ready to be serialized.

For the third option Coherence provides a utility in ExternalizableHelper to convert the Object into a Binary:

ConfigurablePofContext pofContext = new ConfigurablePofContext ("my-pof-config.xml");
then:
Binary binObj = ExternalizableHelper.toBinary (objToBeMeasured, pofContext);

This binObj can then be used to create new instances to make multiple strong references:
Object[] objects = new Object [1000];
Runtime runTime = Runtime.getRuntime();

for (int i = -1; i < 1000; ++i) {
Object o = ExternalizableHelper.fromBinary(binObj, pofContext);
// Reject the first object
if (i >= 0) {
objects[i] = o;
} else {
o = null;
Execute GC;
beforeSize = runTime.totalMemory() - runTime.freeMemory();
}
}

Execute GC again;
afterSize = runTime.totalMemory() - runTime.freeMemory();

Use the difference of (afterSize - beforSize)/1000

I also found a very good implementation of "Execute Garbage Collection" from an article on the javaworld.com, that I am reproducing it here:

for (int i = 0; i < 4; ++i) {
long m1 = runTime.totalMemory() - runTime.freeMemory();
long m2 = Long.MAX_VALUE;
for (int j = 0; (m1 < m2) && (j < 500): ++j) {
runTime.runFinalization();
runTime.gc();
Thread.yield();
m2 = m1;
m1 = runTime.totalMemory() - runTime.freeMemory();
}
}

Watching your baby's and Object's weight is critical and you know it why? Enjoy!

Monday, January 16, 2012

एक और शेर

छुआ जो आसमां तो बादल यूँ फट गए,
अरमान दिल के सब पानी में बह गए।
बूँद हज़ार बन फिर वो खो गए कहीं,
आँख के आंसू भी कहीं उसी में मिल गए।

Wednesday, December 28, 2011

Anna's show a big flop in Mumbai - Think!

Anna called off his fast in Mumbai due to a poor response from people he thought he was fighting for. Stop. When Mahatma Gandhi returned from South Africa he did not just fight one war with British, he fought three. One that he fought against himself to get rid of pretentiousness, ego and bias and this constant battle of self cleansing continued till he died. Second was in fact against the Indian people itself. This was the war to wake them up - not just the lawyers in Mumbai but to awaken people in villages who did not even know what human dignity and self rule meant. It was a war because the majority were even afraid to get rid of the British empire fearing even bitter rule by some of their own brethren's. Unless you awake people and show them what they would be looking for how could you even think of fighting his decisive war against the British? And these three wars were chaotic, bleak at times and hurtful but remained physically non-violent.
What happened in Mumbai is no surprise. A poor show for Anna is no reason to get disheartened. It only shows people are not ready, they are not awakened yet. Instead of getting rid of a corrupt system they see it as an opportunity to join and loot. This has become a diamond mine that people think queuing on is unnecessary, instead chance to get in and loot. And don't forget Mahatma's bugle did not sound from a podium in Mumbai it sounded from a place called Champaran that no body even knew it existed where people did not discuss if a choice is good or bad but a place where there was no choice at all. Anna has to find his Champaran.

Tuesday, December 06, 2011

एक और शेर

ये धुंध और कम्बल में सिहरते लोग सोते किनारे,
चौराहों में लोग पीते हुए चाय, और वो भटकते रिक्शे वाले,
दूर से लाल बत्तियां चीरती निकलती रौशनी को काटते हुए,
वो छोटी सी बच्ची मांगती भीख आधी नंगी अपनी माँ के सहारे,
उन गलियों सड़कों की कहानी आज भी याद आती है,
और इस धुंध में लिपटी अपनी लखनऊ अब भी रुलाती है।

*Picture - courtesy Lucknow page on facebook

Wednesday, November 09, 2011

Transaction, Semaphore and a Joke

A DBA walks in a NOSQL bar hangs around for a few minutes, turns back and leaves. He couldn't find a table. [Quoted]

When the Data Grids came to existence it tried to solve a different set of problems that a typical relational database was not geared to address. Some failed trying imitating a Database, some remained distributed cache managers and some just got lost in space. The ones who succeeded were the ones who remained focused in solving the performance, scalability, manageability and predictability of the application state in the middle tier - actually all of them. But as they continued to sink in the Enterprise application strategies one question remained to completely solve - Transactions. Oracle Coherence couldn't defer a solution for too long either and announced a new cache scheme in its v3.6 to support true transactions extending and then deprecating its TransactionMap. Before I show how this new scheme could be used in one way lets talk about how some of these transactional problems were addressed earlier.

If an analogy has to be made between a Database and Coherence, we can think of cache schemes as a database schema and the caches as database tables

Different solutions popped up for different level of sensitivities of desired guarantees.

Problem: Update Person and its Address together

Solving it in domain model
Solving it in the scheme
Solving it in process
And, solving it correctly

Domain Model
Coherence EntryProcessors provide an unique guarantee - Without an use of explicit lock only one process can modify an Entry at a given time no matter how many backups of that Entry is maintained in the grid. And as node fails the in-flight executions will move to the new owners of the partitions as Entries are repartitioned. For the clients as if nothing has happened other than some delays.

public class Person {
private Address address;
}
NamedCache nCache = CacheFactory.getCache("Person");
nCache.put ("person1", new Person());

nCache.invoke("person1", new PersonAddressUpdaterEntryProcessor());
made sure that Person and his address were updated together while the Person was already locked. The key to this solution lies in the domain modelling by hiding any direct access to the Address by well encapsulating it inside the Person who owns it.
But what if everytime a Person is updated its not always that his Address is touched? This requires that Person and Address can be split into multiple caches and for cases when only Person is updated the deserialization cost can be contained. But what about those cases when both these objects are to be updated?

Key Association and Transaction Lite
Everyone who has bought or rented a place knows the following three words - Location, Location, Location. When it comes to Transactions its colocation, colocation, colocation. Move related data together so that challenges of failures like network, nodes and partial failures can be contained with in a single node. Not completely safe in all scenarios but as long as Addresses are not directly updated a clever solution could be used to update them together.
public class Person {
private String id;
public String getKey() {
return id;
}
}
public class Address implements KeyAssociation {
public Address (String personId) { .. }
public Object getAssociatedKey () {
return personId;
}
}
NamedCache pCache = CacheFactory.getCache("Person");
NamedCache aCache = CacheFactory.getCache("Address");

pCache.put ("person1", new Person());
aCache.put ("address1", new Address("person1"));
nCache.invoke ("person1", new MyEntryProcessor());

The key association between the Person and the Address will make sure that no matter if two different caches are used to put these two Entries they will end up together as long as these two caches use the same cache scheme. And then the following:
public class MyEntryProcessor implements InvocableMap.EntryProcessor {
public Object process (InvocableMap.Entry entry) {
final BackingMapManagerContext addressCtx = ((BinaryEntry) entry).getContext();
Map map = addressCtx.getBackingMap ("Address");
Address address = map.get ("address1");
...
entry.setValue(...);
}
}
This works as long as no other process updates the Address while the Person is locked by the EntryProcessor. If you are already using v3.7 the getBackingMap() API has been deprecated and replaced with more thread safe getBackingMapEntry() API and with some changes in how and when the backups of these entries are updated this entire collection of APIs is now called Transaction Lite. If you could colocate all associated entries together in the same partition the Transaction Lite is worth looking at.

A dirty little way if you know your finger crossing has yielded positive results
Problem of transactions become severe when associated objects are meant to be updated in parallel. What if the same Address is being shared by multiple People? Using KeyAssociation is not recommended. As more and more People share the same Address (like a Dorm) KeyAssociation will make the partition bloated and unevenly balanced something that preferably should be avoided. If provisioning is done right (or even over provisioned) and cluster nodes appear to not fail and atomicity is not absolutely required but good to have then an EntryProcessor can be invoked with in another EntryProcessor as long as the Person and Address caches use different cache schemes (Different service names). But all bets are off once say Address's EP succeeds but the Person's fail. Setting the backup count to '0' would minimize the error window but still this is just not a solution. Coherence did have a now deprecated API TransactionMap that allowed multiple operations to be committed in a single transaction albeit nothing goes wrong as described here. This API has now been deprecated and replaced with the Transaction scheme.

Transactional Scheme
A new cache scheme has been added post v3.6 that allows updates to multiple cache entries in a single transaction as long as the caches belong to the same transactional scheme. Make sure following cache scheme is defined:
<cache-config> ...
<caching-schemes>
<transactional-scheme>
<scheme-name>transactional-scheme</scheme-name>
<service-name>TransactionalCache</service-name>
<autostart>></autostart>
</transactional-scheme>
</caching-schemes>
</cache-config>

If Person and Address use the same TransactionalCache scheme


This is a simpler problem to solve with the current scheme.
DefaultConnectionFactory factory = new DefaultConnectionFactory();
Connection connection = factory.createConnection("TransactionalCache");
connection.setAutoCommit (false);
connection.setIsolationLevel(Isolation.READ_COMMITTED);

OptimisiticNamedCache personCache = connection.getNamedCache("Person");
OptimisticNamedCache addressCache = connection.getNamedCache("Address");

Person p = personCache("p1");
Address a = addressCache("a1");

update(p, a);

try {
connection.commit();
} catch (..) { }
finally {
connection.close();
}

Twist - Person and Address in two different cache schemes

Modified Problem: Update two Persons in a single Transaction and then update the Address if the previous transaction succeeds

This is tricky and this is what I call a dependency transaction

Lets take a twist and introduce a new type of InvocableMap.Entry:
public interface TransactionalEntry extends InvocableMap.Entry { 
Status status getTxStatus();
Object getAnotherEntry (Object key);
void update (Object key, Object value, Filter predicate);
}


Lets use this new interface in line with running an EntryProcessor:
public class SomeUtilClass {
   public Object invoke (final K key, final InvocableMap.EntryProcessor processor) {
final DefaultConnectionFactory factory = new DefaultConnectionFactory();
final Connection connection = factory.createConnection ("TransactionalCache");
connection.setAutoCommit(false);
connection.setIsolationLevel (Isolation.READ_COMMITTED);

final TransactionState state = connection.getTransactionState();

TransactionEntry entry = new TransactionalEntry () {
private final OptimisticNamedCache oCache = connection.getNamedCache(cacheName);

@Override
public Object getKey() {
return key;
}

@Override
public Object getValue() {
return oCache.get(key);
}

@Override
public Object getAnotherEntry(Object relatedKey) {
return oCache.get(relatedKey);
}

@Override
public void update(Object key, Object value, Filter predicate) {
oCache.update(key, value, predicate);
}

@Override
public Status getTransactionStatus() {
return state.getStatus();
}

// -- Other methods
...
};
try {
processor.process (entry);
connection.commit();
} catch (Exception exp) {
connection.rollback();
} finally {
connection.close();
synchronized (entry.getTransactionStatus()) {
entry.getTxStatus().notifyAll();
}
}

}

}

Now which processor is passed in the previously declared invoke method? It is an EntryProcessor that is not necessarily executed as an EntryProcessor but this model could gain some solid points in the design consistency.
public class MyProcessor implements InvocableMap.EntryProcessor {
   private final ValueExtractor extractor = new Reflectionxtractor ("currentVersion", ...);

public Object process (final InvocableMap.Entry entry) {
final TransactionalEntry txEntry = (TransactionalEntry) entry;
final Person firstPerson = (Person) txEntry.getValue();
Filter predicate = new WhateverFilter (extractor, firstPerson.currentVersion());
firstPerson.incrementVersion();

Person anotherPerson = (Person) txEntry.getAnotherEntry ("anotherPersonKey");
anotherPerson.incrementVersion();

// -- This needs to be transactional
doSomething (firstPerson, anotherPerson);

txEntry.update (entry.getKey(), firstPerson, predicate);

// -- Use the Status as a Semaphore
executerService.execute (new Runnable () {
@Override
public void run () {
Status status = ((TransactionalEntry) entry).getTransactionStatus();
synchronized (status) {
try {
status.wait();
} catch (InterruptedException i) { ... }

switch (status) {
case COMMITTED:
// -- Update the Address.
// -- Now the two Person objects are guaranteed to be single transactionally
// -- updated and address could have been done in the same way, had address
// -- used the same Transactional scheme.
// -- If Address is in a different cache scheme and the system is provisioned
// -- right that put() succeeds then the Address will only get updated after
// -- Person(s) have been successfully updated.
// -- The updateAddress() could use another EntryProcessor
updateAddress (address);
break;
case ROLLEDBACK:
break;
default:
}
}
}
});
return firstPerson;
}
}

Enjoy!

Monday, October 31, 2011

Politicians should be traded in stock exchange

Lets see when it comes to our hard earned money who we then bet on. It would be a peoples verdict on a daily basis and lets see if they would be willing to lose their "investments" because of caste, religion, regional based selections. And lets see who turns out to be the consistent performers.

Saturday, October 29, 2011

They should be conferred Bharat Ratnas

For their exemplary achievement in their field of work.

  • Rajani Kanth
  • Sachin Tendulkar
  • Amitabh Bachchan
  • Anna Hazare
  • Sam Manekshaw