Monday, October 05, 2009

Integrating LDAP with Coherence

Please read Securing a Coherence Cache as a precursor to this blog. The link talks about how to externalize the configuration of cache security provider that can be configured in coherence cache configuration. The security provider class that implements SecurityProvider interface has to implement a simple method checkAccess (Subject). Subject needs to be passed by the "cache client" and has to be authenticated/authorized in the security provider. Since I started Oracle coherence consulting it has come up time and again to integrate Coherence with an LDAP provider so that application/user accounts can be controlled on what access they can have to which cache with user accounts being managed in a Directory server. So lets think about it again and see if we can streamline this solution and something generic can be built.

Problem Statement: To setup Coherence cache in such a way that discrete cache access can be set up driven by Enterprise directory.

Lets, think about architectural decision points:
  • LDAP Server is an external data source. Use CacheLoader.
  • Avoid accessing LDAP for each Cache request. Cache user authentications (An admin cache).
  • Protect the admin cache that manages the user-auth to not allow any access. Protect the protector.
  • Protect the proxy from *Extend access. Cluster member has inherent trust. Use authorized-hosts for cluster members.
  • Use JAAS
  • Manage authorization locally but authentication centrally.
How about a quick activity diagram?

Authentication
User authentication has to happen only once (typically once in 24 hrs). This is not such a bad cost that can be incurred once in a day as user accounts do not change and if changes it changes very infrequently. Authentication information can be cached once an account is verified against a Directory server. We also need to make sure that the cache that manages account authentication information is inaccessible to any unauthorized user or applications. Now how to do it?
  • Create an Admin cache.
  • Plug a custom CacheLoader that interacts with an external Directory server.
  • Build the cache key to include cache name and user credentials, the cache value be Boolean.TRUE or Boolean.FALSE.
  • Using the <entitled> XmlElement configure a DisAllowSecurityProvider as it's security-provider.
  • DisAllowSecurityProvider denies all requests to this cache other than made by a very "few chosen". Scroll down for its implementation.
So how would such an Admin cache configuration look like?
<distributed-scheme>
<scheme-name>admin-distributed-scheme</scheme-name>
<service-name>AdminDistributedService</service-name>
<backing-map-scheme>
<read-write-backing-map-scheme>
<internal-cache-scheme>
<local-scheme>
<high-units>200KB</high-units>
<unit-calculator>BINARY</unit-calculator>
<expiry-delay>86400000</expiry-delay>
</local-scheme>
</internal-cache-scheme>
<cachestore-scheme>
<class-scheme>
<class-name>LDAPCacheLoader</class-name>
<init-params>
<init-param>
<param-type>string</param-type>
<param-value>ldap.server.com</param-value>
</init-param>
<init-param>
<param-type>int</param-type>
<param-value>389</param-value>
</init-param>
</init-params>
</class-scheme>
</cachestore-scheme>
</read-write-backing-map-scheme>
</backing-map-scheme>
<autostart>true</autostart>
<entitled>
<security-provider>DisAllowSecurityProvider</security-provider>
</entitled></distributed-scheme>

So Admin cache is size limited and expires every 24 hrs of the first authentication and does not allow any access. How could that be? LDAPCacheLoader's load () method can be very simple. The cache key passed could be a "username$password" that can be parsed and authenticated against a Directory server using LDAP APIs. If authentication succeeds return a Boolean.TRUE otherwise false. So how is this load () invoked and from where?

Default Security Provider
Caching user authentication is a luxury that can be centralized. Applications deal with two aspects of cache security - Authentication and Authorization and these can be split in two classes. Combined with cached-authentication lets write an abstract Default security provider. Any security provider that extends it gets the "performance" for free.

public abstract class DefaultSecurityProvider implements SecurityProvider {
private NamedCache nCache = CacheFactory.getCache ("USER_CRED");
public boolean checkAccess (Subject subject) {
String user_pw = ((Principal) principals.iterator().next()).getName();
String userName = getUserName (user_pw);
Boolean isPresent = (Boolean) nCache.get (user_pw + "$$" + cacheName);
boolean isAuth = false;
if (isPresent.booleanValue()) {
isAuth = authorize (userName);
}
return isAuth;
}
public abstract boolean authorize(String userName);
}
Authorization
Like authentication authorization should be relatively inexpensive too. There could be two approaches. One, using Directory server to store authorization attributes too. Even though it is perfectly doable but authorization is owned by Coherence or application and should be "owned" by it. Central governance should only be applied to authentication and not to authorization. So lets find an inexpensive way... how about Java Permission object driven by a policy file? Lets write a Policy file:
grant Principal CustomPrincipal "Principal1" {
permission java.util.PropertyPermission "Cache1", "read, write";
.. More can be added here...
};
grant Principal CustomPrincipal "Principal2" {
permission java.util.PropertyPermission "Cache2", "read, write";
... More can be added here...
};

What about the custom security provider?

public class MyCustomSP extends DefaultSecurityProvider {

public MyCustomSP (String cacheName) {
super(cacheName);
this.cacheName = cacheName;
}

public boolean authorize(final String user) {
if (user == null) {
System.out.println("Auth not in USER_CRED cache");
return false;
}
try {
PropertyPermission fp =
new PropertyPermission(cacheName, "write");
new SecurityManager().checkPermission(fp);
return true;
} catch (SecurityException exp) {
...
}
return false;
}
}
Now in this implementation if a User Principal has "write" permission then it gets the access. But, out of NamedCache behaviors if each method can be classified into two - Either read or write then the method that was invoked can also be passed along with it's classification to checkAccess () method. Instead of hard-coded "write" for every access, NamedCache's each method can have a fine grained user authorization. Of course you reserve the right to create your own Permission object and a set of Actions and use that.

I am not done yet!
In the activity diagram there is a logical concept of Gatekeeper. Who is it? And how does it do it? This gatekeeper is a combination of a custom NamedCache (EntitledNamedCache) and a SecurityProvider called DisAllowSecurityProvider. EntitledNamedCache is auto-magically configured for caches that has &entitled> Element defined (Read Securing a Coherence Cache for more information). While, DisAllowSecurityProvider is configured on the Admin Cache (USER_CRED) that stores the authentication info.

What does DisAllowSecurityProvider do?
public class DisAllowSecurityProvider implements SecurityProvider {
public DisAllowSecurityProvider() {
}

public DisAllowSecurityProvider(String cacheName) {
}

public boolean checkAccess(Subject subject) {
StackTraceElement[] elements = new Throwable().getStackTrace();
StackTraceElement e3 = (StackTraceElement) elements[3];
StackTraceElement e0 = (StackTraceElement) elements[0];

try {
if (SecurityProvider.class.isAssignableFrom(Class.forName(e0.getClassName())) ||
SecurityProvider.class.isAssignableFrom(Class.forName(e3.getClassName()))) {
return true;
} else {
return false;
}
} catch (ClassNotFoundException f) {
Base.log (f);
return false;
}
}
}
So here you go, you get a decently flexible Coherence Cache security implementation. Enjoy!

**One of my colleagues Steve Brockman asked if it was possible to extend the security to other cluster nodes too besides the proxy nodes. The solution is a little different but easy to make. Following are the steps how to do it:
  1. Copy coherence-cache-config.xml to say alt-cache-config.xml
  2. Open alt-cache-config.xml in an editor and remove all the <entitled> section from the configuration.
  3. Edit ExtendedCacheFactory and look for FILE_CFG_CACHE in the file. The next line is where the class sets the cache configuration name. Hardcode the param-value to alt-cache-config.xml (Or, be more creative but set it to alt-cache-config.xml).
  4. Deploy the alt-cache-config.xml on all the cluster nodes.
  5. Set -Dtangosol.coherence.override=proxy-override.xml on all cluster nodes.

10 comments:

JK said...

Ashish, there are a number of issues with what you have done here and if you (and your other Oracle colleagues) are interested in securing Coherence you should look at what I, and now a few others here in London, have been doing to secure Coherence. It is possible, with some work, to secure Coherence properly against Kerberos, Active Directory etc; this includes cluster security, Extend security and JMX security.

Ashish said...

Hi JK,
Do you mind sharing what issues do you see?

-Ashish

JK said...

OK. As I have done a lot of work over the last few months on Coherence security I kind of know a lot of the shortfalls.

You have based this blog and you previous one on the same subject on the example on Coherence's wiki which uses a wrapped named cache. This is very fine grained access control and would have quite a performance hit considering most people use Coherence for speed of access to data.

You example only pulls the username out of the subject and then uses that to check access. It is very simple to write a piece of code that can pretend to be any user it likes by setting the user name into a principal.

Your example covers Extend access to caches but does not secure Invocation services. I know you could write your invocables to do some checking but that relies on app teams coding things correctly.

Your example does not really cover cluster security or if anyone requires it JMX security. JMX security is straight forward as Sun have made it pretty easy but you still need to extend the Coherence com.tangosol.net.management.MBeanConnector class and disable http access to JMX.

To be properly secure you need at least username and password or even better to use something like Kerberos or Active Directory.

Once you start digging into this more there are lots of edge cases that crop up with Extend and secured clusters.

We have started a project here that covers the work we are doing: http://code.google.com/p/coherence-security/ so you might find it interesting. It is still work in progress so although it does all work the documentation on how to run it all is not quite there yet.

Ashish said...

1. Yes the solution's usecase is to provide fine grained cache authorization. Like one account can do puts but not invoke an EntryProcessor sort of thing. It can be controlled using a custom NamedCache and Actions of a custom permission.
2. The framework does not require a username/password based authorization. As you might see the SecurityProvider is externalized you can have any implementation you want as long as pass a Subject. Only AccessSecurityProvider requires name/passoword. Also it is not only a username and password alone but how they are constructed. Yes it is not foolproof but adding an additional factor decreases the probability of unauthorized access. And for application accounts I have found this is being an acceptable solution as enterprises use a Database connections in a similar fashion too. If you have username and password you construct a JDBC url and get in. Why with Coherence is this bad?
3. As you mentioned the example only protects a proxy but is very trivial to extend it for all cluster nodes. InvocationService comes with a little bit more inherent security as Invocable has to be deployed on the nodes for it to be able to be invoked. Vis-a-vis Invocables, the first protection is I should not be able to write and execute my own Invocable which is not possible. Second is if I execute an Invocable without authorization but this is pretty trivial to do too as you mention.
4. Typical Coherence installation enterprise wise is already protected behind a firewall and other standard infrastructures. And with a set of authorized-host-list in the override plus unique multicast IP/port plus a unique cluster name, making cluster member anymore secure could be a mute point unless we have very specific usecase. There are number of inherent ways for not allowing an unauthorized process or host to become a cluster member.
5. JMX security is out of scope of this solution too but that is pretty trivial to add.
If you look at the source (though not completely available for reuse) the cost of authentication and authorization is kept very low by introducing an Admin cache. I also have tested it under load and still get a pretty decent and acceptable performance numbers. The problem is when it comes to security there is no right answer it all depends on a specific usecase. What this blog talks about is a majority of usecases that I have come across and how acceptable this solution is with the customers but still provide a generic enough framework to expand.
Thanks for sending the link, I looked at it and looks pretty good. But haven't read it in detail. May be we can hookup sometime and see if something can be merged.

JK said...

In reply to your points:
1. Yes, that is OK if the customer want fine grained authorisation and is willing to accept the performance hit on every cache access.

2. Username/Password does not work with Extend. The POF serializer for a Subject only serializes the names of the Principals contained in the Subject. This means that out of the box Extend does not support anything more than passing a name, which is not very secure. You end up having to write your own custom Principal and corresponding POF serializer. I am not saying Coherence is bad, and if you want to do just do username and password that is fine, just that Coherence Extend does not support passing a password.

4. While Cluster security is a bit more straight forward as JAAS is supported, there are a few problems you then run across when combining a secure cluster and secure Extend. One of these, for example, is that if your cluster is secured and an Extend connection is made with no Subject, your Extend proxy authentication code will picks up the Subject you cluster is running as instead of a null subject.

4. Yes, in a lot of cases you can use authorized host lists and firewalls to protect your cluster. This breaks down if your extend client can be any of a large number of client machines (e.g. an engine on a compute grid). If you need to have your cluster machines somehow dynamically allocated so you can quickly add to your cluster then again authorized hosts does not really work. I know you can now use an AddressProvider for the authorized hosts list so you could build something dynamic.

5. Yes, I just added JMX as it is something to think about if you are securing Coherence. I have seen people run the http JMX server on production clusters where anyone in the company could then point their browser to the JMX url and play around with Coherence settings!

Coherence security is a little weak at the moment and you need to do a bit of work to make it usable and cover all the bases. This is only going to become more important as the visibility of Coherence grows in large companies (like Banks) and once internal audit teams realize that all this data sits in memory and is pretty much open to anyone who can connect on the right ports.

I am not being critical of what you have done in your blog, and I did exactly the same things when I started looking at how to secure Coherence. I also realise that what you did is fine for some people and provides all the functionality they want. I just think security is an important subject for some people and if you are going to do a secure Coherence cluster then you need to do it properly.

Unknown said...

hi...
what is the role of subject in the code...entitlercachefactory role...

Ashish said...

Hi Vidhya,
I will quote a javadoc on Subject: "To authorize access to resources, applications first need to authenticate the source of the request. The JAAS framework defines the term subject to represent the source of a request. A subject may be any entity, such as a person or a service. Once the subject is authenticated, a javax.security.auth.Subject is populated with associated identities, or Principals". More is at
http://java.sun.com/j2se/1.4.2/docs/guide/security/jaas/JAASRefGuide.html#Subject

Unknown said...

I know this is almost a year after this blog was posted, but I wanted to mention that as of Coherence 3.6 there is a pluggable identity feature that allows passing a proper Subject from Extend clients to a Proxy Service.

SC said...

Is this implementation still current, that is, valid for Coherence 3.6 and up? Thanks

- Srikanth

Ashish said...

Srikanth,
3.6 introduced some more security features that you may wanna check out first. This solution deals with fine grained authorizations that goes beyond communication security. And yes this solution is still valid with 3.6+

-Ashish