Data sharding and handling it in code

So I was at a project where data was somewhat sharded. We had a customer, which was part of a customer database (accessible by webservice calls), however, we had some additional customer data. This led to the (I think somewhat unfortunate) situation where the application reasoned from their own business object, and, when necessary, retrieved data from the relation system.

I feel this is flawed, ideally the entirety of the customer data should be invisible to the business logic of the application. Where I normally would like to have an anemic domain model, this data sharding makes it hard. Let's say you have relation R, and our extended data S. In our application, we want to reason about RS (or at least, not have to make that distinction). So maybe S has some additional attribute which may (or may not) be filled, depending on whether R is available or not. That sounds like a nice idea, but everywhere when R is necessary (and S is available), R needs to be retrieved.
And that is where the flaw is, since this doesn't help reusability. Our components are built using S, and it is quite possible that we retrieve R multiple times!

That is not desired. We can go two ways.
* we have a factory which always enriches our content upon the start, so we never have S, we always ensure we have RS
* We have a thicker model, in which S can retrieve R when necessary. This entails giving S the means to retrieve R.

To me, the last option feels the cleanest, however, both leave the issue of synchronisation. What happens when R is (implicitly) updated? This sounds like our entitymanager, however an entitymanager does so much more. Yet, we might do something with it.

I really should test this, but so far, it's just a Poc:
Usually, we have some sort of Factory for S objects, which contains the entitymanager and persists S.
However, what we can do is have that Factory also insert the soap client, so S can retrieve R if necessary. Moreover, we *can* use the entitymanager! (whoopee!).
If we register an entitylistener, and use some transient fields, we can have the cake and eat it:

@EntityListeners({CustomerDataPersister.class})
public class ApplicationCustomerData {

    private String customerKey;

    @Transient
    private Customer customer;

    @Transient
    private CustomerWebserviceClient webserviceClient;

    public Customer getCustomer() {
        if(customer == null) {
            customer = webserviceClient.retrieve(key);
        }
        return customer;
    }

    public boolean isEnhanced() {
        return customer!= null;
    }
}

public class CustomerDataPersister {
    // use CDI to inject the webserviceclient, see below
    @PrePersist // and others
    public void update(ApplicationCustomerData d)
        if ( d.isEnhanced() ) {
            webserviceClient.update(d.getCustomer());
        }
    }
}

Now we're talking! We can even do this all behind the scenes using more entitylisteners, say, one which inserts the webserviceclient through @postload annotations.

However, the big secret is injecting the webserviceclient in the first place. We need this in the listeners, but they need not be managed beans! Luckily for us, JPA 2.1 supports this (JPA 2.0 does not). Stackoverflow has this covered!
http://stackoverflow.com/questions/10765508/cdi-injection-in-entitylisteners

Now we all of a sudden have an S object which implicitly loads R when necessary. Our entire application is none the wiser, and all of a sudden a complex issue which would eat through our entire application is neatly tucked away!

I forgot the codez...

Zoeken in deze blog

Data sharding and handling it in code

Labels

Reacties

Een reactie posten

Populaire posts van deze blog

Spring's conditional annotation with properties

OSGI insights without sonar

JPA and transactions