Tuesday, December 15, 2015

Enchance scalability with DDD. Part 2: Domain Event.

Properly identified consistency boundaries allowed us to spit one large aggregate in two smaller ones which in turns promise to scale better. Thus, two operations which modify "logically-independent" data chunks could be performed in parallel. However, very often when it comes to model real domain entities, they are not so "independent", but rather tightly coupled in such a way that modification (or creation/deleting) of one entity requires additional modification to execute on one or more other entities in order to respect domain consistency.

In previous post we have considered simple user management domain (it's highly recommended to read it before). By working with out domain experts Alice and Bob, we already discovered that software should handle simultaneous changing ParenUser's name and adding new ChildUser to the same ParentUser.

5. Multiple user could simultaneously change ParentUser's name and add new ChidlUser to the same ParentUser.

To make things slightly more complicated, let's add new rule:

 6. ParentUser should track number of its ChildUsers.

1
2
3
4
5
6
7
 public class ParentUser extends User{
  ...
   public int getNumberOfChildren(){
     //implementation
   }   
  ...
 } 

In other words, rule 6 means that number of ChildUsers in our system should be consistent with value returned by getNumberOfChildren(). It looks like we have encountered a new consistency rule which spans over ParentUser and ChildUser aggregates.

We already know that for sake of scalability we have to keep aggregates as small as possible and add new entities forming larger aggregates only when  true consistency rule is identified.

So it seams this time we could justify referencing our entities:   

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
 public class ParentUser extends User{
   
   private Set<ChildUsers> childUsers;  
 
   public int getNumberOfChildren(){
     return children.size();
   }
   
   public void addChild(ChildUser newChildUser){  
     this.childUsers.add(newChildUser)  
   }   
 
  ...
 } 

In fact, this solution comes us back to violation of domain rule 5! But we don't want to disappoint Alice and Bob again!

What if we keep Disconnected Model solution, but add simple counter to ParentUser and increment it every time when new ChildUser is added to our system.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
public class ParentUser extends User{
   
   private int childrenCounter;
   ...

   public int getNumberOfChildren(){
     return this.childrenCounter
   } 

   public void incrementChildNumber(){
    this.childrenCounter++;
   } 
  ...
 } 


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
public class UserApplication{
   ...

   @Transactional
   public void addNewChildUser(Name name, UserId parentId){
     ChildUser child = new ChildUser(name, parentId);
     this.repository.save(child);

     ParentUser parent = repository.find(parentId);
     parent.incrementChildNumber();
     this.repository.save(parent);

   } 
  ...
 } 

The problem of the solution above is - it violates another fundamental rule of modeling aggregate.

Modify only ONE aggregate instance per transaction!

A properly designed Bounded Context modifies only one Aggregate instance per transaction in all cases. What is more, we cannot correctly reason on Aggregate design without applying transactional analysis. 
Limiting modification to one Aggregate instance per transaction may sound overly strict. However, it is a rule of thumb and should be the goal in most cases
--Vaughn Vernon. Implementing Domain-Driven Design

Keeping in mind performance and scalability, our last solution is even worst than large aggregate one. Indeed, it still violates rule 5 and involves more entities into transaction. Actually, modifying multiple aggregates instances in one transaction dramatically limits scalability. For example, if we store each aggregate type in different relational database then we would need distributed transaction. If we store them as documents in NoSQL database (such as mongoDB), then we would get transaction issue again (most of NoSQL solution doesn't support modification of several documents in one transaction).  And even though we store them in the same oracle db, we still are not free of: "ORA:08177 cannot serialize access to this transaction",  because of simple rule: more data involved in transaction - higher chances of  concurrent conflicts.

Just because we are given a use case that requires for maintaining consistency in a single transaction doesn’t mean we should do that. Often, in such cases, the business goal can be achieved with eventual consistency between aggregates. 

Domain Event pattern.

There is a practical solution for achieving eventual consistency in a DDD model. An Aggregate publishes a Domain Event which is in some period of time delivered to one or multiple subscribers.

One point could be added to the tweet above: Language of Reactive systems: Events.  The Event is powerful tool and not only for achieving eventual consistency. Sometimes, for facilitating and accelerating analysis of target domain, DDD experts practice Event Storming technique, which leads to fully behavioral model, with no underlying data model. Focusing primary on behavior is crucial strategy for scalable solution (just remind how we came up with Disconnected Model in previous post), thus Event Storming might also help to shift mind for DDD beginners who use to focus mostly on data models.  

The key thing here is to properly understand what is Domain Event. A domain event captures an occurrence that happened in a domain and what poses interest for domain experts. The domain expert is not interested in databases, web sockets, or design patterns, but in the business domain of the things that have to happen. Domain events capture those facts in a way that doesn't specify a particular implementation. 


Let introduce new Domain Event into our possible implementation:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
public class UserApplication{
   ...

   @Transactional
   public void addNewChildUser(Name name, UserId parentId){
     ChildUser child = new ChildUser(name, parentId);
     this.repository.save(child);

     ChildUserWasCreatedEvent event = new ChildUserWasCreatedEvent(child);  
     this.eventPublisher.publish(event);

   } 
  ...
 } 

Please, pay attention modifying aggregate and publishing event should be transactional, so once aggregate changes are persisted and corresponding event is published, an underlying  messaging infrastructure should guarantee delivery to a consumer.  Event name is always in past tense, since it represent a fact that already happened in our domain.

There may be one or more consumers of our domain event, which consume it asynchronously. The task of consumer is to execute business rule making domain model eventually consistent. If event processing is failed, an appropriate re-delivery strategy must be provided.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
public class DomainEventListener{
   ...

   @Transactional
   public void handle(ChildUserWasCreatedEvent event){
     ParentUser parent = repository.find(event.getParentId());
     
     parent.incrementChildNumber();
     
     this.repository.save(parent);     

   } 
  ...
 } 

If we store aggregate events permanently in an event log, then we might notice that we could reconstruct aggregate state by applying all stored events on given aggregate in chronological order, thus entire aggregate state could be stored as its event log. This technique called Event Sourcing. Despite the obvious performance hit of fetching such aggregate from database, it gives enormous possibilities for business, especially for analytics and data mining. Along with CQRS this architecture style is well known and adopted in DDD community (for getting more info please see Greg Young talk).
 
The Event pattern is inevitable artifact of distributed high scalable systems, once we get the hang of using it we will be addicted and wonder  how we survive without it until now. It worth to mention that the biggest challenge here is to come up with appropriate architecture and to chose underground infrastructure. The range of issues and possible solution is too large for blog format, so for getting deeper insights, I would recommend to read Vaughn Vernon's Implementing Domain Driven Design, chapters 4 and 8.   



No comments:

Post a Comment