Demitt's law realizes "high cohesion and loose coupling"

Posted by thewooleymammoth on Fri, 11 Feb 2022 05:05:34 +0100

What is "high cohesion and loose coupling"?
How to realize "high cohesion and loose coupling" by using Demeter's law?
What code design is clearly contrary to Dimitri's law? How to reconstruct this?

What is "high cohesion and loose coupling"?

"High cohesion and loose coupling" is a very important design idea, which can effectively improve the readability and maintainability of the code and narrow the scope of code changes caused by function changes. In fact, many design principles aim to realize the "high cohesion and loose coupling" of code, such as the principle of single responsibility, programming based on interface rather than implementation, etc.

In fact, "high cohesion and loose coupling" is a general design idea, which can be used to guide the design and development of codes with different granularity, such as systems, modules, classes, and even functions. It can also be applied to different development scenarios, such as microservices, frameworks, components, class libraries, etc. In order to facilitate my explanation, I will take "class" as the application object of this design idea to explain. You can compare other application scenarios by yourself.

In this design idea, "high cohesion" is used to guide the design of class itself, and "loose coupling" is used to guide the design of dependencies between classes. However, the two are not completely independent and irrelevant. High cohesion contributes to loose coupling, which needs the support of high cohesion.

What exactly is "high cohesion"?

The so-called high cohesion means that similar functions should be placed in the same class, and different functions should not be placed in the same class. Similar functions are often modified at the same time and placed in the same class. The modification will be concentrated and the code is easy to maintain. In fact, the single responsibility principle is a very effective design principle to achieve high code cohesion.

Let's take a look again. What is "loose coupling"?

The so-called loose coupling means that in the code, the dependencies between classes are simple and clear. Even if two classes have dependencies, code changes in one class will not or rarely lead to code changes in dependent classes. In fact, dependency injection, interface isolation, programming based on interface rather than implementation, and dimitt's law are all to achieve loose coupling of code.

Finally, let's look at the relationship between "cohesion" and "coupling".

As mentioned earlier, "high cohesion" contributes to "loose coupling". Similarly, "low cohesion" also leads to "tight coupling". I drew a comparison chart to explain this. The code structure in the left part of the figure is "high cohesion and loose coupling"; The right part is just the opposite, which is "low cohesion and tight coupling".

In the code design in the left part of the figure, the granularity of classes is relatively small, and the responsibilities of each class are relatively single. Similar functions are put into one class, while different functions are divided into multiple classes. In this way, the class is more independent and the code cohesion is better. Because of the single responsibility, each class will have fewer dependent classes and low code coupling. The modification of a class will only affect the code change of a dependent class. We just need to test whether this dependent class can work properly.

In the code design in the right part of the figure, the class granularity is relatively large, the cohesion is low, and the functions are large and complete. Different functions are put into one class. This causes many other classes to rely on this class. When we modify a function code of this class, it will affect multiple classes that depend on it. We need to test whether these three dependent classes can work normally. This is the so-called "pull one hair and move the whole body".

In addition, we can also see from the figure that the code structure with high cohesion and low coupling is simpler and clearer. Accordingly, it is much better in maintainability and readability.

The English translation of Demeter's law is: Law of Demeter, abbreviated as LOD. From the name alone, we can't guess what this principle says. However, it has another more expressive name, which is called The Least Knowledge Principle.

As for this design principle, let's take a look at its most original English definition:

Each unit should have only limited knowledge about other units: only units "closely" related to the current unit. Or: Each unit should only talk to its friends; Don't talk to strangers.

We translate it into Chinese, which is as follows:

Each module (unit) should only know the limited knowledge of the modules (units: only units "closely" related to the current unit). In other words, each module only talks to its own friends and does not talk to strangers.

Most of the design principles and ideas are very abstract and have various interpretations. In order to be flexibly applied to the actual development, we need to accumulate practical experience. Dimitri's law is no exception. Therefore, combined with my own understanding and experience, I will re describe the definition just now. Note that for unified explanation, I replaced "module" in the definition description with "class".

There should be no dependencies between classes that should not have direct dependencies; Between classes with dependencies, try to rely only on the necessary interfaces (that is, the "limited knowledge" in the definition).

Theoretical interpretation and code practice I

Let's first look at the first half of this principle, "there should be no dependencies between classes that should not have direct dependencies". Let me give an example to explain.

This example realizes the function of a simplified version of the search engine to crawl web pages. The code contains three main classes. The NetworkTransporter class is responsible for the underlying network communication and obtaining data according to the request; The HtmlDownloader class is used to obtain web pages through URL s; Document represents a web document, which is the object of subsequent web content extraction, word segmentation and indexing. The specific code implementation is as follows:

public class NetworkTransporter {
    // Omit properties and other methods
    public Byte[] send(HtmlRequest htmlRequest) {
      //...
    }
}

public class HtmlDownloader {
  private NetworkTransporter transporter;//Inject via constructor or IOC
  
  public Html downloadHtml(String url) {
    Byte[] rawHtml = transporter.send(new HtmlRequest(url));
    return new Html(rawHtml);
  }
}

public class Document {
  private Html html;
  private String url;
  
  public Document(String url) {
    this.url = url;
    HtmlDownloader downloader = new HtmlDownloader();
    this.html = downloader.downloadHtml(url);
  }
  //...
}

Although this code is "usable" and can realize the functions we want, it is not "easy to use" and has many design defects.

First, let's look at the NetworkTransporter class. As an underlying network communication class, we hope that its functions will be as general as possible, rather than just serving to Download HTML. Therefore, we should not directly rely on the too specific sending object HtmlRequest. From this point of view, the design of NetworkTransporter class violates Demeter's law and relies on HtmlRequest class which should not have direct dependency.

How can we refactor the NetworkTransporter class to satisfy the Dimitri rule? I have a vivid metaphor here. If you want to go shopping now, you certainly won't directly give the wallet to the cashier and let the cashier take the money from it, but you take the money out of the wallet and give it to the cashier. The HtmlRequest object here is equivalent to a wallet, and the address and content objects in HtmlRequest are equivalent to money. We should give the address and content to the NetworkTransporter instead of directly giving the HtmlRequest to the NetworkTransporter. According to this idea, the reconstructed code of NetworkTransporter is as follows:

public class NetworkTransporter {
    // Omit properties and other methods
    public Byte[] send(String address, Byte[] data) {
      //...
    }
}

Let's look at the HtmlDownloader class. There is no problem with the design of this class. However, we have modified the definition of the send() function of NetworkTransporter, and this class uses the send() function, so we need to modify it accordingly. The modified code is as follows:

public class HtmlDownloader {
  private NetworkTransporter transporter;//Inject via constructor or IOC
  
  // The HtmlDownloader should also be modified accordingly
  public Html downloadHtml(String url) {
    HtmlRequest htmlRequest = new HtmlRequest(url);
    Byte[] rawHtml = transporter.send(
      htmlRequest.getAddress(), htmlRequest.getContent().getBytes());
    return new Html(rawHtml);
  }
}

Finally, let's look at the Document class. There are many problems in this category, mainly including three points. First, the downloader in the constructor Downloadhtml () has complex logic and takes a long time. It should not be put into the constructor, which will affect the testability of the code. We will talk about the testability of the code later. Here, you can know it first. Second, the HtmlDownloader object is created through new in the constructor, which violates the design idea of programming based on interface rather than implementation, and will also affect the testability of the code. Third, in terms of business meaning, Document web documents do not need to rely on the HtmlDownloader class, which violates the Demeter rule.

Although the Document class has many problems, it is relatively simple to modify, and all problems can be solved with one change. The modified code is as follows:

public class Document {
  private Html html;
  private String url;
  
  public Document(String url, Html html) {
    this.html = html;
    this.url = url;
  }
  //...
}

// Create a Document through a factory method
public class DocumentFactory {
  private HtmlDownloader downloader;
  
  public DocumentFactory(HtmlDownloader downloader) {
    this.downloader = downloader;
  }
  
  public Document createDocument(String url) {
    Html html = downloader.downloadHtml(url);
    return new Document(url, html);
  }
}

Theoretical interpretation and code Practice II

Now, let's take a look at the second half of this principle: "between classes with dependencies, try to rely on only the necessary interfaces". Let's explain it with an example. The following code is very simple. The Serialization class is responsible for Serialization and deserialization of objects. To remind you, there is a similar example mentioned in the previous lesson 15. You can look at it together.

public class Serialization {
  public String serialize(Object object) {
    String serializedResult = ...;
    //...
    return serializedResult;
  }
  
  public Object deserialize(String str) {
    Object deserializedResult = ...;
    //...
    return deserializedResult;
  }
}

Just look at the design of this class, there is no problem. However, if we put it into a certain application scenario, there is still room for further optimization. Suppose that in our project, some classes only use serialization operations, while others only use deserialization operations. Based on the second half of Dimitri's law, "between dependent classes, try to rely only on the necessary interfaces". Those classes that only use serialization operations should not rely on deserialization interfaces. Similarly, classes that only use deserialization operations should not rely on serialization interfaces.

According to this idea, we should split the Serialization class into two smaller granularity classes, one is only responsible for Serialization (Serializer class) and the other is only responsible for deserialization (Deserializer class). After splitting, the classes using Serialization only need to rely on the Serializer class, and the classes using deserialization only need to rely on the Deserializer class. The code after splitting is as follows:

public class Serializer {
  public String serialize(Object object) {
    String serializedResult = ...;
    ...
    return serializedResult;
  }
}

public class Deserializer {
  public Object deserialize(String str) {
    Object deserializedResult = ...;
    ...
    return deserializedResult;
  }
}

I don't know if you can see that although the split code can better meet the Demeter's law, it violates the design idea of high cohesion. High cohesion requires similar functions to be placed in the same class, so that when modifying functions, the modified places will not be too scattered. For this example, if we modify the implementation of serialization, such as changing from JSON to XML, the implementation logic of deserialization also needs to be modified. Without splitting, we only need to modify one class. After splitting, we need to modify two classes. Obviously, the scope of code changes of this design idea has become larger.

If we don't want to violate the design idea of high cohesion or the law of Dimitri, how can we solve this problem? In fact, this problem can be easily solved by introducing two interfaces. The specific code is as follows. The third example of "interface isolation principle" uses a similar implementation idea, which you can combine together.

public interface Serializable {
  String serialize(Object object);
}

public interface Deserializable {
  Object deserialize(String text);
}

public class Serialization implements Serializable, Deserializable {
  @Override
  public String serialize(Object object) {
    String serializedResult = ...;
    ...
    return serializedResult;
  }
  
  @Override
  public Object deserialize(String str) {
    Object deserializedResult = ...;
    ...
    return deserializedResult;
  }
}

public class DemoClass_1 {
  private Serializable serializer;
  
  public Demo(Serializable serializer) {
    this.serializer = serializer;
  }
  //...
}

public class DemoClass_2 {
  private Deserializable deserializer;
  
  public Demo(Deserializable deserializer) {
    this.deserializer = deserializer;
  }
  //...
}

Although we're still going to democlass_ In the constructor of 1, the Serialization implementation class containing Serialization and deserialization is passed in. However, the Serializable interface we rely on only contains Serialization operation, DemoClass_1. The deserialization interface in the Serialization class cannot be used and is insensitive to the deserialization operation, which meets the requirement of "relying on finite interfaces" in the second half of dimitt's law.

Dialectical thinking and flexible application

Do you have any different views on the final design idea of actual combat II?

The whole class only contains two operations: serialization and deserialization. It only needs the users of serialization. Even if they can perceive only one deserialization function, there is no problem. In order to meet the Dimitri rule, we split a very simple class into two interfaces. Is it a little over designed?

The design principle itself has no right or wrong, only whether it can be used correctly. Don't apply design principles for the purpose of applying design principles. When applying design principles, we must analyze specific problems.

For the Serialization class just now, it only contains two operations. There is really no need to split it into two interfaces. However, if we add more functions to the Serialization class and realize more and better Serialization and deserialization functions, let's reconsider this problem. The specific code after modification is as follows:

public class Serializer { // See the interface definition of JSON
  public String serialize(Object object) { //... }
  public String serializeMap(Map map) { //... }
  public String serializeList(List list) { //... }
  
  public Object deserialize(String objectString) { //... }
  public Map deserializeMap(String mapString) { //... }
  public List deserializeList(String listString) { //... }
}

In this scenario, the second design idea is better. Based on the previous application scenarios, most of the code only needs the function of Serialization. For these users, there is no need to understand the "knowledge" of deserialization, while the modified Serialization class, the "knowledge" of deserialization, has changed from one function to three. Once any deserialization operation has code changes, we need to check and test whether all the code that depends on the Serialization class can work normally. In order to reduce the workload of coupling and testing, we should separate the functions of deserialization and Serialization according to dimitt's law.

Topics: Design Pattern

Programmer Think

Demitt's law realizes "high cohesion and loose coupling"

What is "high cohesion and loose coupling"?

Dialectical thinking and flexible application

Hot Topics