How long is the method of code refactoring "long"?

Posted by DEVILofDARKNESS on Tue, 07 Dec 2021 15:19:36 +0100

Whenever we see a long function, we have to:

Forced to understand a long function
In a long function, carefully find out the required logic and fine tune it as required

Almost all programmers have a similar experience. No one likes long functions, but you have to deal with all kinds of long functions all the time.

Hundreds of thousands of lines of functions are certainly not enough to dominate.

How long is "long"?

100 lines? Tolerance for function length is too high! This is the key point leading to the generation of long functions.

When looking at the specific code, you must be able to see the nuances. The key point is that the smaller the task, the better. This view also applies to the code. As the tolerance of code length decreases, the perception of code details will gradually improve, and you can see all kinds of problems hidden in the so-called details.

"The smaller the better" is a goal to pursue. However, without a specific number, there is no way to restrict everyone's behavior. Therefore, usually, we still need to define an upper limit on the number of lines of code to ensure that everyone can execute according to this standard.

For a statically typed language with weak expression ability like Java, strive for 20 lines of code to solve the problem.

This is not a standard to talk about. We should turn it into an executable standard. For example, in Java, we can add the constraints of code lines to the configuration file of CheckStyle:

<module name="MethodLength">
    <property name="tokens" value="METHOD_DEF"/>
    <property name="max" value="20"/>
    <property name="countEmpty" value="false"/>
module>

In this way, long functions can be detected by executing local build scripts before we submit code.

Even with the upper limit of 20 lines, this has exceeded many people's understanding. The specific number of function lines can be formulated in combination with the actual situation of the team. It is not recommended to put this number too large. As I said earlier, if you put it in line 100, this number basically does not make much sense and does not have any binding effect on the team.

What if the lines in the function are long? Should I also insert a newline? If you insert a new line, the number of lines will be increased. If you don't need a new line, you should often move the horizontal scroll bar when looking at the code Count by code line instead of physical line.

Generation of long functions

Limiting the length of the function is a simple and crude solution. The most important thing is that you should know that long functions themselves are a result. If you don't understand the causes of long functions, it's still difficult to write clean code.

Based on Performance

C language, which is already a high-performance programming language today, has been questioned for its poor performance, especially function calls.

In the view of some people who write assembly language, calling functions involves the process of entering and leaving the stack, which is obviously not as high as direct execution. This idea has evolved and spread to today, and any new language will still be questioned for the same reason.

Therefore, in many people's opinion, writing functions long is for the so-called performance. However, this view is untenable today. Performance optimization should not be the first consideration in writing code:

The dynamic programming language itself is constantly optimized, and the performance of both the compiler and the runtime will be better and better
Maintainability should be given priority over performance optimization. When the performance is not enough to meet the needs, we will make corresponding measurements, find the focus and carry out specific optimization. This is more focused and meaningful than considering the so-called performance when writing code.

tell in a simple, straightforward way

Write code to tell the truth and list the little things you think of. For example, the following code (if you don't want to read it carefully, you can skip to the back):

public void executeTask() {
    ObjectMapper mapper = new ObjectMapper();
    CloseableHttpClient client = HttpClients.createDefault();
    List<Chapter> chapters = this.chapterService.getUntranslatedChapters();
    for (Chapter chapter : chapters) {
        // Send Chapter
        SendChapterRequest sendChapterRequest = new SendChapterRequest();
        sendChapterRequest.setTitle(chapter.getTitle());
        sendChapterRequest.setContent(chapter.getContent());


        HttpPost sendChapterPost = new HttpPost(sendChapterUrl);
        CloseableHttpResponse sendChapterHttpResponse = null;
        String chapterId = null;
        try {
            String sendChapterRequestText = mapper.writeValueAsString(sendChapterRequest);
            sendChapterPost.setEntity(new StringEntity(sendChapterRequestText));
            sendChapterHttpResponse = client.execute(sendChapterPost);
            HttpEntity sendChapterEntity = sendChapterPost.getEntity();
            SendChapterResponse sendChapterResponse = mapper.readValue(sendChapterEntity.getContent(), SendChapterResponse.class);
            chapterId = sendChapterResponse.getChapterId();
        } catch (IOException e) {
            throw new RuntimeException(e);
        } finally {
            try {
                if (sendChapterHttpResponse != null) {
                    sendChapterHttpResponse.close();
                }
            } catch (IOException e) {
                // ignore
            }
        }


        // Translate Chapter
        HttpPost translateChapterPost = new HttpPost(translateChapterUrl);
        CloseableHttpResponse translateChapterHttpResponse = null;
        try {
            TranslateChapterRequest translateChapterRequest = new TranslateChapterRequest();
            translateChapterRequest.setChapterId(chapterId);
            String translateChapterRequestText = mapper.writeValueAsString(translateChapterRequest);
            translateChapterPost.setEntity(new StringEntity(translateChapterRequestText));
            translateChapterHttpResponse = client.execute(translateChapterPost);
            HttpEntity translateChapterEntity = translateChapterHttpResponse.getEntity();
            TranslateChapterResponse translateChapterResponse = mapper.readValue(translateChapterEntity.getContent(), TranslateChapterResponse.class);
            if (!translateChapterResponse.isSuccess()) {
                logger.warn("Fail to start translate: {}", chapterId);
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        } finally {
            if (translateChapterHttpResponse != null) {
                try {
                    translateChapterHttpResponse.close();
                } catch (IOException e) {
                    // ignore
                }
            }
        }
    }

Send the untranslated chapters to the translation engine, and then start the translation process.

The translation engine is another service that needs to send a request to it in the form of HTTP. Relatively speaking, this code is fairly straightforward. When you know the logic I said above, you can easily understand this code.

The main reason why this code is so long is that all the logic mentioned above is laid out there. There are business processing logic here. For example, send chapters to the translation engine, and then start the translation process; There are also processing details. For example, convert the object into JSON and send it through the HTTP client.

From this code, we can see two typical problems in plain code:

Multiple business processes are implemented in one function
Put different levels of detail into one function

Here, sending chapters and starting translation are two processes. Obviously, this can be implemented in two different functions. Therefore, as long as we do the extraction function, we can disassemble this seemingly huge function, and the scale of the disassembled functions will be much smaller, as follows:

public void executeTask() {
    ObjectMapper mapper = new ObjectMapper();
    CloseableHttpClient client = HttpClients.createDefault();
    List<Chapter> chapters = this.chapterService.getUntranslatedChapters();
    for (Chapter chapter : chapters) {
        String chapterId = sendChapter(mapper, client, chapter);
        translateChapter(mapper, client, chapterId);
    }
}

The disassembled part is actually the process of packaging and sending objects. Let's take the sending chapter as an example. Let's first look at the disassembled sending chapter:

private String sendChapter(final ObjectMapper mapper,
                           final CloseableHttpClient client,
                           final Chapter chapter) {
    SendChapterRequest request = asSendChapterRequest(chapter);


    CloseableHttpResponse response = null;
    String chapterId = null;
    try {
        HttpPost post = sendChapterRequest(mapper, request);
        response = client.execute(post);
        chapterId = asChapterId(mapper, post);
    } catch (IOException e) {
        throw new RuntimeException(e);
    } finally {
        try {
            if (response != null) {
                response.close();
            }
        } catch (IOException e) {
            // ignore
        }
    }
    return chapterId;
}


private HttpPost sendChapterRequest(final ObjectMapper mapper, final SendChapterRequest sendChapterRequest) throws JsonProcessingException, UnsupportedEncodingException {
    HttpPost post = new HttpPost(sendChapterUrl);
    String requestText = mapper.writeValueAsString(sendChapterRequest);
    post.setEntity(new StringEntity(requestText));
    return post;
}


private String asChapterId(final ObjectMapper mapper, final HttpPost sendChapterPost) throws IOException {
    String chapterId;
    HttpEntity entity = sendChapterPost.getEntity();
    SendChapterResponse response = mapper.readValue(entity.getContent(), SendChapterResponse.class);
    chapterId = response.getChapterId();
    return chapterId;
}


private SendChapterRequest asSendChapterRequest(final Chapter chapter) {
    SendChapterRequest request = new SendChapterRequest();
    request.setTitle(chapter.getTitle());
    request.setContent(chapter.getContent());
    return request

This code is not very neat, but at least it's a little simpler than before. We only use the simplest reconstruction technique of extracting functions to split a large function into several small functions.

Long functions often imply a naming problem. If you look at the modified sendChapter, the variable naming is obviously shorter than before, and the cost of understanding will be reduced accordingly. Because variables are in this short context, there will not be so many naming conflicts. Of course, variable names can be written shorter.

A key point of plain code is that it doesn't decompose different things. If we measure this code from the perspective of design, it means that the "separation of concerns" is not done well, and things at different levels are mixed together, including different businesses and different levels of processing. In my column "the beauty of software design", I also said that the more concerns, the better, and the smaller the granularity, the better.

Add a little at a time

Sometimes, a piece of code is not long at the beginning, just like the following code, which handles the error accordingly according to the returned error:

if (code == 400 || code == 401) {
  // Do some error handling
}

Then, new requirements come, new error codes are added, and it becomes like this:

if (code == 400 || code == 401 || code == 402) {
  // Do some error handling
}

This code has been modified many times. Over time:

if (code == 400 || code == 401 || code == 402 || ...
  || code == 500 || ...
  || ...
  || code == 10000 || ...) {
}

Future generations want to curse when they see it. No code can stand this unconscious accumulation. Everyone did nothing wrong, but the end result was terrible. To combat this deteriorating code, you need to know the "Boy Scout rules": Make the camp cleaner than when you came.

Robert Martin borrowed it from the programming field. We should see if our changes to the code make the original code worse. If so, improve it. But the premise of all this is that you should be able to see whether your code has made the original code worse. Therefore, it is necessary to learn the bad taste of code.

So far, we have seen several common reasons for code length:

Based on Performance
tell in a simple, straightforward way
Add a little at a time

Code lengthening is an unconscious problem. People who write code don't feel that they have destroyed the code. But as long as you realize that long functions are a bad smell, many of the following problems will be discovered naturally. As for the solutions, you have seen that in most cases, they are divided into various small functions.

summary

No one wants to read long functions, but many people inadvertently write long functions.

For teams, a key point is to define the criteria for long functions. Too broad standards are meaningless. To effectively control the function scale, dozens of lines are the upper limit of the standard. The lower the standard, the better.

Causes of long function:

Performance as an excuse
The code is straightforward Function write length is the most common reason. The reason why the code is spread out there: -Write multiple businesses together -Write different levels of code together. The root cause is that the "separation of concerns" is not done well
Each person adds a little at a time The main way to deal with it is to stick to the "Boy Scout rules", but the deeper support behind it is to have a deep understanding of the bad taste

Write the function as short as possible.

Programmer Think