Record the troubleshooting of an online springcloud feign request service timeout exception

Posted by faza on Thu, 13 Jan 2022 07:36:24 +0100

Due to the recent surge in online orders, the third party reported that some work order businesses failed to query and process. After investigation, the current system failed to call the downstream system through FeignClient (the exception code is posted below).

Caused by: feign.RetryableException: Read timed out executing POST http://xxxx
        at feign.FeignException.errorExecuting(FeignException.java:84) ~[feign-core-10.1.0.jar!/:na]
        at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:113) ~[feign-core-10.1.0.jar!/:na]
        at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:78) ~[feign-core-10.1.0.jar!/:na]
        at feign.ReflectiveFeign$FeignInvocationHandler.invoke(ReflectiveFeign.java:103) ~[feign-core-10.1.0.jar!/:na]
        at com.sun.proxy.$Proxy141.creditReportConvert(Unknown Source) ~[na:na]
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_121]
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_121]
        at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[na:1.8.0_121]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[na:1.8.0_121]
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) ~[na:1.8.0_121]
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) ~[na:1.8.0_121]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569) ~[na:1.8.0_121]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) ~[na:1.8.0_121]
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[na:1.8.0_121]
        at feign.Client$Default.convertResponse(Client.java:143) ~[feign-core-10.1.0.jar!/:na]
        at feign.Client$Default.execute(Client.java:68) ~[feign-core-10.1.0.jar!/:na]
        at org.springframework.cloud.openfeign.ribbon.FeignLoadBalancer.execute(FeignLoadBalancer.java:93) ~[spring-cloud-openfeign-core-2.1.1.RELEASE.jar!/:2.1.1.RELEASE]
        at org.springframework.cloud.openfeign.ribbon.FeignLoadBalancer.execute(FeignLoadBalancer.java:56) ~[spring-cloud-openfeign-core-2.1.1.RELEASE.jar!/:2.1.1.RELEASE]
        at com.netflix.client.AbstractLoadBalancerAwareClient$1.call(AbstractLoadBalancerAwareClient.java:104) ~[ribbon-loadbalancer-2.3.0.jar!/:2.3.0]
        at com.netflix.loadbalancer.reactive.LoadBalancerCommand$3$1.call(LoadBalancerCommand.java:303) ~[ribbon-loadbalancer-2.3.0.jar!/:2.3.0]

Through the system's slow request capture and interception, it was found that the current request took only 1031 milliseconds, and the Read timed out timeout error was triggered. This project and downstream projects were registered on Eureka. They were very confused about the timeout of 1 second, so they began to consult the underlying source code.

The tracking code can be found in feign The request internal class Options constructor is configured with a connection timeout of 10 seconds and a read timeout of 60 seconds by default. However, the call request took only about 1 second to be disconnected, prompting a timeout error. It is preliminarily judged that the default timeout configuration is not effective.  

To write this, let's first review the complete call process steps of Feign link.

It can be seen that Feign calls are divided into two layers: Hystrix and Ribbon. Generally, the higher version of Hystrix is turned off by default (in this project, Hystrix is turned off by default, so now you need to analyze the Ribbon layer call configuration information)

You can see that the default configured read timeout and connection timeout of RibbonClientConfiguration are 1000ms = 1s. If there is no timeout configured, the current timeout is triggered according to the timeout error of this call.

We can see in the execute method of FeignLoadBalancer that when IClientConfig is empty, the timeout for overwriting is the timeout of Ribbon by default, not the default timeout of Feign Options.

public FeignLoadBalancer(ILoadBalancer lb, IClientConfig clientConfig,
            ServerIntrospector serverIntrospector) {
        super(lb, clientConfig);
        this.setRetryHandler(RetryHandler.DEFAULT);
        this.clientConfig = clientConfig;
        this.ribbon = RibbonProperties.from(clientConfig);
        RibbonProperties ribbon = this.ribbon;
        this.connectTimeout = ribbon.getConnectTimeout();
        this.readTimeout = ribbon.getReadTimeout();
        this.serverIntrospector = serverIntrospector;
    }

If application If there is a configured timeout in the properties file, the configured timeout is used. Otherwise, take the default timeout of the Ribbon, that is, the default duration of Feign calling the service on the Internet is 1 second, that is, if you fail to connect or respond after 1 second, you will report an error accordingly.

In the actual business, if the response time of the service exceeds 1 second, we can give the timeout corresponding to the configuration according to the actual response. The configurations of properties and yml versions are posted below

#properties version

feign.client.config.default.connectTimeout=60000
feign.client.config.default.readTimeout=60000
#yml version

feign:
  client:
    config:
      default:
        connectTimeout: 60000
        readTimeout: 60000

 

Topics: Spring Boot Microservices