How to deal with Netty out of heap memory leakage?

Posted by tmc01 on Fri, 24 Dec 2021 20:39:03 +0100

1, Background

During the development process, we encountered the problem of out of heap memory leakage - the service will fake death (it seems that the service process is still running, but there is no business response).
Check the error log of logback and record the following information:

[location-center:192.168.5.14:8017] [,] 2021-07-27 09:38:39.640 ERROR 53950 [lettuce-nioEventLoop-8-1] io.netty.util.ResourceLeakDetector LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:349)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
	io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
	io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
	io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
	io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	java.lang.Thread.run(Thread.java:748)
[location-center:192.168.5.14:8017] [,] 2021-07-27 09:38:39.642 ERROR 53950 [lettuce-nioEventLoop-8-1] io.netty.util.ResourceLeakDetector LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:349)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
	io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:115)
	io.netty.handler.codec.ByteToMessageDecoder.expandCumulation(ByteToMessageDecoder.java:532)
	io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
	io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
	io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
	io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	java.lang.Thread.run(Thread.java:748)
[location-center:192.168.5.14:8017] [,] 2021-07-27 10:23:56.884 ERROR 53950 [lettuce-nioEventLoop-8-1] io.netty.util.ResourceLeakDetector LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.SimpleLeakAwareByteBuf.unwrappedDerived(SimpleLeakAwareByteBuf.java:143)
	io.netty.buffer.SimpleLeakAwareByteBuf.readRetainedSlice(SimpleLeakAwareByteBuf.java:67)
	io.netty.handler.codec.DelimiterBasedFrameDecoder.decode(DelimiterBasedFrameDecoder.java:276)
	io.netty.handler.codec.DelimiterBasedFrameDecoder.decode(DelimiterBasedFrameDecoder.java:218)
	io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503)
	io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
	io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
	io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
	io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	java.lang.Thread.run(Thread.java:748)

Anxious little partner, is already on the way to Baidu, right? At this time, if you choose Baidu, you choose "long way"

🎃 Originally, in the log, Netty has provided us with the address of the official document, which describes the problem in detail.

[danger] did you see this prompt in the log? “See https://netty.io/wiki/reference-counted-objects.html for more information.”

2, Netty out of heap memory recycling mechanism

What is off heap memory?

Off heap memory is also called direct memory. The ByteBuf used by Netty allocates out of heap memory, which is not controlled by the JVM garbage collection mechanism.

How about Netty's out of heap memory recycling mechanism?

Looking at the netty source code, we can find some entries for out of heap memory recycling:
io.netty.buffer.AbstractReferenceCountedByteBuf#release().
We can get a clearer understanding from the official documents, which can be summarized as follows:

  • Netty out of heap memory recycling algorithm is based on reference counting;
  • If the code block component receives the memory object based on the reference count and is no longer used, the component shall be responsible for reference release - the reference count will be - 1. For example, ChannelHandler is a typical reference object receiving component.

The official has provided cases to judge whether the release needs to be cited:

public ByteBuf a(ByteBuf input) {
    input.writeByte(42);
    return input;
}

public ByteBuf b(ByteBuf input) {
    try {
        output = input.alloc().directBuffer(input.readableBytes() + 1);
        output.writeBytes(input);
        output.writeByte(42);
        return output;
    } finally {
        input.release();
    }
}

public void c(ByteBuf input) {
    System.out.println(input);
    input.release();
}

public void main() {
    ...
    ByteBuf buf = ...;
    // This will print buf to System.out and destroy it.
    c(b(a(buf)));
    assert buf.refCnt() == 0;
}
Event descriptionWhat objects should be released?What object was released successfully?
1.main()createsbufbuf→main()
2.main()callsa()withbufbuf→a()
3.a()returnsbufmerely.buf→main()
4.main()callsb()withbufbuf→b()
5.b()returns the copy ofbufbuf→b(),copy→main()b()releasesbuf
6.main()callsc()withcopycopy→c()
7.c()swallowscopycopy→c()c()releasescopy

How to recycle?

Does the above content seem a little windy? In Netty, Bytebuf is generally used in ChannelHandler, and the official documents also provide more detailed cases.

  1. The Inbound messages read (channelRead()) process releases the case.
    For example:
public void channelRead(ChannelHandlerContext ctx, Object msg) {
    ByteBuf buf = (ByteBuf) msg;
    try {
        ...
    } finally {
        buf.release();
    }
}
  1. The message parser also generates objects that reference the count.
// Assuming your handler is placed next to `HttpRequestDecoder`
public void channelRead(ChannelHandlerContext ctx, Object msg) {
    if (msg instanceof HttpRequest) {
        HttpRequest req = (HttpRequest) msg;
        ...
    }
    if (msg instanceof HttpContent) {
        HttpContent content = (HttpContent) msg;
        try {
            ...
        } finally {
            content.release();
        }
    }
}

In the above two cases, if it is difficult to distinguish, Netty also provides a more concise way to release References:

public void channelRead(ChannelHandlerContext ctx, Object msg) {
    try {
        ...
    } finally {
        ReferenceCountUtil.release(msg);
    }
}

Of course, improper release of the reference of the out of queue message object will also cause out of heap memory leakage. For more content, please check the official documents by yourself. Zihan will no longer translate one by one.

3, Out of heap memory monitoring configuration

Whether the out of heap memory is recycled or not, how to monitor it in netty? There are objects platformdependent. In netty Class, in which "DIRECT_MEMORY_COUNTER" records the out of heap memory occupation. We can obtain the value of the object through reflection, so as to achieve the purpose of monitoring.

Reflection monitoring off heap memory

In the Springboot project, we can define a scheduled task to print the out of heap memory into the log in real time.

import io.netty.util.internal.PlatformDependent;
import lombok.extern.slf4j.Slf4j;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;

import javax.annotation.PostConstruct;
import javax.annotation.Resource;

@Configuration      //1. It is mainly used to mark configuration classes and has the effect of Component.
@EnableScheduling   // 2. Start scheduled task
@Slf4j
public class SaticScheduleTask {
    private static final int _1K = 1024;
    private static final String BUSINESS_KEY = "netty-direct-memory";
    private AtomicLong directMemory;
    
    @PostConstruct
    public void init(){
        Field field = ReflectUtil.getField(PlatformDependent.class,"DIRECT_MEMORY_COUNTER");
        field.setAccessible(true);
        try {
            directMemory = (AtomicLong) field.get(PlatformDependent.class);
        } catch (IllegalAccessException e) {
            e.printStackTrace();
        }
    }

    /**
     * Count the out of heap direct memory every 5s
     */
    @Scheduled(fixedRate = 30000)
    public void report(){
        int memoryInKb = (int) (directMemory.get()/_1K);
        log.info("{}:{}k",BUSINESS_KEY,memoryInKb);
    }
}

Monitoring out of heap memory on the server

cat location-center-info.log |grep netty-direct

The results are as follows:

[root@localhost location-center]# cat location-center-info.log |grep netty-direct
[location-center:192.168.5.12:8017] [dce11c40a9bc3f77,dce11c40a9bc3f77] 2021-08-02 09:05:42.332 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:32768k
[location-center:192.168.5.12:8017] [f486d613826fab3c,f486d613826fab3c] 2021-08-02 09:06:12.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [a94a3c3012804430,a94a3c3012804430] 2021-08-02 09:06:42.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [2c1e844ae37e00e0,2c1e844ae37e00e0] 2021-08-02 09:07:12.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [67106bcae0fb116d,67106bcae0fb116d] 2021-08-02 09:07:42.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [b9a673e7118c05d7,b9a673e7118c05d7] 2021-08-02 09:08:12.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [ddfe5f1003ac4514,ddfe5f1003ac4514] 2021-08-02 09:08:42.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [cd1fbb7b9c84423d,cd1fbb7b9c84423d] 2021-08-02 09:09:12.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [fd897a29ec939f7f,fd897a29ec939f7f] 2021-08-02 09:09:42.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k
[location-center:192.168.5.12:8017] [48893694a422f0ae,48893694a422f0ae] 2021-08-02 09:10:12.285 INFO 29468 [pool-3-thread-1] com.keyou.evm.location.task.SaticScheduleTask netty-direct-memory:131072k

4, Other cases

There are other scenarios for Netty out of heap memory leaks. This booklet will not list them one by one. I hope some excellent articles I have read can be used as your reference.

This article has been included in From NIO to Netty, actual taxi 905.4 agreement

Topics: jvm Netty Memory Leak