On Netty's Blocking OperationException Problem

Posted by Sergeant on Tue, 01 Oct 2019 23:53:00 +0200

Record a deadlock exception encountered. The scenario occurs when the client disconnects and reconnects.

The reconnection code is similar to this:

	private void reconnect() {
		if (bootstrap == null)
			throw new IllegalArgumentException("bootstrap cannot be null");
		//If it has been connected, it will be connected directly.
		if (channel == null || !channel.isActive()) {
			//Connect
            channel = bootstrap.connect().awaitUninterruptibly().channel();
		}
	}

Throw an exception like this:

io.netty.util.concurrent.BlockingOperationException: DefaultChannelPromise@40edf502(incomplete)
	at io.netty.util.concurrent.DefaultPromise.checkDeadLock(DefaultPromise.java:386)
	at io.netty.channel.DefaultChannelPromise.checkDeadLock(DefaultChannelPromise.java:159)
	at io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:236)
	at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:137)
	at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:30)

Naturally look at the source code:

//DefaultPromise.java
    public Promise<V> awaitUninterruptibly() {
        if (this.isDone()) {
            return this;
        } else {
            this.checkDeadLock();
            boolean interrupted = false;
            synchronized(this) {
                while(!this.isDone()) {
                    this.incWaiters();

                    try {
                        this.wait();
                    } catch (InterruptedException var9) {
                        interrupted = true;
                    } finally {
                        this.decWaiters();
                    }
                }
            }

            if (interrupted) {
                Thread.currentThread().interrupt();
            }

            return this;
        }
    }


//DefaultPromise.java
    protected void checkDeadLock() {
        EventExecutor e = this.executor();
        if (e != null && e.inEventLoop()) {
            throw new BlockingOperationException(this.toString());
        }
    }

//AbstractEventExecutor.java
    public boolean inEventLoop() {
        return this.inEventLoop(Thread.currentThread());
    }

//SingleThreadEventExecutor.java
    public boolean inEventLoop(Thread thread) {
        return thread == this.thread;
    }

It is found that this is to determine whether the current thread (threads that throw exceptions) is the thread that runs bootstrap, and if so, to throw exceptions.

...

The same thread naturally cannot await and notify at the same time, so throwing exceptions is normal.

After debugging for a while, it was found that this problem only occurred when the older machine was running. It is doubtful about directional performance, or the number of directional threads.

    bootstrap.group(new NioEventLoopGroup())
//MultithreadEventLoopGroup.java
    private static final int DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt("io.netty.eventLoopThreads", NettyRuntime.availableProcessors() * 2));

Seeing this, I used it.

    bootstrap.group(new NioEventLoopGroup(1))

Reconnection is found to be 100% deadlocked, so the pot has to be buckled to the performance of the hardware.

The reason is the await operation. Any await operation has the risk of Blocking OperationException. For example, some big guys say that only using ctx.write instead of flush is the internal cause of await. Then the number of threads, if possible, is given manually.

Topics: Java Netty