Settings related to TCP keepalive in Postgresql

Posted by chemoautotroph on Tue, 22 Feb 2022 10:54:46 +0100

Database connection description

When it comes to TCP keepalive related settings, you can't avoid the topic of database connection. At present, in most use environments, database connections are long connections, that is, connections can be reused.

Because the database connection is different from HTTP connection, HTTP is stateless, and the cost of creating a new connection is relatively small, but the cost of establishing a connection with the database is much higher, because the database connection is not stateless; For example, when the connection is closed, the open transaction, temporary table and prepare statement will be lost. Moreover, the connection process is forked by the main process of postgres, and forking also has corresponding costs. Therefore, it is normal for the database to have some idle sessions, but it is not good to keep a large number of idle sessions for too long.

Version 14 introduces idle_session_timeout parameter, which can be set before. After the set time, the database will close the idle connection. Previous versions can be used pg_timeout Plug in to achieve the same effect. But it will also close the normal idle connections we want to keep, so setting TCP keepalive is a better solution.

What is TCP keepalive?

Keepalive is a functional configuration of TCP protocol. When so is set on TCP network socket_ With the keepalive option, the timer will start running when the socket is idle. When the keepalive idle time is exceeded and the socket has no further activity, the kernel will send a "keepalive packet" to the client. If the client answers, it is considered that the connection is good, and the timer starts running again.

If there is no response, the kernel will wait for the keepalive interval before sending another keepalive packet. Repeat this process until the number of keepalive messages sent reaches the keepalive count. After that, the connection is considered to have failed, and attempting to use a network socket will result in an error.

Note that it is the operating system kernel that sends the keepalive message, not the application (database server or client). The application does not know this process.

keepalive has two purposes:

  1. Keep the network not idle. As above, send keepalive packet s periodically
  2. Check whether the other communication terminal has been disconnected, such as abnormal connection failure caused by power failure. Through this setting, the network connection can be closed.

System and database related parameter settings

Operating system kernel parameters:

tcp_keepalive_intvl (integer; default: 75; since Linux 2.4)
       The number of seconds between TCP keep-alive probes.

tcp_keepalive_probes (integer; default: 9; since Linux 2.2)
       The  maximum  number  of  TCP  keep-alive  probes  to send before giving up and killing the connection if no
       response is obtained from the other end.

tcp_keepalive_time (integer; default: 7200; since Linux 2.2)
       The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes.   Keep-
       alives  are  sent only when the SO_KEEPALIVE socket option is enabled.  The default value is 7200 seconds (2
       hours).  An idle connection is terminated after approximately an additional 11 minutes (9 probes an interval
       of 75 seconds apart) when keep-alive is enabled.

The parameters can be found under / proc/sys/net/ipv4

#When programming, the three parameters correspond to the above three parameters
TCP_KEEPCNT(keepalive Count: after continuously sending probe packets, if there is no response after reaching the count setting, the connection is considered to have failed) cover  tcp_keepalive_probes,Default 9 (Times)
TCP_KEEPIDLE(keepalive Idle time, beyond this setting, send packet) cover tcp_keepalive_time,Default 7200 (seconds)
TCP_KEEPINTVL(keepalive Interval time,send out packet after,No response, wait for the interval and send again) cover tcp_keepalive_intvl,Default 75 (seconds)
# query
cat /proc/sys/net/ipv4/tcp_keepalive_time  
or
sysctl net.ipv4.tcp_keepalive_time
#modify
sysctl net.ipv4.tcp_keepalive_time=3600
 Or modify/etc/sysctl.conf file

After knowing the above introduction, let's take a look at the parameter settings of postgresql database

# see "man 7 tcp" for details

tcp_keepalives_idle = 60                # TCP_KEEPIDLE, in seconds;
                                        # 0 selects the system default
tcp_keepalives_interval = 20            # TCP_KEEPINTVL, in seconds;
                                        # 0 selects the system default
tcp_keepalives_count = 10               # TCP_KEEPCNT;
                                        # 0 selects the system default

If both are set to 0, the setting of system kernel parameters will be used. Obviously, we use the default settings of the system. It takes too long to detect invalid connections, so I need to set parameters at the database level to detect invalid connections in a shorter time.

Here are my settings so that invalid connections can be detected in less than 5 minutes

tcp_keepalives_idle = 60                # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 20            # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 10               # TCP_KEEPCNT;

In addition, the PG14 version also introduces the client_ connection_ check_ The interval parameter is used to detect whether the client is offline at regular intervals. If it is offline, the running query will be terminated quickly to prevent the connection from failing, but the query is still being executed and returned to the client, wasting database resources. The default is 0, and the default unit is milliseconds.

reference resources:
https://www.cybertec-postgresql.com/en/tcp-keepalive-for-a-better-postgresql-experience/
https://www.postgresql.org/docs/14/runtime-config-connection.html

Topics: Database PostgreSQL TCP/IP