Problem view like this:

14-Dec-2022 14:13:15.729 SEVERE [http-nio-10.64.130.81-8080-Acceptor-0] org.apache.tomcat.util.net.Acceptor.run Socket accept failed java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)

Ok check fd this process

1
2
]$ ls -l /proc/26700/fd|wc -l
3989
1
2
3
4
5
6
7
8
9
10
11
]$ ls -l /proc/26700/fd
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 990 -> socket:[343227304]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 991 -> socket:[344107160]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 992 -> socket:[343195983]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 993 -> socket:[343190714]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 994 -> socket:[344436574]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 995 -> socket:[344400363]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 996 -> socket:[344413309]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 997 -> socket:[344103141]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 998 -> socket:[344099280]
lrwx------ 1 tomcat tomcat 64 Nov  9 11:01 999 -> socket:[344128665]

Ok we see many sockets open.

Same check using netstat

1
2
]$ netstat -nap|grep 26700|wc -l
3902
1
2
3
4
tcp6       0      0 1.1.1.1:47870      1.1.1.2:22        ESTABLISHED 26700/java
tcp6       0      0 1.1.1.1:39882      1.1.1.2:22        ESTABLISHED 26700/java
tcp6       0      0 1.1.1.1:42212      1.1.1.2:22        ESTABLISHED 26700/java
tcp6       0      0 1.1.1.1:52267      1.1.1.2:22        ESTABLISHED 26700/java

This strange things from theme

1
find /proc/26700/fd -lname "socket:*" -printf "\n%AD %AT %p"|sort -k1.8n -k1.1nr -k1|more

Script (for the next time):

1
2
3
4
5
6
7
8
9
10
11
12
#!/bin/bash
PID=${1}
CONNECTIONS=$(netstat -nape|grep ${PID}|grep "ESTABLISHED" | awk '{print $5","$8","$9}')
echo "total connections for pid ${PID}: $(echo ${CONNECTIONS} | wc -w)"
for cdevice in ${CONNECTIONS}; do
    info=$(echo ${cdevice}|awk -F',' '{print $1}')
    device=$(echo ${cdevice}|awk -F',' '{print $2}')
    timestamp=$(find /proc/${PID}/fd -lname "socket:\[${device}\]" -printf %T@ 2> /dev/null);
    # compute the time difference
    LANG=C printf '%s ( %.2f s ago)\n' "${info} $(date -d @$timestamp)" $(bc <<<"$(date +%s.%N) - $timestamp");
    echo;
done

Run:

1
./checktime.sh 26700

Ok we see something like this:

total connections for pid 31096: 1
10.64.130.81:22 Wed Dec 14 18:00:16 MSK 2022 ( 184.02 s ago)

But i know that problem occured two hours ago. And if we closer look at the script:

1
 -printf %T@ 

it means
%T - File’s last modification time. @ - seconds since Jan. 1, 1970, 00:00 GMT, with fractional part.

But we need to know when socket created, not modification time… for example i start ssh connection at 18:34:00

1
2
3
4
5
6
7
8
> stat /proc/11835/fd/3
  File: /proc/11835/fd/3 -> socket:[34517616]
  Size: 64              Blocks: 0          IO Block: 1024   symbolic link
Device: 0,20    Inode: 34907466    Links: 1
Access: (0700/lrwx------)  Uid: ( 1000/   peter)   Gid: ( 1000/   peter)
Access: 2022-12-14 19:59:29.887846821 +0300
Modify: 2022-12-14 19:59:29.887846821 +0300
Change: 2022-12-14 19:59:29.887846821 +0300

We can use ss for check:

lastsnd: how long time since- the last packet sent, the unit is millisecond. lastrcv: how long time since the last packet received, the unit is millisecond. lastack: how long time since the last ack received, the unit is millisecond

1
ss -Hnapei -O -T dst :22

And we can see thread life time

1
ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm,etime|grep 16953

After restart tomcat all sockets go to TIME_WAIT for

1
cat /proc/sys/net/ipv4/tcp_fin_timeout

to be continued…

Links: https://superuser.com/questions/565991/how-to-determine-the-socket-connection-up-time-on-linux

P.S

1
2
3
4
ss -nap dst :22
--info
-K 
ss -at '( dport = :ssh or sport = :ssh )'