[20191108]内核参数tcp_keepalive与sqlnet.ora expire_time的一些总结.txt

[20191108]内核参数tcp_keepalive与sqlnet.ora expire_time的一些总结.txt

--//前几天在做12c DCD SQLNET.EXPIRE_TIME相关测试时,在11g数据库遇到1个古怪的问题,就是设置sqlnet.expire_time无效.不知道为
--//什么?以前做过类似测试就是在两个都设置的情况下,sqlnet.ora expire_time优先.

--//我当时的情况如下,设置内核参数如下:
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9

--//参数解析:
/proc/sys/net/ipv4/tcp_keepalive_time    当keepalive起用的时候,TCP发送keepalive消息的频度。默认是2小时。
/proc/sys/net/ipv4/tcp_keepalive_intvl   当探测没有确认时,keepalive探测包的发送间隔。缺省是75秒。
/proc/sys/net/ipv4/tcp_keepalive_probes  如果对方不予应答,keepalive探测包的发送次数。缺省值是9。

--//sqlnet.ora,没有设置#SQLNET.EXPIRE_TIME.
$ grep -i expire sqlnet.ora
#SQLNET.EXPIRE_TIME = 1

--//我当时以为取消注解就可以测试,实际上没有出现探测包。我自己当时思路有点乱,干脆选择重启数据库与监听。
--//正好现在有空分析看看当时产生问题的原因。

1.环境:
SYS@book> @ ver1
PORT_STRING         VERSION    BANNER
------------------- ---------- ----------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

$ cat /etc/issue
Oracle Linux Server release 5.9

--//设置内核参数如下:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 60

--//注解SQLNET.EXPIRE_TIME = 1
$ grep -i expire sqlnet.ora
#SQLNET.EXPIRE_TIME = 1

2.测试:
--//测试前重启数据库与监听,避免一些干扰.
--//从客户端连接服务器:
SCOTT@78> @ spid
SID    SERIAL# PROCESS   SERVER    SPID  PID  P_SERIAL# C50
--- ---------- --------- --------- ----- --- ---------- --------------------------------------------
 44         11 4380:8788 DEDICATED 60897  27          6 alter system kill session '44,11' immediate;

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
....
11:24:33.891755 IP 192.168.100.78.1521 > 192.168.98.6.61888: P 7881:8419(538) ack 9769 win 330
11:24:34.090348 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 8419 win 16290
--//执行@spid,不再输入sql语句。
11:25:34.091591 IP 192.168.100.78.1521 > 192.168.98.6.61888: . ack 9769 win 330
11:25:34.096620 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 8419 win 16290
11:26:34.235540 IP 192.168.100.78.1521 > 192.168.98.6.61888: . ack 9769 win 330
11:26:34.235889 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 8419 win 16290
--//可以发现间隔60秒.受内核参数net.ipv4.tcp_keepalive_time = 60
--//修改内核参数net.ipv4.tcp_keepalive_time = 30,继续测试:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 30

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:28:34.555478 IP 192.168.100.78.1521 > 192.168.98.6.61888: . ack 1403678455 win 330
11:28:34.555755 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 1 win 16290
11:29:04.571423 IP 192.168.100.78.1521 > 192.168.98.6.61888: . ack 1 win 330
11:29:04.571802 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 1 win 16290
11:29:34.587578 IP 192.168.100.78.1521 > 192.168.98.6.61888: . ack 1 win 330
11:29:34.587923 IP 192.168.98.6.61888 > 192.168.100.78.1521: . ack 1 win 16290
--//可以当前会话探测包发出间隔30秒.也就是修改内核参数net.ipv4.tcp_keepalive_time = 30马上生效。客户端退出在进入:
11:30:22.145455 IP 192.168.100.78.1521 > 192.168.98.6.62192: P 6670:6687(17) ack 8100 win 330
11:30:22.145744 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307 <nop,nop,sack 1 {6670:6687}>
--//重新登录不执行任何sql语句.
11:30:52.145468 IP 192.168.100.78.1521 > 192.168.98.6.62192: . ack 8100 win 330
11:30:52.145822 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307
11:31:22.171459 IP 192.168.100.78.1521 > 192.168.98.6.62192: . ack 8100 win 330
11:31:22.171807 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307
--//可以发现当前间隔30秒.受内核参数net.ipv4.tcp_keepalive_time = 30.

--//现在修改sqlnet.ora,取消注解:
$ grep -i expire sqlnet.ora
SQLNET.EXPIRE_TIME = 1

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:32:52.219456 IP 192.168.100.78.1521 > 192.168.98.6.62192: . ack 8100 win 330
11:32:52.219822 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307
11:33:22.235450 IP 192.168.100.78.1521 > 192.168.98.6.62192: . ack 8100 win 330
11:33:22.235787 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307
11:33:52.251450 IP 192.168.100.78.1521 > 192.168.98.6.62192: . ack 8100 win 330
11:33:52.251739 IP 192.168.98.6.62192 > 192.168.100.78.1521: . ack 6687 win 16307
--//可以当前会话不受影响,时间间隔30秒.客户端退出在进入:
11:34:19.084451 IP 192.168.100.78.1521 > 192.168.98.6.62390: P 6670:6687(17) ack 8096 win 330
11:34:19.084758 IP 192.168.98.6.62390 > 192.168.100.78.1521: . ack 6687 win 16307 <nop,nop,sack 1 {6670:6687}>
--//重新登录不执行任何sql语句.
11:36:18.847551 IP 192.168.100.78.1521 > 192.168.98.6.62390: P 6687:6697(10) ack 8096 win 330
11:36:19.047149 IP 192.168.98.6.62390 > 192.168.100.78.1521: . ack 6697 win 16304
11:37:18.858073 IP 192.168.100.78.1521 > 192.168.98.6.62390: P 6697:6707(10) ack 8096 win 330
11:37:19.058238 IP 192.168.98.6.62390 > 192.168.100.78.1521: . ack 6707 win 16302
--//开始第一个时间间隔2分钟,然后1分钟,可以发现现在起作用的是sqlnet.ora的SQLNET.EXPIRE_TIME = 1.
--//从这里可以看出在两者设置的情况下sqlnet.ora的SQLNET.EXPIRE_TIME设置优先.而且根本不需要重启监听与数据库.
--//为什么我前面的测试有问题呢,问题到底在那里呢?我仔细回忆我前面的测试,难道问题出在连接模式上吗?重新登录:
>>sqlplus scott/book@192.168.100.78:1521/book
SCOTT@192.168.100.78:1521/book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
       281          1 11172:3296               SHARED    60830                     20          1 alter system kill session '281,1' immediate;
--//注意:当前的连接模式SERVER=SHARED模式(共享模式).spid=60830.

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:41:10.865890 IP 192.168.100.78.1521 > 192.168.98.6.62708: P 11528:12076(548) ack 13033 win 330
11:41:11.067793 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//重新登录不执行任何sql语句.
11:42:11.131426 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:42:11.131844 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:42:41.147547 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:42:41.147896 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:43:11.163563 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:43:11.163905 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:43:41.179536 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:43:41.179827 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//时间间隔是30秒.这个就是我前面测试遇到的情况,我开始设置net.ipv4.tcp_keepalive_time = 7200太大了.根本看不到网络探测包。
--//也就是在使用共享模式登录的时候,受内核参数的控制,因为当时启动数据库时没有SQLNET.EXPIRE_TIME设置,而其对应进程已经启
--//动(指ora_s000_book,ora_d000_book),这样共享模式的连接继承了相关进程的设置,依旧使用内核参数。

$ ps -ef | grep 6083[0]
oracle   60830     1  0 11:21 ?        00:00:00 ora_s000_book

--//直接修改内核参数net.ipv4.tcp_keepalive_time = 10,注意修改时没有退出客户端连接:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 10

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:48:41.339466 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:48:41.339809 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:49:11.355484 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:49:11.355844 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数net.ipv4.tcp_keepalive_time = 10
11:49:21.371482 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:49:21.371840 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:49:31.387482 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:49:31.387762 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:49:41.403499 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:49:41.403822 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数马上生效,客户端的连接并没有退出.再次修改为net.ipv4.tcp_keepalive_time = 100
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 100

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:53:51.803467 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:53:51.803768 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数net.ipv4.tcp_keepalive_time = 100
11:55:31.963461 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:55:31.963853 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
11:57:12.059436 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:57:12.059884 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数马上生效,间隔100秒客户端的连接并没有退出.再次修改为net.ipv4.tcp_keepalive_time = 180.
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 180

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
11:57:12.059436 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
11:57:12.059884 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数net.ipv4.tcp_keepalive_time = 180
12:00:12.283427 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
12:00:12.283809 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
12:03:12.507470 IP 192.168.100.78.1521 > 192.168.98.6.62708: . ack 13033 win 330
12:03:12.507982 IP 192.168.98.6.62708 > 192.168.100.78.1521: . ack 12076 win 16166
--//修改参数马上生效,间隔180秒.客户端的连接并没有退出.

--//现在重启监听数据库看看.
--//内核参数设置如下
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 180

$ grep -i expire sqlnet.ora
SQLNET.EXPIRE_TIME = 1

--//重启数据库与监听.略....
# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
12:07:09.682897 IP 192.168.100.78.1521 > 192.168.98.6.64074: P 6669:6686(17) ack 7975 win 330
12:07:09.881094 IP 192.168.98.6.64074 > 192.168.100.78.1521: . ack 6686 win 16307
--//客户端登录,sqlplus scott/book@192.168.100.78:1521/book,注意:连接模式shared
12:09:09.552062 IP 192.168.100.78.1521 > 192.168.98.6.64074: P 6686:6696(10) ack 7975 win 330
12:09:09.751574 IP 192.168.98.6.64074 > 192.168.100.78.1521: . ack 6696 win 16304
12:10:09.562266 IP 192.168.100.78.1521 > 192.168.98.6.64074: P 6696:6706(10) ack 7975 win 330
12:10:09.762723 IP 192.168.98.6.64074 > 192.168.100.78.1521: . ack 6706 win 16302
--//开始第一个时间间隔2分钟,然后1分钟,可以发现现在起作用的是sqlnet.ora的SQLNET.EXPIRE_TIME = 1.而不是前面测试的内核参数。
--//这也再次验证了在两个都设置的情况下sqlnet.ora的SQLNET.EXPIRE_TIME优先。

--//星期一上班继续测试:
# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
08:37:16.765108 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 7520:7885(365) ack 9749 win 330
08:37:16.765893 IP 192.168.98.6.51673 > 192.168.100.78.1521: P 9749:9770(21) ack 7885 win 16425
08:37:16.766092 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 7885:8426(541) ack 9770 win 330
08:37:16.968199 IP 192.168.98.6.51673 > 192.168.100.78.1521: . ack 8426 win 16289
--//登录专用模式,不执行sql语句:
08:39:13.257561 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 8426:8436(10) ack 9770 win 330
08:39:13.459806 IP 192.168.98.6.51673 > 192.168.100.78.1521: . ack 8436 win 16287
08:40:13.268072 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 8436:8446(10) ack 9770 win 330
08:40:13.464091 IP 192.168.98.6.51673 > 192.168.100.78.1521: . ack 8446 win 16284
--//开始第一个时间间隔2分钟,然后1分钟,可以发现现在起作用的是sqlnet.ora的SQLNET.EXPIRE_TIME = 1
--//修改sqlnet.ora的SQLNET.EXPIRE_TIME = 2,不断开连接.
08:41:13.278271 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 8446:8456(10) ack 9770 win 330
08:41:13.480887 IP 192.168.98.6.51673 > 192.168.100.78.1521: . ack 8456 win 16282
08:42:13.288492 IP 192.168.100.78.1521 > 192.168.98.6.51673: P 8456:8466(10) ack 9770 win 330
08:42:13.502150 IP 192.168.98.6.51673 > 192.168.100.78.1521: . ack 8466 win 16279
--//可以发现修改sqlnet.ora的SQLNET.EXPIRE_TIME = 2,间隔不会变化.
--//现在重新退出登录数据库.
# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
..
08:43:02.135091 IP 192.168.100.78.1521 > 192.168.98.6.52047: P 6671:6688(17) ack 8097 win 330
08:43:02.332762 IP 192.168.98.6.52047 > 192.168.100.78.1521: . ack 6688 win 16307
--//登录专用模式,不执行sql语句:
08:47:01.947602 IP 192.168.100.78.1521 > 192.168.98.6.52047: P 6688:6698(10) ack 8097 win 330
08:47:02.144894 IP 192.168.98.6.52047 > 192.168.100.78.1521: . ack 6698 win 16304
08:49:01.967818 IP 192.168.100.78.1521 > 192.168.98.6.52047: P 6698:6708(10) ack 8097 win 330
08:49:02.164352 IP 192.168.98.6.52047 > 192.168.100.78.1521: . ack 6708 win 16302
--//开始第一个时间间隔4分钟,然后2分钟,可以发现重新登录后sqlnet.ora的SQLNET.EXPIRE_TIME = 2才生效.
--//可以发现与修改内核参数不同,修改内核参数马上生效,不需要退出.

3.继续测试:
--//还可以做一个测试验证我前面的对于共享模式的判断
--//内核参数设置如下:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 20

--//注解SQLNET.EXPIRE_TIME = 1
$ grep -i expire sqlnet.ora
#SQLNET.EXPIRE_TIME = 1

--//重启数据库与监听.
--//客户端连接服务器。sqlplus scott/book@192.168.100.78:1521/book,连接模式shared

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
...
09:01:06.792451 IP 192.168.100.78.1521 > 192.168.98.6.54322: P 7986:8534(548) ack 9650 win 330
09:01:06.792837 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166 <nop,nop,sack 1 {7986:8534}>
--//以共享模式连接sqlplus scott/book@192.168.100.78:1521/book,不执行任何sql语句.
09:01:26.795411 IP 192.168.100.78.1521 > 192.168.98.6.54322: . ack 9650 win 330
09:01:26.795686 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166
09:01:46.811467 IP 192.168.100.78.1521 > 192.168.98.6.54322: . ack 9650 win 330
09:01:46.811802 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166
--//可以发现20秒时间间隔.现在修改sqlnet.ora文件的SQLNET.EXPIRE_TIME = 1.
$ grep -i expire sqlnet.ora
SQLNET.EXPIRE_TIME = 1
--//继续观察...
09:02:06.843472 IP 192.168.100.78.1521 > 192.168.98.6.54322: . ack 9650 win 330
09:02:06.843830 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166
09:02:26.875433 IP 192.168.100.78.1521 > 192.168.98.6.54322: . ack 9650 win 330
09:02:26.875828 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166
09:02:46.907427 IP 192.168.100.78.1521 > 192.168.98.6.54322: . ack 9650 win 330
09:02:46.907868 IP 192.168.98.6.54322 > 192.168.100.78.1521: . ack 8534 win 16166
--//可以还是间隔20秒.退出在登录还是共享模式.
..
09:03:34.487441 IP 192.168.100.78.1521 > 192.168.98.6.54513: P 6736:6763(27) ack 7976 win 330
09:03:34.487686 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 6763 win 16228 <nop,nop,sack 1 {6736:6763}>
09:03:54.487425 IP 192.168.100.78.1521 > 192.168.98.6.54513: . ack 7976 win 330
09:03:54.487734 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 6763 win 16228

SCOTT@192.168.100.78:1521/book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
       281          3 9496:10444               SHARED    23039                     20          1 alter system kill session '281,3' immediate;
--//spid=23039
$ ps -ef | grep 2303[9]
oracle   23039     1  0 09:00 ?        00:00:00 ora_s000_book

$ kill -9 23039

--//在共享模式下执行,会出现短暂的等待:
SCOTT@192.168.100.78:1521/book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
       281          3 9496:10444               SHARED    23106                     20          2 alter system kill session '281,3' immediate;
--//spid=23106,发生了变化,相当于重新登录.
# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
...
09:05:55.995425 IP 192.168.100.78.1521 > 192.168.98.6.54513: . ack 11341 win 330
09:05:55.995742 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 10305 win 16166
--//不执行sql语句
09:06:16.027444 IP 192.168.100.78.1521 > 192.168.98.6.54513: . ack 11341 win 330
09:06:16.027734 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 10305 win 16166
09:06:36.059440 IP 192.168.100.78.1521 > 192.168.98.6.54513: . ack 11341 win 330
09:06:36.059751 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 10305 win 16166
09:06:56.091432 IP 192.168.100.78.1521 > 192.168.98.6.54513: . ack 11341 win 330
09:06:56.091779 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 10305 win 16166
-//还是20秒.现在kill s000与d000进程.
$ ps -ef | grep ora_[sd]000
oracle   23037     1  0 09:00 ?        00:00:00 ora_d000_book
oracle   23106     1  0 09:05 ?        00:00:00 ora_s000_book

$ kill -9 23106 23037

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
...
09:08:15.909145 IP 192.168.100.78.1521 > 192.168.98.6.54513: F 10305:10305(0) ack 11341 win 330
09:08:15.909449 IP 192.168.98.6.54513 > 192.168.100.78.1521: . ack 10306 win 16166

--//原来的连接已经断开,重新以共享模式登录:
SCOTT@192.168.100.78:1521/book> @ spid
       SID    SERIAL# PROCESS                  SERVER    SPID                     PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
       267          1 7376:11148               SHARED    23132                     19          2 alter system kill session '267,1' immediate;

$ ps -ef | grep ora_[sd]000
oracle   23132     1  0 09:08 ?        00:00:00 ora_s000_book
oracle   23134     1  0 09:08 ?        00:00:00 ora_d000_book

# tcpdump -i eth0  host 192.168.98.6 and not port 22 and port 1521 -nnn
...
09:08:47.862829 IP 192.168.100.78.1521 > 192.168.98.6.54899: P 7986:8534(548) ack 9649 win 330
09:08:48.059474 IP 192.168.98.6.54899 > 192.168.100.78.1521: . ack 8534 win 16166
--//不执行sql语句
09:10:43.827496 IP 192.168.100.78.1521 > 192.168.98.6.54899: P 8534:8544(10) ack 9649 win 330
09:10:44.027790 IP 192.168.98.6.54899 > 192.168.100.78.1521: . ack 8544 win 16164
09:11:43.837701 IP 192.168.100.78.1521 > 192.168.98.6.54899: P 8544:8554(10) ack 9649 win 330
09:11:44.047966 IP 192.168.98.6.54899 > 192.168.100.78.1521: . ack 8554 win 16161
09:12:43.847884 IP 192.168.100.78.1521 > 192.168.98.6.54899: P 8554:8564(10) ack 9649 win 330
09:12:44.040166 IP 192.168.98.6.54899 > 192.168.100.78.1521: . ack 8564 win 16159
--//这样才会使用sqlnet.ora的SQLNET.EXPIRE_TIME = 1.

4.补充说明共享模式与专用模式的问题:
--//个人在连接共享模式与专用模式上栽过许多坑。我的测试环境一直配置如下:
SCOTT@192.168.100.78:1521/book> show parameter dispatchers
NAME            TYPE     VALUE
--------------- -------- --------------------------------------
dispatchers     string   (PROTOCOL=TCP) (SERVICE=book,bookXDB)
max_dispatchers integer

SYS@book> show parameter service
NAME          TYPE   VALUE
------------- ------ ---------------
service_names string BOOK, BOOKSHARE

--//服务名book支持两种连接模式
--//建议共享模式与专用模式的服务名不要共用相同的服务名,单独分开。
--//有一些应用配置连接串时,选择默认连接,这样tnsnames.ora配置文件中没有(SERVER = SHARED|DEDICATED)
--//如果服务名支持两种连接模式,优先选择共享模式。ezconnect没有明确连接模式时也是一样。如果明确指明
--//写法如下:
--//sqlplus scott/book@192.168.100.78:1521/book:DEDICATED


5.总结:
--//写的有点乱长。我前面的问题在于我连接时使用的是ezconnect,正好服务名支持2种模式,优先使用共享模式。
--//导致我修改sqlnet.ora expire_time=1无效的假象。
--//在两个都设置的情况下,sqlnet.ora expire_time优先.
--//个人主张采用设置内核参数net.ipv4.tcp_keepalive_time = 590的方法,不要设置在sqlnet.ora中设置expire_time。
--//这里590来源于我的测试,链接:http://blog.itpub.net/267265/viewspace-2150614/=>[20180129]测量网络断开时间.txt
--//修改内核参数马上生效,而修改sqlnet.ora的sqlnet.ora expire_time参数要重新登录才有效.
--//注意共享模式的问题,就是如果你没有设置sqlnet.ora中expire_time,也许要重启数据库才有效.

--//最后实际上上面遇到的问题并不重要.但是如果你不去探究,就很容易失去了解学习的机会.

[20191108]内核参数tcp_keepalive与sqlnet.ora expire_time的一些总结.txt

全文结束