Making sense of Tuxedo's SCANUNIT, SANITYSCAN, and BLOCKTIME
Time accounting is strange in Oracle Tuxedo. First, there is the SCANUNIT
parameter which must be a value between 0
and 60
seconds and also a multiple of 2
or 5
(10
by default). Then the SANITYSCAN
to perform system health checks and BLOCKTIME
for blocking timeouts are expressed as the multipliers of that SCANUNIT
. One of the side-effects of that is timeouts are rarely exact. Most often they are off by up to SCANUNIT
seconds.
That was one of the design decisions that made no sense for me until I understood how it is implemented. Health checks and blocking timeouts are implemented by the BBL
server. We can see how it works by tracing the system calls:
$ strace -p `pidof BBL`
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
alarm(10) = 0
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
alarm(0) = 10
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
semop(0, [{0, -1, SEM_UNDO}], 1) = 0
semop(0, [{0, 1, SEM_UNDO}], 1) = 0
rt_sigaction(SIGALRM, {sa_handler=0x4106fd, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f5d921ecb30}, NULL, 8) = 0
alarm(10) = 0
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
alarm(10) = 0
rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
alarm(0) = 10
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
semop(0, [{0, -1, SEM_UNDO}], 1) = 0
semop(0, [{0, 1, SEM_UNDO}], 1) = 0
rt_sigaction(SIGALRM, {sa_handler=0x4106fd, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f5d921ecb30}, NULL, 8) = 0
alarm(10) = 0
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
So the BBL
server is using old-school multi-tasking. It is waiting for incoming service requests on the request queue (msgrcv
system call) for the .TMIB
service. But before doing so, it sets up the timer using alarm(10)
which will deliver the SIGALRM
to the process in 10 seconds. And 10
is the value of SCANUNIT
in this case. Every time SIGALRM
is delivered to the BBL
process, it checks for blocking timeouts, and once in a while checks if server processes are still alive by calling kill
with a signal equal to 0
.
Oracle Tuxedo 12c Release 1 (12.1.1) introduced millisecond granularity for SCANUNIT
. Adding the MS
suffix to the value allows specifying a SCANUNIT
between 0 and 30,000 milliseconds. It means that crashed servers can be restarted faster and timeouts have greater accuracy. But the algorithm stays the same. Instead of using the alarm
function call the setitimer
function call is used, but that is the only difference. And finally by using 1000MS
as the SCANUNIT
we can express BLOCKTIME
as the number of seconds and things start to make more sense.