Making sense of Tuxedo's SCANUNIT, SANITYSCAN, and BLOCKTIME

21 Jun 2021

Time accounting is strange in Oracle Tuxedo. First, there is the SCANUNIT parameter which must be a value between 0 and 60 seconds and also a multiple of 2 or 5 (10 by default). Then the SANITYSCAN to perform system health checks and BLOCKTIME for blocking timeouts are expressed as the multipliers of that SCANUNIT. One of the side-effects of that is timeouts are rarely exact. Most often they are off by up to SCANUNIT seconds.

SCANUNIT

That was one of the design decisions that made no sense for me until I understood how it is implemented. Health checks and blocking timeouts are implemented by the BBL server. We can see how it works by tracing the system calls:

$ strace -p `pidof BBL`
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
alarm(10)                               = 0
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
alarm(0)                                = 10
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
semop(0, [{0, -1, SEM_UNDO}], 1)        = 0
semop(0, [{0, 1, SEM_UNDO}], 1)         = 0
rt_sigaction(SIGALRM, {sa_handler=0x4106fd, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f5d921ecb30}, NULL, 8) = 0
alarm(10)                               = 0
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
alarm(10)                               = 0
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
alarm(0)                                = 10
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
semop(0, [{0, -1, SEM_UNDO}], 1)        = 0
semop(0, [{0, 1, SEM_UNDO}], 1)         = 0
rt_sigaction(SIGALRM, {sa_handler=0x4106fd, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f5d921ecb30}, NULL, 8) = 0
alarm(10)                               = 0
msgrcv(32769, 0x1fdc038, 4712, -1073741824, 0) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---

So the BBL server is using old-school multi-tasking. It is waiting for incoming service requests on the request queue (msgrcv system call) for the .TMIB service. But before doing so, it sets up the timer using alarm(10) which will deliver the SIGALRM to the process in 10 seconds. And 10 is the value of SCANUNIT in this case. Every time SIGALRM is delivered to the BBL process, it checks for blocking timeouts, and once in a while checks if server processes are still alive by calling kill with a signal equal to 0.

Oracle Tuxedo 12c Release 1 (12.1.1) introduced millisecond granularity for SCANUNIT. Adding the MS suffix to the value allows specifying a SCANUNIT between 0 and 30,000 milliseconds. It means that crashed servers can be restarted faster and timeouts have greater accuracy. But the algorithm stays the same. Instead of using the alarm function call the setitimer function call is used, but that is the only difference. And finally by using 1000MS as the SCANUNIT we can express BLOCKTIME as the number of seconds and things start to make more sense.