I have been running a very stable Kamailio 1.4 install for over a year now with no downtime. From time to time I get a message from the OS telling me that task kamailio:xxxx blocked for more than 120 seconds and a dump into syslog.
I recently setup Kamailio 3.01 and it's been running for around 3 weeks now. Purely testing at the moment, there is only 1 SIP registration and almost no activity most of the time.
[269844.208011] INFO: task kamailio:28504 blocked for more than 120 seconds.
[269844.208093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[269844.208178] kamailio D f6d77bf8 0 28504 28484
[269844.208180] f771e140 00000082 00000002 f6d77bf8 f7091390 f771e2cc c4019020 00000000
[269844.208184] 00000000 011615c9 f7091398 00000282 00000000 00000000 00000000 ffffffff
[269844.208188] f5cba384 f5cba38c f5cba388 f771e140 c02c91ec f6969d68 f5cba38c f771e140
[269844.208191] Call Trace:
[269844.208203] [<c02c91ec>] __mutex_lock_slowpath+0x50/0x7b
[269844.208208] [<c02c9082>] mutex_lock+0xa/0xb
[269844.208210] [<c02b7d4d>] unix_dgram_recvmsg+0x3e/0x231
[269844.208214] [<c0161beb>] get_page_from_freelist+0xc1/0x3e9
[269844.208218] [<c0161045>] __rmqueue_smallest+0x83/0xe3
[269844.208221] [<c025f771>] sock_recvmsg+0xde/0xf9
[269844.208226] [<c01387b0>] autoremove_wake_function+0x0/0x2d
[269844.208231] [<c01620e9>] __alloc_pages_internal+0xb5/0x34e
[269844.208236] [<c0260595>] sys_recvfrom+0xb4/0x116
[269844.208239] [<c011af13>] do_page_fault+0x0/0x8f9
[269844.208242] [<c0108540>] do_notify_resume+0x5f8/0x649
[269844.208247] [<c011b3c5>] do_page_fault+0x4b2/0x8f9
[269844.208250] [<c0260a48>] sys_socketcall+0x135/0x19e
[269844.208254] [<c0108853>] sysenter_past_esp+0x78/0xb1
[269844.208257] [<c02c0000>] xenfb_probe+0xd1/0x35b
[269844.208261] =======================
[269844.208262] INFO: task kamailio:28506 blocked for more than 120 seconds.
[269844.208314] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[269844.208395] kamailio D d0e00947 0 28506 28484
[269844.208397] f77f8e60 00000082 00000002 d0e00947 00004292 f77f8fec c4024020 00000001
[269844.208401] 00000000 00000001 00000002 00000000 f7091398 00000001 00000001 c01211f2
[269844.208404] f5cba384 f5cba38c f5cba388 f77f8e60 c02c91ec f5cba38c f60add68 f77f8e60
[269844.208408] Call Trace:
[269844.208410] [<c01211f2>] __wake_up_sync+0x2a/0x3e
[269844.208413] [<c02c91ec>] __mutex_lock_slowpath+0x50/0x7b
[269844.208416] [<c02c9082>] mutex_lock+0xa/0xb
[269844.208418] [<c02b7d4d>] unix_dgram_recvmsg+0x3e/0x231
[269844.208421] [<c0161beb>] get_page_from_freelist+0xc1/0x3e9
[269844.208424] [<c0161045>] __rmqueue_smallest+0x83/0xe3
[269844.208427] [<c025f771>] sock_recvmsg+0xde/0xf9
[269844.208431] [<c01387b0>] autoremove_wake_function+0x0/0x2d
[269844.208435] [<c01620e9>] __alloc_pages_internal+0xb5/0x34e
[269844.208440] [<c0260595>] sys_recvfrom+0xb4/0x116
[269844.208444] [<c0180d0d>] cp_new_stat64+0xfc/0x10e
[269844.208449] [<c011b3c5>] do_page_fault+0x4b2/0x8f9
[269844.208452] [<c0260a48>] sys_socketcall+0x135/0x19e
[269844.208456] [<c0108853>] sysenter_past_esp+0x78/0xb1
[269844.208459] [<c02c0000>] xenfb_probe+0xd1/0x35b
[269844.208462] =======================
[274198.229285] INFO: task kamailio:28504 blocked for more than 120 seconds.
[274198.229340] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[274198.229422] kamailio D f6d77bf8 0 28504 28484
[274198.229425] f771e140 00000082 00000002 f6d77bf8 f7091390 f771e2cc c4019020 00000000
[274198.229429] 00000000 011615c9 f7091398 00000282 00000000 00000000 00000000 ffffffff
[274198.229432] f5cba384 f5cba38c f5cba388 f771e140 c02c91ec f6969d68 f5cba38c f771e140
[274198.229436] Call Trace:
[274198.229447] [<c02c91ec>] __mutex_lock_slowpath+0x50/0x7b
[274198.229453] [<c02c9082>] mutex_lock+0xa/0xb
[274198.229455] [<c02b7d4d>] unix_dgram_recvmsg+0x3e/0x231
[274198.229459] [<c0161beb>] get_page_from_freelist+0xc1/0x3e9
[274198.229463] [<c0161045>] __rmqueue_smallest+0x83/0xe3
[274198.229466] [<c025f771>] sock_recvmsg+0xde/0xf9
[274198.229471] [<c01387b0>] autoremove_wake_function+0x0/0x2d
[274198.229476] [<c01620e9>] __alloc_pages_internal+0xb5/0x34e
[274198.229481] [<c0260595>] sys_recvfrom+0xb4/0x116
[274198.229484] [<c011af13>] do_page_fault+0x0/0x8f9
[274198.229487] [<c0108540>] do_notify_resume+0x5f8/0x649
[274198.229492] [<c011b3c5>] do_page_fault+0x4b2/0x8f9
[274198.229495] [<c0260a48>] sys_socketcall+0x135/0x19e
[274198.229499] [<c0108853>] sysenter_past_esp+0x78/0xb1
[274198.229502] [<c02c0000>] xenfb_probe+0xd1/0x35b
[274198.229506] =======================
[274198.229507] INFO: task kamailio:28506 blocked for more than 120 seconds.
[274198.229559] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[274198.229641] kamailio D d0e00947 0 28506 28484
[274198.229643] f77f8e60 00000082 00000002 d0e00947 00004292 f77f8fec c4024020 00000001
[274198.229646] 00000000 00000001 00000002 00000000 f7091398 00000001 00000001 c01211f2
[274198.229650] f5cba384 f5cba38c f5cba388 f77f8e60 c02c91ec f5cba38c f60add68 f77f8e60
[274198.229653] Call Trace:
[274198.229656] [<c01211f2>] __wake_up_sync+0x2a/0x3e
[274198.229659] [<c02c91ec>] __mutex_lock_slowpath+0x50/0x7b
[274198.229662] [<c02c9082>] mutex_lock+0xa/0xb
[274198.229664] [<c02b7d4d>] unix_dgram_recvmsg+0x3e/0x231
[274198.229667] [<c0161beb>] get_page_from_freelist+0xc1/0x3e9
[274198.229670] [<c0161045>] __rmqueue_smallest+0x83/0xe3
[274198.229673] [<c025f771>] sock_recvmsg+0xde/0xf9
[274198.229677] [<c01387b0>] autoremove_wake_function+0x0/0x2d
[274198.229681] [<c01620e9>] __alloc_pages_internal+0xb5/0x34e
[274198.229686] [<c0260595>] sys_recvfrom+0xb4/0x116
[274198.229690] [<c0180d0d>] cp_new_stat64+0xfc/0x10e
[274198.229696] [<c011b3c5>] do_page_fault+0x4b2/0x8f9
[274198.229699] [<c0260a48>] sys_socketcall+0x135/0x19e
[274198.229703] [<c0108853>] sysenter_past_esp+0x78/0xb1
[274198.229706] [<c02c0000>] xenfb_probe+0xd1/0x35b
There was no activity when this happened. ON the 1.4 box there could be about 10 sessions setup when it happens.
Even on the old install this doesn't seem to cause any problem and same here on 3.01 but would like to try and solve it.