Re: [SR-Users] out of shm without any visible reason

11 Mar 2020


      On 11.03.20 09:04, Juha Heinanen wrote:
...
Daniel-Constantin Mierla writes:
...
It seems to be the case of a retransmission timeout:
#17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604
Code is 408 and the reply is faked value. This case is happening in
timer process.
That explains it.  But isn't it risky that in this kind of situation
the timer process (the only one) handles the reply and accounting?
There are many cases when delays can increase the risk of
malfunctioning, no matter is in timer module or a sip routing worker. If
that a process is blocked, slots on internal hash tables (e.g., user
location) can be locked and no other process can continue processing
until that process unlocks. Interaction with external systems such as
database, api servers, dns service ... are the typical candidates for
adding significant delay. For specific deployments, there are some
solutions to do as less as possible blocking operations, but it would be
probably impossible to do it everywhere when dealing with external
systems. Such example is even the async-insert added to db_mysql quite
some time ago, or mqueue+rtimer or async modules.
...
The problem is related to db_cluster/mariadb/debian.  If db_cluster is
not used, everything works fine.  With db_cluster, accounting hangs the
timer process at regular (about 2 hour) intervals.
If it happens periodically, maybe you can track why: try to identify
apps accessing the database for back up, cdr generation, etc ... as well
as infrastructure maintenance operations (vm backup snapshot).
Cheers,
Daniel
-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [SR-Users] out of shm without any visible reason