I don't remember by heart, but I think the child_init for PROC_MAIN is
indeed called before forking TCP worker processes, which in this case
results in propagation of the db connection.
Then the db open operation has to me moved from child init for rank
PROC_MAIN to the one with rank PROC_POSTCHILDINIT, if the connection is
needed by main process.
Cheers,
Daniel
On 02.05.22 20:58, Andrew Pogrebennyk wrote:
Henning,
yes, will do. For me it seems to solve the problem,
but I have doubt about this code in ims_usrloc_[sp]cscf where its
origin is in usrloc:
case DB_ONLY:
case WRITE_THROUGH:
/* connect to db only from SIP workers, TIMER
and MAIN processes,
* and RPC processes */
if (_rank<=0 && _rank!=PROC_TIMER &&
_rank!=PROC_MAIN
&& _rank!=PROC_RPC)
return 0;
break;
The connection creation is skipped when _rank is less than -2, for
higher rank numbers we connect - including from the main process.
Based on Daniel's suggestion I also looked if the main proc closes the
connection after doing some stuff.. but no: main process does not
close the connection AFAICS - then it is available in forked tcp
worker processes.
As I found for IMS it works well when the PROC_MAIN does not make a
connection.
If I look at open sockets by kamailio 5.4 running plain usrloc, it
looks better to me with db_mode 0:
- with db_mode 0 i does not have multiple tcp sockets opened for redis
in parallel children
- with db_mode 1 main process has connection open for redis and tcp
workers inherit the socket inode from the main.
I did not test the normal usrloc yet, whether there is any regression
or if it works well if I implement the changes there.
This is the main thing which is holding me back from making PR to
usrloc, ims_usrloc_pcscf, ims_usrloc_scscf.
So to me it looks like it doesn't serve any purpose and other users
could hit the bug; the condition when it happens two tcp children
receiving two registrations close to the same time. Maybe not many
users are running usrloc with db_redis ?
Regards,
Andrew
On Mon, May 2, 2022, 16:52 Henning Westerholt <hw(a)gilawa.com> wrote:
Hello,
thanks for the confirmation. Please create a pull request on our
tracker with the fix if your tests were successful.
Cheers,
Henning
--
Henning Westerholt –
https://skalatan.de/blog/
<https://skalatan.de/blog/>
Kamailio services –
https://gilawa.com <https://gilawa.com/>
*From:* sr-dev <sr-dev-bounces(a)lists.kamailio.org> *On Behalf Of
*Andrew Pogrebennyk
*Sent:* Friday, April 29, 2022 6:27 PM
*To:* Daniel-Constantin Mierla <miconda(a)gmail.com>
*Cc:* Kamailio (SER) - Development Mailing List
<sr-dev(a)lists.kamailio.org>
*Subject:* Re: [sr-dev] db_redis shared tcp connection issue
Daniel,
I think I found it. Since some historic times the ims_usrloc_scscf
and usrloc_pcscf have had connection opened for main process in
child init.
I changed the child init from:
case WRITE_THROUGH:
/* connect to db only from SIP workers, TIMER and MAIN processes */
if (_rank<=0 && _rank!=PROC_TIMER && _rank!=PROC_MAIN)
return 0;
to
case WRITE_THROUGH:
/* skip child init for non-worker process ranks */
if (_rank==PROC_INIT || _rank==PROC_MAIN || _rank==PROC_TCP_MAIN)
return 0;
Testing it.
On Fri, Apr 29, 2022 at 4:18 PM Daniel-Constantin Mierla
<miconda(a)gmail.com> wrote:
No.
Connections opened in mod init or child init for rank proc
main/init must be closed again there.
If a component wants to keep the connection open, has to be
done in child init for ranks corresponding to sip workers,
rpcs, timers, ...
On 29.04.22 15:25, Andrew Pogrebennyk wrote:
Hi Daniel,
I am not sure if I understood you correctly. Do you mean
that child_init should open the connection only when the
rank is proc main or proc init?
For example, in pua module we have
static int child_init(int rank)
{
if (rank==PROC_INIT || rank==PROC_MAIN ||
rank==PROC_TCP_MAIN)
return 0; /* do nothing for the main
process */
if (pua_dbf.init==0)
{
LM_CRIT("database not bound\n");
Is that correct? If I have a module which does not connect
in child_init for rank PROC_RPC, but the origin of this
module (ims_dialog vs dialog), does also establish
connection in RPC rank would that be a problem? No, right? :)
Thanks for the pointer, checking it.
Andrew
On Fri, Apr 29, 2022 at 1:17 PM Daniel-Constantin Mierla
<miconda(a)gmail.com> wrote:
Hello,
this sounds like a module does a db operation in mod
init opening the connection, but does it close it
afterwards there. It should then re-open in child init.
It can be also in child_init(), but when the rank is
proc main or proc init. In child init db connection
has to be left opened only for the other ranks.
Try to identify which component makes the first operation.
Cheers,
Daniel
On 29.04.22 12:39, Andrew Pogrebennyk wrote:
Dear community,
I've been looking at some weirdness in db_redis
behavior when it returns the responses to the
queries made by tcp processes in mixed order.
Tested this on various kamailio 5.3 and 5.4
(sipwise spce) and they are showing interesting
pattern.
After restart of kamailio I ran lsof to enumerate
all the sockets open in kamailio children.
There is a connection to db port 6379 which is
held by multiple processes at the same time.
for i in $(ps auxww | grep kamailio.proxy |
grep -v grep | awk '{print $2}'); do echo
"print file descriptors of $i" && sudo lsof -p
$i | grep 6379; done > redis_conn.txt
...i see that lsof lists tcp client socket to
redis server with same source TCP port and same
inode number in several processes:
14199, "TIMER NH",
14200, "ctl handler",
14205, "Dialog Clean Timer",
14206, "JSONRPCS FIFO",
14210, "JSONRPCS DATAGRAM",
14213, "tcp receiver (generic) child=0",
14214, "tcp receiver (generic) child=1",
14215, "tcp receiver (generic) child=2",
14220, "tcp receiver (generic) child=3",
14224, "tcp receiver (generic) child=4",
14225, "tcp main process"
The UDP processes are safe (and some timer ones
too), because in that lsof they have unique TCP
client port.
That's giving me a lot of headache because UA
registrations received by any of the TCP workers
(or IPSec ones for that matter) are
randomly failing, because if two processes
made same query to DB in parallel it is appearing
on the wire with same TCP source port and replies
can be mixed up.
This can be some bug in usage of hiredis,
impacting all users of db_redis module. Is there
any relation to the way kamailio is working its
TCP workers, where maybe tcp workers are forked
from the main attendant processes after having
opened the DB connection?
P.S. Why I have the above hypothesis: when I log
redis queries with redis-cli monitor at startup of
kamailio, I see only that srem_key_lua is executed
against redis in runtime only once from that
source port, but then this connection is shared
across multiple processes.
Regards,
Andrew
_______________________________________________
Kamailio (SER) - Development Mailing List
sr-dev(a)lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
--
Daniel-Constantin Mierla --
www.asipto.com <http://www.asipto.com>
www.twitter.com/miconda <http://www.twitter.com/miconda> --
www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
Kamailio Advanced Training - Online
*
https://www.asipto.com/sw/kamailio-advanced-training-online/
--
Daniel-Constantin Mierla --
www.asipto.com <http://www.asipto.com>
www.twitter.com/miconda <http://www.twitter.com/miconda> --
www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
Kamailio Advanced Training - Online
*
https://www.asipto.com/sw/kamailio-advanced-training-online/