Hi all,
we use DNSSRV balancing and forward_tcp() to replicate registrations from one SER to the other SERs in the system.
Now when one machine completely crashes, all other SER processes on all other machines hang when processing a REGISTER until tcp-connect times out, leading to a system load of ~16 per machine assuming 16 child processes per SER, and no other messages can be processed.
I understand that replicating using UDP would solve this issue, but then replicated registrations get lost every now and then because of unreliable transmission, and as far as I found out t_replicate() can only be used for replicating to *one* other SER.
This really gets me thinking about patching out the internal location cache and lookup every location from memory, because this additional lookup really doesn't hurt because of ~10 other DB queries per call.
IMHO in systems with more than two SERs this cache is just a big pain.
Andy