Description

Selected algorithm is a hash, over Call-ID, From, etc. If destination is not available, a new one must be selected, without breaking the hash assignation (all messages with the same hash must be forwarded to the same destination)

Expected behavior

The load to the failed destination to be (hopefully) evenly distributed over the remaining destinations.

Actual observed behavior

All the messages that were supposed to be sent to the failed destination are sent to only one of the remaining, causing it to receive double load, comparing with the rest

Debugging Data

In this test, 6000 Calls were sent to a group of six destinations, using algorithm "0" (hash over Call-ID)

  All destinations active
Dest1 996
Dest2 997
Dest3 999
Dest4 1006
Dest5 1005
Dest6 997
Total 6000

all_active

Repeating the test, but with destination Dest4 down:

  OriginalImplementation
Dest1 975
Dest2 993
Dest3 2022
Dest4 0
Dest5 1019
Dest6 991
Total 6000

one_down_original

We can see that poor destination Dest3 receives double traffic

Possible Solutions

I implemented a solution that does not break the expected behavior (messages with same hash are assigned the same destination, but with a better distribution over remaining destinations)

  New Implementation
Dest1 1175
Dest2 1206
Dest3 1191
Dest4 0
Dest5 1216
Dest6 1212
Total 6000

one_down_new

Basically, the solution is as simple as to execute a rehash of the original hash.

I will create a pull request with the proposed change, which I already tested, as it can be seen on the previous chart.

Additional Information

 kamailio 5.4.0-dev4 (x86_64/linux) 0c29e8-dirty
Linux Rechitsa 5.4.19-100.fc30.x86_64 #1 SMP Tue Feb 11 22:27:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.