I have reached the point where a single instance of rtpproxy is about maxed-out, showing that its single process is comsuming 85% or so and climbing of the CPU available. That works out to about 600 standing calls, the bulk of which are G.711.
This is a multi-processor/core system and I have several other CPUs not really doing anything, so I would like to start up multiple instances of rtpproxy, which rtpproxy says it can do, controlling them via AF_INET sockets instead of the default AF_UNIX.
How you set up rtpproxy for multiple instances seems reasonably straightforward, but what you do on the SER side to let it know that these additional copies of rtpproxy exist and how you or SER divides call load between them is less clear.
Is there any documentation or examples showing this done, or better still, someone with a working ser.ctl file who would be willing to show at least that part of their configuration?
Ideally, I'm looking for a case where calls are divided among rtpproxies regardless of where they came from or where they are going, as I have one particular call source/destination pair that is absurdly large compared to all others, and that so it in particular has to be distributed across across multiple rtpproxies.
Thanks in advance!
Frank Durda IV frank.durda@hypercube-llc.com wrote:
How you set up rtpproxy for multiple instances seems reasonably straightforward, but what you do on the SER side to let it know that these additional copies of rtpproxy exist and how you or SER divides call load between them is less clear.
See changelog comment of 2005-02-24 in modules/nathelper/nathelper.c.
A follow-up on using multiple instances of rtpproxy. I finally got this to work, although I had to look at a lot of source code and contradictory (and terse) bits of documentation and finally experiment a while. So here is what I would consider "real" documentation on how to use these functions and features of both SER and rtpproxy in one place.
Although not a requirement to make this work, I'll mention my hardware platform configuration is a Dell 2950 III 48V DC-powered unit with a single four-core processor. The second CPU socket is empty. FreeBSD reports:
CPU: Intel(R) Xeon(R) CPU E5430 @ 2.66GHz (2665.33-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE, MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xce3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR, PDCM,DCA,<b19>> AMD Features=0x20100800<SYSCALL,NX,LM> AMD Features2=0x1<LAHF> Cores per package: 4
I run rtpproxy in the two-interface mode, which from previous messages here I have found that I must be the only person who does, but I do have several good reasons for doing so.
I also recently reported a coding bug in rtpproxy that at least prevents the two-interface mode from working properly. The fix I came up for rtpproxy is: (diff -u this time, since apparently some OSes "patch" program can't cope with diff -c)
--- main.c.STOCK 2008-02-20 18:51:44.000000000 +0000 +++ main.c 2009-03-25 20:31:34.000000000 +0000 @@ -930,7 +940,7 @@ * cannot be trusted and address is different from one * that we recorded update it. */ - if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL && + if (spa->untrusted_addr[pidx] == 0 && !(spa->addr[pidx] != NULL && SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) && memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) { rtpp_log_write(RTPP_LOG_INFO, spa->log, "pre-filling %s's address " @@ -940,7 +950,7 @@ spa->addr[pidx] = ia[0]; ia[0] = NULL; } - if (spa->rtcp->untrusted_addr == 0 && !(spa->rtcp->addr[pidx] != NULL && + if (spa->rtcp->untrusted_addr[pidx] == 0 && !(spa->rtcp->addr[pidx] != NULL && SA_LEN(ia[1]) == SA_LEN(spa->rtcp->addr[pidx]) && memcmp(ia[1], spa->rtcp->addr[pidx], SA_LEN(ia[1])) == 0)) { if (spa->rtcp->addr[pidx] != NULL)
My setup is a bit more complicated than probably typical, so if you don't use two interfaces, simply use the one-interface invocation of rtpproxy ("-l oneipaddr") and build up your additional IP addresses as needed. You will see more on this in a moment.
Each rtpproxy process needs its own IP address(es) to bind to for RTP traffic, so you will need to alias multiple IP addresses on the same ethernet interface or use multiple interfaces. In my case, I put multiple IP address for each side of the rtpproxy instances on the same physical interface using "ifconfig interface-name alias" commands.
I chose to use multiple IP addresses on the same physical interface because I find that on this system I can't move more than about 40Mbit/sec per interface on a single rtpproxy instance before the processor core is at 85% utilization, so four should max-out around 160Mbit/sec, still well below the capacity of a single gigabit ethernet interface. Depending on your CPU speeds and maker as well as the system bus and ethernet card type you are using, your numbers will vary. That may mean buying plug-in cards rather than using the interfaces that happen to be on the motherboard (which frequently are slower, interrupt intensive and generally inefficient compared to others you can install).
The commands to put multiple IP addresses on the same interface would be like this on BSD and similar systems: (IP addresses are altered)
ifconfig em4 10.5.6.7 netmask 255.255.255.192 ifconfig em4 alias 10.5.6.8 netmask 255.255.255.255 ifconfig em4 alias 10.5.6.9 netmask 255.255.255.255 ifconfig em4 alias 10.5.6.10 netmask 255.255.255.255
and so on. Aliases in the same netblock as the non-aliased IP address normally have a 255.255.255.255 netmask, at least in the BSD world. FreeBSD lets you do this in /etc/rc.conf like this:
ifconfig_em4="inet 10.5.6.7 netmask 255.255.128.0" ifconfig_em4_alias0="inet 10.5.6.8 netmask 255.255.255.255" ifconfig_em4_alias1="inet 10.5.6.9 netmask 255.255.255.255" ifconfig_em4_alias2="inet 10.5.6.10 netmask 255.255.255.255"
(If you are running Linux, the way you do aliased IP addresses is probably different.) On FreeBSD, the result looks like:
(our inside network) ifconfig -a ... em4: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet 10.5.6.7 netmask 0xffffffc0 broadcast 10.5.6.63 inet 10.5.6.8 netmask 0xffffffff broadcast 10.5.6.8 inet 10.5.6.9 netmask 0xffffffff broadcast 10.5.6.9 inet 10.5.6.10 netmask 0xffffffff broadcast 10.5.6.10 ether 00:15:17:5b:xx:xx media: Ethernet autoselect (1000baseTX <full-duplex>) status: active
(public Internet side, built with similar commands) em5: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet 66.1.2.3 netmask 0xffffffc0 broadcast 66.1.2.63 inet 66.1.2.4 netmask 0xffffffff broadcast 66.1.2.4 inet 66.1.2.5 netmask 0xffffffff broadcast 66.1.2.5 inet 66.1.2.6 netmask 0xffffffff broadcast 66.1.2.6 ether 00:15:17:5b:xx:xx media: Ethernet autoselect (1000baseTX <full-duplex>) status: active ...
Now that you have some IP addresses defined, start several instances of rtpproxy each listening for instructions on at least a different port than the other instances, as in:
(As found in /etc/rc.local - non BSD systems may need this elsewhere, but this needs to be ahead of where ser is started and after the interfaces get their IP addresses.)
/usr/local/bin/rtpproxy -F -s udp:127.0.0.1:8001 -l 66.1.2.3/10.5.6.7 /usr/local/bin/rtpproxy -F -s udp:127.0.0.1:8002 -l 66.1.2.4/10.5.6.8 /usr/local/bin/rtpproxy -F -s udp:127.0.0.1:8003 -l 66.1.2.5/10.5.6.9 /usr/local/bin/rtpproxy -F -s udp:127.0.0.1:8004 -l 66.1.2.6/10.5.6.10
CAUTION: There are probable security issues with using public IP addresses for the rtpproxy control address, so avoid doing that. (rtpproxy will take instructions from anybody who can get to the port and knows how to talk to it, or might just send it junk and crash that process. If you must use a public IP, block external access to the port range that is used to control rtpproxy.)
Once running, reference those rtpproxy processes via a line like this in ser.cfg:
modparam("nathelper", "rtpproxy_sock", "udp:127.0.0.1:8001=1 udp:127.0.0.1:8002=1 udp:127.0.0.1:8003=1 udp:127.0.0.1:8004=1")
That's all on one line. You will need to comment-out the existing nathelper line that uses the AF_UNIX socket to communicate with just the one copy of rtpproxy. If you still have the old AF_UNIX rtpproxy running, kill it.
The "=1" part of the modparam is the weighting, and can be handy if some of the rtpproxy processes are running on a different computer that is faster or slower than the local one or others. If all on the same computer, they should all be equal.
The documentation on the weighting control is unusually weak (even by SER documentation standards), so I can't tell you exactly how the weighting control works, what ranges of values are valid (or if they can be floating point), or the total number of weights should add up to a total value, or if it is simply a round-robin distribution, where a "1" says the next call goes to this rtpproxy before advancing to the next rtpproxy instance, while a "2" says the next two calls go here before advancing, and so on. I'm using all "1" values for the local system and it seems to be balancing more or less evenly, so for other choices you will need to experiment.
Once running for a while, you should see something like this in "top"
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 2207 xxxxxxxx 1 8 0 4180K 1056K nanslp 1 17.5H 27.05% rtpproxy 2208 xxxxxxxx 1 8 0 4192K 1068K nanslp 2 17.5H 26.66% rtpproxy 2206 xxxxxxxx 1 102 0 4204K 1080K RUN 3 17.4H 25.29% rtpproxy 2205 xxxxxxxx 1 8 0 4228K 1104K nanslp 0 17.3H 19.92% rtpproxy
You should now also see in tcpdump/ngrep output that the SDP payloads have different IP addresses in them for the rtpproxy that is handling the given call. Based on limited experience so far, starting more rtpproxy instances on a machine than there are total processors/cores available doesn't seem to do anything bad, but I would think it would be less efficient than having the numbers of rtpproxy equal to or less than available CPUs.
I did not mess with any of the mechanisms SER offers that can force a given call to a given instance. Such a thing might be useful if you want to guarantee a higher level of stability in packet forwarding or if some of the instances were functionally different. In my case, all instances of rtpproxy have the same capabilities.
Good luck!
Frank Durda IV http://nemesis.lonestar.org Copyright 2009, ask before reprinting.
El Martes 31 Marzo 2009, Frank Durda IV escribió:
Once running, reference those rtpproxy processes via a line like this in ser.cfg:
modparam("nathelper", "rtpproxy_sock", "udp:127.0.0.1:8001=1 udp:127.0.0.1:8002=1 udp:127.0.0.1:8003=1 udp:127.0.0.1:8004=1")
Very interesting. Let me a question:
You are setting "=1" for all the RtpProxies nodes. This means that when you call "force_rtpproxy()" it will choose one of them randomly.
In case an initial INVITE-200 is handled by RtpProxy A, and later a re-INVITE arrives, how do you get "force_rtpproxy()" contacts the same RtpProxy A during re-INVITE transaction?
IUaki Baz Castillo wrote:
Very interesting. Let me [ask] a question:
You are setting "=1" for all the RtpProxies nodes. This means that when you call "force_rtpproxy()" it will choose one of them randomly.
In case an initial INVITE-200 is handled by RtpProxy A, and later a re-INVITE arrives, how do you get "force_rtpproxy()" contacts the same RtpProxy A during re-INVITE transaction?
First off, the default isn't random. Having a higher weight increases the opportunity for calls to be assigned to a given rtpproxy, but the choice for a given call is not random. When you run multiple instances, you will quickly notice that some rtpproxy processes burn more CPU than others, which can be for a variety of reasons, including getting more calls than others because the distribution across rtpproxy instances isn't perfectly even even if you ask for it to be. You'll see why that happens in a moment.
Now, I take it you are asking what happens in the case of the INVITE coming in from the calling party, and later the 183 or 200 comes back from the called party, and for each of these a force_rtp_proxy() is performed. How do you get both directions of the call to go to the same rtpproxy instance, or if something else changes after the 200 OK, even though nathelper maintains no call state? You have three options.
One, run with only one rtpproxy, which seems to be the most common choice made. With no choices you have no worries, but you are very limited on how many calls you can handle.
Two, force_rtp_proxy() offers the primitive capability of using the "Nn" flag to literally hard-code a specific rtpproxy instance to go to without regard to any other factors. This is only vaguely useful because it means you have to force all the calls for a given call-condition (based on some criteria, calling source IP, account credentials, destination IP, last digit of the called number, etc) to get some hand-balanced distribution. If you have six tiny call sources and one huge one, it doesn't help you much, unless the last digit trick works for you.
The limitions implicit in having the script specify the proxy to use via the Nn flag, combined with what I consider to be the most stupid/horrible missing bit of functionality in SER (that of not being able to pass variables to functions/modules*), and you are greatly limited in what smart things you can do to scale SER+rtpproxy, if you do it via the ser.cfg file.
*I note OpenSER supposedly allows you to pass variables to functions (at least reading their lexical scanner seems to have added the rules to allow this, but I don't know if it actually works or not. I would certainly use it for a large number of other things if it was available.)
So choice One limits your ability to scale, and choice Two potentially means a horribly complex ser.cfg file. Fortunately, there is choice Three.
Three, SER quietly (and possibly inadvertantly) takes care of this issue by using the Call-ID as the variable value that is used to select the rtpproxy. The Call-ID string is ground-up in a hashing algorithm (see select_rtpp_node() in modules/nathelper/nathelper.c) and a value between 0 and N comes out of that, and that combined with the total number of weight possibilities selected at start time, skipping any disabled proxies (presumably because they became unresponsive in the past) dictates which rtpproxy that call will be sent to. So the distribution is not at all random nor does it do a traditional ascending/descending orderly assignment to rtpproxy instances like one might find in circuit assignments in TDM or CAS trunk groups, and what I thought the behavior would be when I first read what little documentation there was on how the weighting system worked. The even distribution of the calls across the instances of rtpproxy that have even weighting values still depends on how good the hashing algorithm is, and having a good mix of incoming Call-IDs. In practice, it will usually be somewhat off-balance, favoring one proxy over others at any given time. Just make sure you have enough CPU capacity so that any rtpproxy instance has room to a little hot. If you are using an OS that allows you to lock processes to specific CPUs, don't use that feature on rtpproxy, unless you are giving each rtpproxy 100% of its own CPU.
So, this means that each time force or unforce rtpproxy calls, this same hash gets performed on the same Call-ID for a given call, and except for rare cases where a proxy has failed, you will end up sending the force/unforce for a given Call-ID to the same rtpproxy instance every time. At least, this is how I read the source code.
I'll point out that if the initial selection of rtpproxy from the first force_rtp_proxy() of a given call session had simply been recorded as an integer somewhere with the other trivia that is maintained for the duration of a given call session, nathelper wouldn't have to burn cycles and time recomputing the hash as many as additional three times for the typical call (two more force_rtp_proxy() calls for 183 and 200 responses, then an unforce_rtp_proxy() to tear things down), but that's the limited behavior that exists in there today.
Hi,
Frank Durda IV frank.durda@hypercube-llc.com wrote:
So, this means that each time force or unforce rtpproxy calls, this same hash gets performed on the same Call-ID for a given call, and except for rare cases where a proxy has failed, you will end up sending the force/unforce for a given Call-ID to the same rtpproxy instance every time. At least, this is how I read the source code.
Yes, that's how it works and that was the task when we were writing this code. In absense of dialog state in SER (the most primitive one was invented a few months later) and need to keep a characteristics which is stable during the stable work, the only solution was a kind of hashing by dialog ID among alive proxies. If alive proxy set is changed, this does it best - keeps the calls which doesn't need re-INVITE and allows them to work in most cases after re-INVITE if used in simple scenario.
I'll point out that if the initial selection of rtpproxy from the first force_rtp_proxy() of a given call session had simply been recorded as an integer somewhere with the other trivia that is maintained for the duration of a given call session, nathelper wouldn't have to burn cycles and time recomputing the hash as many as additional three times for the typical call (two more force_rtp_proxy() calls for 183 and 200 responses, then an unforce_rtp_proxy() to tear things down), but that's the limited behavior that exists in there today.
This requires to maintain dialog state. In PortaSIP, we later moved this to B2BUA because it already maintains dialog state and we didn't need to extend SER which programming is extremely hard. This was approx. version 0.9.3. If current implementation allows custom per-transaction and per-dialog variables kept through lifetime of according object, this would be rewritten with new mechanisms.
2009/4/1 Frank Durda IV frank.durda@hypercube-llc.com:
Now, I take it you are asking what happens in the case of the INVITE coming in from the calling party, and later the 183 or 200 comes back from the called party, and for each of these a force_rtp_proxy() is performed. How do you get both directions of the call to go to the same rtpproxy instance, or if something else changes after the 200 OK, even though nathelper maintains no call state? You have three options.
In fact I expected that, being transaction stateful, *SER uses the same RtpProxy when running force_rtpproxy in request and response(s) belongs to the same transaction. I'm not sure of it anyway.
My doubt was about the case of re-INVITE/BYE. More inline:
The limitions implicit in having the script specify the proxy to use via the Nn flag, combined with what I consider to be the most stupid/horrible missing bit of functionality in SER (that of not being able to pass variables to functions/modules*), and you are greatly limited in what smart things you can do to scale SER+rtpproxy, if you do it via the ser.cfg file.
*I note OpenSER supposedly allows you to pass variables to functions (at least reading their lexical scanner seems to have added the rules to allow this, but I don't know if it actually works or not. I would certainly use it for a large number of other things if it was available.)
yes, OpenSER allows pseudo-variable as argument in lots of functions (not all).
Three, SER quietly (and possibly inadvertantly) takes care of this issue by using the Call-ID as the variable value that is used to select the rtpproxy. The Call-ID string is ground-up in a hashing algorithm (see select_rtpp_node() in modules/nathelper/nathelper.c) and a value between 0 and N comes out of that, and that combined with the total number of weight possibilities selected at start time, skipping any disabled proxies (presumably because they became unresponsive in the past) dictates which rtpproxy that call will be sent to.
Ok, but since the chosen RtpProxy depends on the Call-ID and weight combination, how to ensure that force_rtpproxy() in 200 selects the same instance than the one selected by force_rtpproxy() in the INVITE? (the same for re-INVITE/BYE).
This is, a call is established under some circumstances an these cause a selection of a specific RtpProxy. After 30 minutes the call is put on hold by sending a re-INVITE. Now (after 30 minutes) the circumstances have changed so the same Call-ID causes force_rtpproxy() to select other RtpProxy.
This basically means that the RTP session would remain open in the first RtpProxy (until it expires?). And this also means that *SER script cannot rely on the return code of "force_rtpproxy(l)" that is very useful to check if the current in-dialog INVITE belongs to a call for which initially force_rtpproxy() took place.
So, this means that each time force or unforce rtpproxy calls, this same hash gets performed on the same Call-ID for a given call, and except for rare cases where a proxy has failed, you will end up sending the force/unforce for a given Call-ID to the same rtpproxy instance every time. At least, this is how I read the source code.
I believe that if the decision jsut depends on the Call-ID, but since it also depends on the weight... perhaps I miss something?
I'll point out that if the initial selection of rtpproxy from the first force_rtp_proxy() of a given call session had simply been recorded as an integer somewhere with the other trivia that is maintained for the duration of a given call session,
It could be added as parameter to the Record-Route header (obviously loose routing is required if we want the proxy to manage the RTP stream via RtpProxy).
This is:
- In the initial INVITE force_rtpproxy() sets some variable ($rtpproxy_instance = 3).
- After it, the *SER script adds a parameter to Record-Route: Record-Route: sip:1.2.3.4:5060;rtpproxy_instance=1
- Later when processing the 200 (or ACK if it has SDP), re-INVITE or BYE, the *SER script reads the "rtpproxy_instance" from Route header and set the RtpProxy according to it (in Kamailio there is a function "set_rtpproxy").
Does it make sense?
Thanks a lot for your response.