I've trying to add support to the dispatcher module for reloading its configuration file on a fifo command. This allows an external daemon to monitor a set of external Asterisk machines, detect when they fail, and round robin calls between the surviving ones.
I'm (I think) most of the way there. I can issue the command, and see the reload happening, but I see a core dump:
[root@test dispatcher]# serctl fifo dispatcher_reload
2(31672) DISPATCHER:ds_load_list: dest [135343048/135343048/21] sip:1.2.3.4:5060 2(31672) DISPATCHER:ds_load_list: dest [0/1/1] sip:1.2.3.4:5060 2(31672) DISPATCHER:ds_load_list: found [1] dest sets 0(31670) child process 31672 exited by a signal 11 0(31670) core was generated
(IP address changed to protect the guilty)
I've had a look at the code, and don't really understand why it's happening. A gdb backtrace shows:
(gdb) bt #0 ds_load_list (lfile=0x8112b98 "È+\021\b\025") at dispatch.c:281 281 dp = dp->next; #0 ds_load_list (lfile=0x8112b98 "È+\021\b\025") at dispatch.c:281 #1 0x0017cacf in dispatcher_reload (pipe=0xa1f5d30, response_file=0x8112378 "/tmp/ser_receiver_31623") at dispatcher.c:170 #2 0x08057456 in start_fifo_server () at fifo_server.c:540 #3 0x0805caf5 in main_loop () at main.c:988 #4 0x0805e52b in main (argc=3, argv=0xbffa3584) at main.c:1568
I've modified dispatcher.c. In mod_init, I've added:
if (register_fifo_cmd(dispatcher_reload, "dispatcher_reload", 0) < 0) { LOG(L_ERR, "Cannot register dispatcher_reload\n"); return -1; }
and added a new function:
static int dispatcher_reload ( FILE* pipe, char* response_file ) { ds_destroy_list (); if (ds_load_list(dslistfile)==0) { fifo_reply (response_file, "200 OK\n"); return 1; } else { fifo_reply (response_file, "400 Dispatcher reload failed\n"); return -1; } }
Can anyone shed light on this?