Alex,
thank you for your extensive answer. (sorry to top-post). It's just that I have been asked to scale up an existing system (which happens to be asterisk), and I'm now trying to get a grasp on how to do it.
Thanks, Antonio.
Op 02-01-10 17:24, Alex Balashov schreef:
Hi Antonio,
As I mentioned in one of my previous posts, and will emphasise again: large scale cannot happen magically just through the addition of a few small elements.
From your description of your environment, it sounds like the systems and features being used are primarily "host-wise" - e.g. tied to the runtime state/environment of a single instance of Asterisk on a single host. This is a limitation upon its scalability.
To provide services on a large scale, the system has to be composed of non-hostwise components; the architecture must be distributed. This is a defining property and an organising principle of its design from the very bottom, not a setting you can simply enable when your user base grows large enough. Building distributed VoIP service delivery platforms is a very different science than building non-distributed ones.
Once again, let me make a more general cluster analogy; if you have a single instance of an application on a different server, you cannot just add more some more servers, run the application in parallel on them, and expect it to work. Instead, either the whole application must be written in a way that anticipates its being deployed in parallel on multiple nodes, or single instances of it must be placed into some sort of harness that can implement a distributed/parallel abstraction layer for it while preserving for the application the illusion of single-instance runtime. Either way, multiple nodes executing the program must have a way of keeping shared logical state across the entire execution continuum (in a centralised or distributed way), passing messages amongst nodes asynchronously, synchronising storage access to prevent race conditions / mutual exclusion violations, avoid deadlocks, etc.
An application or service designed to run on multiple nodes to begin with will have these facilities baked into its architecture, while an application or service not designed to do that would probably have to be rewritten or, at least, very extensively modified in order to suit the new requirement. In more generic computation this probably means the use of something like the LAM/MPI libraries, or perhaps some sort of concept aspiring to Google's MapReduce and/or BigTable.
It's the same thing with VoIP and Asterisk. Much of what you've got now relies on particular Asterisk nodes performing particular functions, which just isn't how a distributed system works unless you are willing to settle for some sort of compromise involving node specialisation -- which might be okay: a dedicated conferencing server, dedicated ACD/queue server, etc. But this ultimately has scalability barriers too and represents an inefficiency.
I have mentioned one possible and common distributed Asterisk architecture before: a central FastAGI controller in which all application logic is implemented - and, in the case of things which already exist in Asterisk such as queues, often RE-implemented in a way compatible with distributed architecture - to which all calls are dispatched via N Asterisk servers. Such a backend could implement the necessary shared state for logical abstractions extended across N servers. The example I often give is one of Asterisk queues (in the sense of Queue()): a queue exists only in one Asterisk server, but you can reimplement the "user experience" aspect of a queue in FastAGI (estimated time to wait announcement, music, etc.), which would then allow you to extend one logical queue over multiple Asterisk servers and potentially support thousands of callers in "one" queue.
Even if you do not use such an architecture, you're going to have to think along these lines. Kamailio alone cannot make anything scale; the service delivery backend has to be built with a distributed architecture in mind.
-- Alex