Saturday, March 24, 2007

Distributed Service Platform (The big picture)

Internet service applications, among other type of scalability demanding applications, usually suffer from limitations in the design of the current operating systems. In recent years, Internet service applications received considerable attention from researchers, aiming to study the challenges facing Internet services to support large number of concurrent requests. A common conclusion for those studies is that the existing operating system designs are ill-suited for the needs of Internet server applications, and consequently different operating system level mechanisms have been proposed to address those limitations. These mechanims include scalable event delivery (FreeBSD Kqueue, and Linux epoll interfaces), Zero copy sockets, direct I/O, and many others. These proposed solutions greatly enhances the ability of the operating system to cope with the type of load experienced by busy Internet servers, although they are limited to a single instance of an operating system.

On the other hand, parallel and distributed systems has been studied extensively by computer scientists in the contexts of high performance computing, high availability, fault tolerance, and resources sharing. Those studies proved that distributed systems can address the scalability, and performance problems very effectively. This makes distributed architectures very attractive to Internet server applications, although, according to my knowledge, no attempts have been made to formally study the system level mechanisms ( operating system kernel, and system software ) that enables Internet server applications to utilize distributed systems resources to achieve their massive concurrency and scalability needs. The main barriers against utilizing distributed architectures in Internet server applications are 1) the complexity of writing distributed applications, 2) the diminishing benefit of adding distributed resources due to existing operating system mechanisms that may limit the overall system scalability, and 3) the unpredictability of resources utilization due to the lack of control over system level mechanisms from the application side.

In my opinion, there is a great practical potential for a service platform that precisely addresses the needs for complex Internet server applications, and scales beyond the limits of a single physical machine, while still providing a simple and convenient programming model that fits nicely in existing software development processes.

No comments: