A Framework Providing Fault Tolerance Using the CORBA Trading Service.
This thesis describes a body of research into the fault tolerance problem associated with the use of large scale distributed systems and a partial solution to the problem. Fault tolerance problems arise in such an environment because application components may eventually fail due to hardware problems, operator mistakes or software failures. Within most environments, in particular within a banking environment, such failures are not acceptable, thus fault tolerance mechanisms must be employed to reduce the susceptibility of a given system to failure. This thesis provides a framework to support the development of fault tolerant distributed application components within an existing banking environment, such as the Jetpac infrastructure of WDR London. Potential application component failures, in the Jetpac infrastructure, were identified and an architecture to overcome these failures, using CORBA, a distributed object middleware specified by the OMG, was designed and implemented. Of primary importance to this architecture is OMG?s CORBA Trading Service as the mechanism to advertise and manage service offers for fault tolerant application components. Fault tolerance mechanisms were hidden from the client application program and therefore from the user. Quality of Service (QoS) in terms of performance was adequately addressed and evaluated. The prototype implementation of the suggested architecture fits into the Jetpac infrastructure and provides a mechanism to (re)discover and (re)connect master and backup application components at run time. This helps in overall system stability and scalability, providing a highly available system.