Fault localization in Distributed Adaptive Systems

dc.contributor.advisor	Clarke, Siobhán
dc.contributor.author	Raj, Amit
dc.date.accessioned	2018-05-16T15:41:54Z
dc.date.available	2018-05-16T15:41:54Z
dc.date.issued	2016
dc.identifier.citation	Amit Raj, 'Fault localization in Distributed Adaptive Systems', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2016
dc.identifier.other	THESIS 10961
dc.identifier.uri	http://hdl.handle.net/2262/82921
dc.description.abstract	Modern systems that execute in ubiquitous environments must be adaptive in order to maintain an acceptable quality of service. As in traditional distributed systems, faults in such systems are likely to propagate across several components and become manifest as apparently unrelated faults in components other than the responsible one. An efficient and accurate run-time root cause detection mechanism is required for fault recovery, which has been solved for non-adaptive systems. However, it is non-trivial in a distributed adaptive system (DAS) because of the changing nature of the system's structure and behaviour, the potential inaccessibility of adapted system components, and the potential for inaccurate diagnosable information about the system. The key driver of this research is the observation about a debugging expert's wasted effort in finding the root cause within a component where a fault's symptom was observed. However, the root cause may exists in another component. The key idea is to pinpoint the actual faulty component. This thesis presents FaLDAS, a fault localization approach to find actual faulty components responsible to originate propagated faults in DAS. The contribution of the work are: (i) A novel fault propagation graph and its construction using components' names and their run-time input/output values, (ii) Faulty Candidates' detection which returns a sorted list of potentially faulty candidates, and (iii) A _ne grained fault analysis which enables to find the potentially faulty corrupted variable, of a faulty component, whose propagation has caused a propagated fault. The key outcomes of the thesis are (i) improved efficiency over existing mechanisms, (ii) ability to diagnose inaccessible components, and (iii) detection of faulty component and its corrupted output variable. FaLDAS uses run-time values to identify if a component is suspicious to be faulty (suspicious components) or not, by traversing an FPG. Unlike existing mechanisms which use all the components of a system to identify actual faulty components, whereas FaLDAS uses only suspicious components. It reduces the search space to find faulty components; increases the efficiency. The run-time input output values enable the fault diagnosis of components whose internal designs e.g., source code, test cases, data ow, etc., are inaccessible. Unlike existing mechanisms which fail to diagnose inaccessible components, FaLDAS show substantial success in their diagnosis. In addition, detection of corrupted variable of a faulty component enables a debugging expert to find a specific root cause in a specific part of the component, which reduces root cause analysis effort. FaLDAS's fault localization algorithm finds a set of potentially faulty candidates. A candidate is a node in FPG (FPGNode) which has a reference to a potentially faulty component. In addition, FaLDAS sort the candidates according to their probability of being faulty. The final output of FaLDAS is the sorted list of candidates, which is a recommended order for a debugging expert to prioritize components for root cause analysis. FaLDAS was evaluated for time efficiency and accuracy on simulated systems and a real-world system TCAS (Traffic Collision Avoidance System) in widespread use by major aircrafts in United States. Simulated systems with a large number of components were used to measure time efficiency. FaLDAS shows higher time-efficiency than that of other related approaches by up to two orders of magnitude. FaLDAS's accuracy was evaluated on both simulated systems and TCAS. In particular, accuracy was evaluated in scenarios when the actual faulty component is inaccessible and also when the required input information is inaccurate. In the former case, FaLDAS achieves higher accuracy but not in the later case. In summary, FaLDAS achieves higher efficiency, without compromising the quality of results, than that of other related approaches.
dc.format	1 volume
dc.language.iso	en
dc.publisher	Trinity College (Dublin, Ireland). School of Computer Science & Statistics
dc.relation.isversionof	http://stella.catalogue.tcd.ie/iii/encore/record/C__Rb16688910
dc.subject	Computer Science, Ph.D.
dc.subject	Ph.D. Trinity College Dublin
dc.title	Fault localization in Distributed Adaptive Systems
dc.type	thesis
dc.type.supercollection	thesis_dissertations
dc.type.supercollection	refereed_publications
dc.type.qualificationlevel	Doctoral
dc.type.qualificationname	Doctor of Philosophy (Ph.D.)
dc.rights.ecaccessrights	openAccess
dc.description.note	TARA (Trinity’s Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ie

Files in this item

Name:: Raj, Amit_TCD-SCSS-PHD-2016-10.pdf
Size:: 3.877Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record

Browse

All of TARA

This Collection

Statistics

Fault localization in Distributed Adaptive Systems

Files in this item

This item appears in the following Collection(s)