Inference and maintenance planning of monitored structures through Markov chain Monte Carlo and deep reinforcement learning
Item Type:Conference Paper
Citation:Christos Lathourakis, Charalampos Andriotis, Alice Cicirello, Inference and maintenance planning of monitored structures through Markov chain Monte Carlo and deep reinforcement learning, 14th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP14), Dublin, Ireland, 2023.
submission_222.pdf (PDF) 2.470Mb
Maintenance of engineering systems exposed to corrosive environments, e.g. coastal, marine, or highly acidic conditions, is of utmost importance for managing structural risks. The most beneficial sequence of maintenance decisions, i.e. the one that strikes the best balance between life-cycle intervention costs and expected failure losses, can be sought as the solution to an optimization problem. Owing to the high complexity of this sequential decision optimization problem in multistep planning horizons, traditional methods such as threshold-based approaches often get trapped in sub-optimal solution regions. This can be attributed to the inadequacy of such techniques to capture and control combinatorial system-level component interactions, acting more often than not on a single-component optimality basis. Recently, Deep Reinforcement Learning (DRL) frameworks have been proven to provide unmatched capabilities to tackle such problems in high-dimensional multi-component systems. Another major issue encumbering efficient decision-making is the presence of high levels of uncertainty of the deterioration processes. Bayesian principles and model updating are, therefore, key for translating data acquired through monitoring devices to actionable knowledge about the stochastic system. In this paper an integrated framework is developed, that combines DRL and Bayesian model updating, aiming at determining an optimal sequence of maintenance decisions over the lifespan of continuously monitored deteriorating engineering systems. More specifically, different single- and multi-agent DRL architectures are considered, trained through double deep Q-network and proximal policy optimization, while the updating of the uncertain continuous-value environment parameters, is performed through Hamiltonian Markov Chain Monte Carlo (HMCMC) with no U-turn sampling. The proposed methodology is first applied to an elementary problem which also acts as a verification and validation testbed, and then to a more realistic and complicated one, pertaining to a multi-component structural frame. To highlight the benefits of the proposed method, we compared it against optimized time- and condition-based heuristic approaches which are applied to both cases. The obtained results show that the coupled DRL-HMCMC framework performs better than benchmark decision strategies in terms of life-cycle cost minimization and policy sophistication.
Other Titles:14th International Conference on Applications of Statistics and Probability in Civil Engineering(ICASP14)
Type of material:Conference Paper
Series/Report no:14th International Conference on Applications of Statistics and Probability in Civil Engineering(ICASP14)
Availability:Full text available