Alternative metrics for design decisions based on separating aleatory and epistemic probabilistic uncertainties

There is still much philosophical debate about whether a frequentist or subjective view of probability should be adopted. Some uncertainties (typically aleatory uncertainties) are naturally modelled using a frequentist approach, while others (epistemic uncertainties) are clearly sub- jective in nature. In light of this it has been argued, for example by the German philosopher Rudolf Carnap, that both potential descriptions of uncertainty should be maintained and treated separately. However, in current engineering practice it is common to make no distinction between these two types of uncertainty. Generally, uncertainty is represented by a single figure or dis- tribution, for example a probability of failure, which incorporates both aleatory and epistemic uncertainties. This paper explores the idea of treating aleatory and epistemic uncertainties separately and proposes alternative metrics – based on the epistemic probability of an aleatory probability - which can potentially provide greater insight for the designer in engineering problems. The metrics are illustrated using two example engineering dynamics problems; the prediction of wind induced accelerations in a tall building and optimizing the design of sound- proofing in a car. It is shown that as well as providing further insight on the underlying contributing causes of uncertainty, treating aleatory and epistemic uncertainties as separate quantities, as opposed to in the traditional combined manner, can potentially lead to different design outcomes.


Introduction
There is still philosophical debate about the correct interpretation of probability [1]. Two broad interpretations, often termed the frequentist and subjective approaches, are generally presented. The frequentist view defines probability of some event in terms of the relative frequency with which the event tends to occur, while the subjective view defines probability as a measure of the strength of belief regarding the true situation. Various authors have argued that one approach is superior to the other. This debate exists in both classic texts (with 'frequentist' works like those by von Mises or Feller and 'subjectivist' or 'Bayesian' ones such as those by Ramsey or Jaynes) and contemporary literature [2,3]. In his work Logical Foundations of Probability [4] the German philosopher Rudolf Carnap argues that both descriptions of probability should be maintained as the frequentist and subjectivist approaches are fundamentally different, or 'there are two fundamentally different concepts for which the term 'probability' is in general use'. Rather than promoting one view over the other, he argues that 'both concepts are important for science' and defines two separate types of probability, which he refers to as Probability 1 and Probability 2 : monitoring, to what extent uncertainty may be reduced becomes increasingly beneficial as design decisions, or even the decision as to whether or not to employ a digital twin, may depend on the balance between reducible and irreducible uncertainty.

Theory
2.1. Second order failure Probability:P 1 [P 2 (fail) ] Fig. 1 presents a typical engineering problem with associated uncertainty. There is a calculation model with some inputs, x, that are p 1 , uncertain (subjective or epistemic) and others, y, that are p 2 , uncertain (frequentist or aleatory). It is assumed in this study that these uncertain inputs can be represented by probability density functions p 1 (x) and p 2 (y) respectively, in line with Carnap's definitions of Probability 1 and Probability 2 . The output of the calculation model is labelled z and has an associated failure level, z f . Conventionally, where no distinction is made between probability types, the failure probability in such a scenario is calculated as: R(z > z f ), which is the region in x, y space where the output, z, exceeds the allowable level, z f and p 1,2 (x, y) is the joint PDF of the input variables. As per standard convention, in this study a lowercase p refers to a probability density function (PDF) while an uppercase P refers to a probability value or cumulative distribution function (CDF).
The calculation described by Eq. (1) makes no distinction between uncertainty types. For the purpose of this paper, this type of combined failure probability is termed P 0 , in order to distinguish it from an epistemic, or Probability 1 , type probability, P 1 , or a aleatory, or Probability 2 , type probability, P 2 . An example application of Eq. (1), which may provide further clarity, is presented in Section 3 of this paper and illustrated in Fig. 5.
An alternative to this combined P 0 failure probability can be obtained by treating epistemic and aleatory uncertainties as separate quantities. A metric for this purpose is proposed in Eqs. (2) and (3). This a two-step calculation process, where aleatory failure probability conditional on the epistemic variables is calculated first, before the epistemic probability of this being the case is evaluated. The first step involves obtaining the conditional probability of failure given a value of x, which can be done as follows: where R 1 is the region of y values where the output, z, exceeds the allowable level, z f for a given x. This calculation can be repeated for all possible values of x, allowing a conditional distribution to be developed. The second step involves establishing an acceptable P 2 failure probability, denoted P 2 , for example P 2 = 10%, and calculating the P 1 probability of the system being such that this is exceeded: where R 2 is the region of x values where the P 2 failure probability, P 2 (z > z f ⃒ ⃒ x), exceeds the acceptable level, P 2 . This means that instead of a single figure for P 0 failure probability, uncertainty is represented by a second order probability P 1 [P 2 ], which can be used to provide the designer with further insight on the relative contribution of reducible and irreducible uncertainty. An example application of Eqs. (2) and (3), which may provide further clarity, is presented in Figs. 6-8. This metric, and related measures of median and average values, have previously been proposed by [14]. However, to the best of Fig. 1. Typical engineering problem with input parameters with P 1 and P 2 uncertainties. the author's knowledge, the metric appears to have gained little traction in literature outside of work in the field of nuclear engineering but has importance for use in dynamics, particularly in the context of digital twin accompanied design and asset management. Fig. 2 presents a second typical engineering design problem, this time where a design parameter, η, is selected to minimize some cost function, C, which is a function of the conditional P 2 failure probability for a given η and x, i.e. C

Minimization of a cost function
. The aim of the calculation described in this section is to choose an optimal design parameter, η.
There are two ways the optimum design parameter can be selected. Firstly, the expected value, or ensemble average of the cost function, C, can be minimised, as in conventional optimization: where the expected value of the cost for a given η across the range of possible values of x is obtained by: It can be appreciated that the result of Eq. (5) is a function of both probability types and therefore the minimization in Eq. (4) makes no distinction between Probability 1 and Probability 2 . An example application of Eqs. (4) and (5) is presented later in this paper in Section 4 and illustrated in Fig. 16.
Instead of choosing η to minimise E[C], it is possible to define a maximum acceptable cost, ĉ, and search for the design parameter value, η, that minimises the P 1 probability of exceeding this cost, P 1 . The probability of exceeding a maximum acceptable cost, P 1 [C >ĉ], is given by: where R 3 is the region in the epistemic probability space, or set of values of x, where the cost exceeds the limiting value. The optimum design parameter to minimise this probability can be obtained by conventional minimisation: An example application of Eqs. (6) and (7) is illustrated schematically in Fig. 15. Unlike Eq. (5), this is a two-step process where the P 2 and P 1 uncertainties are propagated separately. The optimum η value returned by Eq. (7) can be viewed as an alternative to the value returned via Eq. (5). It can also be seen that Eqs. (6) and (7) are based on a cost function. Therefore any computational method whose outputs can be used to define some cost function (for example the Finite Element Method, Stochastic Finite Element Method or Statistical Energy Analysis) can be employed within this framework.

Illustrative example -Wind induced acceleration of a tall building
In order to illustrate the use of the second order probability metric defined in Eq. (3), the problem of predicting the peak floor acceleration, z, of a tall building under wind gusting is considered. This is an applicable example problem as there is clear aleatory uncertainty associated with the wind excitation, while epistemic uncertainty is generally present in the level of damping, which in real- Fig. 2. Typical minimization problem with parameters with P 1 uncertain input and P 2 uncertain failure probability. world scenarios is often unknown until it is measured after the building is completed. This kind of epistemic uncertainty, where damping is only known after construction, is relatively common in one-off civil engineering structures like tall buildings or pedestrian footbridges [21,22]. Furthermore, there is a clearly defined maximum allowable floor acceleration, z f , specified by ISO 10137:2007 [23]. Accurate calculation of peak floor acceleration of a tall building is difficult, and in reality typically involves wind tunnel testing, but for the purpose of this illustrative example it is assumed that the prediction method proposed in Annex C of the Eurocode 1991-1-4 [24] is adequate for calculating peak along-wind acceleration. Details of this computational model can be found in [25]. Fig. 3 illustrates how this example problem fits neatly in the general calculation framework given in Fig. 1.
The input PDFs for the example case considered are shown in Fig. 4. A normal distribution is employed to represent the subjective uncertainty associated with the level of damping and a Gumbell distribution is used to represent the p 2 probability of the mean 10minute wind speed. It is important to state that the distribution employed to represent damping is purely illustrative and is not necessarily the optimal way to represent this uncertainty in reality. Any probability distribution, for example a uniform distribution or a lognormal distribution, may be employed within the framework. Appendix A examines the impact of employing a uniform distribution, as opposed to normal distribution, to represent uncertainty in the value for damping in this example problem. However, due to the many different forms of probability distributions that could potentially be employed, and the fact that different example problems will have different sensitivities to the epistemically uncertain parameter, it important to emphasise that it is not possible to make universally applicable statements about the impact of the choice of distribution on the outcome. The joint PDF of these two probability distributions (which assumes that they are not fundamentally different and it is permissible to combine them) is also shown in Fig. 4.
The example building under assessment is assumed to have a natural frequency of 0.3 Hz, which approximately corresponds to a 130 m tall structure. The Eurocode computational model is used to develop a response surface, as illustrated in Fig. 5. In this example, this is done by evaluating the peak acceleration across the entire range of feasible wind speed and damping values, however in (realistic) cases where the computational model is more expensive adopting a more sophisticated sampling approach may be necessary. This is discussed in more detail by [15]. From this response surface, the region where peak acceleration exceeds the allowable limit, R 1 , can easily by established, as illustrated in Fig. 5. The P 0 failure probability can then be calculated by numerical integration of the joint PDF of the two input variables over this region, in line with Eq. (1). For this example case, it can be seen that there is a failure probability of 46%. It should be pointed out here that such a high failure probability is realistic as excessive wind induced acceleration is a serviceability failure, where some occupants may feel uncomfortable, as opposed to a critical or ultimate limit state failure leading to building collapse.
The approach described by Eqs. (2) and (3), where Probability 1 and Probability 2 are treated separately, is then considered. Firstly, the conditional failure P 2 failure probability is calculated via Eq. (2), as illustrated in Fig. 6. For a specified damping level, the wind speed where response exceeds the allowable value for peak acceleration, R 1 , is identified and the probability of wind speed being in that region is calculated through numerical integration of the p 2 PDF over this region. Fig. 6 illustrates this calculation of for a damping value of ζ = 1.2%. It can be seen that for this example scenario, the conditional P 2 failure probability is 44%. The calculation can be repeated across a range of feasible levels of damping, to give a curve showing how the P 2 failure probability changes with damping, P 2 , as shown in Fig. 7. The value for ζ = 1.2%, obtained from the calculation shown in Fig. 6, is marked on the plot, illustrating that Fig. 7 is constructed from repeating this calculation for all feasible value of damping. This curve, which shows the conditional P 2 probability of failure for a given damping, is the starting point for Eq. (3), where the P 1 probability of damping is considered. In Eq. (3) this is done using a target failure probability, P 2 . This concept is illustrated in Fig. 8 using a target value of P 2 = 50%. The range of damping values where the conditional P 2 probability of failure is greater than this target value, termed R 2 in Eq. (3), is easily identifiable. The probability of damping corresponding to the this region is then obtained by integrating the PDF of damping, p 1 (ζ), over this region. In the example case shown, there is a 29% P 1 probability that the P 2 failure probability is greater than 50%. Or, phrased slightly differently, there is a 29% P 1 probability that the building will be such that chance of failure due to random excitation exceeds 50%.
A further extension can be made to Eq.    constructed, as illustrated in Fig. 9. The value at P 2 = 50% is marked with an orange dot, again demonstrating how the C-CDF is constructed by simply repeating the calculation performed in Fig. 8 across a range of P 2 values.
metric and the C-CDF shown in Fig. 9 are potentially useful tools that can provide further insight for the designer beyond the P 0 failure probability shown in Fig. 5. For example, in the specific case of design of a tall building, if there is a chance of failure, the designer, broadly speaking, has two options. Firstly, they can fundamentally change the design to reduce the probability of failure, for example though altering the building shape or stiffness. Alternatively, they can decide to proceed with the initial design and accept there is a possibility of failure which may need to be addressed with some remedial action post construction, for example through the installation of a tuned mass damper. If a digital twin is employed, the true value of damping can be calculated using measurements from the physical twin. Understanding, prior to these measurements, the contributions of aleatory and epistemic uncertainty to the failure probability, and how likely the probability of failure is to change once the true damping value for finished building is obtained from the digital-physical twin pair, allows the risk of proceeding with a particular design to assessed from an informed position.   There are a number of additional points worth making about this C-CDF. Firstly, the shape of the curve can inform the designer about the contribution of different uncertainties. This is illustrated in Fig. 10, which shows the CCDF for a two example design scenarios with different levels of epistemic uncertainty associated with damping. It can be seen that for the case of small epistemic uncertainty (achievable if, for example, data is available from an existing similar structure), the C-CDF becomes steeper, or viewed slightly differently, the P 2 failure probability becomes closer to having a fixed deterministic value. In terms of a digital twin, in the case labelled 'Design Scenario (a)', a wide range of failure probabilities (approximately 15% to 100%) appear possible, and learning the true value ζ from measurements on the physical twin will significantly enhance understanding of the structure's future behaviour. In contrast for the scenario labelled 'Design Scenario (b)', the designer knows that the true failure probability is close to 50% and only a limited reduction in uncertainty is achievable from physical twin measurements. Fig. 11 presents a slightly different example, showing the evolution of the state of knowledge about a single building over time in a digital twin-physical twin pair. This shows the state of the state of knowledge prior to any measurements, and the knowledge after a series of measurements from the physical twin are used to update distribution representing ζ in the digital twin. Given the initial state of knowledge at the in the example shown, the cost/benefit of remedial action is difficult to assess. However, it can be appreciated that in a hypothetical case such as that illustrated, where measurements on the physical twin indicate that damping is lower and P 2 failure probability is higher than initially anticipated, remedial action is likely to be to required to reduce the failure probability.
In addition to this, it is also interesting to note that it can be shown that the area enclosed by the C-CDF is equal to the P 0 probability of failure calculated via Eq. (1) and illustrated in Fig. 5. In other words, for the example problem above, the area under the C-CDF in Fig. 9 is equal to the P 0 failure probability of 0.46 shown in Fig. 5. The proof of this presented in Appendix B. Fig. 11. Example of the evolution of the C-CDF of P 1 [P 2 (fail)] over time in a digital twin-physical twin pair.

Illustrative example
The example problem of selecting the optimum level of sound proofing in a car is chosen to demonstrate the application of the metric proposed in Eq. (7). There is both aleatory and epistemic uncertainty associated with the acoustic performance of a car. In terms of aleatory uncertainty, it has been shown that there can be a large differences in the interior noise levels in nominally identical vehicles arising from small variations introduced during the manufacturing process [26,27], for example due to spot weld stiffness which can be different for every vehicle [28]. There is also epistemic uncertainty in production associated with the properties of the production line. This can arise through jig misalignment for example, where the extent of the misalignment is constant but unknown.
The presence of aleatory uncertainty means that for a given production line there is a realizable ensemble of random structures (i.e. cars) that may be produced and therefore the frequency response function of interior sound is P 2 uncertain for each car from that production line. The epistemic uncertainty can be thought of as an unrealizable or imaginary ensemble of production lines, only one of which will exist in reality. As illustrated graphically in Fig. 12, this results in an ensemble of distributions describing the P 2 probability of failure. Failure is deemed to occur when the spatial average of mean squared sound pressure within the cabin of a car under prescribed excitation (for example a shaker at a suspension mount), z, exceeds a limiting value, z f . The role of the designer is to optimize the level of sound proofing in the car to prevent this limiting value being exceeded whilst minimising the cost of this sound proofing.
For the purpose of this example, calculation of interior noise is performed using a simplistic model to estimate interior sound pressure. Using Statistical Energy Analysis [29], the average z value across an ensemble of random systems (in this case cars), μ z , can be shown to be inversely proportional to the loss factor of the interior, η such that: In terms of design, the engineer controls the level of sound proofing and thus controls η. In this simplified example, it is assumed that x is a variable with some associated epistemic uncertainty arising from a lack of knowledge about the production line, for example in jig alignment or material properties. Given that this is constant but unknown, all cars from a single production line have the same value of x. Furthermore, it has been shown that a Gaussian Orthogonal Ensemble (GOE) statistical model can be used to quantify natural frequencies across an ensemble of random structures (for example [30,31]). Using a GOE approach, variance, σ 2 z , of the sound pressure can be written as [27]: where m is the modal overlap factor, given by: where ω is the natural frequency of interest and n is the modal density. Using μ z and σ z a p 2 lognormal distribution (which has been shown to be adequate for representing response of systems conforming to GOE [32]) for the mean square sound pressure, conditional on x and η p 2 (z|x, η), can be constructed. From this the P 2 probability of exceeding a failure threshold z f , P 2 (z > z f ⃒ ⃒ x, η) can be calculated.
For the example problem considered here, the modal density is assumed to equal 50 s and x is assumed to be normally distributed with a mean value of 3 × 10 − 4 N 2 /m 4 and a standard deviation of 1.5 × 10 − 4 N 2 /m 4 . The goal of the design is to keep the spatially averaged interior mean squared sound pressure below 65 dB, i.e. 65 dB is the assumed failure threshold z f .
As with the tall building example in the previous section, this calculation model is a gross simplification. In reality, a detailed Hybrid Finite Element SEA [33] model of the car would be employed to calculate interior sound pressure, however the simplistic approach described by Eqs. (8) and (9) is adequate to illustrate the optimization procedure. Fig. 13 shows how the framework shown in Fig. 2 can be applied to this problem. To implement the minimization defined by Eqs. (4) and (7), the cost of production is defined as: where α is the unit cost of sound proofing and β is the cost of repairing a failure case. P 2 (z > z f ⃒ ⃒ x, η) is the frequentist, or P 2 or aleatory, probability of failure for given x and η. Eq. (18) employs the law of large numbers, and therefore is suitable only for a scenario where a large number of units are produced, like a car production line, as opposed to a one-off structure like a tall building.
Substituting Eq. (11) into Eq. (4), the optimum η value to minimise the expected cost, i.e. the optimum value with no separation of uncertainty types, is given by: Fig. 14 illustrates this calculation for β = 150α. It can be observed that in this case the optimum value for η is 0.09. Alternatively, as outlined by Eqs. (6) and (7), η can be selected to minimise the P 1 probability of exceeding a limiting cost, ĉ. Fig. 15 illustrates the implementation of this process with the acceptable cost (arbitrarily) set to 60 and, as in Fig. 16, β = 150α. Firstly, the region where the cost function exceeds the maximum acceptable cost is identified, as shown in the plot on the left, before the P 1 probability of this value being exceeded for each η is calculated by numerically integrating p 1 (x) over this region. Doing this for each value of η allows a curve showing P 1 [C >ĉ] as a function of η to be developed, as shown by the plot on the right. From this, an optimum value, which in this example case is η = 0.07, can be identified.

Difference in optimum value between approaches
Through some relatively straight forward manipulation of Eq. (12), it can be shown that when E[C] is minimised, i.e. when no distinction is made between the uncertainty types, the optimum η value satisfies the equation: Similarly, through a combination of Eq. (6), 7 and 11, it can be shown that when P 1 [C >ĉ] is minimised, i.e. when the uncertainty types are treated separately, the optimum design η value satisfies the equation: Comparing of Eq. (13) and Eq. (14), it can be appreciated that the optimum design parameter from the two approaches is different, and will only be the same if the region R 3 ranges from − ∞ to ∞, or in practical terms covers the entire range of feasible values of x. For practical implementation, it is not necessary to evaluate the derivatives in Eq. (13) or (14). These Equations are presented simply to Fig. 13. Application of the framework outlined in Fig. 2 for selection of the optimum level of sound proofing in a car.  show that theoretically the optimum value of η is different if E[C] or P 1 [C >ĉ] is minimised. Instead, at least for the computationally cheap models considered in this paper, it is more practical to evaluate E[C] or P 1 [C >ĉ] for a range of x and η values and extract the minimum value. This avoids challenges associated with numerical differentiation and also means that the cost function is not required to be strictly differentiable.
The difference between the two approaches is illustrated in Fig. 16, which shows the optimum η values obtained for a range of repair cost to initial cost, i.e. β/α in Eq. (11), ratios. From Fig. 16, it can be appreciated that the optimum η from the two methods is not the same, demonstrating numerically that employing the distinction between Probability 1 and Probability 2 can lead to different design decisions.

Limitations
The examples in the preceding sections are intended to be illustrative, and as has been already discussed, the computational models employed are excessively simple. In most realistic engineering dynamics problems more expensive computational models are required. Furthermore, this study only considers examples where the respective uncertainties derive from a single variable. In reality there are likely to be multiple parameters contributing to both forms of uncertainty, meaning that large multi-dimensional probability spaces need to be considered. Theoretically, there is no reason why the approach could not be expanded for this. However, as implemented here, doing so would require computationally expensive nested simulations. This computational expense may make some of the calculation procedures adopted here impractical and more sophisticated approaches, for example potentially using Latin Hypercube Sampling or surrogate modelling, are likely to be necessary to develop a response surface for real world, multi variable problems.
In addition, the examples are predicated on the assumption that a probability distribution can be defined for all uncertain parameters. In reality, the values that define the distribution will themselves be uncertain, adding an extra layer of complexity to the problem. However, it is possible to expand the framework outlined to deal with these uncertainties [16], although again it would likely require more sophisticated sampling in the development of conditional distributions.
Finally, the work is limited to a probabilistic view of epistemic uncertainty. There is ongoing debate about this in literature; some authors argue that this is correct (for example [3]), while others argue that probability theory is insufficient to model ignorance and that some form of imprecise probability should be employed. It may be possible to develop the framework to incorporate imprecise probability theory by developing a credal set of possible probability distributions to represent epistemic uncertainty and, in the one dimensional case then propagating upper and lower bounds through the framework. This is likely to be more difficult for the realistic multi-dimensional case.

Conclusion
Carnap argued that both subjective and frequentist views of probability are necessary and should be maintained as separate quantities, which he termed Probability 1 and Probability 2 . These two definitions of probability broadly correspond to epistemic and aleatory uncertainty. This paper explores the benefits performing probabilistic analysis in engineering dynamics where aleatory and epistemic uncertainties are treated as distinct quantiles, in line with Carnap's ideas. Two metrics based on second order probability are examined to demonstrate the potential benefits of such an approach. It is argued that understanding the balance of reducible and irreducible uncertainties at the design stage allows design decisions to be made from a more informed position. Furthermore, understanding this balance is especially relevant when a digital twin approach is adopted given that one of the aims of a digital twin is to reduce this uncertainty. It is also shown that treating the uncertainty types separately can lead to different solutions in the selection of optimum design parameters. Therefore, it is concluded that as well as being more philosophically consistent than the conventional combining uncertainties, employing the two descriptions of probability is potentially valuable approach in engineering dynamics.

Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: John Hickey reports financial support was provided by Engineering and Physical Sciences Research Council.

Appendix A. Example sensitivity to choice of p 1 distribution
As discussed in Section 3 paper, the choice of a normal distribution to represent damping in the Tall Building example is illustrative, and is not necessarily be the best way to represent damping. This appendix examines the impact of changing this distribution from normal to uniform. The two distributions, and the associated P 1 [P 2 ] C-CDFs are compared in Fig. A.1. Firstly, on a basic level, the fact the framework is applicable in both cases illustrates that the method can work for any probability distribution. Secondly, for this example case it can be appreciated that while there are some differences, namely a slightly higher P 1 proability of encountering higher p 2 failure probabilities if a uniform distribution is employed, the C-CDF is not overly sensitive to this change. However, it is important to stress that this outcome is only applicable to this particular example problem and it is not valid to conclude that the proposed metric is insensitive to the choice of p 2 distribution in all cases or for all choices of distribution.

Appendix B. Proof
It can be shown that the area enclosed by the C-CDF of P 1 [P 2 (fail) >P 2 ] is equal to the P 0 probability of failure calculated via Eq.
(1). This can be shown by rewriting Eq. (3) using the Heaviside step function, H: and integrating across the range of possible P 2 values (0 to 1) to find the area under the C-CDF: For a given P 2 (fail|x) the integral of the Heaviside step function is: Hence, Eq. (B.2) can be rewritten as: . A1. Comparison of C-CDF for the tall building example problem using an example normal distribution (mean 1.2%, standard deviation 0.2%) and uniform distribution (U[0.6%, 1.8%]) for damping.
Substituting Eq. (2) into Eq. (B.4) gives: Which can be rewritten as a joint PDF of x and y, p 1,2 (x, y): which, by Eq. (1), is the P 0 failure probability with no distinction between uncertainty types: