Reliability Analysis of Correlated Competitive and Dependent Components Considering Random Isolation Times

In the Internet of Things (IoT) system, relay communication is widely used to solve the problem of energy loss in long-distance transmission and improve transmission efficiency. In Body Sensor Network (BSN) systems, biosensors communicate with receiving devices through relay nodes to improve their limited energy efficiency. When the relay node fails, the biosensor can communicate directly with the receiving device by releasing more transmitting power. However, if the remaining battery power of the biosensor is insufficient to enable it to communicate directly with the receiving device, the biosensor will be isolated by the system. Therefore, a new combinatorial analysis method is proposed to analyze the influence of random isolation time (RIT) on system reliability, and the competition relationship between biosensor isolation and propagation failure is considered. This approach inherits the advantages of common combinatorial algorithms and provides a new approach to effectively address the impact of RIT on system reliability in IoT systems, which are affected by competing failures. Finally, the method is applied to the BSN system, and the effect of RIT on the system reliability is analyzed in detail.


Introduction
With the advancement of communication technology, the development of the IoT has reached a new stage [1,2].In the IoT environment, all objects in our daily lives are part of the Internet, due to their communication and computing capabilities (including microcontrollers and digital communication transceivers).The IoT extends the concept of the Internet and makes it more pervasive, allowing different devices to interact while keeping their data secure (e.g., medical sensors, surveillance cameras, household appliances, etc.) [3,4].Body Sensor Network technology [5] is one of the most critical technologies in modern IoT-based healthcare systems.The BSN system is often used to detect some CMC, 2023, vol.76, no.3 physiological characteristics of people [6,7], and it is widely used in medical care, military, fitness, firefighting, and sports fields.
A BSN system consists of the following three parts: biomedical sensors [8], relay nodes, and sink nodes.In this work, sink nodes are considered to be completely reliable.Other BSN components will only suffer from local failure (LFs) and propagation failure (PFs).LF will only lead to the failure of the component itself without affecting other components in the system, while PF will not only cause the failure of the component itself but also affect other components in the system [9,10].Note that the PFs used in this article are propagation failure with global effects (PFGEs), if one PF occurs, the entire system will fail.In the BSN system, the energy of the biosensor comes from the batteries [11,12].If the transmission distance is long, the perceived information can be transmitted to the target through relay communication [13][14][15] to achieve the purpose of energy saving.
When the relay node fails, the remaining time during which a sensor node can communicate directly with the receiving device is determined by its remaining battery energy.In other words, the timing of the isolation effect is random.When the isolation effect occurs, it may have a dual effect: on the one hand, the performance of the system will be degraded, because when the biosensor is isolated, the receiving device will not receive the information it perceives on time; on the other hand, because the biosensor is isolated at this point, the PFs from this biosensor can be prevented from damaging other components in the system (such as jamming attacks [16,17], overheating, and shortcircuits [18]).However, when the PF of the biosensor in the system occurs before the relay failure or the biosensor PF occurs after a relay failure but the biosensor has enough power to transmit data directly to the receiving node, there will be a propagation effect and make the whole system fail.Therefore, a time-domain competition exists between the PF isolation effect and the PF propagation effect.When the isolation effect occurs first, the system will not fail, and the BSN system will continue to work with reduced performance.Otherwise, if the propagation effect prevails, the system will fail.So, there are many existing papers on competition failure.Unfortunately, there are few papers that consider random isolation time (RIT).More details are given in Section 2.
The rest of this paper is structured as follows: Section 2 is the related work.Section 3 introduces the new combinatorial analysis algorithm.Section 4 takes the BSN system as an example to analyze the proposed algorithm.The numerical analysis is carried out in Section 5. Section 6 summarizes the work and discusses future directions.

Related Work
Solving competition failure problems has become a deep research field.There are different approaches to reliability analysis for different types of functional-dependent systems.The simulation method is highly applicable to all kinds of systems in the modeling of system behavior, but the results calculated by the simulation method are usually not accurate enough and can only provide rough results.If the calculation accuracy needs to be improved, the time cost will increase, so it is not suitable for accurate calculation of the reliability of large-scale systems [19,20].Markov method can flexibly simulate all kinds of dynamic behavior and is common to solve the problem of dynamic system reliability analysis method, but the method of Markov has the state space explosion problem [21,22].
This combination method has the advantages of high precision and high efficiency and is widely used in the reliability analysis of systems with competing faults.The combinatorial algorithm is used to solve the reliability analysis problem of single-stage [23] and multi-stage [24] systems in deterministic competitive faults.For probabilistic competitive faults [25], random fault propagation time [26] is considered.The failure propagation time [27] is considered from the multi-function dependency group and the cascading [28] behavior is considered in competitive failure.
In the case of relay failure, the study of random isolation time (RIT) and competition effect becomes very important in reliability analysis.However, as far as we know, few existing works consider or assume zero RIT when conducting reliability analyses.Although the literature [29] considered the effect of competing failures and random isolation times on system reliability, the approach in this work assumes that when transmitting data, the biosensors cannot use the same relay node, and is therefore not applicable to systems in which the same relay needs to be used for transmission.
In this paper, the failure of the competition combination method was improved compared with existing methods.This article considers the effects of competitive failure and random isolation time on system reliability and allow different biological sensors to use the same relay when transmitting data, solving the problem of data transmission among different biological sensors in the system that need to use the same relay.At the same time, the system element in this method can follow any failure time distribution.Please note that although this article is based on a discussion of BSN systems, the competitive failure behavior and the proposed method can be applied to other application systems, such as computer networks, smart homes, smart grids, etc.

Proposed Method
According to the total probability theorem and divide-and-conquer principle, the reliability of the BSN system with competitive effect can be decomposed into several independent simplified problems without competitive effect.

Establish Fault Tree(FT) Model and Separate PF
The FT model is used to express the system's fault behavior by ignoring the components' propagating failure behavior.In the BSN system, the BSN system failing is the top event, and the relay component failure and the dependent component failure is the basic event.The sensor communicates with the receiver through the relay; thus, there is a functional dependency between the sensor and the relay.The functional dependency (FDEP) behavior in the dynamic FT model can be modeled in Fig. 1.
Figure 1: Functional dependency behavior in BSN P u (t) in Eq. ( 1) is the probability that PF does not occur in mission time (0, t], where the set I is the set of all experiencing PF components, while q i PF (t) is the probability of component i experiencing PF.According to its definition, the calculation process of CR(t) can be given by the following steps.
Note that the failure probability of component i should be replaced by the conditional failure probability of component i (denoted as q i (t)) when calculating the local failure probability of component i. Suppose that q il is denotes the probability of the local failure of a component, and while q ip is the probability of the propagated failure of a component.When local and propagated failures are mutually exclusive, the conditional probability of component failure can be calculated using Eq. ( 2).When local and propagated failures are s-independent, the conditional probability of component failure can be computed using Eq. ( 3): (2)

Build Event Space according to Relay LF and Dependent Components PF
The next step is to construct an event space to consider all possible combinations of failure states for the relay component and dependent components.In the system under consideration, there exists one relay node T and n dependent nodes D i (i = 1, 2, . . ., n) generating 2 n+1 disjoint events.Events E 1 , E 2 , E 3 , E 4 denote the combination of LF and PF occurring and not occurring, respectively.E 1 : The relay node works normally, and no PF occurs in the dependent components D i .In this case, no isolation effect or propagation effect occurs.E 2 : The relay node works normally, and at least one dependent component has PF.At this time, the propagation effect will cause the system to fail.E 3 : LF occurs in the relay node, and PF does not occur in any dependent components.In this case, isolation effects occur without propagation effects, and the corresponding dependent components are isolated.E 4 : LF occurs in the relay node (TLF), and PF occurs in at least one dependent component (denoted as D i PF).When E 4 occurs, TLF and D i PF occurs simultaneously, and the time domain competition leads to three disjoint competition events, denoted as CE 1 , CE 2 , CE 3 ; that is, E 4 = CE 1 ∪ CE 2 ∪ CE 3 .CE 1 indicates that PF occurs in at least one dependent component occurs PF before the LF of the relay node (D i PF → TLF), at which time a global propagation effect will occur, making the system fail.CE 2 indicates that the dependent components have experienced PFs and all PFs occur after the LF of the relay node (TLF → D i PF), but the RIT of at least one dependent component is not less than the difference between the occurrence time of TLF and D i PF (D i IT), this means that the biosensor D i is isolated after the occurrence of D i PF (D i I).At this time, the system fails because of the propagation effects that occur.CE 3 indicates that PF occurs in the dependent components and all PFs occur after the LF of the relay node, but the RIT of at least one dependent component is less than the difference in time of occurrence between TLF and D i PF (D i IT).This means that D i I occur before D i PF, in this case, the isolation effect will occur.
Table 1 provides the definition of (E 1 , E 2 , E 3 , E 4 ).Here, TLF and TLF indicate that LF occurrence and nonoccurrence for T; D i PF and D i PF denote the occurrence and nonoccurrence of PF of D i , respectively.
In BSN system, SR is used to indicate system reliability.According to the event space (E 1 , E 2 , E 3 , CE 1 , CE 2 , CE 3 ) as defined, CR BSN(t) can be calculated: Because the system will occur propagation effects at E 2 , CE 1 and CE 2 , Pr (SR ∩ E 2 ), Pr (SR ∩ CE 1 ) and Pr (SR ∩ CE 2 ) are equal to zero.Therefore, Eq. ( 4) can be simplified as follows: Table 1: The definition of event space Event Space

Address Propagation Effects and Isolation Effects
Pr (SR ∩ E 1 ): In this case, according to the definition of E 1 , no propagation effect or isolation effect occurs.So, Pr . Depending on the conditional probabilities, Pr (SR ∩ E 1 ) can be computed by Eq. ( 6): where Pr (SR|E 1 ) (reliability of BSN systems conditional on the occurrence of E 1 ), according to Eq. ( 1) and simplified FT model.The FT is generated by removing T and the corresponding FDEP gate but keeping D i from the old FT.
Pr (SR ∩ E 3 ): When E 3 occurs, the LF of the relay node occurs, but there is no PF from D i (i = 1, 2, . . ., n).Based on isolation state of D i , E 3 can be replaced by 2 n disjoint events.Thus, where Pr (E 3 ∩ D i I) is shown in Eq. ( 8), while Pr (SR|E 3 ∩ D i I) can be solved by generating a simplified FT model and using the binary decision diagram (BDD) method. Pr Pr (SR ∩ CE 3 ): In this time, the isolation effect will occur.By the definition of CE 3 , it can be decomposed into 2 n − 1, namely CE 3,1 , CE 3,2 , . . ., CE 3,2 n −1 .Thus, the formula can be converted as follows:

Reliability of the Integrated BSN
By calculating Pr (SR ∩ E 1 ), Pr (SR ∩ E 3 ) and Pr (SR ∩ CE 3 ) in the above steps, and then using Eq. ( 5) to obtain CR BSN (t).Moreover, by applying Eq. ( 1), the reliability of the final BSN can be further obtained.

Examples and Analysis
Fig. 2 gives a BSN system with five nodes.More specifically, biosensors B 1 , B 2 , H 1 and H 2 are used to monitor human blood pressure and heart rate, respectively.The relay node T transmits the data of the biosensors B 1 and H 1 to the receiving device; the receiving device will collect and process the data from all biosensors to assist the caregiver in taking action.

Figure 2: BSN model
When T experiences LF, B 1 and H 1 can increase their transmission power to communicate directly with the receiving device until the battery power of B 1 and H 1 is exhausted to the point that it can no longer support direct communication, at which time biosensors B 1 and H 1 will be isolated from the rest of the BSN system.

Establishing the FT Model and Separating PF
The modeling of the BSN system is shown in Fig. 3.The top event of this dynamic FT model is the BSN system fault, the basic event is the local fault of the relay node and biosensor nodes, and the FDEP gate simulates functional dependence behavior in the relay node and biosensor nodes.Applying Eq. ( 1), P u (t) is the probability when no PF is generated; that is, P u (t) = (1 − q TPF ).CR (t) represents the conditional probability of system failure without PF generation in independent components, which can be solved by Eq. ( 5).

Constructing the Event Space
According to Fig. 2, this BSN system has one relay node and two dependent nodes.The event space is built as defined in Section 3.2, as shown in Table 2.According to Eq. ( 5) and the event space definition provided in Table 2, Pr (SR ∩ E 1 ), Pr (SR ∩ E 3 ), and Pr (SR ∩ CE 3 ) can be calculated from Section 4.3.

Table 2: Definition of BSN event space
Events Space Pr (SR ∩ E 1 ): In the event of E 1 , relay node T is reliable for the duration of the mission time, and neither biosensors B 1 nor H 1 will undergo PF. Pr Fig. 4a presents the simplified system FT model for evaluating Pr (SR ∩ E 1 ), this simplified FT model can be obtained by removing T and FDEP from the original FT in Fig. 2, and its BDD model is shown in Fig. 4b.Because the simplified model has components that experience PF, it can be calculated using Eq. ( 1), Pr ; CR E 1 (t) can be solved by the corresponding BDD model, as shown in Eq. (11).Pr (SR ∩ E 3 ): The LF of T occurs in the case of E 3 , but without PF from any D i .Depending on the isolation state of D i , E 3 can be further decomposed into 2 n complementary events: B 1 , H 1 , B 1 , H 1 , B 1 , H 1 , B 1 , H 1 .D 0 I indicate that neither B 1 nor H 1 is isolated by relay node T, and D 3 I indicate that both B 1 and H 1 are isolated by T. Pr (SR ∩ E 3 ) is computed in Eq. (12).
Pr (E 3 ∩ D i I) can be solved according to the definition and input parameters, Pr (E 3 ∩ D 0 I), ) and Pr (E 3 ∩ D 3 I) as shown below: Pr Pr (SR|E3 ∩ D i I) can be simplified through the removal of rules from the FT in Fig. 2 (relay nodes and related PFDG), which enables the simplified FT model to be generated for evaluation.
For example, the simplified system model of Pr (SR|E3 ∩ D 1 I) is shown in Fig. 5.The reduced FT in Fig. 5a is generated by removing the relay fault and FDEP gate from the FT in Fig. 2. Replacing the fault event "B 1 " with "1", and then applying the Boolean reduction rules (1AND B 2 = B 2 ).Pr Pr (SR ∩ CE 3 ): In the case of CE 3 , the isolation effect will occur.According to Table 2, the event E 3 consists of three events: namely, E 3 = CE 3,1 ∪ CE 3,2 ∪ CE 3,3 .The solution of Pr (SR ∩ CE 3 ) can be converted to finding Pr SR ∩ CE 3,j (j = 1, 2, 3).Thus, Pr SR|CE 3,j can be calculated by generating FT.

Integrated System Reliability
By calculating Pr (SR ∩ E 1 ), Pr (SR ∩ E 3 ) and Pr (SR ∩ CE 3 ) in the above steps, CR BSN (t) can be obtained using Eq. ( 5).Moreover, by applying Eq. ( 1), the reliability of the final BSN can be further obtained.

Numerical Analysis
The combined method employed herein is suitable for any failure time distributions of biosensors.The numerical analysis of system reliability in this paper is based on Weibull distribution in Weibull distribution.The probability density function for a random variable c conforming to a Weibull distribution is given below; where, (α, k) denotes the (shape, scale) parameters, respectively.
The expectation or average of c is as follows: Let (α iLF , k iLF ) , (α iPF , k iPF ) , (α iIT , k iIT ) respectively represent the parameters of variable the timeto-LF, the time-to-PF, and RIT of component i under Weibull distribution, as shown in Table 3. Their expected values are expressed by the average time to LF (ATL), the average time to PF (ATP), and the average time to IT (ATI), respectively, which can be calculated by Eq. (28).Note that the combined method presented in this paper is a system-level's analytical method and the componentlevel's parameters are assumed from the biosensor's industrial data.

Impact of RIT on System Reliability
This section studies the effect of RIT of dependent components B 1 and H 1 .Because LF and PF parameters of a component have different influences on system reliability when they are s-independent and disjoint, Eqs. ( 2) and (3) can be used to calculate.This paper analyzes the influence when LF and PF are s-independent.
For ease of calculation, B 1 and H 1 have the same RIT parameters.The BSN system's reliability is analyzed by the combination method detailed in Section 3.For the biosensors B 1 and H 1 with Weibull RIT, the shape parameter k iIT = 1.The scale parameter α IT (per h) and ATI of B 1 and H 1 are shown in Table 4. Table 4 summarizes the analysis results of the system reliability of LF and PF at different mission times.
According to the data given in Table 4, the reliability of the system decreases with an increase in task duration.Because the probability of failure of a biosensor in the system increases with increasing mission time.Obviously, the reliability of the system will continue to decrease under these circumstances.
In all cases, the system will exhibit different levels of reliability with different values of α IT .As can be seen intuitively, the ATI of dependent components increases with the decrease of α IT , the reliability of the BSN system presents an increasing, non-monotonic, and decreasing phenomenon at 48, 96 and 144 h.
In more detail, with the increase of α IT , the isolation probability of biosensors B 1 and H 1 also increases.In short, system reliability will be increased (positive effect) or decreased (negative effect) due to the interaction between the occurrence probability of the event SR∩E 3 ∩D i I. Table 5 summarizes the influences of different dominant effects on system reliability in different time regions.When the propagation and isolation effects of B 1 and H 1 occur simultaneously, another dual effect will occur: first, the relay fault will isolate B 1 and H 1 , thus preventing the PF of nodes B 1 and H 1 affecting other components in the system; second, the isolation effect makes the dependent nodes B 1 and H 1 unusable or inaccessible, which degrades the system's performance.Moreover, on the one hand, the negative effect will reduce the reliability of the system, while on the other hand, the improvement effect will increase the reliability of the system.The change in BSN system reliability during mission time is caused by the combination of improvement and negative effects.With increasing α IT , the system reliability decreases under the action of the negative effect at 48 h; at 96 h, the system reliability initially decreases due to the negative effect, while the improvement effect increases the system reliability as α IT increases.

PF Impact on System Reliability
As summarized in Table 6, when the value of α iPF of dependent components increase to 1.7e-3, 2e-3 and 2.2e-3, using the parameters in Table 2, and the isolation time is (α IT = 8.33e − 4, k = 1).As can be seen from the results in Table 6, when the α iPF increase (ATP decreases at this time), the reliability of the whole system will decrease (that is, the unreliability of the system will increase).Because as α iPF increases, the probability of PF in the system increases, which obviously increases system's unreliability.Moreover, the passage of time has a more significant impact on system reliability.As the mission time increases, the influence of α iPF on system's reliability ranges from 1.23% (calculated as (0.969440-0.957646)/0.957646) for 48 h to 9.01% (calculated as (0.763712-0.700598)/0.700598) for 144 h.This can be explained by the low likelihood that B 1 and H 1 will be isolated at the beginning of the task because the relays are so reliable.Therefore, the PF of B 1 and H 1 has a great influence on system reliability, in other words, with the increase of PF probability in B 1 and H 1 , the reliability of the BSN system decreases.As the task time progresses, with the relay node deteriorating, B 1 and H 1 are more likely to be isolated.Although the propagation effects of B 1 and H 1 are weakened by isolation effects, the reliability of the system is reduced because of the aging of other components in the system.

Impact of LF on System Reliability
Table 7 summarizes when the value of α iLF of relay node increase to 1.5e-2, 2.7e-2 and 5e-2, using the parameters in Table 2, the isolation time is (α IT = 8.33e − 4, k = 1).The results in Table 7 show that the reliability of the entire system will change to a certain extent when the α iLF increases (ATL decreases at this time).In more detail, the system reliability first decreases and then increases at t = 48 h.Moreover, at t = 96 and t = 144 h, the system reliability exhibits an upward trend.This is because, at the beginning of the task when α RLF is relatively small, the relay still has high reliability, meaning that the probability of isolation effect is low; this means that the PF in the dependent node cannot be isolated in time, which results in a reduction in system reliability.However, with the increase of α RLF , the probability of relay failure will also increase, which in turn increases the probability of isolation effect in the BSN system and improves the system reliability.For example, with the passage of task time, when t = 96 and t = 144 h, the probability of component failure in the system increases, and the isolation effect also increases; thus, the system reliability tends to increase overall.

Evaluation
As far as we know, there are few algorithms for random isolation time.Table 8 lists the combinatorial algorithms for calculating system reliability with three competing failures.C 1 considers the impact of competitive failure on system reliability; C 2 is considering the influence of fault propagation time on system reliability, the propagation time is (α PT , k PT ) = (1e − 2/h, 1.0);Both C3 and the proposed algorithm consider the influence of RIT on system reliability, C 3 assumes that only the dependent node H 1 is affected by the relay node, and the isolation time is (α IT , k IT ) = (4.17e− 2/h, 2.0).C 1 only focuses on the evaluation of system reliability in the case of competitive failure.C 2 studies the influence of fault propagation time on system reliability on the basis of C 1 .Neither C 1 nor C 2 takes into account the impact of random isolation time on system reliability, while C 3 takes into account the impact of RIT on system reliability, but it has strong limitations and is only applicable to the case of one dependent node.The algorithm proposed in this paper solves the calculation problem of system reliability in the case of any dependent nodes.
Note that when the number of dependent nodes is reduced to one, the method presented in this article is the same as for C3.Assume that only H 1 depends on the relay node and the LF and PF of the same node are s-independent.Using the parameters in Table 3, (α IT , k IT ) = (4.17e− 2/h, 2.0), the reliability of the system (t = 24 h, 48 h, 72 h) is (0.9711, 0.9050, 0.8165), which is consistent with the results obtained by the method in this paper.

Conclusions
In IoT systems, the energy of the biosensor comes from the batteries.Because of the existence of FDEPs, there are propagation effects and isolation effects compete with each other in time and have to consider the RIT problem However, to the best of our knowledge, existing working assumptions assume that isolation time is zero or that the relay node supports only one dependent node.In this paper, a combination method is proposed for computational reliability analysis to analyze RIT behavior.Although the Weibull distribution is used in this case, the method can be used for any type of failure time distribution.This method can decompose complex problems and then calculate them by traditional methods.Note that although the BSN system is used in this case study, the method can be used for any wireless communication system.Notably, this method assumes that the biosensor can only use one relay for data transmission.In our future work, we intend to allow multiples of the same sensor to use different relay nodes for data transmission, solving the problem of correlation between multiple groups of related FDEP.

Figure 3 :
Figure 3: FT of the BSN mode

Figure 4 :
Figure 4: The FT and BDD of CR E 1

Table 4 :
System reliability under Weibull distribution

Table 5 :
Domination effect of the BSN system

Table 6 :
α iPF influence on system reliability

Table 7 :
α iLF influence on system reliability

Table 8 :
System reliability calculation by combination algorithms