Prospect Theory Based Individual Irrationality Modelling and Behavior Inducement in Pandemic Control

It is critical to understand and model the behavior of individuals in a pandemic, as well as identify effective ways to guide people's behavior in order to better control the epidemic spread. However, current research fails to account for the impact of users' irrationality in decision-making, which is a prevalent factor in real-life scenarios. Additionally, existing disease control methods rely on measures such as mandatory isolation and assume that individuals will fully comply with these policies, which may not be true in reality. Thus, it is critical to find effective ways to guide people's behavior during an epidemic. To address these gaps, we propose a Prospect Theory-based theoretical framework to model individuals' decision-making process in an epidemic and analyze the impact of irrationality on the co-evolution of user behavior and the epidemic. Our analysis shows that irrationality can lead individuals to be more conservative when the risk of being infected is small, while irrationality tends to make users be more risk-seeking when the risk of being infected is high. We then propose a behavior inducement algorithm to guide user behavior and control the spread of disease. Simulations and real user tests validate our proposed model and analysis, and simulation results show that our proposed behavior inducement algorithm can effectively guide users' behavior.


Introduction
The outbreak of COVID-19 has resulted in a severe public health crisis and significant economic losses.Governments worldwide have implemented various measures, including lockdowns and mandatory quarantines, to curtail the spread of the disease.However, individuals may not comply with these policies, as many have their own opinions and preferences.During such public health crises, individuals tend to act irrationally, such as excessive panic in the early stages of a disease outbreak or underestimation of the dangers of the disease later in its spread, which can significantly impact their decisions and ultimately affect the spread of the epidemic.Moreover, individuals' behavior and the pandemic mutually influence each other.For example, individuals' behavior such as wearing masks, social distancing, and isolating can impede the epidemic, while people tend to adopt protective behaviors when the pandemic is more severe.Therefore, it is crucial to model individuals' behavior during disease outbreaks and determine how to control the spread of the disease by guiding people's behavior without resorting to mandatory measures.

Literature Review
In the following, we will review recent works on epidemic control over networks, user behavior modeling during an epidemic, and irrational behavior modeling.

Epidemic Control
There are numerous works studying how to control the disease spread over networks, and many works attempt to limit the disease spread on a network by removing nodes.Wang et al. demonstrate that epidemic spread on a network is highly correlated with the largest eigenvalue of the graph's adjacency matrix [1].As a result, existing works in [2][3][4][5] attempt to manipulate the adjacency matrix's eigenvalues by removing nodes to minimize the likelihood of disease outbreaks.Additionally, the works in [6][7][8] explore macro-level approaches to control disease spread, such as restricting population movement or implementing proportional quarantine.However, these methods assume that individuals will always comply with the policies, which is often not the case in reality.

Individual Behavior Modeling During an Epidemic
There are prior works attempting to model individuals' behavior during an epidemic.The authors of [9][10][11][12] believe that as the proportion of infected individuals in the environment increases, people are more likely to adopt protective behaviors.Zhang et al. assume that individuals would be more likely to take protective behavior when the proportion of infected neighbors is high [9],and they analyze the effect of individual protective behavior on disease spread.The proposed model in [10] assumes that an individual's adoption of protective behavior is influenced by the proportion of infected neighbors, as well as regional and global infection rates.The works in [13][14][15] observe that information dissemination also affects individual protective behavior during the pandemic.However, these studies do not consider the common and critical issue of individual irrationality, which can significantly affect their decision-making process during an epidemic.

Irrational Behavior Modeling
Individuals often make irrational decisions when faced with risks, such as the risk of being infected during a pandemic.For example, many people may be overly panicked in the early stages of a pandemic and may underestimate the risk of the disease in the later stages.A challenging issue here is how to mathematically model such irrational behavior.Prospect theory offers theoretical models to quantify how individuals tend to overestimate small probabilities and underestimate high probabilities [16][17][18][19][20].It is crucial to analyze the impact of this irrationality on individuals' decisions.Studies in [21][22][23] analyze the impact of irrationality on users' decisions on whether to get vaccination during an epidemic, and they assume that this is a one-time binary decision problem where users only make one binary decision during the entire epidemic.However, in reality, individuals have multiple protective behaviors available to them, such as wearing a mask, washing hands, and isolating at home, and they need to continuously decide whether or not to take such protective behaviors and which behavior to take during the entire epidemic.Individuals may choose to adopt the highest-level protective measures such as self-isolation when the epidemic is rapidly spreading, while they may decide not to take any protective behaviors when the epidemic is declining.Therefore, it is crucial to investigate how the epidemic and individual decisions continuously interact with and influence each other over time.
In summary, current studies on individual behavior modeling in epidemics either neglect the impact of irrational decision-making or fail to account for behavior changes in response to the epidemic.Moreover, existing research on epidemic control often assumes users' absolute compliance with the government's policies.Consequently, it is critical to consider user irrationality and the changes in individuals' decisions and propose an effective mechanism to guide their behaviors to control the epidemic.

Our Contribution
In this paper, we model the individuals' irrational decisions during an epidemic and analyze the impact of the individuals' irrationality on their decisions as well as the epidemic.Based on the individuals' behavioral model, we propose an effective method to guide user behavior and control the spread of disease.
In summary, the main contributions of our work are: • We build an M-choice epidemic-behavior co-evolution model to simulate individuals' irrational decision-making and analyze their impact on the epidemic.We theoretically analyze the co-evolution of user behavior and epidemic and its steady state.We also study the impact of irrationality on individuals' behavior and disease spread.
• Given the above individual behavior model, we propose a behavior inducement algorithm to guide individuals' decisions to control the epidemic.
• We validate our individual behavior model and behavior inducement algorithm through simulations.In addition, we use real user tests to validate the conclusions about the impact of irrationality on individuals' behavior.
The rest of the paper is organized as follows.Section 2 presents our proposed epidemicbehavior co-evolution model.Section 3 analyzes the steady state of the epidemic-behavior dynamics and the influence of irrationality.Section 4 presents the behavior inducement method to control the disease spread.Section 5 shows the simulation results.Section 6 shows the results of real user tests, and conclusions are drawn in Section 7.

The M-Choice Epidemic-Behavior Co-Evolution Model
During an epidemic, individual behavioral choices and the spread of the disease mutually influence each other.When the probability of infection and the potential losses are high, people tend to adopt protective behaviors.In turn, these protective behaviors can effectively inhibit the spread of the disease.In this section, we propose a model to capture the coevolution of individual behavioral choices and disease spread during a pandemic.We consider irrational behavior in our model, and use the model with rational behavior assumption as a baseline to analyze the impact of irrationality on the evolutionary dynamics of the epidemic and its steady states.Building upon the model presented in [24], we assume that individuals have a choice among M possible behaviors based on the severity of the epidemic, and these behavioral choices, in turn, impact the spread of the disease.
Following the work in [14], we use two undirected networks to represent the connections among individuals.The first network is the physical contact network where the disease spreads.The second network is the information network where individuals exchange information about their current health state and behavioral choices.It is worth noting that the two networks are different.In reality, an individual may get infected by strangers in a restaurant or on a bus, while their behavior will not be influenced by these strangers since they have not interacted with them.Similarly, users' decisions may be influenced by their friends on the Internet without any physical contact with each other.To simplify the analysis, we make the assumption that both the physical contact network and the information network are regular networks consisting of N nodes, where each node represents an individual.In a regular network, each node has a fixed degree, denoted as k for the physical contact network and d for the information network.As individuals communicate with each other in the information network, we assume that they have knowledge of the health statuses of their neighbors in the information network.However, in the physical contact network, there is no exchange of information, and we assume that individuals do not have knowledge of the health statuses of their neighbors there.In the following, we will use the terms "graph" and "network" interchangeably, and the terms "node", "user", and "individual" interchangeably.
Our model consists of two interconnected parts: the disease spread model and the behavior change model.The disease spread model quantifies how the pandemic propagates through the network given the current behaviors of all individuals, and the behavior change model describes how individuals update their behaviors based on the current number of infected individuals and the behaviors of their neighbors.The illustration of our model is in Fig. 1.In the following, we will introduce these two components of our model in detail.

The Disease Spread Model
We use the classic Susceptible-Infected-Susceptible (SIS) model to depict the spread of the disease.Each individual can be in one of two health states: susceptible or infected.We divide time into slots of equal length.At each time slot, a susceptible individual can be infected by an infected individual at a given infection rate, and an infected individual recovers at a certain recovery rate.We assume that susceptible individuals can adopt several different protective measures to reduce their risk of infection, and assume that the susceptible individuals have a total of M possible behavioral options {a 1 , a 2 , ..., a M }.For example, some may take the pandemic very seriously and adopt self-quarantine to avoid contact with infected people; some may adopt medium-level precautions such as wearing masks when going out and washing hands frequently; while others may take no protection and act as if the epidemic does not exist.To simplify the analysis, we also assume that all susceptible persons taking action a i have the same infection rate β i .For those already infected with the disease, we consider the worst-case scenario where they do not take any protective behavior such as home isolation to prevent the disease from further spreading.This assumption is grounded in the understanding that infected individuals might not possess the same level of motivation or necessity to embrace protective measures since they are already infected.We also assume that all infected people have the same recovery rate γ to simplify the analysis.Let s(t) and i(t) denote the fractions of susceptible and infected individuals, respectively, at time t, while x i (t) denotes the fraction of people adopting action a i among all susceptible individuals at time t.Then the mean-field equation of the disease spread is: where s(t) + i(t) = 1 and β = n i=1 β i x i (t).Then, by plugging s(t) = 1 − i(t) into (1), we can get the differential equation modeling the disease spread

The Behavior Change Model
The individual behavior model quantifies the dynamics of {x i (t)}, which represents the proportion of susceptible individuals adopting action a i at time t.The change in x i (t) can be attributed to two main factors.First, individuals may alter their decisions over time in response to the severity of the pandemic and the influence of their neighbors' behaviors.We use D i (t) to represent this part of the change in x i (t).In addition, due to nodes' changes in their health state, the proportion of susceptible individuals adopting different behaviors may also change.For example, a susceptible individual who was taking action a i at time t − 1 may become infected at time t, or an infected person recovers at time t and decides to take action a i .We use B i (t) to represent this part of the change in x i (t).Then we have: In Section 2.2.1, we focus on modeling D i (t), where individuals' decisions are influenced by their neighbors and the severity of the pandemic.In Section 2.2.2, we study B i (t) and analyze how the changes in individuals' health states may affect the change in x i (t).

Analysis of D i (t)
To model individuals' active behavior change in response to their neighbors' influence and the proportion of infected individuals, we employ evolutionary game theory.Evolutionary game theory is a useful framework to study the impact of neighbors on individuals' decisions [25,26].The basic elements of evolutionary game theory include individual, strategy, payoff, and strategy update rules.We will introduce these elements one by one in the following.Individual and Strategy: Each individual is represented as a node in the information network.As mentioned in Section 2.1, we assume that there are a total of M possible protective behaviors {a 1 , • • • , a M } for susceptible individuals, and each behavior a i corresponds to one strategy for a susceptible individual.For infected individuals, as mentioned in Section 2.1, they do not take any protective behavior such as home isolation to prevent the disease from spreading.Therefore, in this work, we focus on the analysis of susceptible users' behavior and study how their decisions are influenced by their neighbors and the severity of the epidemic.In each time slot, m percent of susceptible individuals are randomly chosen as focal individuals.These focal individuals observe and imitate their neighbors' behaviors.The remaining susceptible individuals maintain their actions unchanged during this time slot.The Payoff: In this paper, we study the protective behavior of susceptible individuals.Note that infected individuals have different health statuses from susceptible ones, and they use the same and fixed strategy of no protective behaviors.Therefore, in this work, we assume that susceptible individuals consider these infected users' decisions not valuable to them, and that the strategies of all susceptible individuals are solely influenced by their susceptible neighbors.So we define the payoff for susceptible individuals only, and ignore the payoffs of infected individuals, as they do not impact susceptible individuals' update of their strategies.
In each time slot, every susceptible individual receives a payoff determined by the chosen strategy and interactions with neighbors.In this paper, we consider two scenarios where the susceptible individual is rational and irrational.Therefore, we define the payoff of different behavior based on the Expected Utility Theory (EUT) [27] and Prospect Theory (PT) [18], respectively, where EUT models the individual as a rational person and PT considers the individual's irrationality.Here, following the prior work in [23], to simplify the analysis, we assume that either all individuals are rational or they are all irrational, and compare the two results to analyze the impact of users' irrationality on the co-evolution of individuals' behavior and the pandemic.
Rational individuals' payoff function: Expected Utility Theory (EUT) is an economic theory that models the decision-making of rational individuals.When an individual chooses a specific behavior, denoted as a i , it leads to L potential actual payoffs o i,1 , o i,2 , ..., o i,L with probabilities p i,1 , p i,2 , ..., p i,L , respectively.It is worth noting that an individual may receive more than one actual payoff for their behavior, and j=1...L p i,j ̸ = 1.For example, if a user decides to go out for dinner during an epidemic, they will receive a positive payoff from enjoying the fine cuisine, while they may also face a negative payoff if they become infected.In addition, from EUT, the perceived payoff may differ from the actual payoff.For example, the relationship between the perceived and actual payoffs is often not linear, and there is a phenomenon of diminishing marginal payoff [16].In prior works in EUT, the value function u E (x) is often used to model the relationship between the actual payoff x and the perceived payoff u E (x).There have been various forms of u E (x) used in the previous works, including the simplest form of u E (x) = x, as well as the power function and the exponential function form [28]. Thus in EUT, the payoff associated with choosing behavior a i is calculated as the expected utility, denoted as U EU T i and is defined as: In our behavior modeling and epidemic control problem, every susceptible individual adopting protective behavior a i would get a fixed actual payoff of c i , which is the payoff from the behavior itself, and the probability of getting this outcome is 1.One example is the gain from enjoying the fine cuisine of dining outside during an epidemic.If the individual is infected at the next moment, it would get an additional loss of c n with c n < 0. For an individual adopting a i , the probability to be infected at time t is approximately kβ i i(t) [29] 1 .Therefore, in our model, the individual who takes a i will get two potential actual payoffs.A payoff of c i with probability 1, and a payoff of c n with probability kβ i i(t).So the expected utility in (4) becomes (5) Irrational individuals' payoff function: Different from the Expected Utility Theory (EUT), the Prospect Theory (PT) takes into account the irrational tendencies exhibited by individuals when faced with uncertainty.In PT, individuals have a tendency to overestimate the probability of small risks and underestimate the probability of large risks [16].Therefore, not only do the actual and the perceived payoffs differ, but the actual and the perceived probabilities are also different when irrational users face uncertainties.
Similar to EUT, the value function u P (x) in PT can take different forms, one commonly used form is the power function [20] where λ reflects the individual's sensitivity to gain and loss, and σ ∈ (0, 1] reflects the curvature and shape of the value function.Our theoretical analysis in Section 3 and Section 4 do not depend on the specific form of u P (x).
In addition, instead of using the actual probability p i , individuals' perceived probability is ω ω ω(p i , α), where ω ω ω(p, α) is the probability weighting function.Following the prior work in [17], in this work, we use the following weighting function to describe the relationship between the perceived probability ω ω ω(p, α) and the actual probability p: where α is the irrationality coefficient.A smaller α indicates that the individual is more irrational (or equivalently, less rational), and the difference between the actual and the perceived probabilities are larger.Note that when α = 1, ω ω ω(p, 1) = p and the perceived and actual probabilities are the same.Also, ω ω ω(1, α) = 1 from ( 7), and we define ω ω ω(0, α) = 0.This ensures that ω ω ω(p, α) ∈ [0, 1] and ω ω ω(p, α) is an increasing function of p.For simplicity, we use the mean-field method and assume all individuals have the same α.
Given the probability weighting function in (7) and the value function u P (x), when an irrational individual chooses behavior a i , which leads to L different potential payoffs {o i,j } with corresponding probabilities {p i,j }, respectively, the expected payoff is In our problem, same as the analysis of U EU T i in the above, the individual who chooses behavior a i will get two potential actual payoffs: a payoff of c i with probability 1, and a payoff of c n with probability kβ i i(t).So (8) becomes Note that if the value functions of EUT and PT are identical (i.e., u E (x) = u P (x)) and the irrationality coefficient α is set to 1, then PT degenerates to EUT.Strategy Update Rules: In each time unit, mN 0 (t) individuals are randomly selected as the focal individuals to update their strategies and others would keep their strategies unchanged, where m is the fraction of individuals who are chosen as the focal individuals, and N 0 (t) is the total number of susceptible users at time t in the network.The focal individuals tend to imitate their neighbors' behavior with a high payoff.Following the work in [33], given a focal individual v with strategy a i and given v randomly chooses a neighbor w using strategy a j , the probability the individual v changes its strategy to a j is where ω ∈ (0, 1] measures the strength of selection, and U max is the normalization term to ensure p(a i → a j ) ≤ 1. U i and U j are the payoffs of strategy a i and a j , respectively.
In our work, we assume that individuals with different behaviors are uniformly distributed in the entire network.Therefore, the probability that the focal user v adopts behavior a j is x j (t), which represents the proportion of susceptible individuals who choose behavior a j at time t in the entire network.And the proportion of focal user v's susceptible neighbors adopting behavior a i is the same as the proportion of susceptible users adopting behavior a i in the entire network.Then the probability that x i (t) increases by 1  N 0 (t) due to individuals' strategy change is where N 0 (t) = N 0 s(t) is the number of susceptible individuals at time t.Similarly, the probability that x i (t) decreases by 1 N 0 (t) due to individuals' strategy change is Combining ( 10) and ( 12), we have

Analysis of B i (t)
In reality, even if individuals do not change their behaviors, the proportion of susceptible individuals with different behaviors will change over time due to transitions in health states.Let s i (t) be the fraction of individuals who are susceptible and adopt behavior a i at time t among the entire population, that is, s i (t) = s(t)x i (t), where s(t) is the fraction of susceptible individuals among all users in the network, and x i (t) is the fraction of individuals adopting a i among all the susceptible individuals.Note that s(t) = M j=1 s j (t), when all individuals do not change their behavior, we have Note that s ′ i (t), the first order derivative of s i (t), contains two parts.The first part represents the change caused by the infection of susceptible individuals, while the second part represents the change caused by the recovery of infected individuals.For the first part, as there are a total of s i (t) susceptible individuals adopting behavior a i , and each of them has probability β i ki(t) to be infected, we have s ′ i1 (t) = −s i (t)β i ki(t).For the second part, we assume that the recovered individuals would choose their behaviors based on the ratio of different behaviors of susceptible individuals in the network, similar to the work in [34].
Therefore, γi(t) infected individuals will recover, x i (t) of whom will choose action a i , and we have s ′ i2 (t) = γx i (t)i(t).By combining these two parts (s Given ( 14), ( 15) and s i (t) = s(t)x i (t), we have

The Overall Behavior Change Dynamics
Combining (3), ( 13) and ( 16), the complete differential equations describing the dynamics of individual behavior change are

The Dynamics and the Steady States of the Epidemic-Behavior Co-evolution Model
Based on the disease spread equation ( 2) and the behavior change dynamics (17), we get the M-choice epidemic-behavior co-evolution model as follows: and Here, the first differential equation represents the dynamics of individuals' health states, and the subsequent M equations model the changes in the proportions of susceptible individuals adopting each of the M behaviors.At the steady state, both the proportion of infected individuals i(t) and the proportions of individuals adopting different behaviors {x i (t)} reach a stable state where there are no further changes in i(t) and {x i (t)}.Even if a small group of individuals becomes infected/recovered or changes their strategies, the steady state would be restored.We denote the steady state of the M-choice model as To find the steady state of our M-choice disease spread and behavior change model, we follow an approach that is similar to [34] and apply Lyapunov's first method [35].
where {λ k } are the eigenvalues of the Jacobian matrix and Re(x) means the real part of x.As it is usually difficult to find the closed-form solution of ( 19), we often use numerical methods to find the steady states of the co-evolution of disease spread and behavioral choice.

Analysis of the Steady State and the Influence of Irrationality for the 2-Behavior Model
To obtain insights into the co-evolution process of disease spread and behavioral choice and their steady states, in this section, we consider a simple scenario where each susceptible individual can have two possible actions and the theoretical solution of ( 19) can be obtained.For example, a susceptible individual may either take risky behavior such as continuing to go out in spite of the epidemic, or take conservative behavior such as home isolation.Home isolation helps significantly reduce the risk of being infected while it also leads to substantial economic loss as well as impacts people's physical and mental well-being.During the pandemic, individuals need to choose between high-cost low-risk conservative behavior and low-cost high-risk risky behavior.Their decisions are often influenced by the severity of the epidemic and the potential loss due to home isolation.People tend to choose selfisolation when the pandemic poses a greater threat to their health; while they may be inclined to go out when the loss due to home isolation is too high (e.g., losing their jobs and income).We use this simple scenario to gain insights into the co-evolution process and its steady states, and theoretically analyze the impact of irrationality on the pandemic as well as users' behavior.For the more general scenario with more than 2 possible behavior choices, we use numerical solutions and simulation results to show the evolution process, and an example with three possible behavior choices is shown in Appendix H.
In this section, based on our model in the previous section, we analyze the evolution of the epidemic and the dynamics of individuals' choices between two behaviors: risky behavior (going out) represented by a 1 , and conservative behavior (home isolation) represented by a 2 .As an example, we assume that the infection rate for risky behavior is β 1 , while the infection rate for conservative behavior is β 2 = 0 as isolation ensures no infection risk.Then we analyze the steady states when the individuals are all rational or irrational, respectively, and compare their results to investigate the influence of irrationality.

Steady State Analysis 3.1.1. The steady states with all rational individuals
When all individuals are rational, the payoff is modeled by EUT in (5).Since x 1 (t) + x 2 (t) = 1, we can replace x 2 (t) by 1 − x 1 (t).With 2 possible behavior choices, and given the EUT payoff function (5), the differential equations in (18) becomes where k 0 = mω Umax > 0. To find the steady states of ( 21), we have the following Theorem 1.
Theorem 1.The steady state (i E , x E 1 ) of ( 21) satisfies (22), where i , there is no steady state.
Proof : See Appendix A.
From Theorem 1, There are three possible steady states, which correspond to three different situations in reality: • Case 1: The steady state (0, 1) represents the extreme situation where the infection rate is too low and the disease would die out eventually even without any protection.So all individuals choose the risky behavior of going out.The evolution process reaches the steady state (0, 1) The term 1 k is the epidemic threshold of a homogeneous network [36].If β γ < 1 k , the epidemic would die out; otherwise, it would spread out.Since β = β 1 x 1 (t) ∈ [0, β 1 ], in this scenario, the disease would die out no matter which behavior people choose.Therefore, all individuals would choose the risky behavior in this steady state since it gives a higher payoff.Therefore, when k < γ β 1 , the stable state is i = 0 and x 1 = 1.
• Case 2: For the steady state 1 − γ kβ 1 , 1 , there are two constraints that need to be satisfied.The first constraint is k > γ β 1 , which means that if all individuals choose the risky behavior, the disease will spread out.The second constraint is Φ 1 < 0. To understand the second constraint, note that when all individuals choose the risky behavior with x 1 = 1, the proportion of infected individuals will reach the stable state î = 1 − γ kβ 1 , which represents the maximum extent to which the disease can spread (proof: see Appendix B).If for all possible values of i in the range [0, î], we have > 0 for all x 1 (t) ∈ (0, 1)2 , then more people will choose the risky behavior as time goes on, and ultimately all individuals will choose the risky behavior at the steady state with x 1 = 1.This may happen when the risky behavior's payoff is much higher than the conservative behavior's with c 1 ≫ c 2 , or when the cost of being infected c n is very low.Note that from (21), dx 1 dt | 0≤i≤ î,0<x 1 <1 > 0 is equivalent to Φ 1 < 0, where Φ 1 is defined in (22).Therefore, when k > γ β 1 and Φ 1 < 0, the steady state 1 − γ kβ 1 , 1 is reached, where all peoples choose the risk behavior, and the proportion of infected people reaches the maximum level î.
• Case 3: The steady state i (1) , represents the scenario other than the above two extreme cases.In this scenario, the disease does not extinct, nor does it spread to the maximum extent, and at the steady state, 0 < i (1) < î of the individuals in the network will be infected.Meanwhile, 0 < γ (1−i (1) ) kβ 1 < 1 of susceptible individuals will choose the risky behavior.This happens when k > γ β 1 and Φ 1 ≥ 0.

The steady states with irrational individuals
Next, we consider the scenario where all individuals are "irrational", and model their payoff function using the Prospect Theory.Plugging in the PT utility function ( 9) into (17), the dynamic of the epidemic and the behavior becomes: Similarly, we can get the steady state of ( 23) in Theorem 2.
Theorem 2. The steady state (i P , x P 1 ) of ( 23) satisfies: where and i (2) , there is no steady state.
Proof : See Appendix C. The three stable states in PT share similarities with those in EUT: • Case 1: The steady state (0,1) is the same as Case 1 under EUT.In this situation, the disease would always die out, then all individuals would choose the risky behavior.
• Case 2: The steady state 1 − γ kβ 1 , 1 is the same as Case 2 under EUT.In situations where the payoff for risky behavior is extremely high or when the loss of being infected is very small, all individuals would choose risky behavior, causing the disease to spread to its maximum extent.Note that the first constraints in ( 22) and ( 24) are the same, while the second constraints are different as • Case 3: The steady state i (2) , is similar to Case 3 under EUT, where the disease does not extinct, nor does it spread to the maximum range.

Analysis of Individuals' Irrationality
In this section, we analyze the influence of individuals' irrationality on the steady state.In the weighting function in (7), the irrationality coefficient quantifies the irrationality degree of individuals, and a smaller α indicates that individuals are more irrational with the difference between the actual and perceived risk being larger.To analyze the impact of the irrationality coefficient on people's behavior, we compare the steady states (i P , x P 1 ) at different α, and we have the following Theorem 3. Theorem 3. Given the same set of system parameters (that is, β 1 , γ, k 0 , c 1 , c 2 , c n and k) and the same value function u P (x), let 0 < α < ᾱ < 1 be two irrationality coefficients, and ( īP , xP 1 ) and (i P , x P 1 ) are the steady states with ᾱ and α, respectively, where the individuals with ᾱ have low irrationality and individuals with α have high irrationality.Then we have: 3a.When k < γ β 1 , all individuals, regardless of their irrationality degree, will choose risky behavior with xP 1 = x P 1 = 1, and the epidemic will eventually die out with īP = i P = 0. 3b.When k > γ β 1 and Φ 2 is not greater than or equal to 0 simultaneously for ᾱ and α.In addition, if 1 − γ kβ 1 ≤ 1 kβ 1 e , there are two possibilities.
-When Φ 2 < 0 for both ᾱ and α, all individuals, regardless of their irrationality degree, will choose the risky behavior with xP -When Φ 2 < 0 for ᾱ and Φ 2 ≥ 0 for α, all individuals with low irrationality will choose the risky behavior with xP 1 = 1.On the other hand, only a subset of individuals with high irrationality will opt for risky behavior with x P 1 < 1.On the contrary, if 1 − γ kβ 1 ≥ 1 kβ 1 e , there are two possibilities.
-When Φ 2 < 0 for both ᾱ and α, all individuals, regardless of their irrationality degree, will choose the risky behavior with xP 1 = x P 1 = 1.-When Φ 2 < 0 for α and Φ 2 ≥ 0 for ᾱ, all individuals with high irrationality will choose the risky behavior with x P 1 = 1.On the other hand, only a subset of individuals with low irrationality will opt for risky behavior with xP 1 < 1.
3c.When k > γ β 1 , Φ 2 ≥ 0 for both ᾱ and α, the epidemic neither dies out nor spreads to the maximum extent, and a fraction of individuals choose the risky behavior at the steady state.
-In addition, if īP ≤ 1 kβ 1 e , we have īP ≥ i P , xP 1 ≥ x P 1 , that is, compared to individuals with high irrationality, fewer individuals with low irrationality would choose the conservative behavior.The proportion of individuals with low irrationality getting infected would be higher than that of those with high irrationality.
-On the contrary, if īP ≥ 1 kβ 1 e , we have īP ≤ i P , xP 1 ≤ x P 1 , that is, compared to individuals with high irrationality, more individuals with low irrationality would choose the conservative behavior.The proportion of individuals with low irrationality getting infected would be lower than that of those with high irrationality.
Proof : See Appendix D. To better understand Theorem 3, note that in (24), Case 1 represents the situation where the infection rate is too low and the disease would eventually die out no matter how individuals choose their behavior.Therefore, individuals' irrationality will not affect the outcome when k < γ β 1 , as stated in Theorem 3a.For Theorem 3b, when k > γ β 1 and Φ 2 is not greater than or equal to 0 simultaneously for ᾱ and α, if 1 − γ kβ 1 ≤ 1 kβ 1 e , the percentage of high irrationality individuals choosing the risky behavior will be less than or equal to the percentage of low irrationality individuals, that is, x P 1 ≤ xP 1 .This is because, in this scenario, the risk of being infected is low (i.e., īP = i P = 1 − γ kβ 1 ≤ 1 kβ 1 e ), and higher irrationality makes individuals overestimate this small probability of risk, causing them to be more conservative.On the contrary, if 1 − γ kβ 1 ≥ 1 kβ 1 e , the percentage of high irrationality individuals choosing the risky behavior will be larger than or equal to the percentage of low irrationality individuals, that is, x P 1 ≥ xP 1 .This is because, in this scenario, the risk of being infected is high (i.e., īP = i P = 1− γ kβ 1 ≥ 1 kβ 1 e ), and higher irrationality makes individuals underestimate this large probability of risk, causing them to be more adventurous.
For Theorem 3c, when k > γ β 1 , Φ 2 ≥ 0 for both ᾱ and α, both ( īP , xP 1 ) and (i P , x P 1 ) are in Case 3 in Theorem 2, where some individuals get infected while the rest do not.In this case, if the risk of getting infected is low when stable (i.e., īP ≤ 1 kβ 1 e ), higher irrationality can motivate individuals to adopt conservative behaviors as they tend to overestimate this small risk, resulting in a decrease in the probability of being infected.On the contrary, if the risk of getting infected is high when stable (i.e., īP ≥ 1 kβ 1 e ), higher irrationality can reduce individuals' cautiousness as they tend to underestimate this large risk, resulting in more people getting infected.
Note that from Section 2.2, if the value functions of EUT and PT are identical (i.e., u E (x) = u P (x)), then EUT can be considered as a special case of PT with α = 1.Therefore, we can also apply Theorem 3 to make comparisons between rational individuals (following EUT) and irrational individuals (following PT).
In summary, irrationality tends to make users become more extreme, that is, risk-averse when the risk is small and risk-seeking when the risk is high.

Behavior Inducement to Control the Disease Spread
In this section, based on our previous analysis in Section 2 and 3, we study how to guide users' behavior and control the spread of disease through policy design and develop effective behavior inducement algorithms.

The Optimal Behavior Inducement Algorithms
We first discuss measures that governments can take to guide people's behavior during an epidemic.For example, they can incentivize or penalize certain behaviors, such as subsidizing risky behaviors (e.g., going out) to boost the economy, penalizing risky behavior, or encouraging conservative behaviors (such as staying at home and wearing masks) to control the disease spread.In our model, this means the parameters c 1 and c 2 can be changed.In addition, during a pandemic, people often have different perceptions of the loss of being infected, which are largely due to the various propaganda efforts.So we assume that the parameter c n can also be adjusted.Furthermore, note that propaganda via social networks and media often influences people's irrationality [37], and thus, we assume that the irrationality coefficient α can be changed as well.In this work, to simplify the analysis, we consider the simple scenario where these parameters c 1 , c 2 , c n and α can be changed to the desired values.We plan to study in our future work the more practical scenario where the optimization parameters are the actions government can take (such as rewarding or punishing specific behaviors through policies) instead of the exact values of these parameters.
Next, we discuss the goals of behavior guidance.The first goal is to control the spread of the disease at the steady state.For example, the government may wish to keep the number of infected people as low as possible.We represent the loss caused by the pandemic as l 1 (i P ), where (i P , x P 1 ) is the steady state of PT.Also, if a large percentage of people take conservative behavior such as self-isolation, it will have a significant impact on the economy as well as people's mental health.Therefore, the second loss term we consider in our work is the loss due to such conservative behavior l 2 (x P 1 ).Furthermore, note that changing the values of c 1 , c 2 , c n , and α through behavior inducements such as propaganda, subsidies, and penalties will incur costs.In this work, the third goal is to minimize the cost associated with behavior guidance l 3 (δ δ δ), where δ δ δ = [∆α, ∆c n , ∆c 1 , ∆c 2 ] is the intervention vector quantifying the extent to which these variables are changed.Given c 1 , c 2 , c n , and α before behavior guidance, the adjusted parameters are Since the irrationality coefficient should be in (0, 1] and the payoff of being infected should be negative, we have 0 < α + ∆α ≤ 1 and c n + ∆c n < 0. Our goal is to find the optimal δ δ δ to minimize the total loss.The optimization problem is: In this work, we do not specify the specific forms of l 1 (i P ), l 2 (x P 1 ), and l 3 (δ δ δ), while we assume that they are differentiable, i.e., ∂l 1 ∂i , ∂l 2 ∂x 1 , and ∂l 3 ∂δ δ δ exist.Moreover, we assume that l 3 (δ δ δ) satisfies and the same constraint also holds for ∂l 3 ∂∆cn , ∂l 3 ∂∆c 1 and ∂l 3 ∂∆c 2 .This implies that when compared to the case where no behavior guidance is taken with δ δ δ = 0, increasing or decreasing any of the variables (c 1 , c 2 , c n , and α) will result in an increase in l 3 (δ δ δ), and l 3 (δ δ δ) has the minimum value at 0 0 0 with no behavior guidance.
From (24), there are three possible steady states.To solve the optimization problem (27), we need to consider all three possible steady states and analyze the optimal solution for each, which would be very complicated.Then we introduce Theorem 4 to simplify this problem.
Given the system parameters and δ δ δ, let (i 0 , x 0 1 ) and (i P , x P 1 ) be the steady states without and with behavior inducement, respectively.Let δ δ δ (3) be the optimal adjustment parameter when the steady state after adjustment (i P , x P 1 ) is in Case 3 in Theorem 2. That means where Φ 2 is defined in (24).
From our analysis in Appendix E, we have the following Theorem 4.
From Theorem 4, we only need to model and solve the problem for Case 3 in (24), which greatly reduces the complexity of our problem.

Solving the Optimization Problems
In this section, we consider the following scenario and use it as an example to demonstrate how to model and solve the optimization problem in (27).Consider the scenario where the government wants to control the epidemic spread and reduce the impact on the economy and people's mental health, that means, at the steady state after behavior inducement (i P , x P 1 ), the percentage of infected people i P is no more than i m and the ratio of people taking risky behavior x P 1 is at least x m .That is, there are two constraints 0 ≤ i P ≤ i m and 1 ≥ x P 1 ≥ x m .Here, we assume that 0 ≤ i m ≤ 1 and 0 ≤ x m ≤ 1.
As we assume that the infection rate β, the recovery rate γ, and the average degree of the networks k are fixed and cannot be changed, it is possible that we cannot find δ δ δ that can make the steady-state (i P , x P 1 ) satisfy both constraints simultaneously.For example, from the analysis in Section 3, when the infection rate is very high, it is unlikely to control the epidemic to a very small range while everyone goes out without any protective measures.Therefore, given β 1 , γ and k, the first step is to determine if the two constraints 0 ≤ i P ≤ i m and 1 ≥ x P 1 ≥ x m are feasible, that is, if it is possible to find δ δ δ that make the steady state satisfy both constraints.Section 4.2.1 studies how to determine the feasibility of the two constraints.If they are feasible, Section 4.2.2 explains the details of the optimization problem and proposes a fast algorithm to solve it.If they are not feasible, Section 4.2.3 studies how to reformulate the problem and let the steady state (i P , x P 1 ) be as close as possible to the constraints.

The Feasibility Test
From Theorem 4, we only need to get the optimal solution in Case 3 and then we can get the optimal solution of the whole space by comparing it with 0 0 0. From (24), if the steady state (i P , x P 1 ) is in Case 3, it should satisfy Note that in (30), x P 1 is an increasing function of i P .Therefore, given 0 ≤ i P ≤ i m , we have , then it is possible to find δ δ δ whose corresponding steady-state (i P , x P 1 ) satisfies 0 ≤ i P ≤ i m and 1 ≥ x P 1 ≥ x m simultaneously, and thus, the two constraints are feasible.Otherwise, the two constraints are infeasible and cannot be satisfied at the same time.
Fig. 2 shows two examples of the feasible region of (i m , x m ), where the green area includes all (i m , x m ) where the constraints are feasible, and the red area includes all the infeasible constraints (0

Solving the Optimization Problem With Feasible Constraints
If 0 ≤ i P ≤ i m and 1 ≥ x P 1 ≥ x m are feasible, we discuss how to solve the optimization problem.Given the two constraints, we adopt the exterior-point method and transform the two constraints into a penalty function: where µ is a parameter that determines the intensity of the penalty, and P (x) △ =(max{0, x}) 2 .We let l 1 (i P ) = µ[P (i P − i m ) + P (−i P )] and l 2 (x P 1 ) = µ[P (x m − x P 1 ) + P (x P 1 − 1)], then the optimization problem (27) becomes: According to Theorem 4, the optimization problem can be transformed into the problem in Case 3 and then we can get the optimal solution of the whole space by comparing it with 0 0 0.Here we only consider the situation where k > γ β 1 . 3Then the optimization problem in Case 3 becomes: Here, ω ω ω[p, α] = e (−(−lnp) α ) .Constraints ③, ④ and ⑤ guarantees that the steady state is in Case 3 in Theorem 2. The method for solving the problem is in Appendix F.

Reformulation of the Optimization Problem When the Constraints are Infeasible
If the constraints 0 ≤ i P ≤ i m and 1 ≥ x P 1 ≥ x m are not feasible, we reformulate the optimization problem and make (i P ,x P 1 ) as close to (i m ,x m ) as possible.Specifically, we transform the two constraints 0 ≤ i P ≤ i m and 1 in the objective function, so that the steady-state (i P , x P 1 ) is close to (i m , x m ) in the i − x 1 plane.Then the optimization problem (33) becomes (34) We can use the same method in Section 4.2.2 to solve (34), and details are shown in Appendix G.

Simulation Results
In this section, we first run simulations to validate our steady state analysis of the coevolution process and the effect of irrationality on behavior and the pandemic in Section 3. Then we validate the effectiveness of our behavior guidance algorithms proposed in Section 4. As there are few previous works that analyze how to model irrational individuals in a pandemic, we do not compare our method with other works in this section.

Simulations of the Steady States of EUT and PT
Theorem 1 and 2 give theoretical analyses of the steady states when individuals are rational and irrational, respectively.To validate the two theorems, as an example, we conducted simulations on regular networks with 500 nodes.The physical contact network has a fixed degree of 10, while the information network has a degree of 20.We observe similar trends on other types of networks and with other parameters.We set the recovery rate to γ = 0.03 as an example and let the infection rate β 1 vary.We first run the simulation to validate the three cases of the steady state of EUT and PT.Since risky behaviors such as going out and not wearing a mask are the default behaviors most people take in their daily lives, we set c 1 = 0; while conservative behaviors such as isolation and wearing masks can be regarded as behaviors with losses and we let c 2 < 0. In this work, we let c 1 = 0, c 2 = −1, and c n = −20.In order to facilitate the comparison between EUT and PT, we set u E (x) = u P (x) and use the power function in ( 6) with σ = 0.65 and λ = 1 as an example.For other value functions, we observe the same trend and omit the results here.For each simulation setup, we repeat the experiment 50 times and show the average result below.Fig. 3 shows simulation results of (i E , x E 1 ), and (i P , x P 1 ) with different infection rates β 1 .We can see that the simulation results match the theoretical results of the steady states in Theorem 1 and 2 very well.In Fig. 3 we can clearly see that the steady states can be divided into three segments, representing the three cases.The red area is Case 1, when the infection rate is too low, all individuals choose risky behavior, and the disease does not spread.The green area is Case 2, where the proportion of infected people increases gradually as the infection rate increases.However, since the infection rate is not high enough while the payoff of risky behavior is still high, all individuals still adopt risky behavior.The blue area is Case 3 where the infection rate continues to increase, the payoff of risky behavior gradually decreases, and more individuals adopt conservative behavior.Thus, the percentage of infected individuals decreases.Then we validate Theorem 3 and study the impact of irrationality on the steady states using the same simulation setup as before.Fig. 4 shows the average results of 50 simulation runs.We observe the same trend for other types of networks and other parameter settings.Note that in Fig. 4, α = 1 (EUT) corresponds to the scenario with rational users, as explained in Section 2.2.1.In Fig. 4, as β 1 γ increases, Point A is boundary point separating Case 1 and 2, and Point B, C, and D are the boundary points separating Case 2 and 3 with α = 0.6, α = 0.8, and α = 1 (EUT), respectively.Then we analyze the results of α = 0.6 and α = 0.8 as an example.We use (i P , x P 1 ) and ( īP , xP 1 ) to represent the steady states for α = 0.6 and α = 0.8.From Fig. 4(a) and 4(b), • before Point A, that is, when β 1 γ < 1 k , irrationality does not influence users' behavior and the steady states in Case 1, as both curves with α = 0.6 and α = 0.8 have xP 1 = x P 1 = 1 and īP = i P = 0. Also, the boundary points separating Case 1 and Case 2 (Point A in Fig. 4(a) and 4(b)) are the same with different α.The results validate Theorem 3a.
• Between Points A and B, β 1 γ > 1 k , Φ 2 < 0 for both α = 0.6, 0.8.Therefore, from Theorem 3b, both individuals with low irrationality (α = 0.8) and individuals with in Fig. 7(a).As shown there, the loss function decreases after each iteration and converges after 10,000 iterations.Simulation results of the algorithm with infeasible constraints are shown in Fig. 6.Here, the given two constraints (i m , x m ) are below the boundary of the feasible region and impossible to satisfy at the same time.It can be seen that although the steady state with behavior guidance cannot satisfy the two constraints simultaneously, it is on the boundary of the feasible region and very close to (i m , x m ).The value of the loss function in (34) after each iteration is shown in Fig. 7(b), where we can see that it decreases and converges as the number of iterations increases.

Real User Tests
While the simulation experiments have validated our theoretical analysis, it is necessary to further validate our conclusions using real user tests.However, obtaining real user data on behavioral choices, network structure, and spread data simultaneously poses a significant challenge.Therefore, we qualitatively validate our conclusions through sociological experiments.In our test, 141 subjects are interviewed, including 61 males and 80 females.Their occupations include students, production staff, salespersons, human resources, teachers, etc.Their ages range from 18 to 60.
In our test, we first collect data to estimate the irrationality coefficient α of the subjects.Note that in Section 2 and 3, to simplify the analysis, we assume that all individuals have the same α, which can be considered as the average of α over the entire network.In reality, the irrationality coefficient α varies from person to person.Therefore, in our test, we estimate α for each individual separately.Next, we gather data on the subjects' behavioral choices in various scenarios during a pandemic.This data collection process allows us to capture individuals' risk preferences.Finally, we analyze the relationship between the irrationality coefficient α and the risk preference.By examining this relationship, we gain insights into how individuals' irrationality impacts their decisions during a pandemic and validate our analysis in Section 3.

Estimating the Irrationality Coefficient α
Data collection: Following the works in [18][19][20], we estimate the subjects' irrationality coefficient α in a way similar to gambling games.In our experiments, subjects are presented with a scenario where they have a probability p i of incurring a large financial loss.However, they are also given the option to purchase insurance at different prices to mitigate the potential loss.The subjects are then asked to make a decision regarding whether they would choose to buy the insurance.Below is an example we use in our experiment.
Question A: You have a probability 10% of losing ¥100, but if you choose to buy insurance, you can guarantee that you will not bear this loss.Then what is the insurance price you can accept?a.When the price is lower than ¥10, I would buy it.b.When the price is lower than ¥20 but higher than ¥10, I would buy it.c.When the price is lower than ¥30 but higher than ¥20, I would buy it.d.When the price is lower than ¥40 but higher than ¥30, I would buy it.e.When the price is lower than ¥50 but higher than ¥40, I would buy it.f.Even if the price is higher than ¥50, I would buy it.
After the subjects have made their initial choices, a refined set of choices is presented to them to obtain fine-grained results.For instance, if a subject chose option c in the above question, the prices shown in the subsequent question would be narrowed down to a range between ¥30 and ¥40, such as {¥30-¥32, ¥32-¥34, ¥34-¥36, ¥36-¥38, ¥38-¥40}.This process continues until the range is narrowed down to ¥1.For each subject, we change the probability p i in the questionnaire, repeat the above process, and get multiple pairs of (p i , r i ), where r i is the final acceptable insurance price of the subject.
Estimation of α: If the subject chooses an acceptable insurance price of r i to avoid the loss of ¥100 with probability p i , then for this subject, a loss of ¥100 with probability p i is equivalent to a loss of r i with probability 1.Then from (8), we have: Similar to [20], we use ( 7) and ( 6) as the probability weighting function and the value function, respectively, in our work.Note that from the definitions, we have u P (0) = 0 and ω ω ω(0, α) = 0, then we have −e (−(−lnp i ) α ) λ(100 which is equivalent to ln(−ln( r i 100 )) = αln(−ln(p i )) − ln(σ).
In (37), ln(−ln( r i 100 )) is a linear function of ln(−ln(p i )).We set σ = 0.65 following the work in [20].Given the collected pairs {(p i , r i )} from one subject, we use linear regression to find α in (37) for this subject.

Measuring Individuals' Risk Preference
Data collection: Then we proceed to collect data on the behavioral choices of different subjects when confronted with risky scenarios during a pandemic.In each question of this section, subjects are presented with a specific epidemic situation.Within each scenario, subjects must choose between going out and staying at home.They are informed that if they choose to go out, there is a certain probability of becoming infected.On the other hand, if they decide to stay at home, they are guaranteed not to be infected, but they will experience some form of loss.Below is an example in our experiment.Question B: There is an epidemic spreading right now.If you come into contact with an infected person, you have a 5% chance of being infected.If you choose to go out, you will be in close contact with 20 people every day.If you decide to stay at home, you are guaranteed to avoid infection but will experience some losses.On the other hand, if you go out, there is a chance of becoming infected.The loss of being infected is 20 times that of the loss due to home isolation.Your city has a population of 1 million, then: a.When there is no confirmed case in the city, I will go out.b.When the number of confirmed cases in the city is less than 10, I will go out.c.When the number of confirmed cases in the city is less than 100, I will go out.d.When the number of confirmed cases in the city is less than 1,000, I will go out.e.When the number of confirmed cases in the city is less than 10,000, I will go out.f.When the number of confirmed cases in the city is less than 100,000, I will go out.g.When the number of confirmed cases in the city is higher than 100,000, I will still go out.
Calculating Individuals' Risk Preference I x : We use the risk preference I x to reflect the behavioral tendencies of the subjects.The risk preference I x for each individual corresponds to the proportion of times choosing to engage in risky behavior in various scenarios in the questionnaire.This indicator ranges between 0 and 1, where values closer to 1 indicate a more risky behavioral tendency.It is important to note that in our model, x P 1 denotes the proportion of individuals choosing the risky behavior.A higher I x value within a group implies a greater inclination of individuals towards risky behavior, resulting in a higher x P 1 value.Then we analyze the relationship between the irrationality and behavioral choice.

Analysis of the Relationship Between Irrationality and Behavioral Choice
We classify subjects into six groups based on their risk preference, with the aim of grouping together individuals who share similar values of I x within each respective group.We calculate the mean value of α and the average of I x for each group and analyze the relationship between α and I x .The results are illustrated in Fig. 8.We can see that groups with a smaller α (i.e., a higher degree of irrationality) also have a smaller average risk preference I x , which is consistent with our conclusion.In our theoretical analysis and simulations, we find that irrationality makes more conservative in most cases, and irrationality makes individuals more risk-seeking only occurs when the infection rate is high, and the loss of disease is extremely low.Since the parameter settings of our real user tests do not meet this condition.Therefore, irrationality leads individuals to be more conservative and has a smaller risk preference.
To validate the validity of our theory from a statistical point of view, we get the Pearson correlation coefficient and the Spearman correlation coefficient of the I x and α (For the Pearson correlation coefficient, we take the mean of I x for each group as a variable to calculate).The results are shown in Table 1.Both the Pearson correlation coefficient and the Spearman correlation coefficient reveal a significant positive correlation between I x and α.This indicates that irrationality leads individuals to exhibit more conservative behavior.

Parameters Pearson correlation Significance Spearman correlation Significance
I r and α 0.939 0.005 1.000 0.000

Conclusion
In this paper, we propose an epidemic-behavior coevolution framework to analyze the behavior of individuals and the coevolution of user behavior and the disease spread during an epidemic.Our model takes into account the irrationality of individuals' decision-making processes, and our theoretical analysis shows that individual irrationality polarizes individual behavior choices.That is, irrationality makes users risk-averse when the probability of being infected is small, while they tend to be risk-seeking when the probability of being infected is large.We then propose a behavior inducement algorithm to control the disease spread and reduce losses by guiding individual behavior.Simulation results show the correctness of our theoretical analysis and verify the validity of our guidance control method.We also qualitatively prove the correctness of our conclusions using real-user tests.

Appendix A. The proof of Theorem 1
In order to analyze the steady state, we first introduce the Lemma 1.
Lemma 1.For the 2-behavior model, the necessary and sufficient conditions that (i * , x * 1 ) is a steady state is: Proof : By the definition of steady state, we first have di dt = 0 and dx 1 dt = 0.For simplicity, we set Then we have: Since Re(λ 1 ) < 0 and Re(λ 2 ) < 0, we have P < 0 and Q > 0, which means ∂i Then we proof the Theorem 1 based on the Lemma 1: By setting di dt = 0, dx 1 dt = 0, we can get four points: (0, 0), (0, 1), (1 − γ kβ 1 , 1) and ) .Then we discuss the stability of these points.Based on Lemma 1, the steady state should satisfy P (i, In our model, the individual can adopt two behaviors, a 1 for risky behavior (like going out) and a 2 for conservative behavior (like self-isolation).In general, the utility of going out should be larger than self-isolation, so we have c 1 − c 2 > 0 (that means u E (c 1 ) − u E (c 2 ) > 0 and u P (c 1 ) − u P (c 2 ) > 0).And c n is the loss of being infected, so c n < 0 (that means u E (c n ) < 0 and u P (c n ) < 0).
In summary, we can solve the optimization problem (33) by Algorithm 1.
Appendix G.The method for solving the optimization problem (34) Similar to the optimization problem (33), the objective function in (34) can be rewritten as: where BF (δ δ δ) is defined in (F.1), and its gradient is: where l 1 (i P ) = (i P − i m ) 2 and l 2 (x P 1 ) = (x P 1 − x m ) 2 .

Fig. 2 :
Fig. 2: The range of i m and x m when (a) β1 γ = 2 3 (b) β1 γ = 1 2 with k = 10.The green area represents the feasible constraints, and the red area represents the infeasible constraints.

1 Fig. 3 :
Fig. 3: Simulation results of the steady states with rational and irrational individuals.

Fig. 4 :
Fig. 4: Simulation results of the steady states with different β1 γ .(a) i E and i P (b) x E 1 and x P 1

Fig. 5 :
Fig. 5: Simulation results of behavior inducement algorithm with feasible constraints (i m , x m ).

2 Fig. 6 :
Fig. 6: Simulation results of the behavior inducement algorithm with infeasible constraints (i m , x m ).

Fig. 7 :
Fig. 7: Simulation results of the behavior inducement algorithms.(a) The loss function in (33) with feasible constraints, and (b) the loss function in (34) with infeasible constraints.

Fig. 8 :
Fig. 8: The relationship between risk preference I x and α.