<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">17266</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2021.017266</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Bayesian Rule Modeling for Interpretable Mortality Classification of COVID-19 Patients</article-title>
<alt-title alt-title-type="left-running-head">Bayesian Rule Modeling for Interpretable Mortality Classification of COVID-19 Patients</alt-title>
<alt-title alt-title-type="right-running-head">Bayesian Rule Modeling for Interpretable Mortality Classification of COVID-19 Patients</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Yun</surname><given-names>Jiyoung</given-names></name>
</contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Basak</surname><given-names>Mainak</given-names></name>
</contrib> 
<contrib id="author-3" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Han</surname><given-names>Myung-Mook</given-names></name>
<email>mmhan@gachon.ac.kr</email>
</contrib>
<aff><institution>Software Department, Gachon University</institution>, <addr-line>Seongnam, 13120</addr-line>, <country>Korea</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Myung-Mook Han. Email: <email>mmhan@gachon.ac.kr</email></corresp>
</author-notes>
<pub-date pub-type="epub" date-type="pub" iso-8601-date="2021-08-23"><day>23</day>
<month>08</month>
<year>2021</year></pub-date>
<volume>69</volume>
<issue>3</issue>
<fpage>2827</fpage>
<lpage>2843</lpage>
<history>
<date date-type="received"><day>25</day><month>1</month><year>2021</year></date>
<date date-type="accepted"><day>25</day><month>3</month><year>2021</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2021 Yun et al.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Yun et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_17266.pdf"></self-uri>
<abstract>
<p>Coronavirus disease 2019 (COVID-19) has been termed a &#x201C;Pandemic Disease&#x201D; that has infected many people and caused many deaths on a nearly unprecedented level. As more people are infected each day, it continues to pose a serious threat to humanity worldwide. As a result, healthcare systems around the world are facing a shortage of medical space such as wards and sickbeds. In most cases, healthy people experience tolerable symptoms if they are infected. However, in other cases, patients may suffer severe symptoms and require treatment in an intensive care unit. Thus, hospitals should select patients who have a high risk of death and treat them first. To solve this problem, a number of models have been developed for mortality prediction. However, they lack interpretability and generalization. To prepare a model that addresses these issues, we proposed a COVID-19 mortality prediction model that could provide new insights. We identified blood factors that could affect the prediction of COVID-19 mortality. In particular, we focused on dependency reduction using partial correlation and mutual information. Next, we used the Class-Attribute Interdependency Maximization (CAIM) algorithm to bin continuous values. Then, we used Jensen Shannon Divergence (JSD) and Bayesian posterior probability to create less redundant and more accurate rules. We provided a ruleset with its own posterior probability as a result. The extracted rules are in the form of &#x201C;<bold><italic>if</italic></bold> <italic>antecedent <bold>then</bold> results, <bold>posterior probability</bold>(</italic><inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mi>&#x03B8;</mml:mi></mml:math></inline-formula><italic>)&#x201D;</italic>. If the sample matches the extracted rules, then the result is positive. The average AUC Score was 96.77&#x0025; for the validation dataset and the F1-score was 92.8&#x0025; for the test data. Compared to the results of previous studies, it shows good performance in terms of classification performance, generalization, and interpretability.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>COVID-19 mortality</kwd>
<kwd>explainable AI</kwd>
<kwd>bayesian probability</kwd>
<kwd>feature selection</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label><title>Introduction</title>
<p>First appearing in 2019, COVID-19 has resulted in a total of 96,877,399 infections and 2,098,879 deaths worldwide as of this writing [<xref ref-type="bibr" rid="ref-1">1</xref>]. This shows that COVID-19 has not only a high infection rate, but also a high mortality rate. Further, the situation is worsening. As COVID-19 infections have begun to increase exponentially, there has been an increasing lack of medical space available, including wards and beds. Unlike other infectious diseases, COVID-19 has wide variability in symptoms. In general, four out of five people have mild symptoms. About 95&#x0025; of COVID-19 infections worldwide are cured, while 5&#x0025; of patients receive emergency treatment or die in intensive care [<xref ref-type="bibr" rid="ref-2">2</xref>]. Reducing the death rate from COVID-19 in poor medical facilities requires preemptive treatment for patients who face serious risks. This need has led to the emergence of studies predicting the mortality of people infected with COVID-19. However, previous studies have had limited interpretability because the outcome probability per feature has been unknown. In addition, the model itself has a generalization problem because it has poor performance in the test dataset, although it has good performance in the validation dataset. To address these weaknesses, we proposed a COVID-19 mortality prediction model with high generality and interpretability by presenting important rules along with their Bayesian posterior probability.</p>
<p>The proposed model consisted of three steps: selecting features, making items by binning continuous data, and generating rules. In the feature selection stage, we used partial correlation to obtain a more accurate dependency value. In the rule generation stage, we used confidence-closed itemset mining, JSD (Jensen-Shannon Divergence), and Bayesian posterior probability to extract an accurate and non-redundant ruleset. Confidence-closed itemset mining removed many useless itemsets. Since JSD deleted rules fast and precisely with distribution distance, we reduced the calculation time. We used Bayesian posterior probability to identify more effective rules in classification and obtain the interpretability of rules. The results showed that this model created a small but accurate ruleset. This model provided rules of &#x201C;<bold><italic>if</italic></bold> <italic>antecedent <bold>then</bold> results, <bold>posterior probability</bold>(</italic><inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mi>&#x03B8;</mml:mi></mml:math></inline-formula><italic>)&#x201D;.</italic> It also provided a prediction for mortality through matched rules. If a sample matched the ruleset, it was positive (death).</p>
<p>As a result, 14 rules were ultimately selected. The AUC score obtained using these 14 rules was 96.77&#x0025; on average for the validation data. The F1-score was 92.8&#x0025; for the test data. The results confirmed that our model had better performance than a previously published model [<xref ref-type="bibr" rid="ref-3">3</xref>]. Compared to another study that used a fuzzy rule list [<xref ref-type="bibr" rid="ref-4">4</xref>], our model showed a 3.3&#x0025; lower AUC score for the validation data. However, it had 11&#x0025; higher performance for the test data, which confirms the generality of our model.</p>
<p>The rules presented in this study are statistical rules. All predictions are completely extracted using data. The final goal of this study is to help medical staff make the best decision, rather than perform an absolute judgment. The contributions of this study are as follows:
<list list-type="bullet">
<list-item><p>Creates a new rule extraction method which has the fast, accurate and interpretable results.</p></list-item>
<list-item><p>Identifies important blood factors and their split points.</p></list-item>
<list-item><p>Represents a combination of important elements and their influence on the result in the form of a ruleset.</p></list-item>
<list-item><p>Proposes a mortality prediction model with better performance and generality than previous studies.</p></list-item>
</list></p>
<p>The rest of this paper is organized as follows. In Section 2, we discuss related studies. Section 3 presents the dataset used in this study. Section 4 is focused on the mortality detection model, and mainly on introducing the proposed rule extraction model. In Section 5, we present the experiment and evaluation. Finally, we conclude our study in Section 6.</p>
</sec>
<sec id="s2">
<label>2</label><title>Relative Study</title>
<sec id="s2_1">
<label>2.1</label><title>COVID-19 Mortality Group Prediction Using Blood Sample</title>
<p>Most prior studies examining COVID-19 mortality using blood samples have focused on the risk of mortality using elements of blood data, such as D-dimer and lymphocytes, rather than making a prediction model [<xref ref-type="bibr" rid="ref-5">5</xref>,<xref ref-type="bibr" rid="ref-6">6</xref>]. Two representative studies have predicted the risk by creating interpretable models using various blood elements. The first study reported an interpretable mortality prediction model using blood samples of infected people [<xref ref-type="bibr" rid="ref-3">3</xref>]. In that study, the authors selected important features using &#x201C;Multi-tree XGBoost&#x201D; and they made predictions using &#x201C;single-tree XGBoost&#x201D;. The selected features were &#x201C;<italic>Lactate dehydrogenase, (&#x0025;)lymphocyte and Hypersensitive C-reactive protein&#x201D;</italic>. The results of that study showed an AUC score of 95.06&#x0025; for the validation dataset on average, with an F1-score of 90&#x0025; for the test dataset. The second study was an interpretable mortality prediction model with fuzzy [<xref ref-type="bibr" rid="ref-4">4</xref>]. That model selected important features and created a fuzzy rule set in a human-friendly way. Three experiments were conducted in that study: The first experiment was conducted using three features extracted from the previous study [<xref ref-type="bibr" rid="ref-3">3</xref>]. The second experiment was conducted with five features after adding the two features of &#x201C;albumin&#x201D; and &#x201C;the International standard ratio&#x201D;. In the last experiment, three important features were added to the time flow. When adding time flow, the AUC score recorded was 100&#x0025; for the validation data with an F1-score of 81.8&#x0025; for the test data.</p>
</sec>
<sec id="s2_2">
<label>2.2</label><title>Rule Extraction Using Bayesian Probability</title>
<p>Rule extraction using Bayesian posterior probability is a probability-based rule extraction model that can extract items with high posterior probability to generate a list of rules. It stems from the previously proposed Bayesian Rule List (BRL) [<xref ref-type="bibr" rid="ref-7">7</xref>]. This model extracts an itemset group that exceeds the support of thresholds, calculates their posterior probability, and generates rules using a high posterior probability itemset group; in this way, it creates an ordered rule list. Rules are created from the multinomial distribution and the Dirichlet distribution, which is a conjugate distribution of the multinomial distribution. <?A3B2 "fig1",5,"anchor"?><xref ref-type="fig" rid="fig-1">Fig. 1</xref> shows the rule list in the distribution format. That study used a stroke dataset, and for the performance of the model, the AUC score was 75.6&#x0025;.</p>
<fig id="fig-1"><label>Figure 1</label><caption><title>Rule list in distribution format [<xref ref-type="bibr" rid="ref-7">7</xref>]</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-1.png"/></fig>
<p>The Bayesian approach has been proposed for use as a rule mining model [<xref ref-type="bibr" rid="ref-8">8</xref>]. In that study, the authors wanted to identify rare but accurate rules, while mentioning the issue of support. First, they selected candidate rules and checked whether they passed the selection criteria based on Bayesian posterior probability. Second, they combined each rule and checked whether it passed the criteria. The algorithm was executed in a breadth-first approach like the a priori algorithm. The experiment was conducted to find association rules as opposed to classification. As a result, they found two more association rules that could not be found with the support criteria.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label><title>Dataset</title>
<p>The dataset used in this study was a dataset consisting of blood samples of patients infected with COVID-19 that was provided by Yan <italic>et al.</italic> [<xref ref-type="bibr" rid="ref-3">3</xref>]. There were 75 features, including the personal information of the infected patients and the blood information, which was tested based on date and time. The data consisted of a training data set and a test data set. The training data had 375 patients (201 discharged patients and 174 deceased patients) that were treated from January 10, 2020 to February 18, 2020. The test data had 110 patients (97 discharged patients and 13 deceased patients) that were treated from February 19, 2020 to February 24, 2020 [<xref ref-type="bibr" rid="ref-3">3</xref>]. We filled in the missing values of the training dataset with each feature&#x0027;s mean value. The model was trained using the patient&#x0027;s last day result (death or discharge). The test was conducted 7 days prior to the patient&#x0027;s last day.</p>
</sec>
<sec id="s4">
<label>4</label><title>Proposed Model</title>
<p>As shown in <?A3B2 "fig2",5,"anchor"?><xref ref-type="fig" rid="fig-2">Fig. 2</xref>, our proposed model had three stages: feature selection, item generation, and rule extraction. In particular, in rule extraction, we proposed a new rule extraction algorithm that used confidence-closed item mining, JSD, and Bayesian posterior probability. We will define all terminology before explaining the proposed model. <italic>Item</italic> refers to features relevant to the conditions (e.g., age &#x003C; 40). <italic>Itemset</italic> refers to the set of items (e.g., age &#x003C; 40, sex &#x003D; female); <italic>itemsets</italic> refer to multiple itemsets.</p>
<sec id="s4_1">
<label>4.1</label><title>Feature Extraction</title>
<p>Feature extraction involved two tasks. First, we extracted important features by reducing the dependency between features and analyzing the interdependency between the target variable and features. To reduce the dependent features, we used partial correlation, which is a type of correlation that can control the subset of variables with an additional effect; it analyzes the correlation between certain features. Correlation typically indicates the relationship between features without controlling other features. This cannot differentiate between direct and indirect effects [<xref ref-type="bibr" rid="ref-9">9</xref>]. Thus, we used partial correlation to identify more accurate and strict correlation. We calculated the partial correlation between features to remove dependent relations. The partial correlation was calculated using <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>, which can calculate the correlation between x and y when given a single controlling variable z [<xref ref-type="bibr" rid="ref-10">10</xref>]. As we can see in <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>, before we solve this equation, we need to know the zero-order correlation between all possible pairs of variables ((x,y) and (y,z) and (x,z)) [<xref ref-type="bibr" rid="ref-10">10</xref>].
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>y</mml:mi><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>y</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>y</mml:mi><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:msqrt><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi>r</mml:mi><mml:mrow><mml:mi>y</mml:mi><mml:mi>z</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msubsup></mml:msqrt><mml:msqrt><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi>r</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>z</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msubsup></mml:msqrt></mml:mrow></mml:mfrac></mml:math></disp-formula>
</p>
<p>In the feature selection stage, we used the mutual information between the target variable and the features. Mutual information was the probability that event X and Y would occur simultaneously among event X and event Y. If X and Y frequently occur together, it means they have a high interdependency. Features were removed if the mutual information value was below the mutual information threshold. Discrete features were calculated with <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>, while continuous features were calculated with <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref> [<xref ref-type="bibr" rid="ref-11">11</xref>].
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>I</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>Y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mrow><mml:mo movablelimits="false">&#x2211;</mml:mo></mml:mrow><mml:mi>y</mml:mi></mml:munder><mml:mo>&#x2061;</mml:mo><mml:munder><mml:mrow><mml:mo movablelimits="false">&#x2211;</mml:mo></mml:mrow><mml:mi>x</mml:mi></mml:munder><mml:mo>&#x2061;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mi>I</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>Y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mrow><mml:mo largeop="false">&#x222B;</mml:mo></mml:mrow><mml:mi>Y</mml:mi></mml:munder><mml:mo>&#x2061;</mml:mo><mml:munder><mml:mrow><mml:mo largeop="false">&#x222B;</mml:mo></mml:mrow><mml:mi>X</mml:mi></mml:munder><mml:mo>&#x2061;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mi>d</mml:mi><mml:mi>y</mml:mi></mml:math></disp-formula>
</p>
<fig id="fig-2">
<label>Figure 2</label><caption><title>Framework of the proposed model</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-2.png"/></fig>
</sec>
<sec id="s4_2">
<label>4.2</label><title>Item Generation and Filtering</title>
<p>In the item generation step, the selected features were binned. If the value was smaller than or equal to the split point, it was set to <italic>f<sub>k</sub> &#x2264; &#x03B4;</italic>; if it was bigger, it was set to <italic>&#x03B4; &#x003C; f<sub>k</sub></italic>. We use the Class-Attribute Interdependency Maximization (CAIM) algorithm for binning [<xref ref-type="bibr" rid="ref-12">12</xref>]. The CAIM algorithm is a binning algorithm that can determine the minimal discretization range using the minimal interdependency information loss. The algorithm can discretize the continuous value at the highest CAIM value. The calculation proceeds until reaching the point at which CAIM &#x003E; Globalcaim or p &#x003C; k. <bold>p</bold> means the number of split points and <bold>k</bold> means the number of classes. CAIM&#x0027;s criteria were defined by <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>. <bold>C</bold> means the target variable, <bold>D</bold> means the discretized variable for <bold>F</bold>, <bold>max<sub>r</sub></bold> is the maximum value in <bold>r</bold> range, and <bold>N<sub>&#x002B;r</sub></bold> is the number of samples in the range [<xref ref-type="bibr" rid="ref-12">12</xref>].
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:mtext>caim</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext>D|F</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo movablelimits="false">&#x2211;</mml:mo></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mo>&#x2061;</mml:mo><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:msubsup><mml:mi>x</mml:mi><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mo>+</mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mi>n</mml:mi></mml:mfrac></mml:math></disp-formula>
After creating the items, we filtered items with low support and confidence. Due to the a priori nature, we can delete single items with a lower support value than the threshold. However, in the confidence filtering, we can&#x0027;t use priori nature, so we deleted very low confidence items to prevent the creation of a redundant and useless itemset.</p>
</sec>
<sec id="s4_3">
<label>4.3</label><title>Rule Extraction Using JSD and Posterior Probability</title>
<p>In this stage, the itemset with a high posterior probability was extracted, and rules were generated. The rule extraction consisted of three stages: mining the itemset, filtering with JSD, and filtering with Bayesian posterior probability.</p>
<sec id="s4_3_1">
<label>4.3.1</label><title>Creating Itemset</title>
<p>First, we created an itemset with filtering for support and confidence using FP-growth [<xref ref-type="bibr" rid="ref-13">13</xref>] for mining. In the mining process, we filtered the itemset using support and confidence thresholds. Then, we checked if the itemset satisfied the confidence-closed itemset condition.</p>
<p>The confidence-closed itemset is closed itemset, which is focus on confidence. A closed itemset is typically focused on support. If the itemset and superset of the itemset have the same support, then the itemset is removed [<xref ref-type="bibr" rid="ref-14">14</xref>]. However, we focused on confidence instead of support. We deleted the itemset that was the superset of the itemset that had a lower confidence value than the itemset (e.g., {a, b} &#x003D; 0.8 and {a, b, c} &#x003D; 0.7, then {a, b, c} was deleted). Since we extracted the ruleset, if the sample matched to itemset({x, y}), then it matched to the superset of itemset({x, y, z}) 100&#x0025;. However, we could not be certain that the itemset({x, y}) was the correct based solely on confidence. Thus, in this stage, we only deleted the superset with lower confidence, not all of them. Because lower confidence refers to a lower probability of co-occurrence of itemset and Y (e.g., class &#x003D; 1 in this model), additional items can harm the creation of an effective rule. We checked the condition to avoid creating inaccurate rules and wasting mining time. Using confidence-closed itemset filtering, we could extract an itemset that was less redundant but more accurate.</p>
</sec>
<sec id="s4_3_2">
<label>4.3.2</label><title>Itemset Filtering with JSD</title>
<p>In this stage, we calculated the distance between the mined itemsets and the training dataset. We used JSD to calculate the distance. JSD is a variation of Kullback&#x2013;Leibler (KL), which represents the distance between two distributions. In general, KL refers to the expectation of an information loss value between two distributions. It is used to measure the similarity of two distributions. However, for KL, the differences between the two distributions were not symmetrical. They are not marked as distances, which is equal to <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref> [<xref ref-type="bibr" rid="ref-15">15</xref>]. In this equation, <bold>P</bold> and <bold>Q</bold> are the distribution.
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mrow><mml:mtext>DKL</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext>P</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mtext>Q</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x2260;</mml:mo><mml:mrow><mml:mtext>DKL</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext>Q</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mtext>P</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
The KL between the median of each distribution and the two distributions was performed to make the difference between two distributions symmetrical. This allowed for the distance between the two probability distributions to be calculated. JSD&#x0027;s equation is shown in <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref> [<xref ref-type="bibr" rid="ref-15">15</xref>].
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mrow><mml:mtext>JSD</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext>P</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext>Q</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mi>D</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>M</mml:mi><mml:mrow><mml:mtext>)</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mi>D</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>Q</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>M</mml:mi><mml:mrow><mml:mtext>)</mml:mtext></mml:mrow><mml:mspace width="thickmathspace" /></mml:math></disp-formula>
</p>
<p>We calculated the distance between the distribution of each itemset and the distribution of the dataset. We made each itemset&#x0027;s multivariate normal distribution using the mean and covariance values of its features. We also made multivariate normal distributions of death (class &#x003D; 1) and discharged (class &#x003D; 0)&#x0027;s using the features of the itemset. Here, we call <bold>I<sub>n</sub></bold> for the n<sub>th</sub> itemset&#x0027;s distribution, <bold>D</bold> for class 1(death) distribution, <bold>S</bold> for class 0 (survival) distribution, and <bold>D<sub>n</sub></bold> for a distance of <bold>I<sub>n</sub></bold>. We made our calculation based on <xref ref-type="disp-formula" rid="eqn-7">Eq. (7)</xref>.
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mtext>JSD</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext>D</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mtext>JSD</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext>D</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mtext>\;&#xA0;</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mtext>\; JSD</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mtext>I</mml:mtext></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mtext>S</mml:mtext></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:math></disp-formula>
Since the number of features could affect the distance between distributions, the distance of the death distribution was divided by the sum of the two distributions (death and discharged) in this study. After calculating the distance of each itemset, the itemset was filtered according to the distance threshold. This stage reduced the training time. Because calculating the posterior probability required a lot of time, we filtered the itemset with a long distance.</p>
</sec>
<sec id="s4_3_3">
<label>4.3.3</label><title>Itemset Filtering with Posterior Probability</title>
<p>We calculated the Bayesian posterior probability of each itemset and removed them when they did not exceed the threshold. We used this stage to achieve an accurate probability. Without sampling, we obtained the probability from our dataset alone, which can lead to a biased probability. Thus, we created an approximated posterior distribution using a large number of samples, and we used this distribution for explanation and classification.</p>
<p>Posterior probability was calculated by the Bayesian theorem. Since it was a binary classification, the binomial distribution was used as a likelihood distribution. The parameter <bold>n</bold> was the total occurrence of items while the parameter <bold>k</bold> was the occurrence of itemsets that resulted in 1(death). The parameter <bold>&#x03B8;</bold> was the binomial probability of k. Beta distribution, which was a conjugate distribution, was used for the conjugate prior to distribution [<xref ref-type="bibr" rid="ref-16">16</xref>]. Then, posterior probability could be calculated as follows [<xref ref-type="bibr" rid="ref-17">17</xref>].
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>k</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mo largeop="false">&#x222B;</mml:mo></mml:mrow><mml:mi>p</mml:mi></mml:msup><mml:mo>&#x2061;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>d</mml:mi><mml:mi>&#x03B8;</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>
<xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref> was derived based on the nature of the joint distribution and equation <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>Y</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula><bold><italic>.</italic></bold> The numerator can be expressed as follows, because of the prior distribution and the conditional probability distribution [<xref ref-type="bibr" rid="ref-17">17</xref>].
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x03B8;</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mi>n</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>k</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mi>B</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:msup><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula>
</p>
<p>The denominator is expressed as follows. <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref> is the same as <xref ref-type="disp-formula" rid="eqn-9">(9)</xref>. <xref ref-type="disp-formula" rid="eqn-11">Eq. (11)</xref> is derived based on the fact that it becomes 1 if the probability density function of the beta distribution whose parameter is <italic>k&#x002B;&#x03B1;</italic> and <italic>&#x03B2;&#x002B;n-k</italic> is integrated [<xref ref-type="bibr" rid="ref-17">17</xref>].
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msup><mml:mrow><mml:mo>&#x222B;</mml:mo></mml:mrow><mml:mi>p</mml:mi></mml:msup><mml:mo>&#x2061;</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>d</mml:mi><mml:mi>&#x03B8;</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mrow><mml:mo>&#x222B;</mml:mo></mml:mrow><mml:mn>0</mml:mn><mml:mn>1</mml:mn></mml:munderover><mml:mo>&#x2061;</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>&#x03B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>d</mml:mi><mml:mi>&#x03B8;</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mrow><mml:mo>&#x222B;</mml:mo></mml:mrow><mml:mn>0</mml:mn><mml:mn>1</mml:mn></mml:munderover><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mi>n</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>k</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:msup><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mrow><mml:mi>B</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac><mml:mi>d</mml:mi><mml:mi>&#x03B8;</mml:mi></mml:math></disp-formula>
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block">
 <mml:mrow>
  <mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo>
   <mml:mrow>
    <mml:mtable>
     <mml:mtr>
      <mml:mtd>
       <mml:mi>n</mml:mi>
      </mml:mtd>
     </mml:mtr>
     <mml:mtr>
      <mml:mtd>
       <mml:mi>k</mml:mi>
      </mml:mtd>
     </mml:mtr>
    </mml:mtable></mml:mrow>
  <mml:mo>)</mml:mo></mml:mrow><mml:mfrac>
   <mml:mn>1</mml:mn>
   <mml:mrow>
    <mml:mi>B</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow>
  </mml:mfrac>
  <mml:munderover>
   <mml:mo>&#x222B;</mml:mo>
   <mml:mn>0</mml:mn>
   <mml:mn>1</mml:mn>
  </mml:munderover>
  <mml:msup>
   <mml:mi>&#x03B8;</mml:mi>
   <mml:mrow>
    <mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow>
  </mml:msup>
  <mml:msup>
   <mml:mrow>
    <mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow>
   <mml:mrow>
    <mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow>
  </mml:msup>
  <mml:mi>d</mml:mi><mml:mi>&#x03B8;</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo>
   <mml:mrow>
    <mml:mtable>
     <mml:mtr>
      <mml:mtd>
       <mml:mi>n</mml:mi>
      </mml:mtd>
     </mml:mtr>
     <mml:mtr>
      <mml:mtd>
       <mml:mi>k</mml:mi>
      </mml:mtd>
     </mml:mtr>
    </mml:mtable></mml:mrow>
  <mml:mo>)</mml:mo></mml:mrow><mml:mfrac>
   <mml:mrow>
    <mml:mi>B</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow>
   <mml:mrow>
    <mml:mi>B</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow>
  </mml:mfrac>
  </mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Using the denominator <xref ref-type="disp-formula" rid="eqn-11">Eq. (11)</xref> and numerator <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref> calculated above, the probability can be obtained as follows [<xref ref-type="bibr" rid="ref-17">17</xref>].
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mi>p</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>k</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mi>B</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>+</mml:mo><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>+</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:msup><mml:mi>&#x03B8;</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B1;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03B8;</mml:mi><mml:msup><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo>&#x223C;</mml:mo><mml:mi>B</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi><mml:mi>a</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x03B1;</mml:mi><mml:mo>+</mml:mo><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03B2;</mml:mi><mml:mo>+</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>k</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula></p>
<p>The posterior probability was updated by multiplying the likelihood and the prior probability. Markov chain Monte Carlo (MCMC) sampling was performed to generate an approximated Bayesian posterior probability distribution for each item [<xref ref-type="bibr" rid="ref-18">18</xref>]. The Metropolis-Hastings algorithm was used for sampling. It requires a sampling distribution g(x), which is proportional to the target distribution and the conditional probability distribution q(x<sub>t&#x002B;1</sub>&#x007C;x<sub>t</sub>), to approximate the target distribution p(x) [<xref ref-type="bibr" rid="ref-19">19</xref>]. The sampling distribution g(x) in this model is the beta-binomial distribution. The process is as follows [<xref ref-type="bibr" rid="ref-19">19</xref>]:
<list list-type="simple">
<list-item><label>&#x02460;</label><p> Pick a distribution g(x) for sampling.</p></list-item>
<list-item><label>&#x02461;</label><p> Choose X<sub>0</sub> which is the start point of the Markov chain.</p></list-item>
<list-item><label>&#x02462;</label><p> Sample y from g(x) which is the candidate of X<sub>t&#x002B;1</sub> when this chain is X&#x003D;x<sub>t</sub>.</p></list-item>
<list-item><label>&#x02463;</label><p> Calculate the acceptance ratio using <xref ref-type="disp-formula" rid="eqn-13">Eq. (13)</xref>.
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
</p></list-item>
<list-item><label>&#x02464;</label><p> Sample u from the uniform distribution U(x) and get Xt&#x002B;1 using the following process.
<disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mi>y</mml:mi><mml:mspace width="thickmathspace" /><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-15"><label>(15)</label><mml:math id="mml-eqn-15" display="block"><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>
</p></list-item>
<list-item><label>&#x02465;</label><p> Continue this process (3,4,5) until the stationary distribution appears.</p></list-item>
</list></p>
<p>In process 5, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mfrac><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mi>g</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mrow><mml:mtext>|</mml:mtext></mml:mrow><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the acceptance probability. If the acceptance probability is set to 1, then the sampled value <italic>y</italic> is accepted at any time. However, if the acceptance probability is smaller than 1, it will be accepted according to u. The chain can be created more than once to ensure an accurate and clear result. When we have a stationary distribution, the point estimation is performed to derive the optimal Bayesian posterior probability of the item [<xref ref-type="bibr" rid="ref-20">20</xref>]. <xref ref-type="table" rid="table-5">Algorithm 1</xref> represents the flow of obtaining the optimal posterior probability.</p>
<table-wrap id="table-5"><label>Algorithm 1</label><caption><title>Getting posterior probability with instance probability</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
</colgroup>
<tbody>
<tr><td align="left"><bold>Input:</bold> <italic>&#x03B1;</italic>, <italic>&#x03B2;</italic>, <italic>k</italic>, <italic>n</italic>, <italic>numsamples, numchains, tune</italic></td></tr>
<tr><td align="left"><bold>Result:</bold> Get optimal posterior probability</td></tr>
<tr><td align="left">p-beta&#x2190; <italic>Beta distribution</italic>(<italic>&#x03B1;</italic>, <italic>&#x03B2;</italic>);</td></tr>
<tr><td align="left">p-binom&#x2190; <italic>Binomial distribution</italic>(<italic>p</italic>&#x2013;<italic>beta</italic>, <italic>k</italic>, <italic>n</italic>);</td></tr>
<tr><td align="left">posterior-distribution&#x2190;</td></tr>
<tr><td align="left">&#x2003;&#x2003;<italic>MCMC sampling(p &#x2013; beta,p &#x2013; binom, numsamples, numchains, tune)</italic>;</td></tr>
<tr><td align="left">probability&#x2190; <italic>Point estimation(posteriordistribution)</italic>;</td></tr>
<tr><td align="left">return probability</td></tr>
</tbody>
</table>
</table-wrap>
<p>The itemsets were filtered based on the calculated posterior probability. Through this step, the itemsets with high posterior probability were extracted. Since our model was based on a ruleset and we were certain that the itemset was accurate with posterior probability, we filtered the superset of the itemset as described in Section 4.3.1.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<label>5</label><title>Experiment</title>
<p>The experiment was conducted using a COVID-19 patient blood dataset. The results and threshold values of the proposed model are described here. The threshold values mentioned in this study are determined through several experiments.</p>
<sec id="s5_1">
<label>5.1</label><title>Feature Extraction</title>
<p>First, we deleted features with a dependency value above 0.7, which was our dependency threshold. Next, we deleted features with a value lower than 0.3, which was the mutual information threshold. We then selected 10 features <bold><italic>{Lactate dehydrogenase, neutrophils(&#x0025;), Hypersensitive C-reactive protein, (&#x0025;)lymphocyte, monocytes(&#x0025;), procalcitonin, D-dimer, International standard ratio, Amino-terminal brain natriuretic peptide precursor(NT-proBNP), albumin}</italic></bold> from the 75 features.</p>
</sec>
<sec id="s5_2">
<label>5.2</label><title>Item Generation</title>
<p>In the item generation step, the continuous features were binned. First, we used the CAIM algorithm [<xref ref-type="bibr" rid="ref-12">12</xref>] to split the continuous values. <?A3B2 "tbl1",5,"anchor"?><xref ref-type="table" rid="table-1">Tab. 1</xref> lists the split point for each feature.</p>
<table-wrap id="table-1">
<label>Table 1</label><caption><title>Split point for each feature</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Features</th>
<th align="left">Split point</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><italic>Lactate dehydrogenase</italic></td>
<td align="left">339.0</td>
</tr>
<tr>
<td align="left"><italic>neutrophils(&#x0025;)</italic></td>
<td align="left">79.4</td>
</tr>
<tr>
<td align="left"><italic>Hypersensitive </italic><break/><italic>C-reactive protein</italic></td>
<td align="left">42.3</td>
</tr>
<tr>
<td align="left"><italic>(&#x0025;)lymphocyte</italic></td>
<td align="left">13.0</td>
</tr>
<tr>
<td align="left"><italic>monocytes(&#x0025;)</italic></td>
<td align="left">4.2</td>
</tr>
<tr>
<td align="left"><italic>procalcitonin</italic></td>
<td align="left">0.095</td>
</tr>
<tr>
<td align="left"><italic>D-dimer</italic></td>
<td align="left">2.04</td>
</tr>
<tr>
<td align="left"><italic>International standard ratio</italic></td>
<td align="left">1.13</td>
</tr>
<tr>
<td align="left"><italic>Amino-terminal brain natriuretic peptide precursor(NT-proBNP)</italic></td>
<td align="left">305.0</td>
</tr>
<tr>
<td align="left"><italic>albumin</italic></td>
<td align="left">31.9</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>After creating the items using each split point, we filtered each item using support and confidence thresholds. The support threshold was 0.1 and the confidence threshold was 0.3. After filtering, we extracted a total of 10 items <bold><italic>{305.0&#x003C;Amino-terminal brain natriuretic peptide precursor(NT-proBNP), monocytes(&#x0025;)&#x003C;&#x003D;4.2, (&#x0025;)lymphocyte&#x003C;&#x003D;13.0, 0.095&#x003C;procalcitonin, 339.0&#x003C;Lactate dehydrogenase, albumin&#x003C;&#x003D;31.9, 1.13&#x003C;International standard ratio, 2.04&#x003C;D-dimer, 79.4&#x003C;neutrophils(&#x0025;), 42.3&#x003C;Hypersensitive C-reactive protein}</italic></bold>.</p>
</sec>
<sec id="s5_3">
<label>5.3</label><title>Rule Extraction Using JSD and Posterior Probability</title>
<p>In the mining process, we filtered the itemsets with a support value lower than the support threshold (0.15) and the confidence threshold (0.9). Next, we calculated the distance between the distribution of the itemset and 1(death) or between the distribution of the itemset and 0 (discharged). We deleted the itemset with a distance higher than 0.035. Using JSD, we selected 65 itemsets from 171 itemsets. <?A3B2 "fig3",5,"anchor"?><xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the distribution of the itemset and class distributions. The left one has a short distance to 1 (death), while the right one has a long distance to 1 (death). The upper one is the itemset distribution, the middle one is the l (death) class distribution, and the bottom one is the 0 (survival) class distribution. As shown in the figure, under short distance conditions, the distributions of itemset and class 1 had similar shapes. The highest conditions and values between them were similar. However, in the long distance condition, the distributions of itemset and class 1 had different shapes and values.</p><fig id="fig-3"><label>Figure 3</label><caption><title>Distribution of itemset, class 1, and class 0</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-3.png"/></fig>
<p>After filtering based on distance, we calculated the Bayesian posterior probability for each filtered itemset. The threshold of each itemset&#x0027;s posterior probability was 0.96. <?A3B2 "fig4",5,"anchor"?><xref ref-type="fig" rid="fig-4">Fig. 4</xref> shows the approximated posterior probability distribution. The upper one had a low probability while the bottom one had a high probability; the left one shows distribution and the right one shows the probability of samples. The highest point was the estimated point.</p>
<fig id="fig-4"><label>Figure 4</label><caption><title>Approximated posterior distribution and probability for samples</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-4.png"/></fig>
<p>Through the Bayesian posterior probability filtering, we selected 20 rules. After the superset filtering, we deleted 6 redundant rules. Therefore, in total, 14 rules were extracted.</p> 
<table-wrap id="table-6">
<table frame="hsides">
<colgroup>
<col align="left" charoff="35"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col/>
</colgroup>
<tbody>
<tr>
<td align="left">42.3&#x003C;Hypersensitive C-reactive protein and 79.4&#x003C;neutrophils(&#x0025;) and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9701269092135602</td>
<td/>
</tr>
<tr>
<td>(&#x0025;)lymphocyte&#x003C;&#x003D;13.0 and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9698961739048645</td>
<td/>
</tr>
<tr>
<td>(&#x0025;)lymphocyte&#x003C;&#x003D;13.0 and 42.3&#x003C;Hypersensitive C-reactive protein &#x003D;&#x003E; High_Risk 0.9693271238901066</td>
<td/>
</tr>
<tr>
<td>albumin&#x003C;&#x003D;31.9 and 0.095&#x003C;procalcitonin and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9781133691560419</td>
<td/>
</tr>
<tr>
<td>albumin&#x003C;&#x003D;31.9 and 42.3&#x003C;Hypersensitive C-reactive protein &#x003D;&#x003E; High_Risk 0.9670296292557119</td>
<td/>
</tr>
<tr>
<td>1.13&#x003C;International standard ratio and 42.3&#x003C;Hypersensitive C-reactive protein and 305.0&#x003C;Amino-terminal brain natriuretic peptide precursor(NT-proBNP) &#x003D;&#x003E; High_Risk 0.974622922099393</td>
<td/>
</tr>
<tr>
<td>1.13&#x003C;International standard ratio and albumin&#x003C;&#x003D;31.9 and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9865627254211912</td>
<td/>
</tr>
<tr>
<td>2.04&#x003C;D-dimer and 305.0&#x003C;Amino-terminal brain natriuretic peptide precursor(NT-proBNP) and 0.095&#x003C;procalcitonin &#x003D;&#x003E; High_Risk 0.9665619146313702</td>
<td/>
</tr>
<tr>
<td>2.04&#x003C;D-dimer and 42.3&#x003C;Hypersensitive C-reactive protein and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9660343990274075</td>
<td/>
</tr>
<tr>
<td>2.04&#x003C;D-dimer and 1.13&#x003C;International standard ratio and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9768106539506374</td>
<td/>
</tr>
<tr>
<td>2.04&#x003C;D-dimer and 1.13&#x003C;International standard ratio and 42.3&#x003C;Hypersensitive C-reactive protein &#x003D;&#x003E; High_Risk 0.9865986602150896</td>
<td/>
</tr>
<tr>
<td>monocytes(&#x0025;)&#x003C;&#x003D;4.2 and 339.0&#x003C;Lactate dehydrogenase &#x003D;&#x003E; High_Risk 0.9781133691560419</td>
<td/>
</tr>
<tr>
<td>monocytes(&#x0025;)&#x003C;&#x003D;4.2 and albumin&#x003C;&#x003D;31.9 and 0.095&#x003C;procalcitonin &#x003D;&#x003E; High_Risk 0.9727626191036757</td>
<td/>
</tr>
<tr>
<td>monocytes(&#x0025;)&#x003C;&#x003D;4.2 and 2.04&#x003C;D-dimer &#x003D;&#x003E; High_Risk 0.9745469068256394</td>
<td/>
</tr>
</tbody>
</table>
</table-wrap> 
</sec>
<sec id="s5_4">
<label>5.4</label><title>Performance Evaluation</title>
<p>We evaluated the performance using the validation data set divided at a 20&#x0025; ratio; the test data set was provided separately. Since early prediction is important in real situations, the test performance is evaluated using data from 7 days before the result (survival or death) is released. The performance evaluation of the validation dataset was conducted through 100 rounds of five-fold cross-validation for evaluation; we executed our model five times and each fold of the ruleset was performed over 100 rounds. The ROC curve, precision, recall, F1-score, AUC score, and accuracy were used as performance metrics.</p>
<p><?A3B2 "tbl2",5,"anchor"?><xref ref-type="table" rid="table-2">Tabs. 2</xref> and <?A3B2 "tbl3",5,"anchor"?><xref ref-type="table" rid="table-3">3</xref> present the performances of the XGBoost model, the Fuzzy model, and the proposed model obtained using the validation dataset and the test dataset. The XGBoost model came from [<xref ref-type="bibr" rid="ref-3">3</xref>] and the fuzzy model came from [<xref ref-type="bibr" rid="ref-4">4</xref>]. With the validation dataset, our model had better performance and smaller variance than the XGBoost [<xref ref-type="bibr" rid="ref-3">3</xref>] model. With the test dataset, the proposed model showed the best performance. Thus, overall, our model has good performance and generalization.</p>
<table-wrap id="table-2"><label>Table 2</label><caption><title>Performances of various models using the validation dataset</title></caption>
<table frame="hsides">
<colgroup>
  <col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Metric</th>
<th align="left">XGBoost [<xref ref-type="bibr" rid="ref-3">3</xref>]</th>
<th align="left">Fuzzy Model [<xref ref-type="bibr" rid="ref-4">4</xref>]</th>
<th align="left">Proposed Model</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Accuracy</td>
<td align="left">0.95283</td>
<td align="left">1.0</td>
<td align="left">0.96693&#x2009;&#x00B1;&#x2009;0.01773</td>
</tr>
<tr>
<td align="left">Precision</td>
<td align="left">0.96491</td>
<td align="left">1.0</td>
<td align="left">0.95218&#x2009;&#x00B1;&#x2009;0.03035</td>
</tr>
<tr>
<td align="left">Recall</td>
<td align="left">0.94827</td>
<td align="left">1.0</td>
<td align="left">0.97484&#x2009;&#x00B1;&#x2009;0.02349</td>
</tr>
<tr>
<td align="left">F1-score</td>
<td align="left">0.95652</td>
<td align="left">1.0</td>
<td align="left">0.963001&#x2009;&#x00B1;&#x2009;0.01959</td>
</tr>
<tr>
<td align="left">AUCscore</td>
<td align="left">0.9506&#x2009;&#x00B1;&#x2009;0.0221</td>
<td align="left">1.0</td>
<td align="left">0.96778&#x2009;&#x00B1;&#x2009;0.01738</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="table-3"><label>Table 3</label><caption><title>Performances of various models using test dataset</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Metric</th>
<th align="left">XGBoost [<xref ref-type="bibr" rid="ref-3">3</xref>]</th>
<th align="left">Fuzzy Model [<xref ref-type="bibr" rid="ref-4">4</xref>]</th>
<th align="left">Proposed Model</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Accuracy</td>
<td align="left">0.97</td>
<td align="left">0.949</td>
<td align="left">0.98181</td>
</tr>
<tr>
<td align="left">Precision</td>
<td align="left">0.81</td>
<td align="left">0.9</td>
<td align="left">0.86666</td>
</tr>
<tr>
<td align="left">Recall</td>
<td align="left">1.0</td>
<td align="left">0.75</td>
<td align="left">1.0</td>
</tr>
<tr>
<td align="left">F1-score</td>
<td align="left">0.9</td>
<td align="left">0.81818</td>
<td align="left">0.92857</td>
</tr>
<tr>
<td align="left">AUCscore</td>
<td align="left">-</td>
<td align="left">0.868</td>
<td align="left">0.98969</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><?A3B2 "fig5",5,"anchor"?><xref ref-type="fig" rid="fig-5">Fig. 5</xref> illustrates the performances of SVM, RandomForest, Gradient Boosting, KNN, and the proposed model using the validation dataset. These figures illustrate performance in terms of box plots. The results show that our proposed model had good performance and very small variance with the validation dataset. Specifically, the performance of our model was similar to or better than that of the blackbox model such as random forest or gradient boosting. <?A3B2 "fig6",5,"anchor"?><xref ref-type="fig" rid="fig-6">Fig. 6</xref> shows the performances of SVM, RandomForest, Gradient Boosting, and KNN obtained with the proposed model using the test dataset. The result show that our proposed model achieved the best performance.</p>
<fig id="fig-5"><label>Figure 5</label><caption><title>Performances of different models using the validation dataset</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-5.png"/></fig>
<fig id="fig-6"><label>Figure 6</label><caption><title>Performances of different models using the test dataset</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-6.png"/></fig>
<p>To separately evaluate the performance of the rule extraction algorithm, we compared the model of the BRL paper [<xref ref-type="bibr" rid="ref-7">7</xref>]. We used our generated item dataset because there was no code for identifying the range of features in the BRL code. <?A3B2 "tbl4",5,"anchor"?><xref ref-type="table" rid="table-4">Tab. 4</xref> lists the comparison results of the rule extraction using BRL with the validation dataset. <?A3B2 "fig7",5,"anchor"?><xref ref-type="fig" rid="fig-7">Fig. 7</xref> shows the confusion matrix of each model&#x0027;s performance with the test dataset; the left one was from our model and the right one was from BRL [<xref ref-type="bibr" rid="ref-7">7</xref>].</p>
<p>The results revealed that our model showed improved performance, especially with the test dataset. It shows our model has great generality.</p>
<table-wrap id="table-4"><label>Table 4</label><caption><title>Comparison of rule extraction with BRL using the validation dataset</title></caption>
<table frame="hsides">
  <colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
  </colgroup>
<thead>
<tr>
<th align="left">Metric</th>
<th align="left">BRL [<xref ref-type="bibr" rid="ref-7">7</xref>]</th>
<th align="left">Proposed Model</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Accuracy</td>
<td align="left">0.95755&#x2009;&#x00B1;&#x2009;0.01939</td>
<td align="left">0.96693&#x2009;&#x00B1;&#x2009;0.01773</td>
</tr>
<tr>
<td align="left">Precision</td>
<td align="left">0.98113&#x2009;&#x00B1;&#x2009;0.01677</td>
<td align="left">0.95218&#x2009;&#x00B1;&#x2009;0.03035</td>
</tr>
<tr>
<td align="left">Recall</td>
<td align="left">0.94352&#x2009;&#x00B1;&#x2009;0.03078</td>
<td align="left">0.97484&#x2009;&#x00B1;&#x2009;0.02349</td>
</tr>
<tr>
<td align="left">F1-score</td>
<td align="left">0.96163&#x2009;&#x00B1;&#x2009;0.01799</td>
<td align="left">0.963001&#x2009;&#x00B1;&#x2009;0.01959</td>
</tr>
<tr>
<td align="left">AUCscore</td>
<td align="left">0.95971&#x2009;&#x00B1;&#x2009;0.01838</td>
<td align="left">0.96778&#x2009;&#x00B1;&#x2009;0.01738</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-7"><label>Figure 7</label><caption><title>Comparison of rule extraction with BRL using test dataset</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_17266-fig-7.png"/></fig>
</sec>
</sec>
<sec id="s6">
<label>6</label><title>Conclusion</title>
<p>The results show that the proposed model has higher accuracy and improved interpretability. It provides posterior probability with rules. It gives a good explanation for why it makes the predictions it does and that can lead to user reliability. Further, it provides various insights; for one, it provides the probability agrees with the argument that probability modeling is important in clinical practice [<xref ref-type="bibr" rid="ref-21">21</xref>]. The model shows great performance in 100-round five-fold cross-validation with the test and validation dataset, thus demonstrating high accuracy and generality. In other words, this model addressed the trade-off problem between interpretability and accuracy using blood samples of patients infected with COVID-19. Future research should focus on applications and improving the model. To strengthen its application in real situations, we will perform experiments with our model using various medical datasets to achieve high generalizability and interpretability. To improve the model, we will optimize the model to find the optimal threshold; this proposed model chooses the optimal thresholds based on the results of multiple experiments. This method does not have an optimal value, and it takes a lot of time. To solve this problem, we study optimization algorithms such as the genetic algorithm [<xref ref-type="bibr" rid="ref-22">22</xref>] and Bayesian optimization [<xref ref-type="bibr" rid="ref-23">23</xref>] to find the optimal threshold.</p>
</sec>
</body>
<back>
<ack>
<p>Thanks my mother, leading professor with lab colleague, my friends and my cats for endless support.</p>
</ack>
<fn-group>
<fn fn-type="other"><p><bold>Funding Statement:</bold> This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2021&#x2013;2020&#x2013;0&#x2013;01602) supervised by the IITP (Institute for Information &#x0026; Communications Technology Planning &#x0026; Evaluation).</p></fn>
<fn fn-type="conflict"><p><bold>Conflicts of Interest:</bold> The authors declare that they have no conflicts of interest to report regarding the present study.</p></fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>World Health Organization (WHO)</collab></person-group>, &#x201C;<article-title>Coronavirus disease (COVID-19) pandemic, number at a glance</article-title>,&#x201D; <year>2020</year>. Retrieved from <uri xlink:href="https://www.who.int/emergencies/diseases/novel-coronavirus-2019">https://www.who.int/emergencies/diseases/novel-coronavirus-2019</uri>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><collab>World Health Organization (WHO)</collab></person-group>, &#x201C;<article-title>COVID-19 severity</article-title>,&#x201D; <year>2020</year>. Retrieved from <uri xlink:href="https://www.who.int/westernpacific/emergencies/covid-19/information/severity">https://www.who.int/westernpacific/emergencies/covid-19/information/severity</uri>. </mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Yan</surname></string-name>, <string-name><given-names>H. T.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Goncalves</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Xiao</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Wang</surname></string-name> <etal>et al.</etal></person-group> <italic>&#x201C;</italic><article-title>An interpretable mortality prediction model for COVID-19 patients</article-title>,&#x201D; <source>Nature Machine Intelligence</source>, vol. <volume>2</volume>, no. <issue>5</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>6</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Gemmar</surname></string-name></person-group>, <article-title>&#x201C;An interpretable mortality prediction model for covid-19 patients-alternative approach&#x201D;</article-title>, <source>MedRxiv</source>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Yan</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Fan</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Liu</surname></string-name> <etal>et al.</etal></person-group> &#x201C;<article-title>D-dimer levels on admission to predict in-hospital mortality in patients with covid-19</article-title>,&#x201D; <source>Journal of Thrombosis and Haemostasis</source>, vol. <volume>18</volume>, no. <issue>6</issue>, pp. <fpage>1324</fpage>&#x2013;<lpage>1329</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Ding</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Huang</surname></string-name> <etal>et al.</etal></person-group> &#x201C;<article-title>Lymphopenia predicts disease severity of COVID-19: A descriptive and predictive study</article-title>,&#x201D; <source>Signal Transduction and Targeted Therapy</source>, vol. <volume>5</volume>, no. <issue>1</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>3</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Letham</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Rudin</surname></string-name>, <string-name><given-names>T. H.</given-names> <surname>McCormick</surname></string-name> and <string-name><given-names>D.</given-names> <surname>Madigan</surname></string-name></person-group>, &#x201C;<article-title>Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model</article-title>,&#x201D; <source>The Annals of Applied Statistics</source>, vol. <volume>9</volume>, no. <issue>3</issue>, pp. <fpage>1350</fpage>&#x2013;<lpage>1371</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L. I. L.</given-names> <surname>Gonz&#x00E1;lez</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Derungs</surname></string-name> and <string-name><given-names>O.</given-names> <surname>Amft</surname></string-name></person-group>, &#x201C;<article-title>A Bayesian approach to rule mining</article-title>,&#x201D; <source>arXiv preprint, arXiv:1912.06432</source>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Nie</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>P. M.</given-names> <surname>Matthews</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Xu</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Guo</surname></string-name></person-group>, &#x201C;<article-title>Minimum partial correlation: An accurate and parameter-free measure of functional connectivity in fMRI</article-title>,&#x201D; <source>Int. Conf. on Brain Informatics and Health</source>, vol. <volume>9250</volume>, pp. <fpage>125</fpage>&#x2013;<lpage>134</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Griffin</surname></string-name></person-group>, &#x201C;<article-title>A further simplification of the multiple and partial correlation process</article-title>,&#x201D; <source>Psychometrika</source>, vol. <volume>1</volume>, no. <issue>3</issue>, pp. <fpage>219</fpage>&#x2013;<lpage>228</lpage>, <year>1936</year>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>T. M.</given-names> <surname>Cover</surname></string-name></person-group>, &#x201C;<article-title>Introduction and preview</article-title>,&#x201D; in <source>Elements of Information Theory</source> <edition>2</edition><sup>nd</sup> ed., <publisher-loc>Hoboken, New Jersey, USA</publisher-loc>: <publisher-name>John Wiley &#x0026; Sons</publisher-name>, pp. <fpage>1</fpage>&#x2013;<lpage>13</lpage>, <year>1999</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L. A.</given-names> <surname>Kurgan</surname></string-name> and <string-name><given-names>K. J.</given-names> <surname>Cios</surname></string-name></person-group>, &#x201C;<article-title>CAIM discretization algorithm</article-title>,&#x201D; <source>IEEE Transactions on Knowledge and Data Engineering</source>, vol. <volume>16</volume>, no. <issue>2</issue>, pp. <fpage>145</fpage>&#x2013;<lpage>153</lpage>, <year>2004</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Han</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Pei</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Yin</surname></string-name></person-group>, &#x201C;<article-title>Mining frequent patterns without candidate generation</article-title>,&#x201D; <source>ACM Sigmod Record</source>, vol. <volume>29</volume>, no. <issue>2</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>12</lpage>, <year>2000</year>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>C. C.</given-names> <surname>Aggarwal</surname></string-name>, <string-name><given-names>M. A.</given-names> <surname>Bhuiyan</surname></string-name> and <string-name><given-names>M. A.</given-names> <surname>Hasan</surname></string-name></person-group>, &#x201C;<chapter-title>Frequent pattern mining algorithms: a survey</chapter-title>,&#x201D; in <source>Frequent Pattern Mining</source>, <publisher-name>Springer, Cham</publisher-name>, pp. <fpage>19</fpage>&#x2013;<lpage>64</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Nielsen</surname></string-name></person-group>, &#x201C;<article-title>On the jensen&#x2013;Shannon symmetrization of distances relying on abstract means</article-title>,&#x201D; <source>Entropy</source>, vol. <volume>21</volume>, no. <issue>5</issue>, pp. <fpage>485</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Forbes</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Evans</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Hastings</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Peacock</surname></string-name></person-group>, &#x201C;<article-title>Beta Distribution</article-title>,&#x201D; in <source>Statistical Distributions</source>, <edition>4</edition><sup>th</sup> ed., <publisher-loc>Hoboken, New Jersey, USA</publisher-loc>: <publisher-name>John Wiley &#x0026; Sons</publisher-name>, pp. <fpage>55</fpage>&#x2013;<lpage>61</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. K.</given-names> <surname>Cowles</surname></string-name></person-group>, &#x201C;<article-title>Introduction to One-Parameter Models: Estimating a Population Proportion</article-title>,&#x201D; in <source>Applied Bayesian statistics: With R and OpenBUGS examples</source>, <edition>1</edition><sup>st</sup> ed., vol. <volume>98</volume>, <publisher-loc>Berlin/Heidelberg, Germany</publisher-loc>: <publisher-name>Springer Science &#x0026; Business Media</publisher-name>, pp. <fpage>25</fpage>&#x2013;<lpage>46</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A. E.</given-names> <surname>Gelfand</surname></string-name> and <string-name><given-names>A. F. M.</given-names> <surname>Smith</surname></string-name></person-group>, &#x201C;<article-title>Sampling-based approaches to calculating marginal densities</article-title>,&#x201D; <source>Journal of the American Statistical Association</source>, vol. <volume>85</volume>, no. <issue>410</issue>, pp. <fpage>398</fpage>&#x2013;<lpage>409</lpage>, <year>1990</year>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Andrieu</surname></string-name>, <string-name><given-names>N. D.</given-names> <surname>Freitas</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Doucet</surname></string-name> and <string-name><given-names>M. I.</given-names> <surname>Jordan</surname></string-name></person-group>, &#x201C;<article-title>An introduction to MCMC for machine learning</article-title>,&#x201D; <source>Machine Learning</source>, vol. <volume>50</volume>, no. <issue>1</issue>, pp. <fpage>5</fpage>&#x2013;<lpage>43</lpage>, <year>2003</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><given-names>C. P.</given-names> <surname>Robert</surname></string-name></person-group>, &#x201C;<chapter-title>Bayesian point estimation,</chapter-title>&#x201D; in <source>the Bayesian Choice</source>, <publisher-name>Springer</publisher-name>, <publisher-loc>New York, NY</publisher-loc>, <publisher-name>Springer Texts in Statistics</publisher-name>, pp. <fpage>137</fpage>&#x2013;<lpage>177</lpage>, <year>1994</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R.</given-names> <surname>Bellazzi</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Zupan</surname></string-name></person-group>, &#x201C;<article-title>Predictive data mining in clinical medicine: Current issues and guidelines</article-title>,&#x201D; <source>International Journal of Medical Informatics</source>, vol. <volume>77</volume>, no. <issue>2</issue>, pp. <fpage>81</fpage>&#x2013;<lpage>97</lpage>, <year>2008</year>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. F.</given-names> <surname>Man</surname></string-name>, <string-name><given-names>K. S.</given-names> <surname>Tang</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Kwong</surname></string-name></person-group>, &#x201C;<article-title>Genetic algorithms: Concepts and applications [in engineering design]</article-title>,&#x201D; <source>IEEE Transactions on Industrial Electronics</source>, vol. <volume>43</volume>, no. <issue>5</issue>, pp. <fpage>519</fpage>&#x2013;<lpage>534</lpage>, <year>1996</year>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M.</given-names> <surname>Pelikan</surname></string-name>, <string-name><given-names>D. E.</given-names> <surname>Goldberg</surname></string-name> and <string-name><given-names>E.</given-names> <surname>Cant&#x00FA;-Paz</surname></string-name></person-group>. &#x201C;<article-title>BOA: The Bayesian optimization algorithm</article-title>,&#x201D; <source>Proc. of the Genetic and Evolutionary Computation Conf. GECCO</source><italic>-</italic>vol. <volume>99</volume>, no. <volume>1</volume>, pp. <fpage>525</fpage>&#x2013;<lpage>532</lpage>, <year>1999</year>.</mixed-citation></ref>
</ref-list>
</back>
</article>