<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMES</journal-id>
<journal-id journal-id-type="nlm-ta">CMES</journal-id>
<journal-id journal-id-type="publisher-id">CMES</journal-id>
<journal-title-group>
<journal-title>Computer Modeling in Engineering &#x0026; Sciences</journal-title>
</journal-title-group>
<issn pub-type="epub">1526-1506</issn>
<issn pub-type="ppub">1526-1492</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">52830</article-id>
<article-id pub-id-type="doi">10.32604/cmes.2024.052830</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Determination of the Pile Drivability Using Random Forest Optimized by Particle Swarm Optimization and Bayesian Optimizer</article-title>
<alt-title alt-title-type="left-running-head">Determination of the Pile Drivability Using Random Forest Optimized by Particle Swarm Optimization and Bayesian Optimizer</alt-title>
<alt-title alt-title-type="right-running-head">Determination of the Pile Drivability Using Random Forest Optimized by Particle Swarm Optimization and Bayesian Optimizer</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Cheng</surname><given-names>Shengdong</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Gao</surname><given-names>Juncheng</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>gaojc@dlut.edu.cn</email></contrib>
<contrib id="author-3" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Qi</surname><given-names>Hongning</given-names></name><xref ref-type="aff" rid="aff-2">2</xref><email>qhn2080@csu.edu.cn</email></contrib>
<aff id="aff-1"><label>1</label><institution>State Key Laboratory of Eco-Hydraulics in Northwest Arid Region, Xi&#x2019;an University of Technology</institution>, <addr-line>Xi&#x2019;an, 710048</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>School of Resources and Safety Engineering, Central South University</institution>, <addr-line>Changsha, 410083</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Authors: Juncheng Gao. Email: <email>gaojc@dlut.edu.cn</email>; Hongning Qi. Email: <email>qhn2080@csu.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="electronic"><day>20</day><month>8</month><year>2024</year></pub-date>
<volume>141</volume>
<issue>1</issue>
<fpage>871</fpage>
<lpage>892</lpage>
<history>
<date date-type="received">
<day>16</day>
<month>4</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>6</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 The Authors.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMES_52830.pdf"></self-uri>
<abstract>
<p>Driven piles are used in many geological environments as a practical and convenient structural component. Hence, the determination of the drivability of piles is actually of great importance in complex geotechnical applications. Conventional methods of predicting pile drivability often rely on simplified physical models or empirical formulas, which may lack accuracy or applicability in complex geological conditions. Therefore, this study presents a practical machine learning approach, namely a Random Forest (RF) optimized by Bayesian Optimization (BO) and Particle Swarm Optimization (PSO), which not only enhances prediction accuracy but also better adapts to varying geological environments to predict the drivability parameters of piles (i.e., maximum compressive stress, maximum tensile stress, and blow per foot). In addition, support vector regression, extreme gradient boosting, k nearest neighbor, and decision tree are also used and applied for comparison purposes. In order to train and test these models, among the 4072 datasets collected with 17 model inputs, 3258 datasets were randomly selected for training, and the remaining 814 datasets were used for model testing. Lastly, the results of these models were compared and evaluated using two performance indices, i.e., the root mean square error (RMSE) and the coefficient of determination (R<sup>2</sup>). The results indicate that the optimized RF model achieved lower RMSE than other prediction models in predicting the three parameters, specifically 0.044, 0.438, and 0.146; and higher R&#x00B2; values than other implemented techniques, specifically 0.966, 0.884, and 0.977. In addition, the sensitivity and uncertainty of the optimized RF model were analyzed using Sobol sensitivity analysis and Monte Carlo (MC) simulation. It can be concluded that the optimized RF model could be used to predict the performance of the pile, and it may provide a useful reference for solving some problems under similar engineering conditions.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Random forest</kwd>
<kwd>regression model</kwd>
<kwd>pile drivability</kwd>
<kwd>Bayesian optimization</kwd>
<kwd>particle swarm optimization</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Science Foundation of China</funding-source>
<award-id>42107183</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>The role of piles is often to transfer structural loads from the upper to the lower geotechnical layers through the formation media in most engineering environments. Pile driving usually involves hammering, where the impact forces can generate tensile and compressive stresses within the pile. When these stresses exceed the resistance strength of the pile materials, it may lead to fracture or damage of the pile [<xref ref-type="bibr" rid="ref-1">1</xref>,<xref ref-type="bibr" rid="ref-2">2</xref>]. Therefore, maximum tensile stress (MTS), maximum compressive stress (MCS), and blows per foot (BPF) are critical technical parameters that must be carefully considered during pile design and driving processes [<xref ref-type="bibr" rid="ref-3">3</xref>,<xref ref-type="bibr" rid="ref-4">4</xref>]. By predicting these parameters through calculations in the design phase, the design of the piles can be effectively optimized, achieving a construction that is both safe and economical.</p>
<p>In early engineering practices, the driving behaviour of piles was often predicted based on the point mass model from Newtonian mechanics. However, this method, which overlooks many practical factors, has limited accuracy. In 1931, Isaac [<xref ref-type="bibr" rid="ref-5">5</xref>] discovered that energy is transmitted through the propagation of impact stress waves in the hammer assembly and the pile, which fundamentally differs from Newton&#x2019;s point mass model. In 1960, Smith [<xref ref-type="bibr" rid="ref-6">6</xref>] employed the theory of stress waves to propose an empirical formula for predicting pile driving characteristics. Although this formula simplifies the pile into a series of discrete mass points, it ignores the pile&#x2019;s lateral vibrations and the complex nonlinear behaviour of the soil. In 1990, Nath [<xref ref-type="bibr" rid="ref-7">7</xref>] introduced the finite element analysis technique using the &#x201C;continuous method&#x201D; for pile driving analysis. While theoretically providing a more detailed analysis, this approach faces challenges of time consumption and parameter calibration in large-scale engineering applications. Additionally, traditional methods of pile driving analysis exhibit many uncertainties and nonlinear responses [<xref ref-type="bibr" rid="ref-8">8</xref>&#x2013;<xref ref-type="bibr" rid="ref-10">10</xref>], complicating and increasing the uncertainty in problem analysis. Therefore, it is necessary to develop more accurate predictive models, such as machine learning models. Machine learning involves learning from historical data and dynamically adjusting and refining algorithms based on the encountered data patterns rather than strictly adhering to predetermined static models. This not only enhances the stability of predictions but also improves accuracy.</p>
<p>In recent years, many researchers have utilized artificial intelligence (AI) algorithms [<xref ref-type="bibr" rid="ref-11">11</xref>&#x2013;<xref ref-type="bibr" rid="ref-13">13</xref>] such as support vector machines (SVM) [<xref ref-type="bibr" rid="ref-14">14</xref>], artificial neural networks (ANN) [<xref ref-type="bibr" rid="ref-15">15</xref>&#x2013;<xref ref-type="bibr" rid="ref-17">17</xref>], and propagation neural network (BPNN) [<xref ref-type="bibr" rid="ref-18">18</xref>] as effective solutions in geotechnical problems. These research methods have important guiding significance for engineering design. For example, Das et al. [<xref ref-type="bibr" rid="ref-19">19</xref>] predicted the bearing capacity of piles through the use of the ANN model. In addition, the SVM model was proposed by Kordjazi et al. [<xref ref-type="bibr" rid="ref-20">20</xref>] to evaluate the bearing capacity of the pile under axial load conditions. Later, Zhang et al. [<xref ref-type="bibr" rid="ref-21">21</xref>] evaluated the ultimate bearing capacity of the driven piles using back BPNN regression model and multivariate adaptive regression splines and performed performance comparisons on the developed methods [<xref ref-type="bibr" rid="ref-22">22</xref>]. Although machine learning has made significant progress in addressing geotechnical engineering issues, existing models still exhibit limitations under various environmental conditions. Training an ANN is often a time-consuming process, largely because it is difficult to predict at the outset which network structure and parameter configuration will yield the best performance [<xref ref-type="bibr" rid="ref-23">23</xref>]. Additionally, while SVM demonstrates good accuracy when handling large datasets, its computational speed can be slow when dealing with complex problems. Concurrently, the random forest (RF) algorithm is renowned for its robust capability to process and interpret complex and nonlinear interactions among variables, making it particularly suitable for solving complex engineering challenges that traditional methods struggle to address [<xref ref-type="bibr" rid="ref-24">24</xref>&#x2013;<xref ref-type="bibr" rid="ref-26">26</xref>].</p>
<p>This study proposed an optimized RF method for the forecast of the MTS, MCS, and BPF associated with the drivability of the pile. In addition to the optimized RF model and for comparison purposes, other models have been constructed, including SVM, k-nearest neighbour (KNN), extreme gradient boosting, and decision tree (DT). The arrangement of this article is as follows. <xref ref-type="sec" rid="s2">Section 2</xref> introduces RF and optimization methods in detail and briefly describes XGBoost, KNN, support vector regression (SVR), and DT. <xref ref-type="sec" rid="s3">Section 3</xref> presents the definition of data and variables. In <xref ref-type="sec" rid="s4">Section 4</xref>, the training process of the models and their comparisons will be given. Finally, the last section presents conclusions and recommendations for further research.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Methodology</title>
<sec id="s2_1">
<label>2.1</label>
<title>Machine Learning Models</title>
<p>RF enhances model generalization and predictive accuracy through the integration of multiple Decision Trees (DTs) [<xref ref-type="bibr" rid="ref-27">27</xref>,<xref ref-type="bibr" rid="ref-28">28</xref>]. Within the RF model, each tree is trained on a randomly selected subset of samples and features from the original dataset. This randomness aids the model in better adapting to diverse data distributions, thereby improving the overall stability and performance of the model [<xref ref-type="bibr" rid="ref-29">29</xref>&#x2013;<xref ref-type="bibr" rid="ref-31">31</xref>]. DTs utilize the Mean Squared Error (MSE) to select the optimal splitting point during training, continuing until no additional features are available or the minimum MSE is achieved. The final prediction result of the forest is obtained by averaging the predictions from all the trees, the formula is as follows:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mover><mml:mi>Y</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>K</mml:mi></mml:mfrac><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula>where <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mover><mml:mi>Y</mml:mi><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> is the RF prediction result; <italic>K</italic> is the number of trees; <italic>Y</italic><sub><italic>i</italic></sub> is the prediction result of the <italic>i</italic>th decision tree.</p>
<p>RF regression proves highly effective in analyzing nonlinear and collinear data, particularly as it does not require the assumption of a specific mathematical model form [<xref ref-type="bibr" rid="ref-31">31</xref>]. In this study, the RF model integrates multiple DTs to manage the diversity of pile drivability performance data, thus enabling the analysis and understanding of the drivability parameters of piles under various geological conditions. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> illustrates the process of building the RF model in this study.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Flowchart of RF (RMSE: root mean square error)</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-1.tif"/>
</fig>
<p>In addition to the RF model, this study employs several commonly used regression methods to predict the drivability of piles, including XGBoost, DT, SVR, and KNN. For a detailed explanation of these models, readers are referred to previously published literature [<xref ref-type="bibr" rid="ref-32">32</xref>&#x2013;<xref ref-type="bibr" rid="ref-35">35</xref>].</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Particle Swarm Optimization</title>
<p>Particle Swarm Optimization (PSO), developed by Eberhart and Kennedy in 1995, is a metaheuristic optimization algorithm inspired by the foraging behaviour of birds. In PSO, each optimization problem is represented as a &#x201C;particle&#x201D; in the search space. Each particle possesses a fitness value determined by the objective function being optimized and a velocity dictating its direction and distance of movement. As the optimization progresses, particles adjust their movements to converge toward the current optimal solutions within the solution space. Suppose that <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the position vector of particle <italic>i</italic>, and <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the velocity vector of particle <italic>i</italic> [<xref ref-type="bibr" rid="ref-36">36</xref>,<xref ref-type="bibr" rid="ref-37">37</xref>], where <italic>n</italic> is the dimension size of the optimization problem. To have more control on the velocity, an inertia weight (<italic>W</italic>) can be described in the velocity equation [<xref ref-type="bibr" rid="ref-38">38</xref>]. Then, the speed vector iteration formula can be presented as follows [<xref ref-type="bibr" rid="ref-39">39</xref>&#x2013;<xref ref-type="bibr" rid="ref-42">42</xref>]:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msub><mml:mi>&#x03BD;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>W</mml:mi><mml:msub><mml:mi>&#x03BD;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>P</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>G</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x03BD;</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></disp-formula></p>
<p>In the above formulas, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mi>P</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mi>G</mml:mi><mml:mi>b</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represent the historical best position vector of particle <italic>i</italic> and the best position vector in the history of the population, the parameters <italic>c</italic><sub><italic>1</italic></sub> and <italic>c</italic><sub><italic>2</italic></sub> are called learning factors, and <italic>r</italic><sub><italic>1</italic></sub> and r<sub><italic>2</italic></sub> are two random probability values distributed in [0,1], <italic>W</italic> is the inertia weight that is used to balance global and local search ability [<xref ref-type="bibr" rid="ref-43">43</xref>].</p>
<p>Through this mechanism, the PSO algorithm continuously learns from its own and the population&#x2019;s historical information, optimizing the search process and gradually converging to the optimal solution. Additionally, the inertia weight <italic>w</italic> is an important parameter in PSO as it influences the velocity update and global search capability of the particles. In this study, the inertia weight <italic>w</italic> in the PSO algorithm was varied adaptively. Specifically, the inertia weight <italic>w</italic> decreases linearly during the iterations, starting from an initial value <italic>w</italic><sub><italic>max</italic></sub> &#x003D; 0.9 and reducing to a final value <italic>w</italic><sub><italic>min</italic></sub> &#x003D; 0.4. <xref ref-type="fig" rid="fig-2">Fig. 2</xref> illustrates the PSO optimization process.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>PSO optimization flowchart</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-2.tif"/>
</fig>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Bayesian Optimization</title>
<p>Bayesian optimization (BO) differs from traditional gradient-based methods in that it uses a Gaussian process model to reveal the hidden relationships between hyper-parameters and the loss function. In the Bayesian tuning process, suppose a set of hyper-parameter combinations <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (<inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the value of a hyper-parameter), and this set of hyper-parameters and the loss function <italic>f</italic>(<italic>x</italic>) that we need to optimize have a functional relationship [<xref ref-type="bibr" rid="ref-44">44</xref>]. When the value of <italic>x<sup>&#x002A;</sup></italic> is taken, the optimal <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>&#x2217;</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can be obtained. The first thing to mention is the fact that the Gaussian process is a normal distribution. It can then be assumed that this process of finding the optimal parameters is a Gaussian process [<xref ref-type="bibr" rid="ref-45">45</xref>,<xref ref-type="bibr" rid="ref-46">46</xref>]. The core steps of the BO algorithm are as follows: First, the sample point <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> is used to estimate and update the Gaussian process <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x223C;</mml:mo><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Then, the acquisition function <italic>EI</italic>(<italic>x</italic>) is then used to guide the new sampling.
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mi>E</mml:mi><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BE;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>&#x03C6;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>&#x03D5;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>Z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mi>Z</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BE;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>&#x03C6;</italic> is the cumulative distribution function of the standard normal distribution, <italic>&#x03D5;</italic> is the probability density function of the standard normal distribution, and <italic>&#x03BE;</italic> is a non-negative parameter that controls the trade-off between exploration and exploitation.</p>
<p>BO, in principle, models the hidden relationship between hyperparameters and the loss function using a Gaussian process. This fitted function provides guidance on optimal parameters for the next iteration. By continuously adding sample points, the posterior distribution of the objective function is updated for a given optimized objective function [<xref ref-type="bibr" rid="ref-47">47</xref>,<xref ref-type="bibr" rid="ref-48">48</xref>]. For more comprehensive insights and discussions, other studies available in the literature can be found [<xref ref-type="bibr" rid="ref-49">49</xref>,<xref ref-type="bibr" rid="ref-50">50</xref>]. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> illustrates the process of BO.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>BO flowchart</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-3.tif"/>
</fig>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Materials</title>
<sec id="s3_1">
<label>3.1</label>
<title>Database for Modelling</title>
<p>This article collected a database of more than 4000 piles from the North Carolina project [<xref ref-type="bibr" rid="ref-51">51</xref>], which have been used in construction projects. The input of this database contains 17 characteristic parameters (concerning the information of hammer, hammer cushion material, pile information, and soil information), and the output is three target parameters (i.e., MCS, MTS, and BPF). The frequency distributions of the MCS, MTS, and BPF are shown in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>. It can be seen in <xref ref-type="fig" rid="fig-4">Fig. 4</xref> that MCS is approximately normally distributed, while MTS and BPF show skewed distributions and are skewed to the side where the data is small. The input and output parameters with their ranges are described in <xref ref-type="table" rid="table-1">Table 1</xref>. Moreover, the correlation analysis between the influencing factors is also performed in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>. <xref ref-type="fig" rid="fig-5">Fig. 5</xref> shows the correlation coefficients between the variables. The closer the correlation coefficient is to 0, the worse the correlation between the two variables.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>Frequency distribution of each output variable: (a) MCS data, (b) MTS data, and (c) BPF data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-4.tif"/>
</fig><table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Summary of variables definition</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Number</th>
<th>Parameter</th>
<th>Description</th>
<th>Operation</th>
</tr>
</thead>
<tbody>
<tr>
<td>X1</td>
<td>Hammer weight (kN) (1.76&#x2013;7.00)</td>
<td rowspan="2">Hammer</td>
<td rowspan="17">Inputs</td>
</tr>
<tr>
<td>X2</td>
<td>Energy (kN&#x002A;m) (17.60&#x2013;75.43)</td>
</tr>
<tr>
<td>X3</td>
<td>Area (m<sup>2</sup>) (225.00&#x2013;416.00)</td>
<td rowspan="4">Hammer cushion material</td>
</tr>
<tr>
<td>X4</td>
<td>Elastic modulus (GPa) (175.00&#x2013;540.00)</td>
</tr>
<tr>
<td>X5</td>
<td>Thickness (m) (1.00&#x2013;7.00)</td>
</tr>
<tr>
<td>X6</td>
<td>Helmet weight (kN) (0.89&#x2013;7.74)</td>
</tr>
<tr>
<td>X7</td>
<td>Length (m) (9.84&#x2013;100.06)</td>
<td rowspan="5">Pile information</td>
</tr>
<tr>
<td>X8</td>
<td>Penetration (m) (9.84&#x2013;100.10)</td>
</tr>
<tr>
<td>X9</td>
<td>Diameter (m) (12.00&#x2013;14.00)</td>
</tr>
<tr>
<td>X10</td>
<td>Section area (m<sup>2</sup>) (11.50&#x2013;21.40)</td>
</tr>
<tr>
<td>X11</td>
<td>L/D (8.43&#x2013;100.10)</td>
</tr>
<tr>
<td>X12</td>
<td>Quake at toe (0.10&#x2013;0.33)</td>
<td rowspan="4">Soil information</td>
</tr>
<tr>
<td>X13</td>
<td>Damping at shaft (s/m) (0.05&#x2013;0.25)</td>
</tr>
<tr>
<td>X14</td>
<td>Damping at toe (s/m) (0.06&#x2013;0.25)</td>
</tr>
<tr>
<td>X15</td>
<td>Shaft resistance (%) (10.00&#x2013;95.00)</td>
</tr>
<tr>
<td>X16</td>
<td colspan="2">Ultimate pile capacity Qu (kN) (31.00&#x2013;650.00)</td>
</tr>
<tr>
<td>X17</td>
<td colspan="2">Stroke (m) (3.36&#x2013;11.35)</td>
</tr>
<tr>
<td>MCS</td>
<td colspan="2" align="center">Maximum compressive stress (3.18&#x2013;61.23)</td>
<td rowspan="3">Output</td>
</tr>
<tr>
<td>MTS</td>
<td colspan="2" align="center">Maximum tensile stress (0&#x2013;31.77)</td>
</tr>
<tr>
<td>BPF</td>
<td colspan="2" align="center">Blow per foot (2.30&#x2013;299.80)</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Correlation analysis between influencing factors</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-5.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Performance Assessment</title>
<p>If the model&#x2019;s training and test performance cannot be quantitatively evaluated, it is difficult to measure the quality of the model [<xref ref-type="bibr" rid="ref-52">52</xref>,<xref ref-type="bibr" rid="ref-53">53</xref>]. In general, the accuracy of the regression model is quantitatively evaluated by using indicators such as root mean square error (RMSE) and coefficient of determination (R<sup>2</sup>). In theory, when RMSE is equal to 0 and R<sup>2</sup> is 1, the model is considered perfect. The RMSE is the square root of the ratio of the squared sum of deviation of the observed value to the true value and the number m of observations [<xref ref-type="bibr" rid="ref-54">54</xref>,<xref ref-type="bibr" rid="ref-55">55</xref>]. Use it to measure the deviation between predicted and true values. Therefore, the prediction ability of the model increases as the RMSE value decreases. The smaller the RMSE value, the better the prediction ability of the model. The R<sup>2</sup> of the model is often used as a measure of the predictability of the model. The value of R<sup>2</sup> represents the percentage of the square of the correlation between the predicted and actual values of the target variable. A model with an R<sup>2</sup> value of 0 indicates that it is completely unpredictable for the target variable while a model with an R<sup>2</sup> value of 1 can perfectly predict the target variable. The relevant equations of RMSE and R<sup>2</sup> are as follows [<xref ref-type="bibr" rid="ref-56">56</xref>,<xref ref-type="bibr" rid="ref-57">57</xref>]:
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:msqrt><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:msubsup><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mtext>{}</mml:mtext></mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mrow><mml:mover><mml:mi>i</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:msqrt></mml:math></disp-formula></p>
<p><disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mtext>{}</mml:mtext></mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>y</mml:mi><mml:mrow><mml:mover><mml:mi>i</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mtext>{}</mml:mtext></mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mover><mml:mrow><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:mrow><mml:mo accent="false">&#x00AF;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></disp-formula>where <italic>yi</italic> represents the observed value, the <italic>y</italic><inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:mrow><mml:mover><mml:mi>i</mml:mi><mml:mo stretchy="false">&#x005E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is the predicted value of the model, <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:mrow><mml:mover><mml:mrow><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:mrow><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> represents the average of the observed values, and <italic>N</italic> denotes the number of samples in the training or testing stages.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Models Development</title>
<p>In this study, various machine learning methods were employed to predict the drivability of piles. Following a performance comparison, several optimization techniques were applied to fine-tune the parameters of the RF model. The main modelling and optimization steps can be summarized as follows:
<list list-type="simple">
<list-item><label>1.</label><p>Data preprocessing: Before model construction, relevant input parameters were selected through correlation assessment. For predicting different pile drivability parameters, the output variables underwent a logarithmic transformation, and missing values were removed. The constructed dataset comprised 4072 samples, with 80% allocated to the training set for model training and the remaining 20% assigned to the test set for model validation.</p></list-item>
<list-item><label>2.</label><p>Model construction: Initially, models were built without optimization, using default hyperparameters for RF, SVR, KNN, XGBoost, and DT. RMSE and the coefficient R&#x00B2; were selected as the performance evaluation metrics. Models were trained on the training set and evaluated on the test set, with performance metrics recorded for each model in their initial state.</p></list-item>
<list-item><label>3.</label><p>Hyperparameter optimization: BO and PSO methods were employed to optimize three hyperparameters of the RF model: the maximum number of features (max_feature), the number of estimators (n_estimators), and the minimum number of samples required to split an internal node (min_samples_split). The distribution of each hyperparameter during the optimization process was illustrated for the three prediction targets (MCS, MTS, and BPF).</p></list-item>
<list-item><label>4.</label><p>Model comparison: Using the optimized hyperparameter configurations, the models were retrained and tested. The optimized models were then compared with the initial models to analyze their performance, discuss their generalization capabilities, and evaluate their fit to the data.</p></list-item>
</list></p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Results and Discussion</title>
<p>In this section, the performance of the initial unoptimized models, including RF, SVR, KNN, XGBoost, and DT, is first evaluated to identify the best-performing model. Subsequently, BO and PSO are applied to fine-tune the hyperparameters of the selected optimal model. Finally, the performance of the optimized model is compared with the other models to determine the most effective model for predicting MCS, MTS, and BPF.</p>
<sec id="s4_1">
<label>4.1</label>
<title>Predictive Performance of the Initial Model</title>
<p><xref ref-type="table" rid="table-2">Tables 2</xref>&#x2013;<xref ref-type="table" rid="table-4">4</xref> present the evaluation performance metrics, including R&#x00B2; and RMSE, for the constructed initial models predicting MCS, MTS, and BPF, respectively. Each model&#x2019;s performance on the training and test sets was evaluated using a scoring system to identify the optimal model.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Model performance evaluation and ranking for MCS</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Model (MCS)</th>
<th colspan="4" align="center">Training</th>
<th colspan="4" align="center">Testing</th>
<th></th>
</tr>
<tr>
<th/>
<th>RMSE</th>
<th><inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE score</th>
<th><inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> score</th>
<th>RMSE</th>
<th><inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE<break/>score</th>
<th><inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> score</th>
<th>Total score</th>
</tr>
</thead>
<tbody>
<tr>
<td>RF</td>
<td>0.027</td>
<td>0.986</td>
<td>4</td>
<td>4</td>
<td>0.066</td>
<td>0.926</td>
<td>4</td>
<td>4</td>
<td>16</td>
</tr>
<tr>
<td>KNN</td>
<td>0.105</td>
<td>0.801</td>
<td>1</td>
<td>1</td>
<td>0.131</td>
<td>0.709</td>
<td>1</td>
<td>1</td>
<td>4</td>
</tr>
<tr>
<td>XGBoost</td>
<td>0.06</td>
<td>0.934</td>
<td>3</td>
<td>3</td>
<td>0.059</td>
<td>0.941</td>
<td>5</td>
<td>5</td>
<td>16</td>
</tr>
<tr>
<td>SVR</td>
<td>0.081</td>
<td>0.881</td>
<td>2</td>
<td>2</td>
<td>0.072</td>
<td>0.912</td>
<td>3</td>
<td>3</td>
<td>10</td>
</tr>
<tr>
<td>DT</td>
<td>0.001</td>
<td>1.0</td>
<td>5</td>
<td>5</td>
<td>0.102</td>
<td>0.825</td>
<td>2</td>
<td>2</td>
<td>14</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Model performance evaluation and ranking for MTS</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Model (MTS)</th>
<th colspan="4" align="center">Training</th>
<th colspan="5" align="center">Testing</th>
</tr>
<tr>
<th/>
<th>RMSE</th>
<th><inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE score</th>
<th><inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> score</th>
<th>RMSE</th>
<th><inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE score</th>
<th><inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> core</th>
<th>Total score</th>
</tr>
</thead>
<tbody>
<tr>
<td>RF</td>
<td>0.245</td>
<td>0.964</td>
<td>4</td>
<td>4</td>
<td>0.493</td>
<td>0.853</td>
<td>5</td>
<td>5</td>
<td>18</td>
</tr>
<tr>
<td>KNN</td>
<td>0.643</td>
<td>0.751</td>
<td>3</td>
<td>3</td>
<td>0.742</td>
<td>0.667</td>
<td>2</td>
<td>2</td>
<td>10</td>
</tr>
<tr>
<td>XGBoost</td>
<td>0.674</td>
<td>0.726</td>
<td>2</td>
<td>2</td>
<td>0.584</td>
<td>0.794</td>
<td>4</td>
<td>4</td>
<td>12</td>
</tr>
<tr>
<td>SVR</td>
<td>0.764</td>
<td>0.648</td>
<td>1</td>
<td>1</td>
<td>0.747</td>
<td>0.663</td>
<td>1</td>
<td>1</td>
<td>4</td>
</tr>
<tr>
<td>DT</td>
<td>0.007</td>
<td>1.0</td>
<td>5</td>
<td>5</td>
<td>0.615</td>
<td>0.771</td>
<td>3</td>
<td>3</td>
<td>16</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Model performance evaluation and ranking for BPF</title>
</caption>
<table frame="hsides">
<colgroup>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col/>
<col/>
<col align="left"/>
<col/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Model (BPF)</th>
<th colspan="4" align="center">Training</th>
<th colspan="4" align="center">Testing</th>
<th></th>
</tr>
<tr>
<th/>
<th>RMSE</th>
<th><inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE score</th>
<th><inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> score</th>
<th>RMSE</th>
<th><inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula></th>
<th>RMSE score</th>
<th><inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:msup><mml:mrow><mml:mtext>R</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> score</th>
<th>Total score</th>
</tr>
</thead>
<tbody>
<tr>
<td>RF</td>
<td>0.116</td>
<td>0.986</td>
<td>4</td>
<td>4</td>
<td>0.184</td>
<td>0.969</td>
<td>5</td>
<td>5</td>
<td>18</td>
</tr>
<tr>
<td>KNN</td>
<td>0.235</td>
<td>0.942</td>
<td>2</td>
<td>2</td>
<td>0.294</td>
<td>0.907</td>
<td>1</td>
<td>1</td>
<td>6</td>
</tr>
<tr>
<td>XGBoost</td>
<td>0.225</td>
<td>0.947</td>
<td>3</td>
<td>3</td>
<td>0.21</td>
<td>0.953</td>
<td>4</td>
<td>4</td>
<td>14</td>
</tr>
<tr>
<td>SVR</td>
<td>0.297</td>
<td>0.908</td>
<td>1</td>
<td>1</td>
<td>0.287</td>
<td>0.911</td>
<td>2</td>
<td>2</td>
<td>6</td>
</tr>
<tr>
<td>DT</td>
<td>0.002</td>
<td>1.0</td>
<td>5</td>
<td>5</td>
<td>0.215</td>
<td>0.95</td>
<td>3</td>
<td>3</td>
<td>18</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In predicting MCS, despite the DT model performing perfectly on the training set, its performance significantly declined on the test set (R&#x00B2; &#x003D; 0.825, RMSE &#x003D; 0.102), indicating a severe overfitting issue. The KNN model also showed poor performance on the test set in terms of R&#x00B2; and RMSE, suggesting its ineffectiveness in handling high-dimensional data. The XGBoost model achieved a relatively high R&#x00B2; (0.941) on the test set, but its overall score was slightly lower than that of the RF model. In contrast, the RF model exhibited an R&#x00B2; of 0.926 and an RMSE of 0.066 on the test set, with the highest overall score of 18 points, demonstrating outstanding performance.</p>
<p>In predicting MTS, the RF model again demonstrated its stability and efficiency. On the test set, the RF model achieved an R&#x00B2; of 0.853 and an RMSE of 0.493, with a total score of 18 points. Although the DT model showed excellent performance on the training set (R&#x00B2; &#x003D; 1.0, RMSE &#x003D; 0.007), its R&#x00B2; dropped to 0.771, and RMSE increased to 0.615 on the test set, indicating overfitting. Both the KNN and SVR models performed poorly on the test set, further highlighting the advantages of the RF model.</p>
<p>In predicting BPF, while the DT model showed the best R&#x00B2; and RMSE on the training set, its performance on the test set was inferior to that of the RF model. The RF model achieved an R&#x00B2; of 0.941 and an RMSE of 0.235 on the test set, with a total score of 14 points. The XGBoost model performed well in predicting BPF, with an R&#x00B2; of 0.953 and an RMSE of 0.210 on the test set, but its overall score remained lower than that of the RF model.</p>
<p>In summary, the RF model demonstrated the best performance in predicting MCS, MTS, and BPF. Therefore, we will focus on hyperparameter optimization of the RF model to further enhance its predictive performance.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Determination of Hyper-Parameters of Model</title>
<p>To have a better model development, BO, PSO, and random search methods were used to optimize the hyperparameters in the RF model. The optimization process was conducted on the training set, and the average RMSE was calculated using 5-fold cross-validation to obtain the optimal parameter combination. The optimization processes of different methods were compared using kernel density estimation.</p>
<p>Kernel density estimation is used in probability theory to estimate the unknown density function and belongs to one of the non-parametric test methods. The kernel density estimation method is intuitively a smoothed histogram. Through the kernel density estimation chart, the distribution characteristics of the data sample itself can be seen relatively intuitively, and the density of the data at any position can be characterised. The ordinate is an estimated value of the kernel density, which indicates the possibility of taking values near a certain value on the <italic>x</italic>-axis.</p>
<p><xref ref-type="fig" rid="fig-6">Fig. 6</xref> shows the distribution of each hyper-parameter optimization selection process conducted in the kernel density map under the three predicted targets. Where &#x201C;tpe&#x201D; stands for &#x201C;Tree-structured Parzen Estimator,&#x201D; a BO method that guides the search for the best parameters by modelling the distributions of good and bad hyper-parameters. When the predicted label is MCS, the maximum possible value of max_feature during parameter optimization is between 0.5 and 0.75, where max_feature represents the percentage of features that RF needs to consider. Then, the maximum possible value of the parameter n_estimators is between 75 and 100, where n_estimators represent the subtree of the RF. The maximum possible value of the parameter min_samples_split is around 5, where min_samples_split is a condition that restricts the subtree from continuing to divide. If the number of samples of a node is less than this value, it will not try to select the best feature to divide. When the predicted label is MTS, the maximum possible value of max_feature is between 0.8 and 1.0. Then, the maximum possible value of n_estimators is around 100, and the maximum possible value of min_samples_split is between 0 and 5. When the predicted label is BPF, the maximum possible value of max_feature is about 0.8. The maximum possible value of n_estimators is also about 100, and the maximum possible value of min_samples_split is between 0 and 5.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Hyper-parameter kernel density maps during optimization: (a) MCS data, (b) MTS data, and (c) BPF data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-6a.tif"/><graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-6b.tif"/>
</fig>
<p>It can be observed that the BO method demonstrates distinct peaks in the selection of hyper-parameters, whereas the density distributions for the random search and PSO methods are more uniform. Specifically, the BO method exhibits a stronger tendency toward certain values in the three hyper-parameters, indicating its higher effectiveness in optimizing these parameters. Additionally, the average optimization times for the BO, PSO, and random search methods are 453, 211, and 319 s, respectively. Considering both optimization effectiveness and time cost, the BO and PSO methods have been chosen as the primary optimization techniques for the subsequent model refinement. The relationship between the loss value and each hyper-parameter in the iterative optimization process can be seen in <xref ref-type="fig" rid="fig-7">Fig. 7</xref>.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Relationship between parameter range and loss during optimization: (a) MCS data, (b) MTS data, and (c) BPF data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-7a.tif"/><graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-7b.tif"/>
</fig>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Comprehensive Analysis between Models</title>
<p>After optimizing the RF model using PSO and BO methods, we conducted a detailed comparative analysis of the performance of all models. <xref ref-type="fig" rid="fig-8">Figs. 8</xref>&#x2013;<xref ref-type="fig" rid="fig-10">10</xref> show the scatter distributions of the training and testing datasets for each regression model with MCS, MTS, and BPF as the predicted parameters, respectively. It can be seen that the optimized RF model outperforms the other models in all three prediction tasks. Whether on testing data or training data, its scatter distribution is very tight, demonstrating excellent predictive accuracy. The small performance discrepancy between the training and testing sets indicates that most models have strong generalization ability without over-fitting or under-fitting. The performance of XGBoost and SVR models is also good but slightly inferior to the optimized RF model in terms of predictive accuracy and consistency. KNN and DT models perform poorly across all three prediction tasks, with their scatter distributions being more dispersed and showing larger prediction errors.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Comparison of predicted and measured values of each model for MCS: (a) testing data and (b) training data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-8.tif"/>
</fig><fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Comparison of predicted and measured values of each model for MTS: (a) testing data and (b) training data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-9.tif"/>
</fig><fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>Comparison of predicted and measured values of each model for BPF: (a) testing data and (b) training data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-10.tif"/>
</fig>
<p>Furthermore, combining the previous frequency distribution of the labels, the distribution of the model scatter should also be related to the distribution of the data itself. On account of the MCS data being roughly normally distributed, the fact is that its scatter plot data fits well. The MTS data is non-normally distributed, so its scatter distribution is not as tight as MCS. The comparison of the performance test results of each model is more intuitive and clearer in the bar chart presented in <xref ref-type="fig" rid="fig-11">Fig. 11</xref>. Obviously, from different sub-sections of the results in <xref ref-type="fig" rid="fig-11">Fig. 11</xref>, the optimized RF model is considered the optimal predictive technique in the predictions of MCS, MTS, and BPF.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Evaluation indicators for each model: (a) MCS data, (b) MTS data, and (c) BPF data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-11.tif"/>
</fig>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Sensitivity and Uncertainty Analysis</title>
<p>Sobol sensitivity analysis is a global sensitivity analysis method used to quantitatively evaluate the influence of model input variables on output variables [<xref ref-type="bibr" rid="ref-58">58</xref>]. This method calculates the contribution of each input variable and its interactions with the model output by decomposing the variance of the output variable. The first-order sensitivity index measures the direct impact of a single input variable on the output variable, while the total sensitivity index measures the total contribution of an input variable to the output variable, including its interaction effects with other variables.</p>
<p>In this study, Sobol sensitivity analysis was used to evaluate the optimized models, and <xref ref-type="fig" rid="fig-12">Fig. 12</xref> shows the influence of each input variable on the prediction outputs for MCS, MTS, and BPF. In <xref ref-type="fig" rid="fig-12">Fig. 12a</xref>, X16 and X17 exhibit significantly higher sensitivity indices, indicating that these two variables have the most substantial impact on MCS prediction. In <xref ref-type="fig" rid="fig-12">Fig. 12b</xref>, the total sensitivity index is significantly higher than the first-order sensitivity index, indicating that MTS is influenced by the interactions between variables in the model. Among them, X13 and X17 are the most influential variables in the model. In <xref ref-type="fig" rid="fig-12">Fig. 12c</xref>, X16, as the variable with the highest sensitivity index, is nearly close to 1, while the other variables are relatively low, indicating that X16 dominates the influence on BPF. Additionally, in the predictions of MCS, MTS, and BPF, X16 and X17, representing ultimate pile capacity and stroke, respectively, are the primary influencing variables, both direct effects and interactions with others variable.</p>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>Sobol sensitivity analysis results for each model: (a) MCS, (b) MTS, and (c) BPF</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-12.tif"/>
</fig>
<p>Monte Carlo (MC) simulation is a numerical technique that uses random sampling for uncertainty analysis and risk assessment. By conducting numerous random samples and repeated experiments, MC simulation can generate distributions of variables, thereby quantifying the uncertainty in model predictions [<xref ref-type="bibr" rid="ref-59">59</xref>]. In this study, an optimized RF model was used for MC simulation to explain the model&#x2019;s uncertainty. A total of 1000 simulations were conducted to ensure the robustness and reliability of the results.</p>
<p><xref ref-type="fig" rid="fig-13">Fig. 13</xref> shows the comparison of the outputs for MCS, MTS, and BPF between the MC simulation and the optimized RF model. In the figure, the green represents the results generated by the MC simulation, and the orange represents the results produced by the optimized RF model. The distribution shapes of the optimized RF model&#x2019;s predictions are very similar to those of the MC simulation results. This similarity indicates that the model&#x2019;s performance under different input conditions aligns with the expected distribution, enhancing confidence in the model&#x2019;s predictive reliability. Specifically, the median and interquartile range of the optimized RF model&#x2019;s predictions are close to those of the MC simulation results, suggesting that both sets of predictions share similar central tendencies and degrees of dispersion. The similarity in the peak positions, widths, and shapes of the two results indicates that the model can capture the main characteristics of the input data and generate stable predictions, demonstrating good stability and reliability of the model.</p>
<fig id="fig-13">
<label>Figure 13</label>
<caption>
<title>Distribution of prediction results before and after MC simulation, (a, c and e): training data; (b, d and f): testing data</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_52830-fig-13.tif"/>
</fig>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Limitations and Future Research</title>
<p>Accurate prediction of pile drivability can effectively optimize pile design, achieving construction that is both safe and economical. In this study, we utilized BO and PSO to perform hyperparameter tuning on the RF model to evaluate and predict pile drivability. However, there are several limitations that need to be addressed in future research:
<list list-type="simple">
<list-item><label>(1)</label><p>The optimized RF model significantly improves prediction accuracy and stability. However, as the dimensionality of the optimization search space increases, the computational complexity and resource consumption of BO increase significantly, while PSO tends to get trapped in local optima and is sensitive to parameter settings. Therefore, in practical applications, the specific characteristics of the problem should be considered comprehensively when selecting the most suitable optimization strategy, balancing the pros and cons of each method.</p></list-item>
<list-item><label>(2)</label><p>Future research can incorporate data from various engineering contexts, including different geological regions, soil types, and construction conditions, to provide a broader validation and prediction scope for the model. By introducing more diverse datasets, the predictive capability of the model will become more generalized, offering stronger support for pile drivability predictions under complex geological conditions.</p></list-item>
<list-item><label>(3)</label><p>To further enhance the performance and applicability of the model, future research can attempt to hybridize different optimization algorithms. By combining the strengths of various algorithms, the limitations of a single algorithm can be overcome. Additionally, new optimization algorithms, such as those based on deep learning or adaptive optimization, should be explored and developed to improve the model&#x2019;s performance in high-dimensional spaces.</p></list-item>
</list></p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusions</title>
<p>This study applied the RF machine learning model optimized by BO and PSO to predict the MCS, MTS, and BPF of the pile under various relevant factors. The established data set contains 4072 samples, of which 80% were used to train the models, and the remaining 20% were utilized to test the models. Then, the performance of the established RF was compared with the KNN, SVR, XGBoost, and DT models. In this study, RMSE and R<sup>2</sup> were selected and calculated as two of the most popular performance indices in predictive models. It was found that the performance prediction of the optimized RF is higher than that of other implemented techniques. Meanwhile, it also proved that when the sample feature dimension is high, the RF model can still train the model effectively. It was shown that it has small generalization errors and strong generalization ability when solving such problems, and it can accurately reflect the complex relationship between piling and related parameters. In the test results, when predicting MCS, the RMSE and R<sup>2</sup> of the optimal PF-PSO are 0.044 and 0.966, respectively. In predicting MTS, the RMSE and R<sup>2</sup> of the optimal PF-Baye are 0.438 and 0.884, respectively. In predicting BPF, the RMSE and R<sup>2</sup> of the optimal PF-Baye are 0.146 and 0.977, respectively. The above results showed that the RF optimization model using PSO and BO is an effective method to solve some complex engineering problems and can also provide a reference for other similar engineering problems in the future.</p>
</sec>
</body>
<back>
<ack><p>The authors thank the support of the National Science Foundation of China.</p>
</ack>
<sec><title>Funding Statement</title>
<p>This research was supported by the National Science Foundation of China (42107183).</p>
</sec>
<sec><title>Author Contributions</title>
<p>The authors confirm contribution to the paper as follows: study conception and design: Shengdong Cheng; data collection: Juncheng Gao; analysis and interpretation of results: Shengdong Cheng, Juncheng Gao, Hongning Qi; draft manuscript preparation: Shengdong Cheng, Hongning Qi. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>All data generated or analyzed during this study are included in this published article [<xref ref-type="bibr" rid="ref-51">51</xref>].</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare that they have no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>1.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Dai</surname> <given-names>G</given-names></string-name>, <string-name><surname>Gong</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Haque</surname> <given-names>A</given-names></string-name>, <string-name><surname>Gamage</surname> <given-names>RP</given-names></string-name></person-group>. <article-title>A review of research on the shaft resistance of rock-socketed piles</article-title>. <source>Acta Geotech</source>. <year>2021</year>;<volume>16</volume>(<issue>3</issue>):<fpage>653</fpage>&#x2013;<lpage>77</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11440-020-01051-2</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>2.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mohanty</surname> <given-names>R</given-names></string-name>, <string-name><surname>Suman</surname> <given-names>S</given-names></string-name>, <string-name><surname>Das</surname> <given-names>SK</given-names></string-name></person-group>. <article-title>Prediction of vertical pile capacity of driven pile in cohesionless soil using artificial intelligence techniques</article-title>. <source>Int J Geotech Eng</source>. <year>2018</year>;<volume>12</volume>(<issue>2</issue>):<fpage>209</fpage>&#x2013;<lpage>16</lpage>. doi:<pub-id pub-id-type="doi">10.1080/19386362.2016.1269043</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>3.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Suman</surname> <given-names>S</given-names></string-name>, <string-name><surname>Das</surname> <given-names>SK</given-names></string-name>, <string-name><surname>Mohanty</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Prediction of friction capacity of driven piles in clay using artificial intelligence techniques</article-title>. <source>Int J Geotech Eng</source>. <year>2016</year>;<volume>10</volume>(<issue>5</issue>):<fpage>469</fpage>&#x2013;<lpage>75</lpage>. doi:<pub-id pub-id-type="doi">10.1080/19386362.2016.1169009</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>4.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Harandizadeh</surname> <given-names>H</given-names></string-name>, <string-name><surname>Jahed Armaghani</surname> <given-names>D</given-names></string-name>, <string-name><surname>Khari</surname> <given-names>M</given-names></string-name></person-group>. <article-title>A new development of ANFIS-GMDH optimized by PSO to predict pile bearing capacity based on experimental datasets</article-title>. <source>Eng Comput</source>. <year>2021</year>;<volume>37</volume>(<issue>1</issue>):<fpage>685</fpage>&#x2013;<lpage>700</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00366-019-00849-3</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>5.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Isaacs</surname> <given-names>DV</given-names></string-name></person-group>. <article-title>Reinforced concrete pile formulae</article-title>. <source>J Instit Eng Australia</source>. <year>1931</year>;<volume>3</volume>(<issue>9</issue>):<fpage>305</fpage>&#x2013;<lpage>23</lpage>.</mixed-citation></ref>
<ref id="ref-6"><label>6.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Smith</surname> <given-names>EAL</given-names></string-name></person-group>. <article-title>Pile-driving analysis by the wave equation</article-title>. <source>J Soil Mech Found Div</source>. <year>1960</year>;<volume>86</volume>(<issue>4</issue>):<fpage>35</fpage>&#x2013;<lpage>61</lpage>.</mixed-citation></ref>
<ref id="ref-7"><label>7.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Nath</surname> <given-names>B</given-names></string-name></person-group>. <article-title>A continuum method of pile driving analysis: comparison with the wave equation method</article-title>. <source>Comput Geotech</source>. <year>1990</year>;<volume>10</volume>(<issue>4</issue>):<fpage>265</fpage>&#x2013;<lpage>85</lpage>.</mixed-citation></ref>
<ref id="ref-8"><label>8.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Goh</surname> <given-names>AT</given-names></string-name></person-group>. <article-title>Multivariate adaptive regression splines and neural network models for prediction of pile drivability</article-title>. <source>Geosci Front</source>. <year>2016</year>;<volume>7</volume>(<issue>1</issue>):<fpage>45</fpage>&#x2013;<lpage>52</lpage>.</mixed-citation></ref>
<ref id="ref-9"><label>9.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Samui</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Assessment of pile drivability using random forest regression and multivariate adaptive regression splines</article-title>. <source>Georisk: Asses Manag Risk for Eng Syst Geohazards</source>. <year>2019</year>;<volume>15</volume>(<issue>1</issue>):<fpage>27</fpage>&#x2013;<lpage>40</lpage>.</mixed-citation></ref>
<ref id="ref-10"><label>10.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Heidarie Golafzani</surname> <given-names>S</given-names></string-name>, <string-name><surname>Eslami</surname> <given-names>A</given-names></string-name>, <string-name><surname>Jamshidi Chenari</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Probabilistic assessment of model uncertainty for prediction of pile foundation bearing capacity; static analysis, SPT and CPT-based methods</article-title>. <source>Geotech Geol Eng</source>. <year>2020</year>;<volume>38</volume>(<issue>5</issue>):<fpage>5023</fpage>&#x2013;<lpage>41</lpage>.</mixed-citation></ref>
<ref id="ref-11"><label>11.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yong</surname> <given-names>W</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Predicting the thickness of an excavation damaged zone around the roadway using the DA-RF hybrid model</article-title>. <source>Comput Model Eng &#x0026; Sci</source>. <year>2023</year>;<volume>136</volume>(<issue>3</issue>):<fpage>2507</fpage>&#x2013;<lpage>26</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmes.2023.025714</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>12.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Gu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Hong</surname> <given-names>L</given-names></string-name>, <string-name><surname>Han</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Comprehensive review of machine learning in geotechnical reliability analysis: algorithms, applications and further challenges</article-title>. <source>Appl Soft Comput</source>. <year>2023</year>;<volume>136</volume>(<issue>1</issue>):<fpage>110066</fpage>.</mixed-citation></ref>
<ref id="ref-13"><label>13.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Short-term rockburst prediction in underground project: insights from an explainable and interpretable ensemble learning model</article-title>. <source>Acta Geotech</source>. <year>2023</year>;<volume>18</volume>(<issue>12</issue>):<fpage>6655</fpage>&#x2013;<lpage>85</lpage>.</mixed-citation></ref>
<ref id="ref-14"><label>14.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Goh</surname> <given-names>AT</given-names></string-name>, <string-name><surname>Goh</surname> <given-names>SH</given-names></string-name></person-group>. <article-title>Support vector machines: their use in geotechnical engineering as illustrated using seismic liquefaction data</article-title>. <source>Comput Geotech</source>. <year>2007</year>;<volume>34</volume>(<issue>5</issue>):<fpage>410</fpage>&#x2013;<lpage>21</lpage>.</mixed-citation></ref>
<ref id="ref-15"><label>15.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Asteris</surname> <given-names>PG</given-names></string-name>, <string-name><surname>Plevris</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Anisotropic masonry failure criterion using artificial neural networks</article-title>. <source>Neural Comput Appl</source>. <year>2017</year>;<volume>28</volume>(<issue>8</issue>):<fpage>2207</fpage>&#x2013;<lpage>29</lpage>.</mixed-citation></ref>
<ref id="ref-16"><label>16.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Asteris</surname> <given-names>PG</given-names></string-name>, <string-name><surname>Kolovos</surname> <given-names>KG</given-names></string-name>, <string-name><surname>Douvika</surname> <given-names>MG</given-names></string-name>, <string-name><surname>Roinos</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Prediction of self-compacting concrete strength using artificial neural networks</article-title>. <source>Eur J Environ Civil Eng</source>. <year>2016</year>;<volume>20</volume>(<issue>sup1</issue>):<fpage>s102</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1080/19648189.2016.1246693</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>17.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Asteris</surname> <given-names>PG</given-names></string-name>, <string-name><surname>Nikoo</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Artificial bee colony-based neural network for the prediction of the fundamental period of infilled frame structures</article-title>. <source>Neural Comput Appl</source>. <year>2019</year>;<volume>31</volume>(<issue>9</issue>):<fpage>4837</fpage>&#x2013;<lpage>47</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00521-018-03965-1</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>18.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Luo</surname> <given-names>J</given-names></string-name>, <string-name><surname>Ren</surname> <given-names>R</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>K</given-names></string-name></person-group>. <article-title>The deformation monitoring of foundation pit by back propagation neural network and genetic algorithm and its application in geotechnical engineering</article-title>. <source>PLoS One</source>. <year>2020</year>;<volume>15</volume>(<issue>7</issue>):<fpage>e0233398</fpage>. doi:<pub-id pub-id-type="doi">10.1371/journal.pone.0233398</pub-id>; <pub-id pub-id-type="pmid">32609717</pub-id></mixed-citation></ref>
<ref id="ref-19"><label>19.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Das</surname> <given-names>SK</given-names></string-name>, <string-name><surname>Basudhar</surname> <given-names>PK</given-names></string-name></person-group>. <article-title>Undrained lateral load capacity of piles in clay using artificial neural network</article-title>. <source>Comput Geotech</source>. <year>2006</year>;<volume>33</volume>(<issue>8</issue>):<fpage>454</fpage>&#x2013;<lpage>9</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compgeo.2006.08.006</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>20.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kordjazi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Nejad</surname> <given-names>FP</given-names></string-name>, <string-name><surname>Jaksa</surname> <given-names>MB</given-names></string-name></person-group>. <article-title>Prediction of ultimate axial load-carrying capacity of piles using a support vector machine based on CPT data</article-title>. <source>Comput Geotech</source>. <year>2014</year>;<volume>55</volume>(<issue>1</issue>):<fpage>91</fpage>&#x2013;<lpage>102</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compgeo.2013.08.001</pub-id>.</mixed-citation></ref>
<ref id="ref-21"><label>21.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Goh</surname> <given-names>AT</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Multivariate adaptive regression splines application for multivariate geotechnical problems with big data</article-title>. <source>Geotech Geol Eng</source>. <year>2016</year>;<volume>34</volume>(<issue>1</issue>):<fpage>193</fpage>&#x2013;<lpage>204</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10706-015-9938-9</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>22.</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Cao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Prediction of ultimate axial load-carrying capacity for driven piles using machine learning methods</article-title>. In: <conf-name>Proceedings of the 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC)</conf-name>; <year>2019</year>; <publisher-loc>Chengdu, China</publisher-loc>: <publisher-name>IEEE</publisher-name>. p. <fpage>334</fpage>&#x2013;<lpage>340</lpage>.</mixed-citation></ref>
<ref id="ref-23"><label>23.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Asteris</surname> <given-names>PG</given-names></string-name>, <string-name><surname>Koopialipoor</surname> <given-names>M</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Tahir</surname> <given-names>MM</given-names></string-name></person-group>. <article-title>Invasive weed optimization technique-based ANN to the prediction of rock tensile strength</article-title>. <source>Appl Sci</source>. <year>2019</year>;<volume>9</volume>(<issue>24</issue>):<fpage>5372</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app9245372</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>24.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Short-term rockburst damage assessment in burst-prone mines: an explainable XGBOOST hybrid model with SCSO algorithm</article-title>. <source>Rock Mech Rock Eng</source>. <year>2023</year>;<volume>56</volume>(<issue>12</issue>):<fpage>8745</fpage>&#x2013;<lpage>70</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00603-023-03522-w</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>25.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hajihassani</surname> <given-names>M</given-names></string-name>, <string-name><surname>Abdullah</surname> <given-names>SS</given-names></string-name>, <string-name><surname>Asteris</surname> <given-names>PG</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name></person-group>. <article-title>A gene expression programming model for predicting tunnel convergence</article-title>. <source>Appl Sci</source>. <year>2019</year>;<volume>9</volume>(<issue>21</issue>):<fpage>4650</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app9214650</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>26.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>YL</given-names></string-name>, <string-name><surname>Qin</surname> <given-names>YG</given-names></string-name>, <string-name><surname>Armaghsni</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Monjezi</surname> <given-names>Ma</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Enhancing rock fragmentation prediction in mining operations: A Hybrid GWO-RF model with SHAP interpretability</article-title>. <source>J Cent South Univ</source>. <year>2024</year>;<volume>31</volume>(<issue>6</issue>):<fpage>1</fpage>&#x2013;<lpage>14</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11771-024-5699-z</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>27.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Breiman</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Bagging predictors</article-title>. <source>Mach Learn</source>. <year>1996</year>;<volume>24</volume>(<issue>2</issue>):<fpage>123</fpage>&#x2013;<lpage>40</lpage>.</mixed-citation></ref>
<ref id="ref-28"><label>28.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Breiman</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Random forests</article-title>. <source>Mach Learn</source>. <year>2001</year>;<volume>45</volume>(<issue>1</issue>):<fpage>5</fpage>&#x2013;<lpage>32</lpage>.</mixed-citation></ref>
<ref id="ref-29"><label>29.</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Kuhn</surname> <given-names>M</given-names></string-name>, <string-name><surname>Johnson</surname> <given-names>K</given-names></string-name></person-group>. <source>Applied predictive modeling</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2013</year>; vol. <volume>26</volume>.</mixed-citation></ref>
<ref id="ref-30"><label>30.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Rodriguez-Galiano</surname> <given-names>V</given-names></string-name>, <string-name><surname>Sanchez-Castillo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Chica-Olmo</surname> <given-names>M</given-names></string-name>, <string-name><surname>Chica-Rivas</surname> <given-names>MJOGR</given-names></string-name></person-group>. <article-title>Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines</article-title>. <source>Ore Geol Rev</source>. <year>2015</year>;<volume>71</volume>(<issue>1</issue>):<fpage>804</fpage>&#x2013;<lpage>18</lpage>.</mixed-citation></ref>
<ref id="ref-31"><label>31.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gislason</surname> <given-names>PO</given-names></string-name>, <string-name><surname>Benediktsson</surname> <given-names>JA</given-names></string-name>, <string-name><surname>Sveinsson</surname> <given-names>JR</given-names></string-name></person-group>. <article-title>Random forests for land cover classification</article-title>. <source>Pattern Recognit Lett</source>. <year>2006</year>;<volume>27</volume>(<issue>4</issue>):<fpage>294</fpage>&#x2013;<lpage>300</lpage>.</mixed-citation></ref>
<ref id="ref-32"><label>32.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Dehghanbanadaki</surname> <given-names>A</given-names></string-name>, <string-name><surname>Khari</surname> <given-names>M</given-names></string-name>, <string-name><surname>Amiri</surname> <given-names>ST</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name></person-group>. <article-title>Estimation of ultimate bearing capacity of driven piles in c-&#x03C6; soil using MLP-GWO and ANFIS-GWO models: a comparative study</article-title>. <source>Soft Comput</source>. <year>2021</year>;<volume>25</volume>(<issue>5</issue>):<fpage>4103</fpage>&#x2013;<lpage>19</lpage>.</mixed-citation></ref>
<ref id="ref-33"><label>33.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jong</surname> <given-names>SC</given-names></string-name>, <string-name><surname>Ong</surname> <given-names>DEL</given-names></string-name>, <string-name><surname>Oh</surname> <given-names>E</given-names></string-name></person-group>. <article-title>State-of-the-art review of geotechnical-driven artificial intelligence techniques in underground soil-structure interaction</article-title>. <source>Tunnelling Undergr Space Technol</source>. <year>2021</year>;<volume>113</volume>(<issue>1</issue>):<fpage>103946</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.tust.2021.103946</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>34.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kardani</surname> <given-names>N</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>A</given-names></string-name>, <string-name><surname>Nazem</surname> <given-names>M</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>SL</given-names></string-name></person-group>. <article-title>Estimation of bearing capacity of piles in cohesionless soil using optimised machine learning approaches</article-title>. <source>Geotech Geol Eng</source>. <year>2020</year>;<volume>38</volume>(<issue>2</issue>):<fpage>2271</fpage>&#x2013;<lpage>91</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10706-019-01085-8</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>35.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Khandelwal</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name></person-group>. <article-title>Developing a hybrid model of Jaya algorithm-based extreme gradient boosting machine to estimate blast-induced ground vibrations</article-title>. <source>Int J Rock Mech Min Sci</source>. <year>2021</year>;<volume>145</volume>(<issue>1</issue>):<fpage>104856</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.ijrmms.2021.104856</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>36.</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Shi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Eberhart</surname> <given-names>RC</given-names></string-name></person-group>. <article-title>Parameter selection in particle swarm optimization</article-title>. In: <conf-name>Evolutionary Programming VII: Proceedings of the 7th International Conference</conf-name>; <year>1998 Mar 25&#x2013;27</year>; <publisher-loc>San Diego, CA, USA. Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>. p. <fpage>591</fpage>&#x2013;<lpage>600</lpage>.</mixed-citation></ref>
<ref id="ref-37"><label>37.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Yin</surname> <given-names>M</given-names></string-name></person-group>. <article-title>A particle swarm inspired cuckoo search algorithm for real parameter optimization</article-title>. <source>Soft Comput</source>. <year>2016</year>;<volume>20</volume>(<issue>4</issue>):<fpage>1389</fpage>&#x2013;<lpage>1413</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00500-015-1594-8</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>38.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Poli</surname> <given-names>R</given-names></string-name>, <string-name><surname>Kennedy</surname> <given-names>J</given-names></string-name>, <string-name><surname>Blackwell</surname> <given-names>T</given-names></string-name></person-group>. <article-title>Particle swarm optimization an overview</article-title>. <source>Swarm Intell</source>. <year>2007</year>;<volume>1</volume>(<issue>1</issue>):<fpage>33</fpage>&#x2013;<lpage>57</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11721-007-0002-0</pub-id>.</mixed-citation></ref>
<ref id="ref-39"><label>39.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jahed Armaghani</surname> <given-names>D</given-names></string-name>, <string-name><surname>Kumar</surname> <given-names>D</given-names></string-name>, <string-name><surname>Samui</surname> <given-names>P</given-names></string-name>, <string-name><surname>Hasanipanah</surname> <given-names>M</given-names></string-name>, <string-name><surname>Roy</surname> <given-names>B</given-names></string-name></person-group>. <article-title>A novel approach for forecasting of ground vibrations resulting from blasting: modified particle swarm optimization coupled extreme learning machine</article-title>. <source>Eng Comput</source>. <year>2021</year>;<volume>37</volume>(<issue>4</issue>):<fpage>3221</fpage>&#x2013;<lpage>35</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00366-020-00997-x</pub-id>.</mixed-citation></ref>
<ref id="ref-40"><label>40.</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Kennedy</surname> <given-names>J</given-names></string-name>, <string-name><surname>Eberhart</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Particle swarm optimization</article-title>. In: <conf-name>IEEE International Conference on Neural Networks</conf-name>; <year>1995</year>; <publisher-loc>Perth, Australia</publisher-loc>. p. <fpage>1942</fpage>&#x2013;<lpage>8</lpage>.</mixed-citation></ref>
<ref id="ref-41"><label>41.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Nguyen</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yagiz</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Optimization of support vector machine through the use of metaheuristic algorithms in forecasting TBM advance rate</article-title>. <source>Eng Appl Artif Intell</source>. <year>2021</year>;<volume>97</volume>(<issue>1</issue>):<fpage>104015</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.engappai.2020.104015</pub-id>.</mixed-citation></ref>
<ref id="ref-42"><label>42.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Brits</surname> <given-names>R</given-names></string-name>, <string-name><surname>Engelbrecht</surname> <given-names>AP</given-names></string-name>, <string-name><surname>van den Bergh</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Locating multiple optima using particle swarm optimization</article-title>. <source>Appl Math Comput</source>. <year>2007</year>;<volume>189</volume>(<issue>2</issue>):<fpage>1859</fpage>&#x2013;<lpage>83</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.amc.2006.12.066</pub-id>.</mixed-citation></ref>
<ref id="ref-43"><label>43.</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Zheng</surname> <given-names>YL</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>LH</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>LY</given-names></string-name>, <string-name><surname>Qian</surname> <given-names>JX</given-names></string-name></person-group>. <article-title>On the convergence analysis and parameter selection in particle swarm optimization</article-title>. In: <conf-name>Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 03EX693)</conf-name>; <year>2003 Nov</year>; <publisher-loc>Xi&#x2019;an, China</publisher-loc>: <publisher-name>IEEE</publisher-name>. p. <fpage>1802</fpage>&#x2013;<lpage>7</lpage>.</mixed-citation></ref>
<ref id="ref-44"><label>44.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mockus</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tiesis</surname> <given-names>V</given-names></string-name>, <string-name><surname>Zilinskas</surname> <given-names>A</given-names></string-name></person-group>. <article-title>The application of Bayesian methods for seeking the extremum</article-title>. <source>Towards Global Optim</source>. <year>1978</year>;<volume>2</volume>(<issue>117&#x2013;129</issue>):<fpage>2</fpage>.</mixed-citation></ref>
<ref id="ref-45"><label>45.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Garrido-Merch&#x00E1;n</surname> <given-names>EC</given-names></string-name>, <string-name><surname>Hern&#x00E1;ndez-Lobato</surname> <given-names>D</given-names></string-name></person-group>. <article-title>Dealing with categorical and integer-valued variables in bayesian optimization with gaussian processes</article-title>. <source>Neurocomputing</source>. <year>2020</year>;<volume>380</volume>(<issue>4</issue>):<fpage>20</fpage>&#x2013;<lpage>35</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.neucom.2019.11.004</pub-id>.</mixed-citation></ref>
<ref id="ref-46"><label>46.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Han</surname> <given-names>H</given-names></string-name>, <string-name><surname>Jahed Armaghani</surname> <given-names>D</given-names></string-name>, <string-name><surname>Tarinejad</surname> <given-names>R</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tahir</surname> <given-names>MM</given-names></string-name></person-group>. <article-title>Random forest and bayesian network techniques for probabilistic prediction of flyrock induced by blasting in quarry sites</article-title>. <source>Nat Resour Res</source>. <year>2020</year>;<volume>29</volume>(<issue>2</issue>):<fpage>655</fpage>&#x2013;<lpage>67</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11053-019-09611-4</pub-id>.</mixed-citation></ref>
<ref id="ref-47"><label>47.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Pradhan</surname> <given-names>B</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>A</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>An optimized system of random forest model by global harmony search with generalized opposition-based learning for forecasting TBM advance rate</article-title>. <source>Comput Model Eng &#x0026; Sci</source>. <year>2024</year>;<volume>138</volume>(<issue>3</issue>):<fpage>2873</fpage>&#x2013;<lpage>97</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmes.2023.029938</pub-id>; <pub-id pub-id-type="pmid">37303558</pub-id></mixed-citation></ref>
<ref id="ref-48"><label>48.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>D</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name></person-group>. <article-title>An ensemble model of explainable soft computing for failure mode identification in reinforced concrete shear walls</article-title>. <source>J Build Eng</source>. <year>2024</year>;<volume>82</volume>(<issue>1</issue>):<fpage>108386</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jobe.2023.108386</pub-id>.</mixed-citation></ref>
<ref id="ref-49"><label>49.</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Lizotte</surname> <given-names>D</given-names></string-name></person-group>. <article-title>Practical bayesian optimization (Ph.D. Thesis). University of Alberta: Edmonton, Alberta</article-title>; <year>2008</year>.</mixed-citation></ref>
<ref id="ref-50"><label>50.</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Osborne</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Garnett</surname> <given-names>R</given-names></string-name>, <string-name><surname>Roberts</surname> <given-names>SJ</given-names></string-name></person-group>. <article-title>Gaussian processes for global optimization</article-title>. In: <conf-name>3rd International Conference on Learning and Intelligent Optimization (LION3)</conf-name>; <year>2009</year>; p. <fpage>1</fpage>&#x2013;<lpage>15</lpage>.</mixed-citation></ref>
<ref id="ref-51"><label>51.</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Jeon</surname> <given-names>JK</given-names></string-name>, <string-name><surname>Rahman</surname> <given-names>MS</given-names></string-name></person-group>. <chapter-title>Fuzzy neural network models for geotechnical problems</chapter-title>. In: <source>Research project FHWA/NC/2006-52</source>. <publisher-loc>Raleigh, NC, USA</publisher-loc>: <publisher-name>North Carolina State University</publisher-name>; <year>2008</year>.</mixed-citation></ref>
<ref id="ref-52"><label>52.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Tahir</surname> <given-names>MM</given-names></string-name>, <string-name><surname>Pham</surname> <given-names>BT</given-names></string-name>, <string-name><surname>Huynh</surname> <given-names>VV</given-names></string-name></person-group>. <article-title>A combination of feature selection and random forest techniques to solve a problem related to blast-induced ground vibration</article-title>. <source>Appl Sci</source>. <year>2020</year>;<volume>10</volume>(<issue>3</issue>):<fpage>869</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app10030869</pub-id>.</mixed-citation></ref>
<ref id="ref-53"><label>53.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>E</given-names></string-name>, <string-name><surname>Wei</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Qiao</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name></person-group>. <article-title>Random forests and cubist algorithms for predicting shear strengths of rockfill materials</article-title>. <source>Appl Sci</source>. <year>2019</year>;<volume>9</volume>(<issue>8</issue>):<fpage>1621</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app9081621</pub-id>.</mixed-citation></ref>
<ref id="ref-54"><label>54.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Armaghani</surname> <given-names>DJ</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Li</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Predicting TBM penetration rate in hard rock condition: a comparative study among six XGB-based metaheuristic techniques</article-title>. <source>Geosci Front</source>. <year>2021</year>;<volume>12</volume>(<issue>3</issue>):<fpage>101091</fpage>.</mixed-citation></ref>
<ref id="ref-55"><label>55.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Li</surname> <given-names>S</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>C</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Intelligent approach based on random forest for safety risk prediction of deep foundation pit in subway stations</article-title>. <source>J Comput Civ Eng</source>. <year>2019</year>;<volume>33</volume>(<issue>1</issue>):<fpage>05018004</fpage>.</mixed-citation></ref>
<ref id="ref-56"><label>56.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Amjad</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ahmad</surname> <given-names>I</given-names></string-name>, <string-name><surname>Ahmad</surname> <given-names>M</given-names></string-name>, <string-name><surname>Wr&#x00F3;blewski</surname> <given-names>P</given-names></string-name>, <string-name><surname>Kami&#x0144;ski</surname> <given-names>P</given-names></string-name>, <string-name><surname>Amjad</surname> <given-names>U</given-names></string-name></person-group>. <article-title>Prediction of pile bearing capacity using XGBoost algorithm: modeling and performance evaluation</article-title>. <source>Appl Sci</source>. <year>2022</year>;<volume>12</volume>(<issue>4</issue>):<fpage>2126</fpage>.</mixed-citation></ref>
<ref id="ref-57"><label>57.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sun</surname> <given-names>G</given-names></string-name>, <string-name><surname>Hasanipanah</surname> <given-names>M</given-names></string-name>, <string-name><surname>Amnieh</surname> <given-names>HB</given-names></string-name>, <string-name><surname>Foong</surname> <given-names>LK</given-names></string-name></person-group>. <article-title>Feasibility of indirect measurement of bearing capacity of driven piles based on a computational intelligence technique</article-title>. <source>Measurement</source>. <year>2020</year>;<volume>156</volume>:<fpage>107577</fpage>.</mixed-citation></ref>
<ref id="ref-58"><label>58.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>P</given-names></string-name></person-group>. <article-title>A novel feature selection method based on global sensitivity analysis with application in machine learning-based prediction model</article-title>. <source>Appl Soft Comput</source>. <year>2019</year>;<volume>85</volume>(<issue>1</issue>):<fpage>105859</fpage>.</mixed-citation></ref>
<ref id="ref-59"><label>59.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Fang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>L</given-names></string-name>, <string-name><surname>Yao</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Li</surname> <given-names>W</given-names></string-name>, <string-name><surname>You</surname> <given-names>S</given-names></string-name></person-group>. <article-title>Process optimization of biomass gasification with a Monte Carlo approach and random forest algorithm</article-title>. <source>Energy Convers Manag</source>. <year>2022</year>;<volume>264</volume>(<issue>1</issue>):<fpage>115734</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.enconman.2022.115734</pub-id>.</mixed-citation></ref>
</ref-list>
</back></article>