<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMES</journal-id>
<journal-id journal-id-type="nlm-ta">CMES</journal-id>
<journal-id journal-id-type="publisher-id">CMES</journal-id>
<journal-title-group>
<journal-title>Computer Modeling in Engineering &#x0026; Sciences</journal-title>
</journal-title-group>
<issn pub-type="epub">1526-1506</issn>
<issn pub-type="ppub">1526-1492</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">75537</article-id>
<article-id pub-id-type="doi">10.32604/cmes.2026.075537</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>LANET: A Deep Lightweight Attention Network for Skin Cancer Segmentation</article-title>
<alt-title alt-title-type="left-running-head">LANET: A Deep Lightweight Attention Network for Skin Cancer Segmentation</alt-title>
<alt-title alt-title-type="right-running-head">LANET: A Deep Lightweight Attention Network for Skin Cancer Segmentation</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Khalaf</surname><given-names>Abdulrahman Dira</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref ref-type="aff" rid="aff-2">2</xref><xref rid="cor1" ref-type="corresp">&#x002A;</xref><email>adk1973@uofallujah.edu.iq</email></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Hamdan</surname><given-names>Hazlina</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><xref rid="cor1" ref-type="corresp">&#x002A;</xref><email>hazlina@upm.edu.my</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Halin</surname><given-names>Alfian Abdul</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Manshor</surname><given-names>Noridayu</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Faculty of Computer Science and Information Technology, Universiti Putra Malaysia (UPM)</institution>, <addr-line>Serdang</addr-line>, <country>Malaysia</country></aff>
<aff id="aff-2"><label>2</label><institution>Department of Computer Center, University of Fallujah</institution>, <addr-line>Anbar</addr-line>, <country>Iraq</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Authors: Hazlina Hamdan. Email: <email>hazlina@upm.edu.my</email>; Abdulrahman Dira Khalaf. Email: <email>adk1973@uofallujah.edu.iq</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>27</day><month>5</month><year>2026</year>
</pub-date>
<volume>147</volume>
<issue>2</issue>
<elocation-id>46</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>11</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>12</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors. Published by Tech Science Press.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>The Authors</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMES_75537.pdf"></self-uri>
<abstract>
<p>Current automated lesion segmentation methods have limited success, particularly for segmenting small, irregular, or heterogeneous lesions. Moreover, such models require significant computational power, which restricts their scalability and clinical application. To overcome these limitations, a lightweight LANET, which is a layer-attention network based on an encoder&#x2013;decoder deep-learning architecture, has the explicit goal of increasing the segmentation performance and computational efficiency. The LANET is coupled with three new modules: (i) an attention module that includes a depthwise separable convolution operator to reduce the number of parameters, (ii) a custom attention mechanism, and (iii) an atrous spatial pyramid pooling (ASPP) module designed to model substantial features at multiple scales under ideal conditions. Through experiments on benchmark datasets, LANET demonstrated robustness, resulting in accuracies of 96.44%, 96.8%, 96.3%, and 97.9% for HAM10000, ISIC 2017, ISIC 2018, and PH2, respectively. These results exceed those of classical architectures, such as U-Net, UNet&#x002B;&#x002B;, and DeepLabv3&#x002B;, as well as more recent state-of-the-art approaches. Simultaneously, it integrates only 846,786 parameters of the LANET, which leads to a minimum number of overall parameters, and thus, lower computational costs in terms of inference. Furthermore, techniques such as Grad-CAM and activation-map visualizations help explain model decisions and highlight clinically relevant regions. The results show that the LANET provides a robust, scalable, and interpretable real-time segmentation system. This design specifically improves the segmentation of small- or low-contrast lesions. This approach offers a practical path for integrating efficient segmentation models into clinical workflows for skin disease analyses.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Attention</kwd>
<kwd>deep learning</kwd>
<kwd>lightweight models</kwd>
<kwd>segmentation</kwd>
<kwd>skin cancer</kwd>
<kwd>U-Net</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>Skin cancer is an enormous issue worldwide and is becoming more common every year. By 2025, the USA is expected to have 104,960 instances [<xref ref-type="bibr" rid="ref-1">1</xref>]. This statistic shows that the disease is becoming more common and that there is a clear need for precise tests. Traditional machine learning (ML) techniques for skin lesion analysis typically rely on manually crafted features, such as color, texture, and shape descriptors [<xref ref-type="bibr" rid="ref-2">2</xref>,<xref ref-type="bibr" rid="ref-3">3</xref>]. Although these approaches offer interpretability, the significant heterogeneity in lesion appearance makes interpretation particularly difficult, particularly in low-contrast and uneven-margin cases. In contrast, deep learning (DL) methods learn hierarchical representations directly from data and operate on raw images, resulting in more robust and accurate segmentation performance [<xref ref-type="bibr" rid="ref-4">4</xref>].</p>
<p>The performance of segmentation has improved in recent years; however, existing methods exhibit inherent trade-offs. Fully convolutional neural network (FCNN) architectures such as U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>] and its variants, including UNet&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>], reduce the semantic gap between the encoder and decoder layers and achieve strong segmentation accuracy. Despite their success, these models often struggle with small or subtle lesions and require significant computational resources, which affects their reproducibility and real-time usability [<xref ref-type="bibr" rid="ref-7">7</xref>]. Models such as Dermo-Seg [<xref ref-type="bibr" rid="ref-8">8</xref>] and MSREA-Net [<xref ref-type="bibr" rid="ref-9">9</xref>] enhance feature aggregation through attention mechanisms and improve lesion identification; however, their complex architectures lead to high computational costs. Transformer-based approaches such as DermoSegDiff [<xref ref-type="bibr" rid="ref-10">10</xref>], CTH-Net [<xref ref-type="bibr" rid="ref-11">11</xref>], and UNETR [<xref ref-type="bibr" rid="ref-12">12</xref>] improve boundary modelling by capturing long-range dependencies. However, they require considerable computational resources and struggle with narrow or poorly defined margins.</p>
<p>Many existing segmentation models fail to capture subtle or poorly defined lesion edges and often require high computational power [<xref ref-type="bibr" rid="ref-9">9</xref>]. The LANET is designed to address these issues by improving the boundary sensitivity while maintaining a low computational overhead. Artifacts such as hair, ruler marks, and lighting variations further degrade the segmentation accuracy [<xref ref-type="bibr" rid="ref-13">13</xref>]. To address these challenges, we introduce LANET, a lightweight attention-based network that combines depthwise convolutions, attention modules, and atrous spatial pyramid pooling (ASPP) [<xref ref-type="bibr" rid="ref-14">14</xref>], which enables effective multiscale context aggregation without increasing computational complexity. The design specifically targeted small and low-contrast lesion boundaries. The LANET balances high performance with computational efficiency, making it both correct and clinically practical. The main contributions of this study are as follows:<list list-type="order">
<list-item>
<p>LANET reduces the number of parameters while improving boundary detection, particularly for small- or low-contrast lesions.</p></list-item>
<list-item>
<p>The architecture integrates lightweight ASPP and attention modules to capture the lesion structure without increasing computational cost.</p></list-item>
<list-item>
<p>LANET is evaluated on four public datasets with diverse imaging conditions and lesion variations.</p></list-item>
<list-item>
<p>The results show that the LANET maintains a strong accuracy while preserving the boundary detail and operating with a compact parameter count.</p></list-item>
</list></p>
<p>This paper presents LANET, a lightweight attention-based segmentation framework for pixel-level skin lesion delineation. The LANET combines depthwise separable convolutions, a customized attention mechanism, and an ASPP module within an encoder&#x2013;decoder structure to enhance the boundary sensitivity at a low computational cost. The model was evaluated using four benchmark datasets: HAM10000, ISIC 2017, ISIC 2018, and PH2, covering diverse imaging conditions and lesion types. The LANET achieves superior Dice, IoU, and sensitivity scores compared with the baseline models. These results demonstrate the effectiveness of the compact architecture for accurate lesion boundary extraction and support its potential clinical use. <xref ref-type="sec" rid="s2">Section 2</xref> summarizes the related work. <xref ref-type="sec" rid="s3">Section 3</xref> describes the datasets and LANET method. <xref ref-type="sec" rid="s4">Section 4</xref> presents experimental results and interpretability analysis. <xref ref-type="sec" rid="s5">Section 5</xref> reports ablation studies, followed by a discussion in <xref ref-type="sec" rid="s6">Section 6</xref>. <xref ref-type="sec" rid="s7">Section 7</xref> outlines the limitations, and <xref ref-type="sec" rid="s8">Section 8</xref> provides future research directions.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<p>Accurate diagnosis of skin cancer depends on reliable lesion segmentation. Recent research has focused on deep learning (DL) and hybrid models [<xref ref-type="bibr" rid="ref-12">12</xref>,<xref ref-type="bibr" rid="ref-15">15</xref>], particularly on publicly available dermoscopic datasets, such as ISIC [<xref ref-type="bibr" rid="ref-16">16</xref>]. This section summarizes key studies across four methodological categories: (i) CNN-based, (ii) transformer-based, (iii) attention-based, and (iv) lightweight segmentation. Each category offers unique strengths and limitations that motivate the design of LANET.</p>
<sec id="s2_1">
<label>2.1</label>
<title>CNN-Based Model</title>
<p>Convolutional neural networks (CNNs) remain the foundation of most methods used to segment skin lesions. Classic architectures such as U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>] and its derivatives have been widely extended to improve pixel-level accuracy. For example, UNet&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>] and DFF-UNet [<xref ref-type="bibr" rid="ref-7">7</xref>] integrated deep feature fusion, whereas MRP-UNet [<xref ref-type="bibr" rid="ref-17">17</xref>] incorporated multiscale input fusion and pyramid dilated convolutions to capture lesions of varying sizes better. These fully convolutional networks (FCNs) highlight the strength of CNNs in local feature learning; however, they often struggle to effectively capture the global context. To address this, several models have begun to embed attention- or transformer-based modules in CNN frameworks.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Transformer-Based Model</title>
<p>Transformer-based methods address the limitations of CNN by capturing long-range dependencies and local features. Mas-TransUNet [<xref ref-type="bibr" rid="ref-18">18</xref>] incorporates CNN encoding and transformer modules for simultaneous global and local feature extractions. DuaSkinSeg is a two-encoder-based method using MobileNetV2 and Vision Transformer to utilize convolutional features with contextual representation [<xref ref-type="bibr" rid="ref-19">19</xref>]. BDFormer employs a boundary-aware dual-decoder transformer to improve edge accuracy [<xref ref-type="bibr" rid="ref-20">20</xref>], and EM-Net incorporates a morphology-aware design to better represent the lesion structure [<xref ref-type="bibr" rid="ref-21">21</xref>]. These approaches enhance segmentation performance, but often introduce greater model complexity, limiting their suitability for real-time or resource-constrained environments. However, their computational demands hinder their automatic clinical use in real-time applications, which require speed and hardware efficiency.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Attention-Based Model</title>
<p>One way to improve the precision and performance is to introduce an attention mechanism. Attention Squeeze U-Net focuses on embedded devices [<xref ref-type="bibr" rid="ref-22">22</xref>]. This is not due to the dataset size, but to the deployment environment, as large datasets are mostly required only for offline model training. Attention networks are intrinsically lightweight, which improves their performance, while making them more compact. While alternative efficient methods, such as boundary-aware attention and a hybrid CNN-transformer encoder-decoder, enhance the fusion of region features, they also introduce additional computational overhead [<xref ref-type="bibr" rid="ref-11">11</xref>]. They identified a trade-off between segmentation accuracy and clinical applicability.</p>
</sec>
<sec id="s2_4">
<label>2.4</label>
<title>Lightweight Models</title>
<p>Lightweight versions are popular for real-time applications. DSNet is a lightweight CNN with depthwise convolutions and boundary-aware loss, achieving strong segmentation performance with fewer parameters [<xref ref-type="bibr" rid="ref-23">23</xref>]. In addition to architectural design, a model explores compression techniques such as pruning, quantization, and knowledge distillation to reduce inference time and memory usage. As such, &#x201C;lightweight&#x201D; in clinical settings involves a smaller parameter count and faster runtime on mobile or embedded systems, where GPUs are not available. Several strategies based on swine transformers with parameter sharing and anti-aliasing upsampling have been investigated; however, challenges remain in the ambiguous segmentation of lesions [<xref ref-type="bibr" rid="ref-24">24</xref>]. The use of lightweight designs allows point-of-care implementation to be integrated into research models.</p>
<p>This categorization emphasizes incremental developments in CNN-based, transformer-based, and lightweight methods, thereby providing high segmentation accuracy and clinical utility trade-offs. Despite notable progress in lesion segmentation, existing approaches face three significant challenges: (i) high computational costs that limit real-time use, (ii) reduced accuracy for small- or low-contrast lesions, and (iii) limited adaptability in resource-constrained clinical environments. These limitations indicate the importance of a lightweight yet accurate model. In the following section, LANET explain that lightweight designs help bridge the gap between research models and point-of-care deployment. These gaps were addressed by combining depthwise convolutions, a customized attention block, and an optimized ASPP module.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Materials and Methods</title>
<p>LANET was implemented in MATLAB 2024a and trained on an Ubuntu server with an NVIDIA Tesla V100 GPU (16 GB). The model was trained on four datasets with random allocation and an 80:10:10 split for training, validation, and testing. The images were resized to 224 &#x00D7; 224 pixels for training, and a batch size of 32 pixels was used. The optimization was based on stochastic gradient descent with momentum (SGDM), and 20 epochs.</p>
<sec id="s3_1">
<label>3.1</label>
<title>Dataset</title>
<p><xref ref-type="table" rid="table-1">Table 1</xref> lists the human-to-machine ratio (HAM10000), which consists of images and the ground truth [<xref ref-type="bibr" rid="ref-25">25</xref>]. The dataset consists of 10,015 images organized into seven categories, with a resolution of 600 &#x00D7; 450 pixels in the JPEG format. The ground-truth dataset is the same size, but it is in binary images in the PNG format. <xref ref-type="table" rid="table-2">Table 2</xref> lists other public datasets used in this study. The ISIC 2017 and 2018 datasets are widely used as benchmarks for skin cancer detection and consist of dermoscopic images. The ISIC 2017 comprises 2750 images and supports three tasks: lesion segmentation, dermoscopic feature identification, and lesion categorization into melanoma, seborrheic keratosis, and benign nevi [<xref ref-type="bibr" rid="ref-16">16</xref>]. The ISIC 2018 dataset comprises 3694 images. It is used for multiclass classification across seven diagnostic categories [<xref ref-type="bibr" rid="ref-26">26</xref>]. Both datasets offer expert annotations and have established themselves as benchmarks for assessing DL models in automated skin-lesion analyses. The PH2 dataset consists of 200 skin lesion images, each with its corresponding label, at a resolution of 768 &#x00D7; 560 [<xref ref-type="bibr" rid="ref-27">27</xref>]. It is commonly used to assess the generality and efficacy of a model. <xref ref-type="fig" rid="fig-1">Fig. 1</xref> shows the sample images and corresponding ground-truth masks from all four datasets.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Seven classes of HAM10000.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Class</th>
<th>Abbreviation</th>
<th>Number of Images</th>
</tr>
</thead>
<tbody>
<tr>
<td>Actinic keratoses</td>
<td>AKIEC</td>
<td>327</td>
</tr>
<tr>
<td>Basal cell carcinoma</td>
<td>BCC</td>
<td>514</td>
</tr>
<tr>
<td>Benign keratoses</td>
<td>BKL</td>
<td>1099</td>
</tr>
<tr>
<td>Dermatofibroma</td>
<td>DF</td>
<td>115</td>
</tr>
<tr>
<td>Melanoma</td>
<td>MEL</td>
<td>1113</td>
</tr>
<tr>
<td>Melanocytic nevi</td>
<td>NV</td>
<td>6705</td>
</tr>
<tr>
<td>Vascular lesions</td>
<td>VASC</td>
<td>142</td>
</tr>
</tbody>
</table>
</table-wrap><table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Skin cancer public datasets.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Dataset</th>
<th>No. of Images for the 80:10:10 Split</th>
<th>Image Size</th>
<th>Classes</th>
</tr>
</thead>
<tbody>
<tr>
<td>HAM10000</td>
<td>10,015</td>
<td>600 &#x00D7; 450</td>
<td>AKIEC, BKL, DF, VASC, MEL, NV, BCC</td>
</tr>
<tr>
<td>ISIC 2018</td>
<td>3694</td>
<td>Different sizes</td>
<td>AKIEC, BKL, DF, VASC, MEL, NV, BCC</td>
</tr>
<tr>
<td>ISIC 2017</td>
<td>2750</td>
<td>Different sizes</td>
<td>Melanoma, Nevus, Seborrheic Keratosis</td>
</tr>
<tr>
<td>PH2</td>
<td>200</td>
<td>768 &#x00D7; 560</td>
<td>Melanoma, Common Nevi, Atypical Nevi</td>
</tr>
</tbody>
</table>
</table-wrap><fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>Examples of typical dermoscopic images from each dataset; each row shows the source image (color) with its corresponding ground truth mask (black and wight) for: (<bold>a</bold>) HAM10000; (<bold>b</bold>) ISIC 2018; (<bold>c</bold>) ISIC 2017; and (<bold>d</bold>) PH2.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-1.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Proposed Method</title>
<p>LANET is based on U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]. The encoder-decoder architecture was designed for efficiency and segmentation accuracy (<xref ref-type="fig" rid="fig-2">Fig. 2</xref>). It includes three main components: (i) an encoder that incorporates attention mechanisms and grouped convolutions; (ii) a decoder; and (iii) an ASPP module for multiscale contextual understanding. The model accepts three channels (R, G, and B) as input for an image of size 224 &#x00D7; 224 &#x00D7; 3, and outputs a pixel-wise segmentation map with two classes (lesion and background).</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>The main framework of the LANET.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-2.tif"/>
</fig>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>Encoder</title>
<p>The encoder forms four blocks that extract the features at diverse levels. Each block had a 2D convolutional layer, batch normalization, and three activation functions. The Rectified Linear Unit (ReLU), leaky ReLU, and clipped ReLU activation functions were computed using <xref ref-type="disp-formula" rid="eqn-1">Eqs. (1)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-3">(3)</xref>. Depthwise separable convolutions were employed to improve computational efficiency. The scale factor was 0.01 and the ceiling was 6. Multiscale pooling was applied through average and max-pooling operations, enabling feature representations at varying spatial resolutions. Moreover, channel attention was incorporated into each block via sigmoid gating from each source with a residual connection to increase feature selectivity. The pattern used 16, 32, 64, and 128 feature maps to show a gradual increase in feature depth across the encoder stages. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the feature map.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo movablelimits="true" form="prefix">max</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mi>x</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mi>x</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>s</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mtext>&#xA0;</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left left" rowspacing=".2em" columnspacing="1em" displaystyle="false"><mml:mtr><mml:mtd><mml:mi>x</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>x</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mn>0</mml:mn><mml:mo>&#x2264;</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mi>x</mml:mi><mml:mo>&#x2265;</mml:mo><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>g</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Extracting feature maps by encoding layers and attention.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-3.tif"/>
</fig>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Atrous Spatial Pyramid Pooling (ASPP) Module</title>
<p>The atrous spatial pyramid pooling (ASPP) module [<xref ref-type="bibr" rid="ref-14">14</xref>] improves multiscale feature representation by employing parallel atrous convolutions with various rates of dilation. This design enlarges the receptive field and does not increase the number of model parameters; thus, a fine lesion boundary and coarser long-range information can be captured by the network. As illustrated in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>, LANET and ASPP use dilation rates of 1, 6, 12, and 18 to combine local boundary details with the global semantic context sufficiently, owing to the large variation in lesion size, shape, and margin complexity.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>ASPP architecture: (<bold>a</bold>) original ASPP module; (<bold>b</bold>) ASPP module with dilated convolutions at different rates for land cover types.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-4.tif"/>
</fig>
<p>Unlike typical U-Nets, which are based on repeated pooling and therefore tend to lose spatial information, the integration of ASPP enables LANET to retain detailed structures and acquire rich contextual cues. This provides the model with more immunity to irregular, diffuse or uncertain regions. ASPP integration helps the model capture multiscale contextual information without significantly increasing the computational cost of DeepLab-style models. The design seeks to improve the accuracy while maintaining suitable efficiency for real-time clinical use.</p>
</sec>
<sec id="s3_2_3">
<label>3.2.3</label>
<title>Block Attention</title>
<p>The attention mechanism between the encoder and decoder stages captured the lesion features more effectively. A LANET uses an attention mechanism inspired by a convolutional block attention module (CBAM) [<xref ref-type="bibr" rid="ref-28">28</xref>]. It combines channel and spatial attention and refines feature representations through two consecutive steps of attention map inference. This process helps the network to focus on more informative channels and spatial regions. Algorithm 1 contains the pseudocode for the pooled-unpooled attention block obtained by merging the original and additional information in its attention block. The exact process is applied to the other encoders within the LANET, as indicated by the arrow lines in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>. Channel attention detects key features in an image. By contrast, spatial attention identifies specific locations with valuable information.</p>
<fig id="fig-19">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-19.tif"/>
</fig>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>Block attention modules: (<bold>a</bold>) Channel Attention; (<bold>b</bold>) Spatial Attention.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-5.tif"/>
</fig>
<p>The LANET introduces a pooled&#x2013;unpooled attention mechanism that restores the information removed during pooling as shown in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>. The module first performs max pooling and unpooling to recover the spatial layout and then computes the pixel-wise difference between the original and unpooled features. This subtraction reveals subtle activations suppressed by pooling. It retains edge-level variations that are critical for medical segmentation. The spatial attention module enhances clinically important locations by emphasizing reconstructed border cues and reducing the influence of background artifacts. The proposed mechanism explicitly recovers lost information and incorporates leaky ReLU to improve the gradient flow for low-contrast lesions. The following simplified example shows how <bold>pool &#x2192; unpool &#x2192; subtraction</bold> recovers suppressed features.</p>
<p><bold>Input Feature Map (4 &#x00D7; 4):</bold><inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left left left left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>16</mml:mn></mml:mtd><mml:mtd><mml:mn>2</mml:mn></mml:mtd><mml:mtd><mml:mn>3</mml:mn></mml:mtd><mml:mtd><mml:mn>13</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>5</mml:mn></mml:mtd><mml:mtd><mml:mn>11</mml:mn></mml:mtd><mml:mtd><mml:mn>10</mml:mn></mml:mtd><mml:mtd><mml:mn>8</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>9</mml:mn></mml:mtd><mml:mtd><mml:mn>7</mml:mn></mml:mtd><mml:mtd><mml:mn>6</mml:mn></mml:mtd><mml:mtd><mml:mn>12</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>4</mml:mn></mml:mtd><mml:mtd><mml:mn>14</mml:mn></mml:mtd><mml:mtd><mml:mn>15</mml:mn></mml:mtd><mml:mtd><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p><bold>Step 1: Max-Pooling (2 &#x00D7; 2):</bold> Pooling outputs the maximum from each block. <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>P</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>16</mml:mn></mml:mtd><mml:mtd><mml:mn>13</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>14</mml:mn></mml:mtd><mml:mtd><mml:mn>15</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p><bold>Step 2: Max-Unpooling with Stored Indices:</bold> <inline-formula id="ieqn-16"><mml:math id="mml-ieqn-16"><mml:mi>U</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left left left left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>16</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>13</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>14</mml:mn></mml:mtd><mml:mtd><mml:mn>15</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p><bold>Step 3: Residual (X &#x2212; U) Recovers Suppressed Values:</bold> <inline-formula id="ieqn-17"><mml:math id="mml-ieqn-17"><mml:mi>R</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="center center center center" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>2</mml:mn></mml:mtd><mml:mtd><mml:mn>3</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>5</mml:mn></mml:mtd><mml:mtd><mml:mn>11</mml:mn></mml:mtd><mml:mtd><mml:mn>10</mml:mn></mml:mtd><mml:mtd><mml:mn>8</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>9</mml:mn></mml:mtd><mml:mtd><mml:mn>7</mml:mn></mml:mtd><mml:mtd><mml:mn>6</mml:mn></mml:mtd><mml:mtd><mml:mn>12</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>4</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula></p>
</sec>
<sec id="s3_2_4">
<label>3.2.4</label>
<title>Decoder</title>
<p>The decoder reconstructs the segmentation mask from the encoded feature maps using transposed convolutions for upsampling. Skip connections are introduced by the corresponding encoder layers to preserve spatial information. Each upsampled feature map was refined using convolutional and activation layers to ensure accurate localization and boundary recovery. The final prediction layer consisted of a one-by-one convolution to reduce the feature dimensions to two classes: lesion and background. The activation function applies softmax to generate pixel-wise class probabilities, thereby producing the final segmentation output.</p>
<p>At each encoder level, the output features are generated by customization layers that compute the difference between the maximum unpooling of the max pooling and the input. The first type is unpooling, which uses the pooling concept to start the decoding operations, as illustrated in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The learnable parameters of the LANET model, which are the outputs, were initialized from the first convolutional encoding layer. Three activation functions were used in this study: ReLU, leaky ReLU, and clipped ReLU, <xref ref-type="disp-formula" rid="eqn-1">Eqs. (1)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-3">(3)</xref>, respectively: The last convolutional decoding layer was followed by a softmax activation layer, resulting in 846,786 parameters, as listed in <xref ref-type="table" rid="table-3">Table 3</xref>, which provides details on the distribution of the parameters of the proposed architecture. For more details, refer to <xref ref-type="app" rid="app-1">Appendix A</xref>.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Extracting feature maps by Decoding layers.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-6.tif"/>
</fig><table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Distribution of the learning parameters of the LANET model.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Block</th>
<th>Layers</th>
<th>Learnable Sizes</th>
<th>Learnable</th>
</tr>
</thead>
<tbody>
<tr>
<td>Encoder_Block 1</td>
<td>Conv, BN, GroupedConv</td>
<td>224 &#x00D7; 224 &#x00D7; 16</td>
<td>3232</td>
</tr>
<tr>
<td>Encoder_Block 2</td>
<td>Conv, BN, GroupedConv</td>
<td>112 &#x00D7; 112 &#x00D7; 32</td>
<td>15,328</td>
</tr>
<tr>
<td>Encoder_Block 3</td>
<td>Conv, BN, GroupedConv</td>
<td>56 &#x00D7; 56 &#x00D7; 64</td>
<td>60,352</td>
</tr>
<tr>
<td>Encoder_Block 4</td>
<td>Conv, BN, GroupedConv</td>
<td>28 &#x00D7; 28 &#x00D7; 128</td>
<td>239,488</td>
</tr>
<tr>
<td>Bottleneck</td>
<td>Conv, BN</td>
<td>14 &#x00D7; 14 &#x00D7; 16</td>
<td>57,536</td>
</tr>
<tr>
<td>Decoder_Block 1</td>
<td>Transposed_Conv, Conv, BN</td>
<td>28 &#x00D7; 28 &#x00D7; 128</td>
<td>329,472</td>
</tr>
<tr>
<td>Decoder_Block 2</td>
<td>Transposed_Conv, Conv, BN</td>
<td>56 &#x00D7; 56 &#x00D7; 64</td>
<td>107,392</td>
</tr>
<tr>
<td>Decoder_Block 3</td>
<td>Transposed_Conv, Conv, BN</td>
<td>112 &#x00D7; 112 &#x00D7; 32</td>
<td>27,072</td>
</tr>
<tr>
<td>Decoder_Block 4</td>
<td>Transposed_Conv, Conv, BN</td>
<td>224 &#x00D7; 224 &#x00D7; 16</td>
<td>6914</td>
</tr>
<tr>
<td><bold>Total learnable parameters</bold></td>
<td align="center" colspan="3">846,786</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>U-Net, U-Net&#x002B;&#x002B;, and U-Net&#x002B;3 were implemented to determine the number of parameters, and their results were compared with those of the LANET. It contains 846,786 parameters, resulting in a small memory footprint suitable for the deployment of hardware with limited resources. <xref ref-type="table" rid="table-3">Table 3</xref> compares these criteria using a state-of-the-art (SOTA) model. The memory size is computed by [<xref ref-type="bibr" rid="ref-29">29</xref>] using <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>. For example, the LANET is equal to 3.2 megabytes.</p>

</sec>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Cross-Validation (Improves Robustness)</title>
<p>To ensure the robustness and reliability of the proposed model, the LANET was evaluated on the HAM10000 dataset using 5-fold cross-validation. Data were randomly subdivided into five subsets. In each cycle, four subsets were trained in, and one was validated such that each image was validated exactly once. Hence, the final performance measures reported the average and standard deviation across all folds, providing a better estimate of how well the model can be generalized in practice.</p>
<p>To better illustrate the learning behavior of the model, we tracked the accuracy and loss of both the training and validation sets across all the iterations. As shown in <xref ref-type="fig" rid="fig-7">Fig. 7</xref>, the curves demonstrate stable learning, with consistent trends between the two sets. The validation accuracy closely followed the training accuracy, and the loss decreased smoothly for both, indicating an effective convergence without overfitting. The final validation accuracy reached 96.12%, which was consistent with the quantitative results reported in the main experiments. This comparison confirms that LANET generalizes well to unseen data and maintains reliable performance throughout the training process.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>Progress training model.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-7.tif"/>
</fig>
<p><xref ref-type="table" rid="table-4">Table 4</xref> presents the comparison criteria for testing these models using the LANET. All GFLOPs are computed for an input of 224 &#x00D7; 224 using the convention that one multiply&#x2013;accumulate (MAC) equals two floating-point operations. Compared to U-Net (7.42 GFLOPs) and U-Net&#x002B;&#x002B; (8.45 GFLOPs), the proposed LANET achieves higher segmentation accuracy (as mentioned in the results section) while requiring only 4.22 GFLOPs. This highlights superior computational efficiency of LANET.<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mrow><mml:mtext mathvariant="italic">Size</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="italic">of</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="italic">memory</mml:mtext></mml:mrow><mml:mspace width="thinmathspace" /><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="italic">MegaBytes</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mfrac><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:mn>4</mml:mn><mml:mspace width="thinmathspace" /><mml:mi>b</mml:mi><mml:mi>y</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:msup><mml:mn>1024</mml:mn><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mfrac></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>

<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Model complexity comparison.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Model</th>
<th>Number of Parameters (Million)</th>
<th>Number of Layers</th>
<th>Memory Size (MB)</th>
<th>GFLOPs</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>31</td>
<td><bold>70</bold></td>
<td>118.3</td>
<td>7.42</td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>36.2</td>
<td>116</td>
<td>138.1</td>
<td>8.45</td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>26.9</td>
<td>92</td>
<td>102.6</td>
<td>6.80</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>0.846</bold></td>
<td>136</td>
<td><bold>3.2</bold></td>
<td><bold>4.22</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-4fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p><inline-formula id="ieqn-18"><mml:math id="mml-ieqn-18"><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>o</mml:mi><mml:mi>f</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>846786</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>4</mml:mn></mml:mrow><mml:msup><mml:mn>1024</mml:mn><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mn>3387144</mml:mn><mml:mn>1048576</mml:mn></mml:mfrac><mml:mo>=</mml:mo><mml:mn>3.2</mml:mn><mml:mrow><mml:mtext>&#xA0;MB</mml:mtext></mml:mrow><mml:mo>.</mml:mo></mml:math></inline-formula> Assuming 32-bit float precision (4 bytes per parameter), a LANET with 846,786 parameters requires approximately 3.2 MB of memory.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Evaluation Metrics</title>
<p>Five primary metrics were used to evaluate the performance of the LANET. Evaluation metrics are beneficial for the overall analysis and offer helpful information regarding many characteristics of the segmentation results. <xref ref-type="disp-formula" rid="eqn-5">Eqs. (5)</xref>&#x2013;<xref ref-type="disp-formula" rid="eqn-9">(9)</xref> show the subsequent metrics. Where: <italic>&#x201C;TP: True Positive, TN: True Negative, FP: False Positive, FN: False Negative&#x201D;</italic>
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>S</mml:mi><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>e</mml:mi><mml:mi>f</mml:mi><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mspace width="thinmathspace" /><mml:mo stretchy="false">(</mml:mo><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true"><mml:mtr><mml:mtd /><mml:mtd><mml:mi>I</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>o</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mspace width="thinmathspace" /><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mspace width="thinmathspace" /><mml:mo stretchy="false">(</mml:mo><mml:mi>I</mml:mi><mml:mi>o</mml:mi><mml:mi>U</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Results</title>
<p>The LANET performed well in terms of segmentation on all four datasets. The results are presented as mean &#x00B1; standard deviation in <xref ref-type="table" rid="table-5">Tables 5</xref>&#x2013;<xref ref-type="table" rid="table-8">8</xref>. The model demonstrated stable performance in accuracy, Dice, IoU, sensitivity, and specificity compared with the baseline and SOTA methods. All the results reported here are for segmentation performance, that is, the accuracy of segmentation, Dice, IoU, sensitivity, and specificity. Across the datasets, LANET consistently outperformed the baseline and contemporary architecture in most evaluation metrics. To evaluate whether the improvements of LANET over the SOTA models were statistically significant, we applied a paired <italic>t</italic>-test.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Quantitative comparison of LANET and baseline segmentation models based on Accuracy, Dice, IoU, Sensitivity, and Specificity.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Citation</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>90.87 &#x00B1; 0.46</td>
<td>77.24 &#x00B1; 0.52</td>
<td>56.69 &#x00B1; 0.49</td>
<td>60.87 &#x00B1; 0.47</td>
<td>93.98 &#x00B1; 0.41</td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>95.65 &#x00B1; 0.32</td>
<td>89.09 &#x00B1; 0.37</td>
<td>81.31 &#x00B1; 0.33</td>
<td>86.94 &#x00B1; 0.31</td>
<td><bold>96.35 &#x00B1; 0.29</bold></td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>95.41 &#x00B1; 0.30</td>
<td>88.24 &#x00B1; 0.35</td>
<td>77.67 &#x00B1; 0.31</td>
<td>83.70 &#x00B1; 0.29</td>
<td>95.83 &#x00B1; 0.27</td>
</tr>
<tr>
<td>DeepLabv3&#x002B; [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>95.84 &#x00B1; 0.28</td>
<td>89.71 &#x00B1; 0.33</td>
<td>74.21 &#x00B1; 0.30</td>
<td>84.34 &#x00B1; 0.27</td>
<td>96.87 &#x00B1; 0.25</td>
</tr>
<tr>
<td>MRP-UNet [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>94.64 &#x00B1; 0.26</td>
<td>92.95 &#x00B1; 0.29</td>
<td>90.18 &#x00B1; 0.26</td>
<td>89.85 &#x00B1; 0.25</td>
<td>90.18 &#x00B1; 0.23</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>96.44 &#x00B1; 0.22</bold></td>
<td><bold>94.53 &#x00B1; 0.25</bold></td>
<td><bold>92.61 &#x00B1; 0.21</bold></td>
<td><bold>91.98 &#x00B1; 0.20</bold></td>
<td>92.48 &#x00B1; 0.19</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-5fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap><table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Evaluation of SOTA performance in comparison to LANET for the ISIC 2017 dataset.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Citation</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>96.46 &#x00B1; 0.36</td>
<td>86.17 &#x00B1; 0.42</td>
<td>64.79 &#x00B1; 0.40</td>
<td>70.65 &#x00B1; 0.39</td>
<td><bold>97.12 &#x00B1; 0.31</bold></td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>96.36 &#x00B1; 0.34</td>
<td>87.72 &#x00B1; 0.39</td>
<td>82.98 &#x00B1; 0.33</td>
<td>88.65 &#x00B1; 0.31</td>
<td>96.43 &#x00B1; 0.29</td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>96.35 &#x00B1; 0.33</td>
<td>87.27 &#x00B1; 0.38</td>
<td>73.71 &#x00B1; 0.32</td>
<td>79.04 &#x00B1; 0.30</td>
<td>96.87 &#x00B1; 0.27</td>
</tr>
<tr>
<td>DeepLabv3&#x002B; [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>94.91 &#x00B1; 0.29</td>
<td>85.71 &#x00B1; 0.35</td>
<td><bold>83.64 &#x00B1; 0.28</bold></td>
<td><bold>90.74 &#x00B1; 0.27</bold></td>
<td>94.65 &#x00B1; 0.25</td>
</tr>
<tr>
<td>MRP-UNet [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>93.51 &#x00B1; 0.26</td>
<td><bold>93.14 &#x00B1; 0.31</bold></td>
<td>92.41 &#x00B1; 0.25</td>
<td>90.42 &#x00B1; 0.23</td>
<td>92.41 &#x00B1; 0.22</td>
</tr>
<tr>
<td>EM-Net [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>93.97 &#x00B1; 0.24</td>
<td>86.42 &#x00B1; 0.29</td>
<td>78.70 &#x00B1; 0.24</td>
<td>85.33 &#x00B1; 0.22</td>
<td>92.17 &#x00B1; 0.21</td>
</tr>
<tr>
<td>IDSNet [<xref ref-type="bibr" rid="ref-32">32</xref>]</td>
<td>94.45 &#x00B1; 0.23</td>
<td>86.53 &#x00B1; 0.27</td>
<td>78.68 &#x00B1; 0.23</td>
<td>84.41 &#x00B1; 0.21</td>
<td>93.73 &#x00B1; 0.20</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>96.78 &#x00B1; 0.22</bold></td>
<td>92.90 &#x00B1; 0.24</td>
<td>81.90 &#x00B1; 0.21</td>
<td>90.30 &#x00B1; 0.20</td>
<td>92.83 &#x00B1; 0.19</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-6fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap><table-wrap id="table-7">
<label>Table 7</label>
<caption>
<title>Evaluation of SOTA performance vs. the LANET model on the ISIC 2018 dataset.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Citation</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>85.96 &#x00B1; 0.48</td>
<td>75.11 &#x00B1; 0.57</td>
<td>64.58 &#x00B1; 0.54</td>
<td>75.55 &#x00B1; 0.49</td>
<td>85.65 &#x00B1; 0.45</td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>89.05 &#x00B1; 0.39</td>
<td>80.26 &#x00B1; 0.44</td>
<td>68.67 &#x00B1; 0.42</td>
<td>80.34 &#x00B1; 0.37</td>
<td>88.36 &#x00B1; 0.33</td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>89.71 &#x00B1; 0.37</td>
<td>81.32 &#x00B1; 0.40</td>
<td>71.01 &#x00B1; 0.36</td>
<td>81.87 &#x00B1; 0.35</td>
<td>89.09 &#x00B1; 0.31</td>
</tr>
<tr>
<td>DeepLabv3&#x002B; [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>90.41 &#x00B1; 0.34</td>
<td>82.46 &#x00B1; 0.41</td>
<td>67.63 &#x00B1; 0.38</td>
<td>83.98 &#x00B1; 0.33</td>
<td>90.54 &#x00B1; 0.29</td>
</tr>
<tr>
<td>DSU-Net [<xref ref-type="bibr" rid="ref-33">33</xref>]</td>
<td>94.31 &#x00B1; 0.26</td>
<td>90.04 &#x00B1; 0.28</td>
<td>83.43 &#x00B1; 0.27</td>
<td>92.22 &#x00B1; 0.25</td>
<td>96.14 &#x00B1; 0.23</td>
</tr>
<tr>
<td>WFC_AS_KL [<xref ref-type="bibr" rid="ref-34">34</xref>]</td>
<td>94.59 &#x00B1; 0.25</td>
<td>90.27 &#x00B1; 0.29</td>
<td><bold>84.06 &#x00B1; 0.25</bold></td>
<td>86.08 &#x00B1; 0.24</td>
<td><bold>98.95 &#x00B1; 0.20</bold></td>
</tr>
<tr>
<td>EM-Net [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>94.70 &#x00B1; 0.23</td>
<td>90.30 &#x00B1; 0.27</td>
<td>83.60 &#x00B1; 0.23</td>
<td><bold>92.40 &#x00B1; 0.22</bold></td>
<td>93.90 &#x00B1; 0.21</td>
</tr>
<tr>
<td>MSHV-Net [<xref ref-type="bibr" rid="ref-35">35</xref>]</td>
<td>94.79 &#x00B1; 0.21</td>
<td>89.16 &#x00B1; 0.26</td>
<td>80.43 &#x00B1; 0.22</td>
<td>87.91 &#x00B1; 0.20</td>
<td>97.01 &#x00B1; 0.18</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>96.32 &#x00B1; 0.19</bold></td>
<td><bold>92.21 &#x00B1; 0.20</bold></td>
<td>83.37 &#x00B1; 0.18</td>
<td>90.84 &#x00B1; 0.19</td>
<td>98.22 &#x00B1; 0.17</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-7fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap><table-wrap id="table-8">
<label>Table 8</label>
<caption>
<title>Comparison of SOTA performance and the LANET model on the PH2 dataset.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Citation</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>92.00 &#x00B1; 0.50</td>
<td>91.10 &#x00B1; 0.40</td>
<td>82.00 &#x00B1; 0.35</td>
<td>91.40 &#x00B1; 0.42</td>
<td>90.20 &#x00B1; 0.38</td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>93.30 &#x00B1; 0.46</td>
<td>91.90 &#x00B1; 0.38</td>
<td>84.40 &#x00B1; 0.33</td>
<td>92.30 &#x00B1; 0.41</td>
<td>91.10 &#x00B1; 0.36</td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>93.20 &#x00B1; 0.44</td>
<td>92.10 &#x00B1; 0.37</td>
<td>85.30 &#x00B1; 0.32</td>
<td>93.30 &#x00B1; 0.40</td>
<td>91.60 &#x00B1; 0.35</td>
</tr>
<tr>
<td>DeepLabv3&#x002B; [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>94.70 &#x00B1; 0.39</td>
<td>93.60 &#x00B1; 0.34</td>
<td>86.10 &#x00B1; 0.29</td>
<td>93.20 &#x00B1; 0.38</td>
<td>92.00 &#x00B1; 0.34</td>
</tr>
<tr>
<td>WFC_AS_KL [<xref ref-type="bibr" rid="ref-35">35</xref>]</td>
<td>88.98 &#x00B1; 0.47</td>
<td>83.22 &#x00B1; 0.51</td>
<td>74.67 &#x00B1; 0.46</td>
<td>87.73 &#x00B1; 0.44</td>
<td>93.39 &#x00B1; 0.38</td>
</tr>
<tr>
<td>LiteMamba-Bound [<xref ref-type="bibr" rid="ref-36">36</xref>]</td>
<td>94.79 &#x00B1; 0.29</td>
<td><bold>95.62 &#x00B1; 0.33</bold></td>
<td><bold>91.70 &#x00B1; 0.25</bold></td>
<td><bold>96.20 &#x00B1; 0.27</bold></td>
<td>92.67 &#x00B1; 0.32</td>
</tr>
<tr>
<td>BDFormer [<xref ref-type="bibr" rid="ref-20">20</xref>]</td>
<td>95.14 &#x00B1; 0.26</td>
<td>92.66 &#x00B1; 0.35</td>
<td>86.32 &#x00B1; 0.28</td>
<td>95.15 &#x00B1; 0.31</td>
<td><bold>95.20 &#x00B1; 0.25</bold></td>
</tr>
<tr>
<td>EM-Net [<xref ref-type="bibr" rid="ref-21">21</xref>]</td>
<td>96.34 &#x00B1; 0.22</td>
<td>94.03 &#x00B1; 0.30</td>
<td>88.92 &#x00B1; 0.24</td>
<td>94.57 &#x00B1; 0.26</td>
<td>94.06 &#x00B1; 0.22</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>97.89 &#x00B1; 0.18</bold></td>
<td>93.70 &#x00B1; 0.21</td>
<td>91.60 &#x00B1; 0.19</td>
<td>95.80 &#x00B1; 0.20</td>
<td>92.90 &#x00B1; 0.21</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-8fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<sec id="s4_1">
<label>4.1</label>
<title>HAM10000 Dataset</title>
<p><xref ref-type="table" rid="table-5">Table 5</xref> evaluation of our method on the HAM10000 dataset LANET achieved accuracies of 96.44%, Dice &#x003D; 94.53% and IoU &#x003D; 92.61% over the HAM10000 dataset (shown in <xref ref-type="table" rid="table-5">Table 5</xref>). All these results are higher than U-Net, U-Net&#x002B;&#x002B;, DeepLabv3&#x002B;, and MRP-UNet. The model shows a 17.3-point gain in Dice over U-Net and a 5.4-point gain over U-Net&#x002B;&#x002B;. Compared with MRP-UNet, LANET improves Dice by 1.58 points, confirming its stronger boundary detection ability. A five-fold cross validation was conducted. LANET achieved an average Dice of 96.4% &#x00B1; 0.6% and IoU of 89.1% &#x00B1; 0.7%, showing stable performance. The results in <xref ref-type="fig" rid="fig-8">Fig. 8</xref> compare the five SOTA metrics. <xref ref-type="fig" rid="fig-15">Fig. A1</xref> in <xref ref-type="app" rid="app-2">Appendix B</xref> shows the predictions in green color and the ground truth in red color.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Comparison of accuracy, Dice, IoU, sensitivity, and specificity for LANET and baseline models on HAM10000.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-8.tif"/>
</fig>
<p>The LANET returned a Dice score of 94.53% and accuracy of 96.44%, revealing competitive in-domain performance. The results transferred to ISIC 2017, ISIC 2018, and PH2 for LANET were the best with Dice scores from 90.61% to 91.09% and accuracies over the threshold, and all datasets were higher than 93%, showing a stable performance in segmentation over polling positions including different datasets.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>ISIC 2017 Dataset</title>
<p>LANET obtained the highest accuracy (96.78%) and one of the best Dice scores (92.90%) among all models. Its Dice score exceeds those of U-Net, U-Net&#x002B;&#x002B;, and U-Net&#x002B;3 by 6.7, 5.2, and 5.6 points, respectively. The sensitivity (90.30%) and specificity (92.83%) showed that the model accurately detected lesion areas, while limiting false positives. The results were statistically significant (<italic>p</italic> &#x003C; 0.05). The summary statistics of the performance of the models are shown in <xref ref-type="fig" rid="fig-9">Fig. 9</xref>. Qualitative examples demonstrate the superior performance of LANET in tracing lesion borders compared to other competitors. <xref ref-type="fig" rid="fig-16">Fig. A2</xref> in <xref ref-type="app" rid="app-2">Appendix B</xref> shows some segmented image samples. Qualitative examples support the quantitative findings, showing accurate segmentation across various lesion shapes and sizes.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Metric comparison for LANET and SOTA models on ISIC 2017.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-9.tif"/>
</fig>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>ISIC 2018 Dataset</title>
<p><xref ref-type="table" rid="table-7">Table 7</xref> presents the comparative results of various DL models against the LANET. LANET achieved the highest overall accuracy (96.32%) and Dice score (92.21%) for the ISIC 2018. Its Dice score surpasses U-Net by 17.1 points and exceeds DSU-Net and EM-Net by roughly 2 points. The sensitivity (90.84%) and specificity (98.22%) demonstrated a balanced discrimination between the lesion and background regions. A statistical comparison of the performances is shown in <xref ref-type="fig" rid="fig-10">Fig. 10</xref>. Across all SOTA models, LANET produced the second-highest specificity, while maintaining competitive sensitivity. Examples of the representative segmentations are shown in <xref ref-type="fig" rid="fig-17">Fig. A3</xref> in <xref ref-type="app" rid="app-2">Appendix B</xref>. The qualitative outputs showed an improved boundary precision and reduced noise sensitivity.</p>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>Statistical performance of LANET and competing models on the ISIC 2018 dataset.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-10.tif"/>
</fig>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>PH2 Dataset</title>
<p><xref ref-type="table" rid="table-8">Table 8</xref> lists the performance of the PH2 dataset. LANET achieved the highest accuracy (97.89%), followed by strong Dice (93.70%) and IoU (91.60%). It improved Dice over U-Net by 18.4 points, U-Net&#x002B;&#x002B; by 5.3 points, and DeepLabv3&#x002B; by 2.7 points. The LANET also achieved a high sensitivity (95.80%) and specificity (92.90%), indicating reliable lesion recognition with few false alarms. The model consistently outperformed the classical and transformer-based methods (<italic>p</italic> &#x003C; 0.05). The qualitative samples showed accurate structural preservation even in challenging low-contrast lesions. The qualitative examples in <xref ref-type="fig" rid="fig-18">Fig. A4</xref> in <xref ref-type="app" rid="app-2">Appendix B</xref> shows that the LANET is more accurate at boundary rendering and lesion localization than the other models.</p>
</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Cross-Dataset Evaluation</title>
<p>A cross-dataset experiment was performed using LANET to evaluate its generalizability to other imaging datasets, as shown in <xref ref-type="table" rid="table-9">Table 9</xref>. The model was trained on the dataset HAM10000 (refer to <xref ref-type="table" rid="table-5">Table 5</xref>) and tested on three non-introduced datasets (ISIC 2017, ISIC 2018, and PH2) without any retraining or fine-tuning. These datasets are diverse with respect to illumination, lesion type, and color normalization. Cross-dataset testing shows that the LANET maintains a strong performance when applied to datasets with different imaging characteristics, indicating good generalization.</p>
<table-wrap id="table-9">
<label>Table 9</label>
<caption>
<title>Quantitative evaluation of LANET on cross-dataset testing.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Test Dataset</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>ISIC 2017</td>
<td>93.81</td>
<td>90.79</td>
<td>82.00</td>
<td>88.89</td>
<td>91.71</td>
</tr>
<tr>
<td>ISIC 2018</td>
<td>94.02</td>
<td>91.09</td>
<td>83.07</td>
<td>89.02</td>
<td>92.12</td>
</tr>
<tr>
<td>PH2</td>
<td>93.46</td>
<td>90.61</td>
<td>82.11</td>
<td>88.50</td>
<td>91.51</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Model Visualization for Interpretability</title>
<p>The LANET model uses interpretability tools (i.e., convolutional activation maps and Grad-CAM) to illustrate the image regions that drive its predictions. These visualizations are concentrated in the lesion regions rather than irrelevant artifacts to mitigate clinical trust hesitation and provide transparency. Because DL models are considered &#x201C;black boxes,&#x201D; these interpretability techniques shed light on what LANET has thought of. Through a performance assessment using the Dice coefficient, we demonstrated that the LANET ensures a good overlap between the predictions and ground-truth masks for multiple datasets. From <xref ref-type="table" rid="table-10">Table 10</xref>, the high Dice scores show reliable and accurate segmentation and generalization, even when considering the imbalance between classes and slight boundary changes.</p>
<table-wrap id="table-10">
<label>Table 10</label>
<caption>
<title>Dice model among four public datasets.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Model</th>
<th>HAM10000</th>
<th>ISIC 2017</th>
<th>ISIC 2018</th>
<th>PH2</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>77.24</td>
<td>86.17</td>
<td>75.11</td>
<td>91.10</td>
</tr>
<tr>
<td>U-Net&#x002B;&#x002B; [<xref ref-type="bibr" rid="ref-6">6</xref>]</td>
<td>89.09</td>
<td>87.72</td>
<td>80.26</td>
<td>91.90</td>
</tr>
<tr>
<td>U-Net&#x002B;3 [<xref ref-type="bibr" rid="ref-30">30</xref>]</td>
<td>88.24</td>
<td>87.27</td>
<td>81.32</td>
<td>92.10</td>
</tr>
<tr>
<td>DeepLabv3&#x002B; [<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
<td>89.71</td>
<td>85.71</td>
<td>82.46</td>
<td>93.60</td>
</tr>
<tr>
<td><bold>LANET</bold></td>
<td><bold>94.53</bold></td>
<td><bold>92.90</bold></td>
<td><bold>92.21</bold></td>
<td><bold>93.70</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-10fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<sec id="s4_6_1">
<label>4.6.1</label>
<title>Interpretability Based on Convolutional Layers</title>
<p>Interpretability can be described as a clear technical explanation of a model. Visualization of these activations helps us decipher how the model transforms the input lesion images and discover the handcrafted features that establish class decisions. Interpretability techniques highlight the image regions that influence the model&#x2019;s predictions and help verify that the LANET focuses on clinically relevant structures during segmentation. This explicability is particularly important in medical imaging, where (even if correct) predictive but inscrutable models do not play a useful clinical role. <xref ref-type="fig" rid="fig-11">Fig. 11</xref> shows the feature activation maps generated by the LANET for various layers. These maps demonstrate how the network gradually refines the lesion-relevant information. The shallow layers are sensitive to the overall texture variations, whereas the latter capture fine edges and structures. The focus of the LANET during segmentation can be seen in the highlighted regions, demonstrating that the lesion contours and high-contrast regions were effectively addressed by our model.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>Activation maps from LANET&#x2019;s convolutional layers; (<bold>a</bold>) early layers emphasize low-level cues such as textures and simple edges; and (<bold>b</bold>) deeper layers capture higher-level semantic patterns, including lesion shape and boundary complexity.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-11.tif"/>
</fig>
</sec>
<sec id="s4_6_2">
<label>4.6.2</label>
<title>Interpretability with Grad-CAM</title>
<p>To interpret how the LANET localizes lesion regions, the output scores and underlying convolutional features were linked using Grad-CAM. The strong-gradient locations correspond to the image regions that significantly affect the model decision. To explore the decision making of LANET, Grad-CAM was performed on several decoder layers. As shown in <xref ref-type="fig" rid="fig-12">Fig. 12</xref>, shallow layers focus on fine-grained details of the lesion by emphasizing the edges and irregular margins, whereas deeper layers highlight the overall structure of the lesions. This evolution shows that the LANET uses both local border cues and the global semantic context during segmentation. The class-specific activation map obtained is given by <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref>:<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:msup><mml:mrow><mml:mtext>CAM</mml:mtext></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mtext>ReLU</mml:mtext></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:munder><mml:msubsup><mml:mi>a</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:msup><mml:mi>A</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">)</mml:mo></mml:math></disp-formula></p>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>LANET Grad-CAM overlays at ISIC 2018 (<bold>a</bold>) Input image; (<bold>b</bold>) Predicted mask; (<bold>c</bold>); (<bold>d</bold>); and (<bold>e</bold>) Grad-CAM overlay from shallow, intermediate, and deep layers, respectively.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-12.tif"/>
</fig>
<p><inline-formula id="ieqn-19"><mml:math id="mml-ieqn-19"><mml:msup><mml:mi>A</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the <inline-formula id="ieqn-20"><mml:math id="mml-ieqn-20"><mml:msup><mml:mi>k</mml:mi><mml:mrow><mml:mrow><mml:mtext>th</mml:mtext></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula> feature map of the final convolutional layer and <inline-formula id="ieqn-21"><mml:math id="mml-ieqn-21"><mml:msubsup><mml:mi>a</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the importance weight for feature map <inline-formula id="ieqn-22"><mml:math id="mml-ieqn-22"><mml:mi>k</mml:mi></mml:math></inline-formula> for class <inline-formula id="ieqn-23"><mml:math id="mml-ieqn-23"><mml:mi>c</mml:mi></mml:math></inline-formula>.</p>
<p>The weights <inline-formula id="ieqn-24"><mml:math id="mml-ieqn-24"><mml:msubsup><mml:mi>a</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represent how strongly feature map <inline-formula id="ieqn-25"><mml:math id="mml-ieqn-25"><mml:mi>k</mml:mi></mml:math></inline-formula> influences the output score for class <inline-formula id="ieqn-26"><mml:math id="mml-ieqn-26"><mml:mi>c</mml:mi></mml:math></inline-formula>. They are computed as:<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:msubsup><mml:mi>a</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>Z</mml:mi></mml:mfrac><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:munder><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x2202;</mml:mi><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p><inline-formula id="ieqn-27"><mml:math id="mml-ieqn-27"><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the score for class <inline-formula id="ieqn-28"><mml:math id="mml-ieqn-28"><mml:mi>c</mml:mi></mml:math></inline-formula>. <inline-formula id="ieqn-29"><mml:math id="mml-ieqn-29"><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the activation at the spatial location <inline-formula id="ieqn-30"><mml:math id="mml-ieqn-30"><mml:mo stretchy="false">(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula> in feature map <inline-formula id="ieqn-31"><mml:math id="mml-ieqn-31"><mml:mi>k</mml:mi><mml:mo>.</mml:mo><mml:mi>Z</mml:mi></mml:math></inline-formula> is the total number of spatial locations in <inline-formula id="ieqn-32"><mml:math id="mml-ieqn-32"><mml:msup><mml:mi>A</mml:mi><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>.</p>
<p>A more comprehensive Grad-CAM analysis over the entire ISIC 2018 dataset found that in 88% of the cases, the highlighted regions coincided with relevant areas (e.g., borders of lesions and pigment structures), as shown in <xref ref-type="fig" rid="fig-12">Fig. 12</xref>. Reduced overlap was observed primarily for low-contrast or visual occlusions. These results indicate the ability of LANET to reliably zoom in diagnostically meaningful regions and corroborate the interpretability and clinical reliability of our model.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Ablation Experiment</title>
<p>Ablation studies have demonstrated the contribution of each module to a given performance level. Specifically, the Dice and IoU scores decreased significantly without the ASPP module, confirming that multiscale context aggregation is required to distinguish the lesion topological boundary across varying sizes. Sensitivity decreased without an attention mechanism, indicating that a model with one has an improved focus on subtle or low-contrast lesions. This step not only reduces the computational cost but also preserves the accuracy, thereby confirming the relevance of depthwise separable convolutions for lightweight implementations. Together, these findings justify the inclusion of each architectural element and confirm the design choices made for the LANET.</p>
<p>This study analyzed the roles of each constituent of the LANET architecture using the ISIC 2018 dataset. To evaluate the contribution of individual components to the LANET, an extensive ablation study was performed. The configurations tested are listed in <xref ref-type="table" rid="table-11">Table 11</xref>, which summarizes the results of this study, where all variants were trained and evaluated using the same experimental setup for a fair comparison. Furthermore, the baseline U-Net model reached an Intersection over Union (IoU) of 75.11%. The incorporation of depthwise separable convolutions into U-Net increased IoU to 78.20%. The attention method further improved performance by increasing the IoU to 81.10%. The combination of the proposed improvements, including the ASPP, bespoke attention mechanisms, and depthwise convolutions, resulted in the best performance metrics, achieving an IoU of 81.90%. The total LANET architecture increased the IoU by approximately 6.79% compared with the basic U-Net. <xref ref-type="fig" rid="fig-13">Fig. 13</xref> shows the effect of each part on the LANET.</p>
<table-wrap id="table-11">
<label>Table 11</label>
<caption>
<title>Ablation study results for LANET using the ISIC 2018 dataset.</title>
</caption>
<table>
<colgroup>
<col align="center" width="60mm"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Citation</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
</tr>
</thead>
<tbody>
<tr>
<td>U-Net (Baseline)</td>
<td>85.96 &#x00B1; 0.42</td>
<td>75.11 &#x00B1; 0.38</td>
<td>64.58 &#x00B1; 0.55</td>
<td>75.55 &#x00B1; 0.47</td>
</tr>
<tr>
<td>U-Net &#x002B; Depthwise Conv.</td>
<td>91.20 &#x00B1; 0.33</td>
<td>83.15 &#x00B1; 0.29</td>
<td>78.20 &#x00B1; 0.26</td>
<td>84.30 &#x00B1; 0.35</td>
</tr>
<tr>
<td>U-Net &#x002B; Depthwise Conv. &#x002B; Attention</td>
<td>94.80 &#x00B1; 0.27</td>
<td>87.50 &#x00B1; 0.31</td>
<td>81.10 &#x00B1; 0.24</td>
<td>87.20 &#x00B1; 0.28</td>
</tr>
<tr>
<td>U-Net &#x002B; Depthwise Conv.&#x002B; Attention &#x002B; ASPP (LANET)</td>
<td><bold>96.78 &#x00B1; 0.25</bold></td>
<td><bold>92.90 &#x00B1; 0.22</bold></td>
<td><bold>81.90 &#x00B1; 0.20</bold></td>
<td><bold>90.30 &#x00B1; 0.23</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="table-11fn1" fn-type="other">
<p>Note: Bold values indicate the best result in each column.</p>
</fn>
</table-wrap-foot>
</table-wrap><fig id="fig-13">
<label>Figure 13</label>
<caption>
<title>Effect of each part by the LANET model, where prediction in green color and the ground truth in red color: (<bold>a</bold>) Original images; (<bold>b</bold>) Ground truth; (<bold>c</bold>) Baseline U-Net segmentation; (<bold>d</bold>) U-Net with depthwise convolutions; (<bold>e</bold>) U-Net with depthwise convolutions and attention mechanism; and (<bold>f</bold>) LANET segmentation.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-13.tif"/>
</fig>
</sec>
<sec id="s6">
<label>6</label>
<title>Discussion</title>
<p>LANET offers compact architecture that maintains a strong segmentation of accuracy while requiring minimal computational resources. With a combination of depthwise separable convolutions, custom attention mechanism and ASPP module in a compact encoder&#x2013;decoder architecture, LANET achieves high seg-mentation performance with only 0.85 million parameters and 3.2 MB memory footprint. The results across the HAM10000, ISIC 2017, ISIC 2018 and PH2 datasets indicate that LANET achieves state-of-the-art performance in comparison with U-Net, U-Net&#x002B;&#x002B;, and DeepLabv3&#x002B;., indicating a strong boundary preservation ability and robust lesion localization.</p>
<p>LANET&#x2019;s low computational demand supports deployment in clinical settings, where hardware resources may be limited. Its low latency and small memory footprint make it well suited for real-time applications in clinical workstations as well as mobile or embedded systems. These properties are relevant because many large-capacity models require hardware that is not attainable in resource-constrained environments. It is encouraging that, with proper model design, lightweight models can achieve clinically relevant accuracy without compromising interpretability or usability.</p>
<p>Extensions, such as model compression, uncertainty estimation, and multimodal integration, may further improve clinical use. Model compression or knowledge distillation approaches can further reduce memory consumption. Taking uncertainty estimation into account might bring to the fore cases that are difficult or ambiguous for the model and may support clinician trust in towards human&#x2013;AI collaboration. Further cross-validation and enlargement of the multi-dataset evaluation would also serve to test the stability across populations and imaging parameters. Next, by applying the LANET to other imaging modalities, such as MRI or CT, one can test the generality of its feature representations and attention mechanisms.</p>
<p>The Waterloo skin-tone dataset [<xref ref-type="bibr" rid="ref-37">37</xref>] provides complementary information. LANET consistently achieved high accuracy and sensitivity across samples, demonstrating robustness in lesion detection under diverse skin tones. <xref ref-type="table" rid="table-12">Table 12</xref> shows high Dice and IoU scores manifested significant overlaps in some cases although over- or under-segmentation was observed in others, particularly in lesions with low contrast or irregular boundaries.</p>
<table-wrap id="table-12">
<label>Table 12</label>
<caption>
<title>Quantitative results of evaluating a sample of the Waterloo dataset.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/> </colgroup>
<thead>
<tr>
<th>Images</th>
<th>Accuracy</th>
<th>Dice</th>
<th>IoU</th>
<th>Sensitivity</th>
<th>Specificity</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0.98854</td>
<td>0.95268</td>
<td>0.90963</td>
<td>0.99776</td>
<td>0.98734</td>
</tr>
<tr>
<td>2</td>
<td>0.95105</td>
<td>0.91762</td>
<td>0.84777</td>
<td>0.96719</td>
<td>0.94472</td>
</tr>
<tr>
<td>3</td>
<td>0.92951</td>
<td>0.86534</td>
<td>0.76265</td>
<td>0.98732</td>
<td>0.91230</td>
</tr>
<tr>
<td>4</td>
<td>0.87074</td>
<td>0.21742</td>
<td>0.12197</td>
<td>1.00000</td>
<td>0.86837</td>
</tr>
<tr>
<td>5</td>
<td>0.99081</td>
<td>0.85643</td>
<td>0.74891</td>
<td>0.98708</td>
<td>0.99092</td>
</tr>
<tr>
<td>6</td>
<td>0.89346</td>
<td>0.75332</td>
<td>0.60426</td>
<td>0.60426</td>
<td>1.00000</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>These results correspond to the qualitative examination in which most examples had good structural agreement; however, some were subject to either false positives or boundary drift. These observations illustrate that improvements in texture modelling and contrast normalization may further improve robustness. <xref ref-type="fig" rid="fig-14">Fig. 14</xref> shows the qualitative results of the LANET segmentation on six example images from the Waterloo dataset. These results indicate that lightweight and interpretable architectures can achieve high performance and practical clinical value.</p>
<fig id="fig-14">
<label>Figure 14</label>
<caption>
<title>Qualitative results of testing six images from the Waterloo dataset, where the prediction in green color and the ground truth is in red.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-14.tif"/>
</fig>
</sec>
<sec id="s7">
<label>7</label>
<title>Limitations and Future Work</title>
<p>The datasets used may not have covered the full range of clinical imaging variations. Broader evaluations across institutions and devices would improve our understanding of LANET&#x2019;s robustness. Furthermore, the datasets they were trained on may not truly represent the variety of clinical imaging devices, acquisition techniques, and patient populations, highlighting the importance of having diverse large-scale datasets to validate the robustness and generalization. More importantly, the model relies solely on image features and cannot leverage patient metadata or clinical knowledge when the diagnosis is ambiguous owing to a lack of clinical background. Future research should address these limitations in the future. Although LANET perform well on public datasets, their applicability to wider clinical populations remains uncertain.</p>
<p>Future studies will continue to investigate the robustness across acquisition technologies, lighting conditions, and lesion properties and include more diverse datasets collected from multiple sources under various conditions. This evaluation aimed to provide a comprehensive view of how well LANET can transfer learned representations to new, unseen clinical data distributions.</p>
</sec>
<sec id="s8">
<label>8</label>
<title>Conclusion</title>
<p>This study introduced the LANET, a lightweight attention-based network for skin lesion segmentation. LANET integrates depthwise convolutions, attention, and multiscale contexts to achieve accurate segmentation with a low computational cost. The compact design makes it suitable for real-time clinical use. A LANET consists of only 0.85 million parameters and costs approximately 3.2 MB of memory. It performs better than the widely used architectures on the four benchmark datasets. The results demonstrate that the LANET provides precise, robust, and reusable segmentation. Its small size and low latency make it appropriate for real-time implementation of mobile clinical hardware. These features limit the adoption of DL in dermatology by providing models that work well in both high-resource and low-resource contexts.</p>
<p>Future work will examine performance across wider datasets and clinical settings. This study provides a robust basis for the development of future diagnostic tools that can help clinicians in routine clinical practice. The use of LANET in other medical imaging applications can further demonstrate this versatility. Thus, LANETs offer a feasible and effective stepping stone for the resource-efficient, interpretable, and clinically deployable segmentation of skin lesions.</p>
</sec>
</body>
<back>
<ack>
<p>The authors gratefully acknowledge the Faculty of Computer Science and Information Technology, Universiti Putra Malaysia (UPM), for the institutional support provided for this research. The authors also sincerely thank the reviewers for their careful evaluation and constructive comments, which contributed to improving the quality of the manuscript.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>The authors received no specific funding for this study.</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>The authors confirm contribution to the paper as follows: study conception and design: Abdulrahman Dira Khalaf, Hazlina Hamdan; data collection: Abdulrahman Dira Khalaf; analysis and interpretation of results: Abdulrahman Dira Khalaf, Hazlina Hamdan, Alfian Abdul Halin, Noridayu Manshor; draft manuscript preparation: Abdulrahman Dira Khalaf. All authors reviewed and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>The datasets analyzed in this study are publicly available. HAM10000, ISIC 2017, and ISIC 2018 datasets are available from the International Skin Imaging Collaboration (ISIC) archive. The PH2 dataset is available from the Dermatology Service of Hospital Pedro Hispano. No new data was generated or analyzed in this study. All datasets are available from their respective public repositories without restriction.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest.</p>
</sec>
<app-group id="appg-1">
<app id="app-1">
<title>Appendix A Detailed Layer-Wise Configuration and Learnable Parameter Distribution of the Proposed Lanet Architecture</title>
<table-wrap id="table-13">
<table>
<colgroup>
<col align="center" width="35mm"/>
<col align="center" width="35mm"/>
<col align="center" width="65mm"/>
<col align="center" width="12mm"/>
</colgroup>
<thead>
<tr>
<th>Layer Type</th>
<th>Activation Size</th>
<th>Learnable Parameters</th>
<th>Count</th>
</tr>
</thead>
<tbody>
<tr>
<td>2-D Convolution_1</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 3 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>448</td>
</tr>
<tr>
<td>Batch Normalization_1</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>Grouped Convolution_1</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 16</td>
<td>160</td>
</tr>
<tr>
<td>2-D Convolution_2</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 16 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>2320</td>
</tr>
<tr>
<td>2-D Convolution_3</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 16 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>272</td>
</tr>
<tr>
<td>2-D Convolution_4</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 16 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 32</td>
<td>4640</td>
</tr>
<tr>
<td>Batch Normalization_2</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 32, Scale 1 &#x00D7; 1 &#x00D7; 32</td>
<td>64</td>
</tr>
<tr>
<td>Grouped Convolution_2</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 32</td>
<td>320</td>
</tr>
<tr>
<td>2-D Convolution_5</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 32 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 32</td>
<td>9248</td>
</tr>
<tr>
<td>2-D Convolution_6</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 32 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 32</td>
<td>1056</td>
</tr>
<tr>
<td>2-D Convolution_7</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 32 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 64</td>
<td>18,496</td>
</tr>
<tr>
<td>Batch Normalization_3</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 64, Scale 1 &#x00D7; 1 &#x00D7; 64</td>
<td>128</td>
</tr>
<tr>
<td>Grouped Convolution_3</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 64</td>
<td>640</td>
</tr>
<tr>
<td>2-D Convolution_8</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 64 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 64</td>
<td>36,928</td>
</tr>
<tr>
<td>2-D Convolution_9</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 64 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 64</td>
<td>4160</td>
</tr>
<tr>
<td>2-D Convolution_10</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 64 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 128</td>
<td>73,856</td>
</tr>
<tr>
<td>Batch Normalization_4</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 128, Scale 1 &#x00D7; 1 &#x00D7; 128</td>
<td>256</td>
</tr>
<tr>
<td>Grouped Convolution_4</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 128</td>
<td>1280</td>
</tr>
<tr>
<td>2-D Convolution_11</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 128 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 128</td>
<td>147,584</td>
</tr>
<tr>
<td>2-D Convolution_12</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 128 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 128</td>
<td>16,512</td>
</tr>
<tr>
<td>2-D Convolution_13</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 128 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>2064</td>
</tr>
<tr>
<td>Batch Normalization_5</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>2-D Convolution_14</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 128 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>18,448</td>
</tr>
<tr>
<td>Batch Normalization_6</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>2-D Convolution_15</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 128 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>18,448</td>
</tr>
<tr>
<td>Batch Normalization_7</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>2-D Convolution_16</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 128 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>18,448</td>
</tr>
<tr>
<td>Batch Normalization_8</td>
<td>14 &#x00D7; 14 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>Transposed Convolution_1</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 2 &#x00D7; 2 &#x00D7; 128 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 128</td>
<td>32,896</td>
</tr>
<tr>
<td>2-D Convolution_17</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 256 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 128</td>
<td>295,040</td>
</tr>
<tr>
<td>Batch Normalization_9</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 64, Scale 1 &#x00D7; 1 &#x00D7; 64</td>
<td>256</td>
</tr>
<tr>
<td>Grouped Convolution_5</td>
<td>28 &#x00D7; 28 &#x00D7; 128 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 128</td>
<td>1280</td>
</tr>
<tr>
<td>Transposed Convolution_2</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 2 &#x00D7; 2 &#x00D7; 64 &#x00D7; 128, Bias 1 &#x00D7; 1 &#x00D7; 64</td>
<td>32,832</td>
</tr>
<tr>
<td>2-D Convolution_18</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 128 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 64</td>
<td>73,792</td>
</tr>
<tr>
<td>Batch Normalization_10</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 64, Scale 1 &#x00D7; 1 &#x00D7; 64</td>
<td>128</td>
</tr>
<tr>
<td>Grouped Convolution_6</td>
<td>56 &#x00D7; 56 &#x00D7; 64 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 64</td>
<td>640</td>
</tr>
<tr>
<td>Transposed Convolution_3</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 2 &#x00D7; 2 &#x00D7; 32 &#x00D7; 64, Bias 1 &#x00D7; 1 &#x00D7; 32</td>
<td>8224</td>
</tr>
<tr>
<td>2-D Convolution_19</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 64 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 32</td>
<td>18,464</td>
</tr>
<tr>
<td>Batch Normalization_11</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 32, Scale 1 &#x00D7; 1 &#x00D7; 32</td>
<td>64</td>
</tr>
<tr>
<td>Grouped Convolution_7</td>
<td>112 &#x00D7; 112 &#x00D7; 32 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 32</td>
<td>320</td>
</tr>
<tr>
<td>Transposed Convolution_4</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 2 &#x00D7; 2 &#x00D7; 16 &#x00D7; 32, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>2064</td>
</tr>
<tr>
<td>2-D Convolution_20</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 32 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 16</td>
<td>4624</td>
</tr>
<tr>
<td>Batch Normalization_12</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Offset 1 &#x00D7; 1 &#x00D7; 16, Scale 1 &#x00D7; 1 &#x00D7; 16</td>
<td>32</td>
</tr>
<tr>
<td>Grouped Convolution_8</td>
<td>224 &#x00D7; 224 &#x00D7; 16 &#x00D7; 1</td>
<td>Weights 3 &#x00D7; 3 &#x00D7; 1 &#x00D7; 1 &#x00D7; 16, Bias 1 &#x00D7; 1 &#x00D7; 1 &#x00D7; 16</td>
<td>160</td>
</tr>
<tr>
<td>2-D Convolution_21</td>
<td>224 &#x00D7; 224 &#x00D7; 2 &#x00D7; 1</td>
<td>Weights 1 &#x00D7; 1 &#x00D7; 16 &#x00D7; 2, Bias 1 &#x00D7; 1 &#x00D7; 2</td>
<td>34</td>
</tr>
</tbody>
</table>
</table-wrap>
</app>
<app id="app-2">
<title>Appendix B Qualitative Segmentation Results on Public Skin Lesion Datasets</title>
<fig id="fig-15">
<label>Figure A1</label>
<caption>
<title>Examples of HAM10000 segmentations. Green indicates LANET predictions; red indicates ground truth.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-15.tif"/>
</fig>
<fig id="fig-16">
<label>Figure A2</label>
<caption>
<title>Examples of ISIC 2017 segmentations with predictions in green and ground truth in red.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-16.tif"/>
</fig>
<fig id="fig-17">
<label>Figure A3</label>
<caption>
<title>Segmentation examples from ISIC 2018 showing predictions (green) and ground truth (red).</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-17.tif"/>
</fig>
<fig id="fig-18">
<label>Figure A4</label>
<caption>
<title>Segmentation results from the PH2 dataset with predictions shown in green and ground truth in red.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMES_75537-fig-18.tif"/>
</fig>
</app>
</app-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mph</surname> <given-names>RLS</given-names></string-name>, <string-name><surname>Sung</surname> <given-names>H</given-names></string-name>, <string-name><surname>Mph</surname> <given-names>TBK</given-names></string-name>, <string-name><surname>Msph</surname> <given-names>ANG</given-names></string-name></person-group>. <article-title>Cancer statistics 2025</article-title>. <source>A Cancer J Clin</source>. <year>2025</year>:<fpage>10</fpage>&#x2013;<lpage>45</lpage>. doi:<pub-id pub-id-type="doi">10.3322/caac.21871</pub-id>; <pub-id pub-id-type="pmid">39817679</pub-id></mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hameed</surname> <given-names>M</given-names></string-name>, <string-name><surname>Zameer</surname> <given-names>A</given-names></string-name>, <string-name><surname>Raja</surname> <given-names>MAZ</given-names></string-name></person-group>. <article-title>A comprehensive systematic review: advancements in skin cancer classification and segmentation using the ISIC dataset</article-title>. <source>Comput Model Eng Sci</source>. <year>2024</year>;<volume>140</volume>(<issue>3</issue>):<fpage>2131</fpage>&#x2013;<lpage>64</lpage>. doi:<pub-id pub-id-type="doi">10.32604/cmes.2024.050124</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kassem</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Hosny</surname> <given-names>KM</given-names></string-name>, <string-name><surname>Dama&#x0161;evi&#x010D;ius</surname> <given-names>R</given-names></string-name>, <string-name><surname>Eltoukhy</surname> <given-names>MM</given-names></string-name></person-group>. <article-title>Machine learning and deep learning methods for skin lesion classification and diagnosis: a systematic review</article-title>. <source>Diagnostics</source>. <year>2021</year>;<volume>11</volume>(<issue>8</issue>):<fpage>1390</fpage>. doi:<pub-id pub-id-type="doi">10.3390/diagnostics11081390</pub-id>; <pub-id pub-id-type="pmid">34441324</pub-id></mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Luo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Greer</surname> <given-names>P</given-names></string-name></person-group>. <article-title>Advancements in skin cancer classification: a review of machine learning techniques in clinical image analysis</article-title>. <source>Multimed Tools Appl</source>. <year>2025</year>;<volume>84</volume>(<issue>11</issue>):<fpage>9837</fpage>&#x2013;<lpage>64</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11042-024-19298-2</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Ronneberger</surname> <given-names>O</given-names></string-name>, <string-name><surname>Fischer</surname> <given-names>P</given-names></string-name>, <string-name><surname>Brox</surname> <given-names>T</given-names></string-name></person-group>. <chapter-title>U-Net: convolutional networks for biomedical image segmentation</chapter-title>. In: <source>Medical image computing and computer-assisted intervention&#x2014;MICCAI 2015</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>; <year>2015</year>. p. <fpage>234</fpage>&#x2013;<lpage>41</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-319-24574-4_28</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Rahman Siddiquee</surname> <given-names>MM</given-names></string-name>, <string-name><surname>Tajbakhsh</surname> <given-names>N</given-names></string-name>, <string-name><surname>Liang</surname> <given-names>J</given-names></string-name></person-group>. <chapter-title>UNet&#x002B;&#x002B;: a nested U-Net architecture for medical image segmentation</chapter-title>. In: <source>Deep learning in medical image analysis and multimodal learning for clinical decision support</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>; <year>2018</year>. p. <fpage>3</fpage>&#x2013;<lpage>11</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-00889-5_1</pub-id>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Khan</surname> <given-names>S</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Teng</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>DFF-UNet: a lightweight deep feature fusion U-Net model for skin lesion segmentation</article-title>. <source>IEEE Trans Instrum Meas</source>. <year>2025</year>;<volume>74</volume>:<fpage>5030214</fpage>. doi:<pub-id pub-id-type="doi">10.1109/TIM.2025.3565715</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Arshad</surname> <given-names>S</given-names></string-name>, <string-name><surname>Amjad</surname> <given-names>T</given-names></string-name>, <string-name><surname>Hussain</surname> <given-names>A</given-names></string-name>, <string-name><surname>Qureshi</surname> <given-names>I</given-names></string-name>, <string-name><surname>Abbas</surname> <given-names>Q</given-names></string-name></person-group>. <article-title>Dermo-seg: resNet-UNet architecture and hybrid loss function for detection of differential patterns to diagnose pigmented skin lesions</article-title>. <source>Diagnostics</source>. <year>2023</year>;<volume>13</volume>(<issue>18</issue>):<fpage>2924</fpage>. doi:<pub-id pub-id-type="doi">10.3390/diagnostics13182924</pub-id>; <pub-id pub-id-type="pmid">37761291</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Nie</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>S</given-names></string-name></person-group>. <article-title>MSREA-Net: an efficient skin disease segmentation method based on multi-level resolution receptive field</article-title>. <source>Appl Sci</source>. <year>2023</year>;<volume>13</volume>(<issue>18</issue>):<fpage>10315</fpage>. doi:<pub-id pub-id-type="doi">10.3390/app131810315</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Bozorgpour</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sadegheih</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Kazerouni</surname> <given-names>A</given-names></string-name>, <string-name><surname>Azad</surname> <given-names>R</given-names></string-name>, <string-name><surname>Merhof</surname> <given-names>D</given-names></string-name></person-group>. <chapter-title>DermoSegDiff: a boundary-aware segmentation diffusion model for skin lesion delineation</chapter-title>. In: <source>Predictive intelligence in medicine</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer Nature</publisher-name>; <year>2023</year>. p. <fpage>146</fpage>&#x2013;<lpage>58</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-031-46005-0_13</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ding</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yi</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Xiao</surname> <given-names>J</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Liao</surname> <given-names>Z</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>CTH-Net: a CNN and transformer hybrid network for skin lesion segmentation</article-title>. <source>iScience</source>. <year>2024</year>;<volume>27</volume>(<issue>4</issue>):<fpage>109442</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.isci.2024.109442</pub-id>; <pub-id pub-id-type="pmid">38523786</pub-id></mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Hatamizadeh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Tang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Nath</surname> <given-names>V</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>D</given-names></string-name>, <string-name><surname>Myronenko</surname> <given-names>A</given-names></string-name>, <string-name><surname>Landman</surname> <given-names>B</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>UNETR: transformers for 3D medical image segmentation</article-title>. In: <conf-name>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); 2022 Jan 3&#x2013;8</conf-name>; <publisher-loc>Waikoloa, HI, USA</publisher-loc>. p. <fpage>1748</fpage>&#x2013;<lpage>58</lpage>. doi:<pub-id pub-id-type="doi">10.1109/WACV51458.2022.00181</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Khalaf</surname> <given-names>AD</given-names></string-name>, <string-name><surname>Hamdan</surname> <given-names>H</given-names></string-name>, <string-name><surname>Abdul Halin</surname> <given-names>A</given-names></string-name>, <string-name><surname>Manshor</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Segmentation and classification of skin cancer diseases based on deep learning: challenges and future directions</article-title>. <source>IEEE Access</source>. <year>2025</year>;<volume>13</volume>(<issue>1</issue>):<fpage>90163</fpage>&#x2013;<lpage>84</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ACCESS.2025.3569170</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>LC</given-names></string-name>, <string-name><surname>Papandreou</surname> <given-names>G</given-names></string-name>, <string-name><surname>Kokkinos</surname> <given-names>I</given-names></string-name>, <string-name><surname>Murphy</surname> <given-names>K</given-names></string-name>, <string-name><surname>Yuille</surname> <given-names>AL</given-names></string-name></person-group>. <article-title>DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs</article-title>. <source>IEEE Trans Pattern Anal Mach Intell</source>. <year>2018</year>;<volume>40</volume>(<issue>4</issue>):<fpage>834</fpage>&#x2013;<lpage>48</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TPAMI.2017.2699184</pub-id>; <pub-id pub-id-type="pmid">28463186</pub-id></mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>S</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>G</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>W</given-names></string-name>, <string-name><surname>Lei</surname> <given-names>B</given-names></string-name>, <string-name><surname>Wen</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>FAT-Net: feature adaptive transformers for automated skin lesion segmentation</article-title>. <source>Med Image Anal</source>. <year>2022</year>;<volume>76</volume>(<issue>9</issue>):<fpage>102327</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.media.2021.102327</pub-id>; <pub-id pub-id-type="pmid">34923250</pub-id></mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Codella</surname> <given-names>NCF</given-names></string-name>, <string-name><surname>Gutman</surname> <given-names>D</given-names></string-name>, <string-name><surname>Celebi</surname> <given-names>ME</given-names></string-name>, <string-name><surname>Helba</surname> <given-names>B</given-names></string-name>, <string-name><surname>Marchetti</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Dusza</surname> <given-names>SW</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>Skin lesion analysis toward melanoma detection: a challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)</article-title>. In: <conf-name>2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018); 2018 Apr 4&#x2013;7</conf-name>; <publisher-loc>Washington, DC, USA</publisher-loc>. p. <fpage>168</fpage>&#x2013;<lpage>72</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ISBI.2018.8363547</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Gong</surname> <given-names>X</given-names></string-name>, <string-name><surname>Li</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Skin lesion segmentation with a multiscale input fusion U-Net incorporating Res2-SE and pyramid dilated convolution</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>7975</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-92447-1</pub-id>; <pub-id pub-id-type="pmid">40055411</pub-id></mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Upadhyay</surname> <given-names>AK</given-names></string-name>, <string-name><surname>Bhandari</surname> <given-names>AK</given-names></string-name></person-group>. <article-title>MaS-TransUNet: a multiattention swin transformer U-Net for medical image segmentation</article-title>. <source>IEEE Trans Radiat Plasma Med Sci</source>. <year>2025</year>;<volume>9</volume>(<issue>5</issue>):<fpage>613</fpage>&#x2013;<lpage>26</lpage>. doi:<pub-id pub-id-type="doi">10.1109/TRPMS.2024.3477528</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ahmed</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>G</given-names></string-name>, <string-name><surname>Bilal</surname> <given-names>A</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Ebad</surname> <given-names>SA</given-names></string-name></person-group>. <article-title>Precision and efficiency in skin cancer segmentation through a dual encoder deep learning model</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>(<issue>1</issue>):<fpage>4815</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-88753-3</pub-id>; <pub-id pub-id-type="pmid">39924555</pub-id></mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ji</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>X</given-names></string-name></person-group>. <article-title>BDFormer: boundary-aware dual-decoder transformer for skin lesion segmentation</article-title>. <source>Artif Intell Med</source>. <year>2025</year>;<volume>162</volume>(<issue>6</issue>):<fpage>103079</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.artmed.2025.103079</pub-id>; <pub-id pub-id-type="pmid">39983372</pub-id></mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhu</surname> <given-names>K</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>R</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>D</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>B</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>EM-Net: effective and morphology-aware network for skin lesion segmentation</article-title>. <source>Expert Syst Appl</source>. <year>2025</year>;<volume>285</volume>(<issue>1</issue>):<fpage>127668</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eswa.2025.127668</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Pennisi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bloisi</surname> <given-names>DD</given-names></string-name>, <string-name><surname>Suriani</surname> <given-names>V</given-names></string-name>, <string-name><surname>Nardi</surname> <given-names>D</given-names></string-name>, <string-name><surname>Facchiano</surname> <given-names>A</given-names></string-name>, <string-name><surname>Giampetruzzi</surname> <given-names>AR</given-names></string-name></person-group>. <article-title>Skin lesion area segmentation using attention squeeze U-Net for embedded devices</article-title>. <source>J Digit Imag</source>. <year>2022</year>;<volume>35</volume>(<issue>5</issue>):<fpage>1217</fpage>&#x2013;<lpage>30</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10278-022-00634-7</pub-id>; <pub-id pub-id-type="pmid">35505265</pub-id></mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>G</given-names></string-name>, <string-name><surname>Dong</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zeng</surname> <given-names>J</given-names></string-name>, <string-name><surname>Qin</surname> <given-names>C</given-names></string-name></person-group>. <article-title>DSNET: a lightweight segmentation model for segmentation of skin cancer lesion regions</article-title>. <source>IEEE Access</source>. <year>2025</year>;<volume>13</volume>:<fpage>31095</fpage>&#x2013;<lpage>104</lpage>. doi:<pub-id pub-id-type="doi">10.1109/ACCESS.2025.3539521</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Naveed</surname> <given-names>A</given-names></string-name>, <string-name><surname>Naqvi</surname> <given-names>SS</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>TM</given-names></string-name>, <string-name><surname>Iqbal</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wani</surname> <given-names>MY</given-names></string-name>, <string-name><surname>Khan</surname> <given-names>HA</given-names></string-name></person-group>. <article-title>AD-Net: attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation</article-title>. <source>Neural Comput Appl</source>. <year>2024</year>;<volume>36</volume>(<issue>35</issue>):<fpage>22277</fpage>&#x2013;<lpage>99</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00521-024-10362-4</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tschandl</surname> <given-names>P</given-names></string-name>, <string-name><surname>Rosendahl</surname> <given-names>C</given-names></string-name>, <string-name><surname>Kittler</surname> <given-names>H</given-names></string-name></person-group>. <article-title>The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions</article-title>. <source>Sci Data</source>. <year>2018</year>;<volume>5</volume>(<issue>1</issue>):<fpage>180161</fpage>. doi:<pub-id pub-id-type="doi">10.1038/sdata.2018.161</pub-id>; <pub-id pub-id-type="pmid">30106392</pub-id></mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Codella</surname> <given-names>N</given-names></string-name>, <string-name><surname>Rotemberg</surname> <given-names>V</given-names></string-name>, <string-name><surname>Tschandl</surname> <given-names>P</given-names></string-name>, <string-name><surname>Celebi</surname> <given-names>ME</given-names></string-name>, <string-name><surname>Dusza</surname> <given-names>S</given-names></string-name>, <string-name><surname>Gutman</surname> <given-names>D</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Skin lesion analysis toward mela-noma detection 2018: a challenge hosted by the international skin imaging collaboration (ISIC)</article-title>. <comment>arXiv:1902.03368. 2019</comment>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Mendon&#x00E7;a</surname> <given-names>T</given-names></string-name>, <string-name><surname>Ferreira</surname> <given-names>PM</given-names></string-name>, <string-name><surname>Marques</surname> <given-names>JS</given-names></string-name>, <string-name><surname>Marcal</surname> <given-names>ARS</given-names></string-name>, <string-name><surname>Rozeira</surname> <given-names>J</given-names></string-name></person-group>. <article-title>PH2&#x2014;a dermoscopic image database for research and benchmarking</article-title>. In: <conf-name>2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC); 2013 Jul 3&#x2013;7</conf-name>; <publisher-loc>Osaka, Japan</publisher-loc>. p. <fpage>5437</fpage>&#x2013;<lpage>40</lpage>. doi:<pub-id pub-id-type="doi">10.1109/EMBC.2013.6610779</pub-id>; <pub-id pub-id-type="pmid">24110966</pub-id></mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Woo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Park</surname> <given-names>J</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>JY</given-names></string-name>, <string-name><surname>Kweon</surname> <given-names>IS</given-names></string-name></person-group>. <chapter-title>CBAM: convolutional block attention module</chapter-title>. In: <source>Computer vision&#x2014;ECCV 2018</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>; <year>2018</year>. p. <fpage>3</fpage>&#x2013;<lpage>19</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-01234-2_1</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Goodfellow</surname> <given-names>I</given-names></string-name>, <string-name><surname>Bengio</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Courville</surname> <given-names>A</given-names></string-name>, <string-name><surname>Bengio</surname> <given-names>Y</given-names></string-name></person-group>. <source>Deep learning</source>. Vol. <volume>1</volume>. <publisher-loc>Cambridge, MA, USA</publisher-loc>: <publisher-name>MIT press Cambridge</publisher-name>; <year>2016</year>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>L</given-names></string-name>, <string-name><surname>Tong</surname> <given-names>R</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Iwamoto</surname> <given-names>Y</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>UNet 3&#x002B;: a full-scale connected UNet for medical image segmentation</article-title>. In: <conf-name>ICASSP 2020&#x2014;2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2020 May 4&#x2013;8</conf-name>; <publisher-loc>Barcelona, Spain</publisher-loc>. p. <fpage>1055</fpage>&#x2013;<lpage>9</lpage>. doi:<pub-id pub-id-type="doi">10.1109/icassp40776.2020.9053405</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="book"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>LC</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Papandreou</surname> <given-names>G</given-names></string-name>, <string-name><surname>Schroff</surname> <given-names>F</given-names></string-name>, <string-name><surname>Adam</surname> <given-names>H</given-names></string-name></person-group>. <chapter-title>Encoder-decoder with atrous separable convolution for semantic image segmentation</chapter-title>. In: <source>Computer vision&#x2014;ECCV 2018</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>; <year>2018</year>. p. <fpage>833</fpage>&#x2013;<lpage>51</lpage>. doi:<pub-id pub-id-type="doi">10.1007/978-3-030-01234-2_49</pub-id>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>P</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>B</given-names></string-name></person-group>. <article-title>IDSNet: unifying local and global context for skin lesion image segmentation</article-title>. <source>Biomed Signal Process Control</source>. <year>2025</year>;<volume>108</volume>(<issue>1</issue>):<fpage>107961</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2025.107961</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhong</surname> <given-names>L</given-names></string-name>, <string-name><surname>Li</surname> <given-names>T</given-names></string-name>, <string-name><surname>Cui</surname> <given-names>M</given-names></string-name>, <string-name><surname>Cui</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>L</given-names></string-name></person-group>. <article-title>DSU-Net: dual-stage U-Net based on CNN and transformer for skin lesion segmentation</article-title>. <source>Biomed Signal Process Control</source>. <year>2025</year>;<volume>100</volume>(<issue>1</issue>):<fpage>107090</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.bspc.2024.107090</pub-id>.</mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kumari</surname> <given-names>P</given-names></string-name>, <string-name><surname>Agrawal</surname> <given-names>RK</given-names></string-name>, <string-name><surname>Priya</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Weighted fuzzy clustering approach with adaptive spatial information and Kullback-Leibler divergence for skin lesion segmentation</article-title>. <source>Int J Mach Learn Cybern</source>. <year>2025</year>;<volume>16</volume>(<issue>7</issue>):<fpage>5317</fpage>&#x2013;<lpage>37</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s13042-025-02575-3</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Gao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>MSHV-Net: a multi-scale hybrid vision network for skin image segmentation</article-title>. <source>Digit Signal Process</source>. <year>2025</year>;<volume>162</volume>(<issue>2</issue>):<fpage>105166</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.dsp.2025.105166</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ho</surname> <given-names>QH</given-names></string-name>, <string-name><surname>Nguyen</surname> <given-names>TNQ</given-names></string-name>, <string-name><surname>Tran</surname> <given-names>TT</given-names></string-name>, <string-name><surname>Pham</surname> <given-names>VT</given-names></string-name></person-group>. <article-title>LiteMamba-Bound: a lightweight Mamba-based model with boundary-aware and normalized active contour loss for skin lesion segmentation</article-title>. <source>Methods</source>. <year>2025</year>;<volume>235</volume>(<issue>5</issue>):<fpage>10</fpage>&#x2013;<lpage>25</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.ymeth.2025.01.008</pub-id>; <pub-id pub-id-type="pmid">39864606</pub-id></mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><surname>Mahavar</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Skin cancer detection [Internet]. San Francisco, CA, USA: Kaggle</article-title>; <year>2023</year> <comment>[cited 2026 Jan 1]</comment>. Available from: <ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/fatemehmehrparvar/skin-cancer-detection">https://www.kaggle.com/datasets/fatemehmehrparvar/skin-cancer-detection</ext-link>.</mixed-citation></ref>
</ref-list>
</back></article>