<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">JNM</journal-id>
<journal-id journal-id-type="nlm-ta">JNM</journal-id>
<journal-id journal-id-type="publisher-id">JNM</journal-id>
<journal-title-group>
<journal-title>Journal of New Media</journal-title>
</journal-title-group>
<issn pub-type="epub">2579-0129</issn>
<issn pub-type="ppub">2579-0110</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">31113</article-id>
<article-id pub-id-type="doi">10.32604/jnm.2022.031113</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Semi-Supervised Medical Image Segmentation Based on Generative Adversarial Network</article-title>
<alt-title alt-title-type="left-running-head">Semi-Supervised Medical Image Segmentation Based on Generative Adversarial Network</alt-title>
<alt-title alt-title-type="right-running-head">Semi-Supervised Medical Image Segmentation Based on Generative Adversarial Network</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Tan</surname><given-names>Yun</given-names></name><xref ref-type="aff" rid="aff-1">1</xref>
<xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Wu</surname><given-names>Weizhao</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Tan</surname><given-names>Ling</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Peng</surname><given-names>Haikuo</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-5" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Qin</surname><given-names>Jiaohua</given-names></name><xref ref-type="aff" rid="aff-2">2</xref><email>qinjiaohua@163.com</email></contrib>
<aff id="aff-1"><label>1</label><institution>Hunan Applied Technology University</institution>, <addr-line>Changde, 415000</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>Central South University of Forestry and Technology</institution>, <addr-line>Changsha, 410004</addr-line>, <country>China</country></aff>
<aff id="aff-3"><label>3</label><institution>The Second Xiangya Hospital of Central South University</institution>, <addr-line>Changsha, 410011</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Jiaohua Qin. Email: <email>qinjiaohua@163.com</email></corresp>
</author-notes>
<pub-date pub-type="epub" date-type="pub" iso-8601-date="2022-06-11"><day>11</day>
<month>06</month>
<year>2022</year></pub-date>
<volume>4</volume>
<issue>3</issue>
<fpage>155</fpage>
<lpage>164</lpage>
<history>
<date date-type="received"><day>10</day><month>4</month><year>2022</year></date>
<date date-type="accepted"><day>11</day><month>5</month><year>2022</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2022 Tan et al.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Tan et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_JNM_31113.pdf"></self-uri>
<abstract>
<p>At present, segmentation for medical image is mainly based on fully supervised model training, which consumes a lot of time and labor for dataset labeling. To address this issue, we propose a semi-supervised medical image segmentation model based on a generative adversarial network framework for automated segmentation of arteries. The network is mainly composed of two parts: a segmentation network for medical image segmentation and a discriminant network for evaluating segmentation results. In the initial stage of network training, a fully supervised training method is adopted to make the segmentation network and the discrimination network have certain segmentation and discrimination capabilities. Then a semi-supervised method is adopted to train the model, in which the discriminant network will generate pseudo-labels on the results of the segmentation for semi-supervised training of the segmentation network. The proposed method can use a small part of annotated dataset to realize the segmentation of medical images and effectively solve the problem of insufficient medical image annotation data.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Medical image</kwd>
<kwd>semi-supervised</kwd>
<kwd>U-net</kwd>
<kwd>generative adversarial network</kwd>
<kwd>image segmentation</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1"><label>1</label><title>Introduction</title>
<p>In recent years, with the continuous optimization of the performance of convolutional neural networks, deep learning models have outperformed humans in some image processing tasks.</p>
<p>Computer Aided Diagnosis (CAD) uses computer technology to help medical staff deal with complex tasks. With the application of artificial intelligence technology in CAD, CAD can more effectively collect information from medical images of patients, provide doctors with diagnostic suggestions, and improve the efficiency of doctors&#x2019; diagnosis. Medical image segmentation and detection [<xref ref-type="bibr" rid="ref-1">1</xref>] is a basic task in CAD, which can effectively extract the focal organ region. At present, medical image segmentation methods based on deep learning can be composed of three categories: methods that take candidate regions, methods based on fully convolutional networks (FCNs), and weakly supervised segmentation methods [<xref ref-type="bibr" rid="ref-2">2</xref>]. Among the methods based on fully convolutional networks, the U-Net network proposed by Ronneberger&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-3">3</xref>] in 2015 is often used for medical image semantic segmentation. It innovatively proposes U-shaped structure and Skip-Connect. During the upsampling process, high-level semantic features of different scales and low-level features are fused, so that the output of the network has more refined edge information. Hou&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-4">4</xref>] improved medical image segmentation methods using attention mechanism and feature fusion. Zhou&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-5">5</xref>] applied sequential feature pyramid and attention mechanism to multi-label segmentation and achieved state-of-the-art performance on their dataset.</p>
<p>Although fully supervised convolutional neural networks show remarkable results in segmentation tasks, they require many labor costs for data cleaning and labeling. Moreover, the complexity of medical image semantics makes its data annotation more time-consuming and labor-intensive than natural images. Semi-supervised algorithms can rely on a small amount of labeled data, and introduce easily obtained unlabeled data to complete the training of the model, which reduces the cost of data set production, and gradually shows its performance in image segmentation.</p>
<p>At present, the medical image segmentation based on weak supervision mainly adopts the structure of Mean Teacher [<xref ref-type="bibr" rid="ref-6">6</xref>] to design. Sun&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-7">7</xref>] completed weakly supervised training on liver lesion segmentation using a teacher-student architecture. Goodfellow&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-8">8</xref>] proposed generative Adversarial Network (GAN) in 2014, and this structure uses adversarial learning to train the model. Son&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-9">9</xref>] used GAN to segment retinal blood vessels in 2017, and achieved good results. Liu&#x00A0;et&#x00A0;al.&#x00A0;[<xref ref-type="bibr" rid="ref-10">10</xref>] proposed a semi-supervised segmentation model, CDR-GANs, to split the optic disc and optic cup segmentation task of fundus maps into two independent stages, each of which was trained separately using GAN. Its performance is close to that of fully supervised learning models.</p>
<p>This work draws on the structural framework of GAN and introduces U-Net based on literature [<xref ref-type="bibr" rid="ref-11">11</xref>] to construct a new semi-supervised semantic segmentation network. The network utilizes the excellent performance of U-Net in medical image segmentation to achieve semi-supervised segmentation of medical images, which can effectively solve the problem of insufficient labeling data for medical images. In this work, a CT image dataset of aortic angiography is constructed, which contains a number of 2598 arterial images. The research uses the dataset to train the model, and finally realizes the automatic segmentation of pulmonary artery and ascending and descending aorta multi-classification tasks.</p>
</sec>
<sec id="s2"><label>2</label><title>Method</title>
<sec id="s2_1"><label>2.1</label><title>Overall Structure</title>
<p>The overall framework of the implemented semi-supervised semantic segmentation model is shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>.</p>
<fig id="fig-1"><label>Figure 1</label><caption><title>Framework diagram of semi-supervised segmentation model based on GAN framework</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-1.png"/></fig>
<p>The framework consists of two parts, a segmentation network S based on the U-Net [<xref ref-type="bibr" rid="ref-12">12</xref>] network and a discriminative network D that discriminates the segmentation map. The segmentation network inputs a size of (512, 512, 3) medical image and outputs a classification probability map for a size of (512, 512, 3) medical image, where the output of 3 represents the number of categories that need to be output by the segmentation network. The (<italic>i, j, k</italic>)&#x2208;(512, 512, 3) in the probability map represent the probability that the location pixel point in the original image belongs to category <italic>k</italic>. The input of the discriminative network is the ground-truth label information or the classification probability map output by the U-Net segmentation network. The output of the discriminative network is a confidence map.</p>
<p>We use fully supervised training in the early stage to make it have a certain segmentation ability, and use semi-supervised training in the later stage. The cross-entropy <italic>L<sub>ce</sub></italic> of the segmentation results with the ground truth and the <italic>L<sub>adv</sub></italic> under the adversarial framework are used to optimize the fully supervised training of the segmentation network. Since label data is not used during semi-supervised training, we train the model using the high confidence maps output by the discriminator as pseudo labels. With the addition of pseudo-labels, the loss entropy of the loss function of the segmentation network becomes composed of the segmentation result, the loss entropy of the pseudo-labels <italic>L<sub>semi</sub></italic> and the result of the discriminant network <italic>L<sub>adv</sub></italic> component. The training of the discriminant network only needs to use labeled data for fully supervised training.</p>
</sec>
<sec id="s2_2"><label>2.2</label><title>Segmentation Network</title>
<p>U-Net is a semantic segmentation model based on full convolutional neural network. The encoder-decoder constitutes a unique U-shaped structure, and an innovative jumper structure is proposed to save feature information more effectively. The basic structure is shown in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>.</p>
<fig id="fig-2"><label>Figure 2</label><caption><title>U-Net network structure</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-2.png"/></fig>
<p>The segmentation network adopts the U-Net network as the backbone network due to its segmentation capability in the field of medical images, which is locally perceptive and can be trained with fewer samples. The U-Net network consists of two paths, the left path is the Contracting Path, which is mainly used to down sample the image to extract the high-dimensional features of the image, and the right Expansive Path, which aims to fuse the feature map back to the same size as the original image.</p>
</sec>
<sec id="s2_3"><label>2.3</label><title>Discriminant Network</title>
<p>The setting of the discriminative network is based on the discriminative structure design in DCGAN [<xref ref-type="bibr" rid="ref-13">13</xref>]. The discriminator receives a 512&#x2009;&#x00D7;&#x2009;512&#x2009;&#x00D7;&#x2009;3 classification probability map or one-hot encoded label output by the segmentation network. The backbone of the network consists of 6 convolutional layers with the same settings as the DCGAN discriminator network, as see in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>. The pooling layer operation and batch normalization are cancelled, a 4&#x2009;&#x00D7;&#x2009;4 convolution kernel is used for convolution and pooling with a stride 2 convolution operation, and Leaky Relu [<xref ref-type="bibr" rid="ref-14">14</xref>] is used as&#x00A0;the activation function, so that the discriminant network can be trained stably. Finally, the output of the&#x00A0;discriminant network is upsampled by an interpolation operation in order to be able to calculate the loss value.</p>
<fig id="fig-3"><label>Figure 3</label><caption><title>Network structure of discriminator</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-3.png"/></fig>
</sec>
</sec>
<sec id="s3"><label>3</label><title>Experiment and Result Analysis</title>
<sec id="s3_1"><label>3.1</label><title>Dataset Construction</title>
<p>The dataset used in this work is the dataset of arterial angiography, which consists of 2608 images in total. The mask label is completed under the guidance of medical experts. The dataset images mainly include the aorta (AA) and pulmonary artery (PA), and the image size is 512&#x2009;&#x00D7;&#x2009;512. The image and annotation label visualization results of the constructed arterial dataset are shown in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>. The data image at the top and the corresponding labels at the bottom. The red area in the label map is AA and the green area is PA.</p>
<fig id="fig-4"><label>Figure 4</label><caption><title>The constructed arterial dataset</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-4.png"/></fig>
</sec>
<sec id="s3_2"><label>3.2</label><title>Training of Discriminative Networks</title>
<p>The discriminator proposed in this work is mainly used to distinguish whether the input image <italic>X<sub>n</sub></italic> is from the output of the segmentation network <italic>S(X<sub>n</sub>)</italic> or the ground truth label <italic>Y<sub>n</sub></italic>. In each training epoch, we alternately train the segmentation network which denote by <italic>S(</italic>&#x22C5;<italic>)</italic> and the discriminant network <italic>D(</italic>&#x22C5;<italic>)</italic>, and for the discriminant network, we train the discriminator using the segmentation map generated by the segmentation network and the real label map, respectively. When the output probability value of the discriminator network is closer to 1, it means that the image <italic>X<sub>n</sub></italic> input to the discriminator is closer to the real segmented image, otherwise the <italic>X<sub>n</sub></italic> is a fake image with the output probability value of the <italic>D(</italic>&#x22C5;<italic>)</italic> is closer to 0. We employ a binary cross-entropy loss denoted as <italic>L<sub>bce</sub></italic> to detect each pixel of the input image, which is used to constrain the training of the network. The loss function <italic>L<sub>D</sub></italic> of the discriminant network is shown in <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>, where <italic>D(</italic>&#x22C5;<italic>)<sup>(h, w)</sup></italic> represents the confidence map output at position <italic>(h, w)</italic>. We use gradient descent for network training in order to achieve rapid convergence.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mi>D</mml:mi></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo>,</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
</sec>
<sec id="s3_3"><label>3.3</label><title>Training of Segmentation Networks</title>
<sec id="s3_3_1"><label>3.3.1</label><title>Fully Supervised Training</title>
<p>During GAN network training, the purpose of the segmentation network is to generate as realistic a label map as possible to fool the discriminator. The segmentation network and the identification network are always in a process of confrontation. In order to make the two more reasonably confrontational training, a fully supervised training method is adopted for the segmentation network in the early stage. The loss function for fully supervised training of the segmentation network denoted as <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msubsup><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:msubsup></mml:math></inline-formula> is given in <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:msubsup><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mi>f</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>&#x03BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula></p>
<p>The <italic>L<sub>ce</sub></italic> in <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref> represents the multi-class cross entropy loss function, and <italic>L<sub>adv</sub></italic> in <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref> is denoted as the loss of the adversarial network, as shown in <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>. We add the adversarial loss to the semantic segmentation network based on the GAN framework, and adjust the proportion of the adversarial loss in the loss function of the segmentation network through hyper parameters <italic>&#x03BB;<sub>adv</sub></italic> to train the segmentation network.</p>
<p>For fully supervised training of the segmentation network, the segmentation result of the input image and the ground truth label <italic>Y<sub>n</sub></italic> encoded by One-Hot are computed as losses <italic>L<sub>ce</sub></italic> in <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref>, where C represents the pulmonary artery, aorta and background. The <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:msubsup><mml:mi>Y</mml:mi><mml:mrow><mml:mtext>n</mml:mtext></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> represents a mask of class c, the pixel <italic>(h, w)</italic> belonging to class c has a value of 1, and the other parts are 0. The loss function for the adversarial training is given in <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>.
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:msubsup><mml:mi>L</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
</sec>
<sec id="s3_3_2"><label>3.3.2</label><title>Semi-Supervised Training</title>
<p>Since unlabeled data accounts for the majority of medical image datasets, we perform fully supervised training on the segmentation network in the first 1500 epochs to enable the segmentation network to have preliminary segmentation capabilities. At the same time, the discriminative network produces discriminative power in alternate training. Finally, the training is continued in a semi-supervised manner.</p>
<p>After fully supervised training, the discriminative network is able to generate a probability map with high confidence in the results generated by the segmentation network. Therefore, the unlabeled data can utilize the GAN network to generate a high-quality confidence map, combined with the corresponding binarization process to generate a pseudo-label for supervised model training. The semi-supervised segmentation network loss <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msubsup><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mi>s</mml:mi></mml:msubsup></mml:math></inline-formula> is shown in <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref>:
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msubsup><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mi>s</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>&#x03BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>&#x03BB;</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>where the <italic>L<sub>ce</sub></italic> and <italic>L<sub>adv</sub></italic> is the same as the loss function for fully supervised training, and <italic>L<sub>semi</sub></italic> denotes the multi-class cross-loss entropy during semi-supervised training. Because pseudo-labels are introduced to update the network parameters, a hyper parameter <italic>&#x03BB;<sub>semi</sub></italic> is needed to reduce the impact of incorrect labels on the network to be adjusted to prevent large errors from being generated.</p>
<p>The semi-supervised multi-class crossover loss function <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is given in <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref>.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mrow><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:munder><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x003E;</mml:mo><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>where <italic>T<sub>semi</sub></italic> is the threshold to binarize the confidence graph of the discriminative network and <italic>I(</italic>&#x22C5;<italic>)</italic> is used to filter the high confidence in the confidence graph for classification to perform pseudo-label construction.</p>
</sec>
</sec>
<sec id="s3_4"><label>3.4</label><title>Experimental Environment and Parameter Settings</title>
<p>The model code is written based on the Pytorch framework, and the model is trained on a 3060 Nvidia GPU with 12G memory. The specific parameter settings are shown in <xref ref-type="table" rid="table-1">Tab. 1</xref>.</p>
<table-wrap id="table-1"><label>Table 1</label><caption><title>Parameter settings for semi-supervised training</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Parameter</th>
<th align="left">Epoch</th>
<th align="left"><italic>T<sub>semi</sub></italic></th>
<th align="left">Momentum</th>
<th align="left">Weight decay</th>
<th align="left">Optimization</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Setup</td>
<td align="left">5000</td>
<td align="left">1500</td>
<td align="left">0.9</td>
<td align="left">1e&#x02212;4</td>
<td align="left">Adam</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>We randomly divided all data into 5 parts. For fully supervised training, one of them is selected as the test data in turn, and the other four are used as the training data. For semi-supervised training, one of the data is used as the test set in turn, two of the remaining data are used for fully supervised training, and the other two are used as unlabeled data for semi-supervised training.</p>
</sec>
<sec id="s3_5"><label>3.5</label><title>Evaluation Criteria</title>
<p>In this work, the performance of the model is evaluated from multiple perspectives using multiple evaluation criteria: mean intersection ratio (MIOU), Dice coefficient, specificity, and accuracy rate. The semantic segmentation results can be expressed in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>, and the 0 and 1 represent the class of the sample, respectively.</p>
<fig id="fig-5"><label>Figure 5</label><caption><title>Four cases of semantic segmentation results</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-5.png"/></fig>
<p>Mean Intersection Over Union (<italic>MIOU</italic>), which calculates the ratio of the intersection to the union between the two sets of predicted and true values. As shown in <xref ref-type="disp-formula" rid="eqn-7">Eq. (7)</xref>.
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:mi>M</mml:mi><mml:mi>I</mml:mi><mml:mi>O</mml:mi><mml:mi>U</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>The Dice coefficient (<italic>DICE</italic>) is an ensemble similarity measure function used to calculate the similarity between two samples. Its formula is given in <xref ref-type="disp-formula" rid="eqn-8">Eq. (8)</xref>. The value of Dice is taken between 0 and 1. Larger values indicate a larger proportion of overlapping regions and better prediction of segmentation.
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>The specificity (<italic>S</italic>) is the proportion of the predicted sample that is actually negative and is given in <xref ref-type="disp-formula" rid="eqn-9">Eq. (9)</xref>. Where the recall rate indicates how many of the predicted negative samples are truly negative samples.
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Precision (<italic>P</italic>) is the proportion of true positives in a positive sample and is given in <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref>. Where the precision rate is mainly for how many of the samples predicted to be positive are positive samples.
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mrow><mml:mi>P</mml:mi><mml:mo>=</mml:mo></mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
</sec>
<sec id="s3_6"><label>3.6</label><title>Experimental Results</title>
<p>We first evaluate the fully supervised training effect of the adversarial learning model on the arterial dataset with different neural networks, as shown in <xref ref-type="table" rid="table-2">Tab. 2</xref>. <italic>MIOU</italic>, <italic>DICE</italic>, <italic>S</italic>, <italic>P</italic> represent mean intersection ratio, Dice coefficient, specificity, and precision, respectively. Compared with adding attention mechanism to U-net, the average results of three experiments of adversarial learning are improved in <italic>MIOU</italic>, <italic>DICE</italic> and <italic>P</italic>. This shows the advantages of adversarial learning.</p>
<table-wrap id="table-2"><label>Table 2</label><caption><title>Fully supervised training evaluation results</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Method</th>
<th align="left"><italic>MIOU</italic></th>
<th align="left"><italic>DICE</italic></th>
<th align="left"><italic>S</italic></th>
<th align="left"><italic>P</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">U-Net</td>
<td align="left">0.81792</td>
<td align="left">0.87864</td>
<td align="left">0.98587</td>
<td align="left">0.90327</td>
</tr>
<tr>
<td align="left">Attention U-Net</td>
<td align="left">0.83178</td>
<td align="left">0.89190</td>
<td align="left"><bold>0.98900</bold></td>
<td align="left">0.90367</td>
</tr>
<tr>
<td align="left">Adv U-Net(Proposed)</td>
<td align="left"><bold>0.85057</bold></td>
<td align="left"><bold>0.90507</bold></td>
<td align="left">0.98487</td>
<td align="left"><bold>0.92367</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Due to the small amount of data, we performed five-fold cross-validation for semi-supervised training. When the labeled data is reduced to 1/2, 1/4, and 1/8 of the original labeled data, the detection results are shown in <xref ref-type="table" rid="table-3">Tab. 3</xref>. We can see that under the premise of the same amount of data, the proposed semi-supervised method achieves the performance of U-net&#x2019;s full supervision when the training data is reduced to 1/2.</p>
<table-wrap id="table-3"><label>Table 3</label><caption><title>Semi-supervised learning results with different ratios</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left"/>
<th align="left"><italic>MIOU</italic></th>
<th align="left"><italic>DICE</italic></th>
<th align="left"><italic>S</italic></th>
<th align="left"><italic>P</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1/2 semi-supervised</td>
<td align="left">0.81028</td>
<td align="left">0.87595</td>
<td align="left">0.98127</td>
<td align="left">0.90787</td>
</tr>
<tr>
<td align="left">1/4 semi-supervised</td>
<td align="left">0.77837</td>
<td align="left">0.84930</td>
<td align="left">0.98293</td>
<td align="left">0.88400</td>
</tr>
<tr>
<td align="left">1/8 semi-supervised</td>
<td align="left">0.72426</td>
<td align="left">0.81167</td>
<td align="left">0.98373</td>
<td align="left">0.84600</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The experimental results in <xref ref-type="table" rid="table-3">Tab. 3</xref> verify the feasibility of the method and provide a new idea for medical image segmentation. <xref ref-type="fig" rid="fig-6">Fig. 6</xref> below shows the prediction results of this method under each model. We can see that 1/2 semi-supervised segmentation performance is comparable to full supervision, but 1/8 is inferior in semi-supervised performance, especially on the performance of pulmonary aorta segmentation. For slices containing only the aorta, the semi-supervised trained model at each scale can segment the aorta with good contours.</p>
<fig id="fig-6"><label>Figure 6</label><caption><title>The prediction result of the proposed model. 1/2, 1/4, 1/8 represent the results of the semi-supervised training process reducing the labeled data to the proportion of the original fully supervised data, respectively</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-6a.png"/><graphic mimetype="image" mime-subtype="png" xlink:href="JNM_31113-fig-6b.png"/></fig>
</sec>
</sec>
<sec id="s4"><label>4</label><title>Concluding Remarks</title>
<p>In this work, we adopt an adversarial learning framework for semi-supervised training of segmentation networks. The performance of the model trained with half the labeled data is close to the performance of the fully supervised training model, and certain results have been achieved. This also proves the advantages of semi-supervised learning in medical image segmentation, which can use a small amount of data to complete the training of the segmentation network. However, there is still a disadvantage that network training requires constant adjustment of thresholds to optimize pseudo-label generation.</p>
</sec>
</body>
<back>
<ack>
<p>The author would like to thank the support of Central South University of Forestry &#x0026; Technology and the support of the Second Xiangya Hospital of Central South University.</p>
</ack>
<fn-group>
<fn fn-type="other"><p><bold>Funding Statement:</bold> This work was supported in part by the National Natural Science Foundation of China (No. 62002392); in part by the Key Research and Development Plan of Hunan Province (No. 2019SK2022); in part by the Natural Science Foundation of Hunan Province (No. 2020JJ4140 and 2020JJ4141); in part by the Postgraduate Excellent teaching team Project of Hunan Province [Grant [2019] 370&#x2013;133].</p></fn>
<fn fn-type="conflict"><p><bold>Conflicts of Interest:</bold> The authors declare that they have no conflicts of interest to report regarding the present study.</p></fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Tan</surname></string-name>, <string-name><given-names>X. Y.</given-names> <surname>Xiang</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>J. H.</given-names> <surname>Qin</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Automatic detection of aortic dissection based on morphology and deep learning</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>62</volume>, no. <issue>3</issue>, pp. <fpage>1201</fpage>&#x2013;<lpage>1215</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. A.</given-names> <surname>Taghanaki</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Abhishek</surname></string-name>, <string-name><given-names>J. P.</given-names> <surname>Cohen</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Hamarneh</surname></string-name></person-group>, &#x201C;<article-title>Deep semantic segmentation of natural and medical images: A review</article-title>,&#x201D; <source>Artificial Intelligence Review</source>, vol. <volume>54</volume>, no. <issue>1</issue>, pp. <fpage>137</fpage>&#x2013;<lpage>178</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Ronneberger</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Fischer</surname></string-name> and <string-name><given-names>T.</given-names> <surname>Brox</surname></string-name></person-group>, &#x201C;<article-title>U-Net: Convolutional networks for biomedical image segmentation</article-title>,&#x201D; in <conf-name>Proc. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention</conf-name>, <conf-loc>Springer, Cham</conf-loc>, pp. <fpage>234</fpage>&#x2013;<lpage>241</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Hou</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Qin</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Xiang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Tan</surname></string-name> and <string-name><given-names>N. N.</given-names> <surname>Xiong</surname></string-name></person-group>, &#x201C;<article-title>AF-net: A medical image segmentation network based on attention mechanism and feature fusion</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>69</volume>, no. <issue>2</issue>, pp. <fpage>1877</fpage>&#x2013;<lpage>1891</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Q. Y.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>J. H.</given-names> <surname>Qin</surname></string-name>, <string-name><given-names>X. Y.</given-names> <surname>Xiang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Tan</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Ren</surname></string-name></person-group>, &#x201C;<article-title>MOLS-Net: Multi-organ and lesion segmentation network based on sequence feature pyramid and attention mechanism for aortic dissection diagnosis</article-title>,&#x201D; <source>Knowledge-Based Systems</source>, vol. <volume>239</volume>, no. <issue>2022</issue>, pp. <fpage>107853</fpage>&#x2013;<lpage>107864</lpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Tarvainen</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Valpola</surname></string-name></person-group>, &#x201C;<article-title>Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results</article-title>,&#x201D; in <conf-name>Proc. the 31st Int. Conf. on Neural Information Processing Systems</conf-name>, <conf-loc>Long Beach, California, US</conf-loc>, pp. <fpage>1195</fpage>&#x2013;<lpage>1204</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>L. Y.</given-names> <surname>Sun</surname></string-name>, <string-name><given-names>J. X.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>X. H.</given-names> <surname>Ding</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>G. S.</given-names> <surname>Wang</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>A teacher-student framework for semi-supervised medical image segmentation from mixed supervision</article-title>,&#x201D; arXiv preprint arXiv:2010.12219, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I.</given-names> <surname>Goodfellow</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Pouget-Abadie</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Mirza</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Warde-Farley</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Generative adversarial nets</article-title>,&#x201D; <source>Advances in Neural Information Processing Systems</source>, vol. <volume>27</volume>, pp. <fpage>2672</fpage>&#x2013;<lpage>2680</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Son</surname></string-name>, <string-name><given-names>S. J.</given-names> <surname>Park</surname></string-name> and <string-name><given-names>K. H.</given-names> <surname>Jung</surname></string-name></person-group>, &#x201C;<article-title>Towards accurate segmentation of retinal vessels and the optic disc in fundoscopic images with generative adversarial networks</article-title>,&#x201D; <source>Journal of Digit Imaging</source>, vol. <volume>32</volume>, pp. <fpage>499</fpage>&#x2013;<lpage>512</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. P.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>J. M.</given-names> <surname>Hong</surname></string-name>, <string-name><given-names>J. P.</given-names> <surname>Liang</surname></string-name>, <string-name><given-names>X. P.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Ouyang</surname></string-name> <etal>et al.</etal></person-group> &#x201C;<article-title>Medical image segmentation using semi-supervised conditional generative adversarial nets</article-title>,&#x201D; <source>Ruan Jian Xue Bao/Journal of Software</source>, vol. <volume>31</volume>, no. <issue>8</issue>, pp. <fpage>2588</fpage>&#x2212;<lpage>2602</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="other"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>French</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Aila</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Laine</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Mackiewicz</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Finlayson</surname></string-name></person-group>, &#x201C;<article-title>Semi-supervised semantic segmentation needs strong, varied perturbation</article-title>,&#x201D; arXiv preprint arXiv: 1906.01916, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Zeng</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Xie</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>Y.</given-names> <surname>Lu</surname></string-name></person-group>, &#x201C;<article-title>RIC-Unet: An improved neural network based on unet for nuclei segmentation in histology images</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>7</volume>, pp. <fpage>21420</fpage>&#x2013;<lpage>21428</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Fang</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>V. S.</given-names> <surname>Sheng</surname></string-name> and <string-name><given-names>Y. W.</given-names> <surname>Ding</surname></string-name></person-group>, &#x201C;<article-title>A method for improving CNN-based image recognition using DCGAN</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>57</volume>, no. <issue>1</issue>, pp. <fpage>167</fpage>&#x2013;<lpage>178</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Wang</surname></string-name> and <string-name><given-names>D. L.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>A modified leaky ReLU scheme (MLRS) for topology optimization with multiple materials</article-title>,&#x201D; <source>Applied Mathematics and Computation</source>, vol. <volume>352</volume>, pp. <fpage>188</fpage>&#x2013;<lpage>204</lpage>, <year>2019</year>.</mixed-citation></ref>
</ref-list>
</back>
</article>