<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">57269</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2024.057269</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Secure Medical Image Retrieval Based on Multi-Attention Mechanism and Triplet Deep Hashing</article-title>
<alt-title alt-title-type="left-running-head">Secure Medical Image Retrieval Based on Multi-Attention Mechanism and Triplet Deep Hashing</alt-title>
<alt-title alt-title-type="right-running-head">Secure Medical Image Retrieval Based on Multi-Attention Mechanism and Triplet Deep Hashing</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Zhang</surname><given-names>Shaozheng</given-names></name></contrib>
<contrib id="author-2" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Zhang</surname><given-names>Qiuyu</given-names></name><email>zhangqy@lut.edu.cn</email></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Tang</surname><given-names>Jiahui</given-names></name></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Xu</surname><given-names>Ruihua</given-names></name></contrib>
<aff><institution>School of Computer and Communication, Lanzhou University of Technology</institution>, <addr-line>Lanzhou, 730000</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Qiuyu Zhang. Email: <email>zhangqy@lut.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2025</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>17</day><month>02</month><year>2025</year>
</pub-date>
<volume>82</volume>
<issue>2</issue>
<fpage>2137</fpage>
<lpage>2158</lpage>
<history>
<date date-type="received">
<day>13</day>
<month>8</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>01</day>
<month>11</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2025 The Authors.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Published by Tech Science Press.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_57269.pdf"></self-uri>
<abstract>
<p>Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third-party providers is not always guaranteed. To safeguard against the exposure and misuse of personal privacy information, and achieve secure and efficient retrieval, a secure medical image retrieval based on a multi-attention mechanism and triplet deep hashing is proposed in this paper (abbreviated as MATDH). Specifically, this method first utilizes the contrast-limited adaptive histogram equalization method applicable to color images to enhance chest X-ray images. Next, a designed multi-attention mechanism focuses on important local features during the feature extraction stage. Moreover, a triplet loss function is utilized to learn discriminative hash codes to construct a compact and efficient triplet deep hashing. Finally, upsampling is used to restore the original resolution of the images during retrieval, thereby enabling more accurate matching. To ensure the security of medical image data, a lightweight image encryption method based on frequency domain encryption is designed to encrypt the chest X-ray images. The findings of the experiment indicate that, in comparison to various advanced image retrieval techniques, the suggested approach improves the precision of feature extraction and retrieval using the COVIDx dataset. Additionally, it offers enhanced protection for the confidentiality of medical images stored in cloud settings and demonstrates strong practicality.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Secure medical image retrieval</kwd>
<kwd>multi-attention mechanism</kwd>
<kwd>triplet deep hashing</kwd>
<kwd>image enhancement</kwd>
<kwd>lightweight image encryption</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>61862041</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>As medical imaging methods become increasingly widespread, the volume of medical imaging information has surged dramatically, creating critical challenges in storage, retrieval, information security, and optimal utilization for all healthcare institutions [<xref ref-type="bibr" rid="ref-1">1</xref>]. To explore more potential information from medical imaging data, similar medical image retrieval technologies have attracted widespread attention from relevant research experts and scholars [<xref ref-type="bibr" rid="ref-2">2</xref>]. For instance, content-based medical image retrieval (CBMIR) technology is capable of automatically extracting visual features from images [<xref ref-type="bibr" rid="ref-3">3</xref>&#x2013;<xref ref-type="bibr" rid="ref-5">5</xref>]. Furthermore, deep hashing-based medical image retrieval technology provides a new approach to image feature extraction, maintaining the original semantic information of images by constructing high-quality compact binary hash codes [<xref ref-type="bibr" rid="ref-6">6</xref>]. Furthermore, to ensure the safety of medical imaging information stored in the cloud storage and prevent the leakage of personal, institutional, and national healthcare-related information, information security in medical imaging is also a critical issue that cannot be ignored.</p>
<p>Currently, deep hashing technology has become an essential approach for medical image retrieval. This method offers rapid retrieval capabilities and reduced storage expenses [<xref ref-type="bibr" rid="ref-7">7</xref>,<xref ref-type="bibr" rid="ref-8">8</xref>], enhancing the precision and effectiveness of similar medical image searches. For instance, TSDSH [<xref ref-type="bibr" rid="ref-9">9</xref>], MCRLDH [<xref ref-type="bibr" rid="ref-10">10</xref>], DUDH [<xref ref-type="bibr" rid="ref-11">11</xref>], SWTH [<xref ref-type="bibr" rid="ref-12">12</xref>], and DGSSH [<xref ref-type="bibr" rid="ref-13">13</xref>], etc.</p>
<p>In recent years, the advent of cloud storage solutions has significantly improved the ability of healthcare institutions to store medical images. Although the vast storage capacity and extensive computing power of cloud servers (CS) reduce the strain on local image storage and management, healthcare institutions consequently forfeit direct control over medical image data [<xref ref-type="bibr" rid="ref-14">14</xref>]. In the process of cloud storage for medical data, an increasingly serious issue is the illegal replication, modification, and forgery of medical data [<xref ref-type="bibr" rid="ref-15">15</xref>]. Therefore, it is essential to protect data integrity and prevent unauthorized access. Chang et al. [<xref ref-type="bibr" rid="ref-16">16</xref>] addressed cloud platform insecurities and adopted symmetric encryption to prevent unauthorized access or modification of patient data, ensuring medical data security during storage. Thus, safeguarding the privacy of medical images in the cloud necessitates encryption techniques, which are vital for secure medical image retrieval. Existing traditional encryption algorithms such as Data Encryption Standard (DES), Advanced Encryption Standard (AES), and Rivest-Shamir-Adleman (RSA) are not suitable for encrypting multimedia data. Homomorphic encryption (HE) is key to achieving data privacy computation, but currently, no feasible HE technology can encrypt large volumes of images, and HE is overly complex and time-consuming. Chaotic systems, known for their sensitivity to initial conditions, randomness, and ergodicity, are extensively used in multimedia data encryption.</p>
<p>To ensure the privacy and security of medical imaging data, this paper proposes a secure retrieval method for chest X-ray (CXR) images using deep hashing, a multi-attention mechanism, and lightweight encryption. The method, termed MATDH (Multi-Attention Triplet Deep Hashing), offers secure and efficient medical image retrieval. Key contributions are as follows:</p>
<p>1) CXR images are enhanced using contrast-limited adaptive histogram equalization, dynamically adjusting based on local contrast to better preserve details and improve image discriminability and expressiveness.</p>
<p>2) A triplet deep hashing model is designed, combining channel and enhanced spatial attention mechanisms to focus on local features, significantly improving the accuracy of important region extraction in CXR images.</p>
<p>3) Grounded in the principle of frequency domain encryption, a lightweight CXR image encryption method based on chaos theory is proposed. While securing medical images, the complexity of encryption is reduced, and the speed of both encryption and decryption processes is enhanced.</p>
<p>The paper is organized as follows: <xref ref-type="sec" rid="s2">Section 2</xref> reviews related research, <xref ref-type="sec" rid="s3">Section 3</xref> details the medical image secure retrieval method, <xref ref-type="sec" rid="s4">Section 4</xref> validates the scheme through experiments and compares its performance with existing methods, and <xref ref-type="sec" rid="s5">Section 5</xref> concludes the work.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related Work</title>
<sec id="s2_1">
<label>2.1</label>
<title>Medical Image Retrieval Using Deep Hashing Techniques</title>
<p>With the excellent performance of artificial intelligence and deep hashing technology in various fields, deep hashing technology has also been extensively applied within the domain of medical image retrieval [<xref ref-type="bibr" rid="ref-17">17</xref>&#x2013;<xref ref-type="bibr" rid="ref-20">20</xref>]. The medical image retrieval methods using deep hashing are generally divided into supervised deep hashing methods [<xref ref-type="bibr" rid="ref-8">8</xref>,<xref ref-type="bibr" rid="ref-21">21</xref>&#x2013;<xref ref-type="bibr" rid="ref-23">23</xref>] and unsupervised deep hashing methods [<xref ref-type="bibr" rid="ref-18">18</xref>,<xref ref-type="bibr" rid="ref-24">24</xref>,<xref ref-type="bibr" rid="ref-25">25</xref>]. However, for high-precision medical image retrieval, unsupervised hashing methods face semantic gaps due to the lack of label information during network training, affecting the retrieval accuracy of medical images [<xref ref-type="bibr" rid="ref-26">26</xref>]. Therefore, for medical image datasets with complete label information, supervised deep hashing methods have become the optimal solution for constructing binary hash codes. For example, Wang et al. [<xref ref-type="bibr" rid="ref-8">8</xref>] proposed that the triplet constraint be directly integrated into medical image feature learning to capture the intricate relationships among medical images, while encoding-decoding networks are used to enhance the discriminative strength of the generated hash codes. However, this method cannot focus on the detail areas and important features of medical images because it does not enhance the images before retrieval. Fang et al. [<xref ref-type="bibr" rid="ref-22">22</xref>] proposed an attention-based triplet hashing network that effectively retains classification and limited sample information while learning binary hash codes. This method combines cross-entropy loss with triplet loss, simultaneously training similarity loss and classification loss to maintain classification information in hash codes, achieving maximum class discrimination and hash code distinguishability. However, the attention mechanism designed in this method still cannot effectively focus on the important areas of medical images.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Secure Medical Image Retrieval</title>
<p>Given that medical images contain patient information, which is considered highly sensitive data, hospitals or other healthcare institutions, as data owners, do not wish for every user to have access to and view these images. Instead, users are granted access only after authorization [<xref ref-type="bibr" rid="ref-27">27</xref>]. To ensure privacy and confidentiality during the medical image retrieval process, various security services are used, among which image encryption technology has become a crucial means of protecting medical image information from attacks [<xref ref-type="bibr" rid="ref-28">28</xref>&#x2013;<xref ref-type="bibr" rid="ref-32">32</xref>]. Haddad et al. [<xref ref-type="bibr" rid="ref-28">28</xref>] utilized the cipher block chaining mode in AES and combined it with image watermarking technology to encrypt medical images, allowing tracking and controlling the reliability of medical images from encrypted or compressed domains. However, this method not only incurs high encryption costs and complexity but also introduces image quality loss. Guo et al. [<xref ref-type="bibr" rid="ref-29">29</xref>] developed a convolutional neural network (CNN) framework that protects privacy by allowing the use of homomorphic encryption technology for classifying and retrieving encrypted medical images, enhancing the security of the retrieval scheme. Nevertheless, the communication and computational costs significantly increase, making it unsuitable for use in resource-constrained scenarios. Kumar et al. [<xref ref-type="bibr" rid="ref-30">30</xref>] employed the idea of dual encryption to ensure the privacy of medical image retrieval processes. The first-level image encryption employs chaotic Arnold mapping to encrypt the images while preserving their statistical characteristics. The second level of encryption further boosts security and reduces the chances of cryptographic attacks. The use of a simple chaotic Arnold mapping in the first-level encryption cannot achieve perfect secrecy for query images, and encrypted visual content still provides relevant information about the true content of query images, posing a risk of attack. However, the methods used for classification and relevance scores may lead to privacy leaks and other issues.</p>
<p>A lightweight medical image encryption algorithm combining chaos-based frequency domain encryption and the PRESENT algorithm is proposed to ensure security and reduce costs in image retrieval. A triplet deep hashing method with multi-attention (MATDH), integrating channel and enhanced spatial attention mechanisms, improves the discriminability of CXR images. This secure medical image retrieval method enhances efficiency and accuracy while safeguarding data privacy.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>The Proposed Method</title>
<sec id="s3_1">
<label>3.1</label>
<title>System Model</title>
<p><xref ref-type="fig" rid="fig-1">Fig. 1</xref> shows the system model for secure medical image retrieval, avoiding issues such as the leakage of medical images containing patient privacy by cloud servers.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>System model</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-1.tif"/>
</fig>
<p>In <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, this model consists primarily of three entities: Cloud Server (CS), Data Owner (DO), and Data User (DU). The main responsibilities of the three entities in the system model are as follows:</p>
<p>1) Data Owner (DO): DO encrypts medical images using lightweight encryption and uploads them to the cloud. The images are enhanced via contrast-limited adaptive histogram equalization and processed to generate triplet deep hashing codes. DO then creates an image feature index set by linking image numbers with hashing codes, and uploading it to the cloud as a hash index table.</p>
<p>2) Cloud Server (CS): CS stores the encrypted images and the hash index table. Upon receiving a query, CS returns <italic>r</italic> semantically similar encrypted images to DU.</p>
<p>3) Data User (DU): DU generates a triplet deep hashing code for the query image, and submits it to CS, which performs similarity matching and returns <italic>r</italic> encrypted images. DU decrypts them and identifies the top-<italic>k</italic> similar images.</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Medical Image Enhancement</title>
<p>CXR images are crucial in medical diagnosis, and the image quality significantly affects accurate diagnosis. However, CXR images often suffer from insufficient contrast or uneven illumination, which affects the accuracy of feature extraction [<xref ref-type="bibr" rid="ref-33">33</xref>]. Therefore, to better display the details in CXR images and enhance the precision of feature extraction, this paper utilizes the contrast-limited adaptive histogram equalization technique applicable to color images [<xref ref-type="bibr" rid="ref-34">34</xref>] to enhance CXR images. This method dynamically adjusts the enhancement level based on the local contrast of CXR images, thereby better-preserving image details and avoiding issues such as excessive noise enhancement or detail loss that may occur with traditional grayscale histogram equalization methods [<xref ref-type="bibr" rid="ref-35">35</xref>].</p>
<p>The detailed process for enhancing CXR images includes the following steps:</p>
<p><bold>Step 1:</bold> Convert the color image to an appropriate color space. Represented as <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>:
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>Y</mml:mi><mml:mi>U</mml:mi><mml:mi>V</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mi>G</mml:mi><mml:mi>B</mml:mi><mml:mn>2</mml:mn><mml:mi>Y</mml:mi><mml:mi>U</mml:mi><mml:mi>V</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>R</mml:mi><mml:mi>G</mml:mi><mml:mi>B</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p><bold>Step 2:</bold> Separate the luminance channel. The luminance channel <italic>Y</italic> is extracted from the <italic>YUV</italic> color space, as shown in <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>:
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:mi>R</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:mi>G</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mrow><mml:mi>b</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00D7;</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>W</italic> is the weight factor, where <italic>W</italic><sub><italic>r</italic></sub>, <italic>W</italic><sub><italic>g</italic></sub>, and <italic>W</italic><sub><italic>b</italic></sub> are the weights for the red, green, and blue channels.</p>
<p><bold>Step 3:</bold> Enhance the image under the luminance channel using the contrast-limited adaptive histogram equalization technique applicable to color images.</p>
<p><bold>Step 4:</bold> Reconstruct the luminance channel.</p>
<p><bold>Step 5:</bold> Convert the enhanced color image back to the RGB color space, enabling it to serve as the correct input image for subsequent network models.</p>
<p><xref ref-type="fig" rid="fig-2">Fig. 2</xref> shows the comparative images before and after the enhanced images of the proposed method.</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>Enhanced image <italic>vs</italic>. the original: (a) Original Image, (b) Enhanced Image</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-2.tif"/>
</fig>
<p><xref ref-type="fig" rid="fig-2">Fig. 2</xref> clearly shows that the enhanced image exhibits clearer details compared to the pre-enhanced (unenhanced original CXR image) counterpart, without excessive distortion. This indicates that the method is effective and suitable for enhancing CXR and similar medical images.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Construction of Triplet Deep Hashing with Multi-Attention Mechanism</title>
<p>Within the feature extraction module shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>, based on the image characteristics of CXR images, this paper combines the channel attention mechanism [<xref ref-type="bibr" rid="ref-36">36</xref>] with the designed enhanced spatial attention mechanism. This combination enables the neural network to focus more on locally important features during feature extraction. Additionally, the triplet loss function is utilized to improve the discriminative ability of CXR image features and reduce the redundancy in the embedding space, thereby achieving triplet deep hashing. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> shows the network architecture of MATDH.</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>The network structure of MATDH</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-3.tif"/>
</fig>
<p>During feature extraction, a UNet-based architecture is used [<xref ref-type="bibr" rid="ref-37">37</xref>], incorporating a dual attention mechanism in residual blocks to emphasize important local features of CXR images. The training combines triplet loss and reconstruction loss to enhance feature distinctiveness. As shown in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>, the feature encoder includes convolutional layers, maximum pooling layers, downsampling, residual blocks with channel and enhanced spatial attentions (RCESA), average pooling layers, full connectivity layers, and hash layers.</p>
<p>For the <italic>i-</italic>th CXR image, two large-kernel 2D convolutional layers are applied to capture local features, followed by a max pooling operation to downsize the image for better focus on local content. Four RCESA modules are then stacked to extract features, with downsampling after the first three modules to capture more abstract high-level features. The feature map from the fourth module is processed through an average pooling layer and a dense layer, producing a 1000-dimensional feature vector <italic>f</italic><sub><italic>i</italic></sub>.</p>
<p>During feature decoding, alternating maximum pooling and upsampling operations reconstruct the feature maps from the encoding phase. A composite loss function integrates triplet loss, enhancing feature discriminability by ensuring similar samples are closer together than different ones. The multi-attention mechanism highlights important regions, allowing the network to focus on key features, thereby improving the distinction between similar and dissimilar samples. This approach retains essential information while minimizing redundancy. After minimizing the loss through the fully connected layer, a <italic>k</italic>-bit deep hash code representing CXR features is generated in the hashing layer, which includes a full connectivity layer and a hash function. The resulting hash codes are stored in a hash index table for future retrieval, the hash codes of the <italic>i-</italic>th CXR image can be represented as <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref>:
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C9;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>b</italic><sub><italic>i</italic></sub> is the <italic>k</italic>-bit hash code of the <italic>i-</italic>th CXR image <italic>X</italic><sub><italic>i</italic></sub>, <italic>b</italic><sub><italic>i</italic></sub> &#x2208; {0, 1}, <italic>&#x03C9;</italic> represents the trainable mapping in the fully connected layer, and <italic>h</italic>(<italic>x</italic>) denotes the hash function, defined as <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>:
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mspace width="2em" /><mml:mi>x</mml:mi><mml:mspace width="thinmathspace" /><mml:mo>&#x003C;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mspace width="2em" /><mml:mi>x</mml:mi><mml:mspace width="thinmathspace" /><mml:mo>&#x2265;</mml:mo><mml:mn>0.</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow></mml:math></disp-formula></p>
<sec id="s3_3_1">
<label>3.3.1</label>
<title>Multi-Attention Mechanism</title>
<p>To concurrently and adaptively determine the importance of key regions and various channels in CXR images, this paper combines the channel attention mechanism with the designed enhanced spatial attention mechanism in the residual block to form a multi-attention mechanism to achieve this goal. <xref ref-type="fig" rid="fig-4">Fig. 4</xref> shows the structure of the residual block (RCESA) with the multi-attention mechanism.</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>The structure of RCESA</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-4.tif"/>
</fig>
<p>In <xref ref-type="fig" rid="fig-4">Fig. 4</xref>, incorporating residual blocks into the network allows for image feature extraction and helps tackle the issue of network degradation. Taking the intermediate feature vector <italic>M</italic><sub><italic>in</italic></sub> extracted by the neural network as input, the weight of important areas is first enhanced through the enhanced spatial attention mechanism, and then its output serves as the basis for the channel attention mechanism operation to improve the accuracy of feature extraction. Specifically, within each residual block, different-sized convolutional layers are first used to extract multi-scale features to obtain the multi-scale feature vector <italic>M</italic><sub><italic>F</italic></sub>. Then, the extracted features are subjected to max-pooling operation (used for further downsampling of the feature map after partial feature extraction to reduce network computational load and enhance the network model&#x2019;s abstraction ability for important features of CXR images). Next, features are further extracted through a 3 &#x00D7; 3 convolutional layer, and bilinear interpolation is used for upsampling to match the feature map size with that of the input. The upsampled operation is then combined with the preceding feature map to generate the feature fusion vector <italic>M</italic><sub><italic>A</italic></sub>. Feature fusion is performed through 1 &#x00D7; 1 convolutional operation, allowing the network to more effectively utilize features of varying scales. Ultimately, the feature vector after fusion is normalized to the [0, 1] range with the sigmoid activation function to obtain the weight matrix <italic>m</italic>, which is subsequently multiplied by the input feature <italic>M</italic><sub><italic>in</italic></sub> to generate the weighted feature map as the output <italic>M</italic><sub><italic>out</italic></sub> of enhanced spatial attention operation, achieving enhanced spatial attention to enable the network model to utilize crucial regional information in input CXR images more effectively. This series of operations is represented by <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref>:
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>m</mml:mi><mml:mi>o</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>I</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p>Subsequently, the feature vector <italic>M</italic><sub><italic>out</italic></sub> processed by the enhanced spatial attention mechanism is used as the input of the channel attention mechanism. Initially, the convolutional feature vector <italic>M</italic><sub><italic>G</italic></sub> is obtained through two convolutional layers. This is followed by max-pooling and average-pooling to generate two vectorized representations. These vectors are then inputted into a shared multi-layer perceptron (MLP). The representations generated by the MLP are summed to form an attention vector, which is then multiplied with <italic>M</italic><sub><italic>G</italic></sub> to produce the feature map <italic>M</italic><sub><italic>H</italic></sub> with channel attention. This series of operations is represented by <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref>:
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x03C3;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C6;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>&#x03C9;</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2295;</mml:mo><mml:mi>&#x03C6;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>&#x03C9;</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2297;</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>&#x03C3;</italic> is the sigmoid function, <italic>&#x03D5;</italic> represents a trainable transformation within the MLP, <italic>&#x03C9;</italic><sub>1</sub> and <italic>&#x03C9;</italic><sub>2</sub> respectively represent max-pooling and average-pooling operations. After this step, the shortcut connection defined by <xref ref-type="disp-formula" rid="eqn-7">Eq. (7)</xref> is applied to obtain the output of the RCESA.
<disp-formula id="eqn-7"><label>(7)</label><mml:math id="mml-eqn-7" display="block"><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2295;</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p>Each RCESA&#x2019;s MLP comprises two convolutional layers with a kernel size of 1 &#x00D7; 1, referred to as Layer-1 and Layer-2. The detailed settings of the MLP are shown in <xref ref-type="table" rid="table-1">Table 1</xref>.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>The configuration of MLP</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Layers</th>
<th>RCESA-1</th>
<th>RCESA-2</th>
<th>RCESA-3</th>
<th>RCESA-4</th>
</tr>
</thead>
<tbody>
<tr>
<td>Layer-1 (Input/Output)</td>
<td>64/4</td>
<td>128/8</td>
<td>256/16</td>
<td>512/32</td>
</tr>
<tr>
<td>Layer-2 (Input/Output)</td>
<td>4/64</td>
<td>8/128</td>
<td>16/512</td>
<td>32/512</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_3_2">
<label>3.3.2</label>
<title>Triplet Loss</title>
<p>To understand the complex relationships among the input samples, triplet constraints serve as the loss function guiding the training of deep networks, improving the distinguishing ability of the obtained deep feature hash codes. The post-training CXR images are represented as {<italic>X</italic><sub>1</sub>, &#x2026;, <italic>X</italic><sub>i</sub>, &#x2026;, <italic>X</italic><sub>M</sub>}, and the associated class labels are represented as {<italic>L</italic><sub>1</sub>, &#x2026;, <italic>L</italic><sub>i</sub>, &#x2026;, <italic>L</italic><sub>M</sub>}, where <italic>L</italic><sub>i</sub>&#x2208;{1, &#x2026;, <italic>c</italic>}, and <italic>c</italic> represents the number of categories. <xref ref-type="fig" rid="fig-5">Fig. 5</xref> shows an example of triplet training.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>An illustration of triplet learning</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-5.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>, the triplet loss is designed to increase similarity among samples from the same class while decreasing similarity between samples from different classes [<xref ref-type="bibr" rid="ref-38">38</xref>]. Mathematically, given a triplet unit {<italic>q</italic>, <italic>p</italic>, <italic>n</italic>}, where <italic>q</italic> represents the query sample, <italic>p</italic> represents the positive sample, and <italic>n</italic> represents the negative sample. The goal of triplet loss is to reduce the following items:
<disp-formula id="eqn-8"><label>(8)</label><mml:math id="mml-eqn-8" display="block"><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>q</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>q</mml:mi><mml:mo>,</mml:mo><mml:mspace width="thinmathspace" /><mml:mi>n</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>d(q</italic>, <italic>p)</italic> is the distance between the query and the positive sample, <italic>d</italic>(<italic>q</italic>, <italic>n</italic>) is the distance between the query and the negative sample, and <italic>m</italic> represents a preset positive margin value.</p>

<p>In this study, given an image <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, triplet <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mrow><mml:mo>{</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> is generated through random selection. To attain a more distinctive representation for effective CXR image retrieval, this paper simultaneously applies triplet constraints to both deep features and hash codes, as outlined in <xref ref-type="disp-formula" rid="eqn-9">Eq. (9)</xref>:
<disp-formula id="eqn-9"><label>(9)</label><mml:math id="mml-eqn-9" display="block"><mml:msub><mml:mrow><mml:mrow><mml:mi>&#x02112;</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mo movablelimits="false">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo>[</mml:mo><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>T</italic> indicates a total number of triplet units across the training images, <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:msubsup><mml:mi>h</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> denote the hash codes of the query image <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, positive sample image <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and negative sample image <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> in the <italic>i-</italic>th triplet unit, respectively. <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula id="ieqn-11"><mml:math id="mml-ieqn-11"><mml:msubsup><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represent the deep features of the query image <inline-formula id="ieqn-12"><mml:math id="mml-ieqn-12"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, positive sample image <inline-formula id="ieqn-13"><mml:math id="mml-ieqn-13"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and negative sample image <inline-formula id="ieqn-14"><mml:math id="mml-ieqn-14"><mml:msubsup><mml:mi>X</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> in the <italic>i-</italic>th triplet unit individually, <italic>m</italic> denotes the margin value, and <inline-formula id="ieqn-15"><mml:math id="mml-ieqn-15"><mml:mi>d</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x22C5;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the Euclidean distance.</p>
<p>The specific learning algorithm for MATDH is shown in Algorithm 1.</p>
<fig id="fig-13">
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-13.tif"/>
</fig>
</sec>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Lightweight Image Encryption Algorithm</title>
<p>The construction of the encrypted image database utilizes a lightweight CXR image encryption method as shown in <xref ref-type="fig" rid="fig-6">Fig. 6</xref> to encrypt the original images.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>Lightweight CXR image encryption processing</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-6.tif"/>
</fig>
<p>To overcome the shortcomings of simple encryption algorithms, this paper requires randomization of the original image before encrypting the coefficients of the Discrete Cosine Transform (DCT) in the image frequency domain. Firstly, in the spatial domain, use the two-dimensional Arnold transform to scramble the pixel blocks of the ordinary original image for encryption. Then, segment the image and perform DCT transformation on each image block to transition from the spatial representation to the frequency representation. Next, use the two-dimensional Logistic chaotic mapping to encrypt the DCT coefficients of each block. Finally, after the inverse, Discrete Cosine Transform (IDCT), secondary encryption uses the PRESENT algorithm with a 128-bit key to produce the final encrypted image. The specific steps for lightweight CXR image encryption are as follows:</p>
<p><bold>Step 1:</bold> Apply a two-dimensional Arnold transformation to scramble and encrypt the pixels of the original image. With the definition of the two-dimensional Arnold transformation as shown in <xref ref-type="disp-formula" rid="eqn-10">Eq. (10)</xref>:
<disp-formula id="eqn-10"><label>(10)</label><mml:math id="mml-eqn-10" display="block"><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mn>1</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>c</mml:mi></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow><mml:mspace width="1em" /><mml:mrow><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:mspace width="1em" /><mml:mi>b</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>b</mml:mi><mml:mi>c</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mtext>mod</mml:mtext></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mspace width="negativethinmathspace" /><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>b</italic> and <italic>c</italic> are arbitrary positive integers, and the matrix&#x2019;s determinant needs to equal 1 to preserve the same region in the CXR image, the two-dimensional Arnold transformation iterates <italic>m</italic> times, generating a random image in each iteration. The values of parameters <italic>b</italic>, <italic>c</italic>, and <italic>m</italic> serve as the encryption key.</p>
<p><bold>Step 2:</bold> Divide the scrambled image into 8 &#x00D7; 8 pixel blocks.</p>
<p><bold>Step 3:</bold> Perform DCT processing on the segmented image blocks to transform them from the spatial domain to the frequency domain. The expression for the two-dimensional DCT of an <italic>M</italic> &#x00D7; <italic>N</italic> matrix is as shown in <xref ref-type="disp-formula" rid="eqn-11">Eq. (11)</xref>:
<disp-formula id="eqn-11"><label>(11)</label><mml:math id="mml-eqn-11" display="block"><mml:mi>C</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>v</mml:mi></mml:mrow></mml:msub><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>M</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munderover><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>M</mml:mi></mml:mrow></mml:mfrac><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p><bold>Step 4:</bold> Encrypt the DCT coefficients of each pixel block using two-dimensional Logistic chaotic mapping. The definition of two-dimensional Logistic chaotic mapping is as shown in <xref ref-type="disp-formula" rid="eqn-12">Eq. (12)</xref>:
<disp-formula id="eqn-12"><label>(12)</label><mml:math id="mml-eqn-12" display="block"><mml:mrow><mml:mo>{</mml:mo><mml:mtable columnalign="left" rowspacing="4pt" columnspacing="1em"><mml:mtr><mml:mtd><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>&#x03B4;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo fence="true" stretchy="true" symmetric="true"></mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>x</italic><sub><italic>n</italic></sub> and <italic>y</italic><sub><italic>n</italic></sub> are the two-state components at time <italic>n</italic>, <italic>r</italic><sub>1</sub>, and <italic>r</italic><sub>2</sub> are control parameters, and <italic>&#x03B4;</italic> is the coupling function.</p>
<p><bold>Step 5:</bold> Perform IDCT on each pixel block to obtain the preliminarily encrypted image. The definition of the two-dimensional IDCT of an <italic>M</italic> &#x00D7; <italic>N</italic> matrix is as shown in <xref ref-type="disp-formula" rid="eqn-13">Eq. (13)</xref>:
<disp-formula id="eqn-13"><label>(13)</label><mml:math id="mml-eqn-13" display="block"><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mrow><mml:mi>v</mml:mi></mml:mrow></mml:msub><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>M</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munderover><mml:mi>c</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>M</mml:mi></mml:mrow></mml:mfrac><mml:mi>cos</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn>2</mml:mn><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:math></disp-formula></p>
<p><bold>Step 6:</bold> Utilize the PRESENT algorithm to perform secondary encryption on the preliminarily encrypted image, obtaining the final encrypted image.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Simulation Results and Analysis</title>
<sec id="s4_1">
<label>4.1</label>
<title>Experimental Settings</title>
<p>The experimental hardware environment consists of CPU: Intel(R) Core(TM) i7-13700H CPU @2.20 GHz, GPU: NVIDIA GeForce RTX 4060 Laptop GPU, memory: 12 GB, software environment: Windows11, JetBrains PyCharm Community Edition 2023.2x64, Anaconda. The deep learning framework is PyTorch. This paper uses Adam to optimize the objective function, with a learning rate set to 0.001, batch size set to 32, 30 training iterations, and the margin threshold in triplet loss set to 0.2.</p>
<p><bold>Dataset:</bold> This experiment evaluates the suggested approach using the public COVIDx dataset [<xref ref-type="bibr" rid="ref-39">39</xref>], which includes CXR images related to COVID-19. To enhance the model&#x2019;s generalization ability during training, the 29,986 CXR images are divided into multiple subsets, which are alternately used as the test set to reduce the risk of overfitting. Additionally, the independent test set includes 400 medical images, including 200 negative images and 200 positive images for COVID-19.</p>
<p><bold>Data preprocessing:</bold> CXR images are resized to 256 &#x00D7; 256 for uniformity, randomly cropped to 224 &#x00D7; 224, and horizontally flipped to enhance the model&#x2019;s ability to handle spatial variations and orientation of chest structures.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Network Model and Performance Analysis</title>
<p>In this paper, experimental adjustments are made to the parameters of the network model, and accuracy and loss are evaluated. After multiple experiments, the deep network model based on the UNet architecture with optimal performance is obtained. <xref ref-type="fig" rid="fig-7">Fig. 7</xref> shows the training and testing curves for the network model.</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>The training and testing curves for the network model</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-7.tif"/>
</fig>
<p>As can be seen from <xref ref-type="fig" rid="fig-7">Fig. 7</xref>, it can be observed that the training and testing accuracy curves initially converge but stabilize after a certain number of iterations, with no significant jumps or drastic fluctuations, indicating good generalization capability of the network model and no overfitting. After 30 iterations of training, the training accuracy is 94.56%, and the testing accuracy is 94.42%. Therefore, the network model adopted in this paper exhibits high performance and accuracy, with good retrieval performance.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Retrieval Performance Analysis</title>
<sec id="s4_3_1">
<label>4.3.1</label>
<title>Retrieval Accuracy Analysis</title>
<p>Mean Average Precision (mAP) is commonly used to evaluate object detection models, considering the average precision (AP) across different categories; a higher mAP indicates more similar images at the top of the retrieved list. Additionally, P@H &#x2264; 2 is a key measure in image retrieval, emphasizing the use of shorter hash codes to capture similarity within the feature space. This paper calculates the average precision for instances where the Hamming distance between the query and database images is no more than 2. To mitigate bias from model initialization, five independent experiments were conducted.</p>
<p>To validate the CXR image retrieval performance of the proposed method, a comparison was conducted between our MATDH method and six other deep hashing methods, namely TCDH [<xref ref-type="bibr" rid="ref-8">8</xref>], SWTH [<xref ref-type="bibr" rid="ref-12">12</xref>], ATH [<xref ref-type="bibr" rid="ref-22">22</xref>], DPN [<xref ref-type="bibr" rid="ref-40">40</xref>], ASH [<xref ref-type="bibr" rid="ref-41">41</xref>], and DBDH [<xref ref-type="bibr" rid="ref-42">42</xref>]. For fairness, the comparison was conducted using the same data and parameter settings as MATDH. <xref ref-type="table" rid="table-2">Table 2</xref> shows mAP values for different methods across various hash code lengths.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>The comparison of mAP values for different methods</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Methods</th>
<th>Bits &#x003D; 16</th>
<th>Bits &#x003D; 24</th>
<th>Bits &#x003D; 32</th>
<th>Bits &#x003D; 48</th>
<th>Bits &#x003D; 64</th>
</tr>
</thead>
<tbody>
<tr>
<td>TCDH</td>
<td>0.8152</td>
<td>0.8257</td>
<td>0.8359</td>
<td>0.8691</td>
<td>0.8271</td>
</tr>
<tr>
<td>SWTH</td>
<td>0.7847</td>
<td>0.7919</td>
<td>0.8022</td>
<td>0.8219</td>
<td>0.8386</td>
</tr>
<tr>
<td>DPN</td>
<td>0.8066</td>
<td>0.8297</td>
<td>0.8418</td>
<td>0.8511</td>
<td>0.8643</td>
</tr>
<tr>
<td>ASH</td>
<td>0.7691</td>
<td>0.7774</td>
<td>0.7866</td>
<td>0.7978</td>
<td>0.8246</td>
</tr>
<tr>
<td>ATH</td>
<td>0.8073</td>
<td>0.8196</td>
<td>0.8485</td>
<td>0.8681</td>
<td>0.8704</td>
</tr>
<tr>
<td>DBDH</td>
<td>0.6392</td>
<td>0.6524</td>
<td>0.6549</td>
<td>0.6697</td>
<td>0.6866</td>
</tr>
<tr>
<td>MATDH(Ours)</td>
<td><bold>0.8331</bold></td>
<td><bold>0.8427</bold></td>
<td><bold>0.8648</bold></td>
<td><bold>0.9117</bold></td>
<td><bold>0.8789</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As can be seen from <xref ref-type="table" rid="table-2">Table 2</xref>, it can be observed that MATDH outperforms the compared existing deep hashing methods. For example, with a 48-bit hash code length, MATDH (0.9117) enhances the mAP by approximately 4.2% over the top-performing deep learning technique, TCDH (0.8691). This is because the features extracted from CXR images after image enhancement under the dual attention mechanism are more accurate, and guided by the triplet loss function, the network can extract more discriminative features. Furthermore, after performing a <italic>t</italic>-test distribution calculation comparing the mAP of the MATDH method with that of six other methods, it was found that there is no significant difference between MATDH and ATH, while significant differences exist with the other five methods. This indicates that the MATDH method has statistical significance, providing important evidence for selecting and optimizing medical image retrieval methods. <xref ref-type="fig" rid="fig-8">Fig. 8</xref> shows the curve of P@H &#x2264; 2 for MATDH compared to six other methods.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Line graph comparing P@H &#x2264; 2 across various methods</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-8.tif"/>
</fig>
<p>As can be seen from <xref ref-type="fig" rid="fig-8">Fig. 8</xref>, P@H &#x2264; 2 curves further illustrate the method&#x2019;s effectiveness within a Hamming radius of 2. The accuracy curve in <xref ref-type="fig" rid="fig-8">Fig. 8</xref> shows that the proposed MATDH method consistently outperforms other advanced methods across hash code lengths from 16 to 64 bits. At a hash code length of 48 bits, the accuracy is approximately 20% higher than that of the DPN method. This suggests that the MATDH method is capable of retrieving a greater number of images from the same category when the Hamming radius is set to 2. Moreover, in comparison to other methods, the precision attained by MATDH consistently exceeds 0.85 as the number of hash bits increases. This highlights MATDH&#x2019;s effectiveness in CXR image retrieval and its resilience to Hamming sorting.</p>

<p>Additionally, with a 48-bit hash code length, MATDH achieves the highest mAP values. This suggests that for CXR images, using 48-bit hash codes with MATDH can better preserve the main features of the images. Therefore, the MATDH method in this paper adopts a 48-bit deep hash code.</p>
</sec>
<sec id="s4_3_2">
<label>4.3.2</label>
<title>The Recall Rate, Precision Analysis</title>
<p>Besides relying on mAP, in medical image retrieval, accuracy for items positioned highest in the ranking is often a key priority for users. Therefore, this paper also uses the recall rate in the <italic>top-K</italic> retrieval results (Recall@<italic>K</italic>) and the precision of the <italic>top-N</italic> retrieval results (P@<italic>N</italic>) to evaluate the effectiveness of CXR image retrieval methods [<xref ref-type="bibr" rid="ref-8">8</xref>].</p>
<p><xref ref-type="fig" rid="fig-9">Fig. 9</xref> shows the recall rate curves of top<italic>-k</italic> images across various techniques using a consistent 48-bit hash code length, and <xref ref-type="table" rid="table-3">Table 3</xref> presents the comparison results of P@1, P@5, and P@10 obtained through various methods using 48-bit deep hash codes in the COVIDx dataset. The XMIR method [<xref ref-type="bibr" rid="ref-43">43</xref>], which is not a deep hash method, is not concerned with the issue of hash code length.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Recall curve for different methods</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-9.tif"/>
</fig><table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>The comparison of P@<italic>N</italic> results for different methods</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>mAP</th>
<th>P@1</th>
<th>P@5</th>
<th>P@10</th>
</tr>
</thead>
<tbody>
<tr>
<td>TCDH</td>
<td>0.8691</td>
<td>0.9142</td>
<td>0.8857</td>
<td>0.8831</td>
</tr>
<tr>
<td>SWTH</td>
<td>0.8219</td>
<td>0.8766</td>
<td>0.8544</td>
<td>0.8395</td>
</tr>
<tr>
<td>DPN</td>
<td>0.8511</td>
<td>0.8917</td>
<td>0.8845</td>
<td>0.8798</td>
</tr>
<tr>
<td>ASH</td>
<td>0.7978</td>
<td>0.8524</td>
<td>0.8408</td>
<td>0.8233</td>
</tr>
<tr>
<td>ATH</td>
<td>0.8681</td>
<td>0.8874</td>
<td>0.8780</td>
<td>0.8677</td>
</tr>
<tr>
<td>DBDH</td>
<td>0.6697</td>
<td>0.7586</td>
<td>0.7468</td>
<td>0.7153</td>
</tr>
<tr>
<td>XMIR</td>
<td>0.8616</td>
<td>0.9015</td>
<td>0.9001</td>
<td>0.8815</td>
</tr>
<tr>
<td>MATDH(Ours)</td>
<td><bold>0.9117</bold></td>
<td><bold>0.9333</bold></td>
<td><bold>0.9154</bold></td>
<td><bold>0.9081</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="fig" rid="fig-9">Fig. 9</xref>, the recall rate of MATDH remains consistently higher than that of other methods as the number of returned CXR images increases from 0 to 200. For example, when 180 images are returned, the recall rate is approximately 30% higher than that of the DBDH method. This indicates that MATDH retrieves a higher number of relevant images, consistent with the comparative analysis results.</p>

<p>As can be seen from <xref ref-type="table" rid="table-3">Table 3</xref>, it is evident that when the limit on returned images is set, MATDH outperforms other methods. This is because, under the effect of the multi-attention mechanism, with the iterative training of deep neural networks, the constructed hash codes can more accurately represent images, thereby improving the accuracy of CXR image retrieval.</p>

<p>Although this study mainly uses the COVIDx dataset for experiments, the proposed MATDH method has the potential for clinical application. It can serve as an auxiliary diagnostic tool, helping doctors make faster and more accurate judgments on COVID-19 infections. The model can be integrated into hospital imaging systems for real-time radiological image analysis, improving early detection accuracy. Additionally, the proposed MATDH method can assist doctors in quickly and accurately retrieving medical imaging data, reducing the workload of data queries during treatment, especially when medical resources are limited.</p>
</sec>
</sec>
<sec id="s4_4">
<label>4.4</label>
<title>Visual Analysis</title>
<p>This section uses the 48-bit deep hash codes to display the 10 most similar CXR images obtained through various methods. <xref ref-type="fig" rid="fig-10">Fig. 10</xref> shows the visualization results. The images within the red boxes represent those returned in the wrong categories compared to the query image.</p>
<fig id="fig-10">
<label>Figure 10</label>
<caption>
<title>The visualization results of the top 10 CXR images are recognized by different methods</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-10.tif"/>
</fig>
<p>As can be seen from <xref ref-type="fig" rid="fig-10">Fig. 10</xref>, it can be observed that most of the images returned by MATDH compared to other methods are correct. This indicates that the retrieval results of MATDH are highly accurate, demonstrating that the multiple attention mechanism and optimized loss function used in MATDH outperform other advanced methods. Additionally, the visualization results of different methods further confirm the accuracy of the P@10 results.</p>

</sec>
<sec id="s4_5">
<label>4.5</label>
<title>Ablation Study</title>
<sec id="s4_5_1">
<label>4.5.1</label>
<title>Impact of Network Architecture and CXR Image Enhancement Methods</title>
<p>To objectively evaluate the network model and framework of this paper&#x2019;s method, disintegration experiments are conducted through the removal or substitution of modules in the network architecture. For the network structure, the basic structure of ResNet18 in the UNet network used by the proposed method is replaced with AlexNet [<xref ref-type="bibr" rid="ref-44">44</xref>] and ResNet34 [<xref ref-type="bibr" rid="ref-45">45</xref>], and these two variants are respectively denoted as MATDH-AlexNet and MATDH-ResNet34. Furthermore, before extracting features from the deep network, the approach described in this paper does not enhance the original images, which are denoted as MATDH-NoEnhanced, and experiments are conducted using 48-bit deep hashing codes. The detailed data can be found in <xref ref-type="table" rid="table-4">Table 4</xref>, where the top entries are emphasized in bold.</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>The impact of network architecture and CXR image enhancement methods</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>mAP</th>
<th>P@1</th>
<th>P@5</th>
<th>P@10</th>
</tr>
</thead>
<tbody>
<tr>
<td>MATDH-NoEnhanced</td>
<td>0.8344</td>
<td>0.9150</td>
<td>0.8979</td>
<td>0.8868</td>
</tr>
<tr>
<td>MATDH-AlexNet</td>
<td>0.7806</td>
<td>0.8109</td>
<td>0.7945</td>
<td>0.7748</td>
</tr>
<tr>
<td>MATDH-ResNet34</td>
<td>0.7909</td>
<td>0.8255</td>
<td>0.8215</td>
<td>0.7806</td>
</tr>
<tr>
<td>MATDH(Ours)</td>
<td><bold>0.9117</bold></td>
<td><bold>0.9333</bold></td>
<td><bold>0.9154</bold></td>
<td><bold>0.9081</bold></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As can be seen from <xref ref-type="table" rid="table-4">Table 4</xref>, MATDH enhances the retrieval performance of input CXR images by utilizing image enhancement, and it outperforms variant networks with different frameworks.</p>

</sec>
<sec id="s4_5_2">
<label>4.5.2</label>
<title>The Impact of Multi-Attention Mechanism</title>
<p>To examine how the dual attention mechanism within the RCESAs module influences the outcomes of CXR image retrieval experiments, this section modified the attention mechanisms used in the RCESAs module of the MATDH framework. One of the variants utilizes only the channel attention mechanism, referred to as MATDH-CA, while the other variant utilizes only the enhanced spatial attention mechanism, referred to as MATDH-ESA. <xref ref-type="fig" rid="fig-11">Fig. 11</xref> shows the comparison between MATDH and its variants that utilize various attention mechanisms for CXR image recovery from the COVIDx dataset, using a 48-bit hash code length, with the top outcomes emphasized in bold.</p>
<fig id="fig-11">
<label>Figure 11</label>
<caption>
<title>The comparison of MATDH and its variants</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-11.tif"/>
</fig>
<p>According to the results in <xref ref-type="fig" rid="fig-11">Fig. 11</xref>, MATDH with a dual-attention mechanism outperforms the standalone CA and ESA mechanisms in retrieval performance. The specific contributions to this performance improvement can be summarized as follows:</p>

<p>1) Feature enhancement: The dual-attention mechanism effectively captures and enhances the correlation between useful feature channels, enabling the model to better identify features related to lesions in CXR images, thereby improving retrieval accuracy.</p>
<p>2) Redundant information suppression: This mechanism effectively reduces the influence of unnecessary data and noisy channels in CXR images, allowing the network to focus more on key information, thus enhancing the clarity of feature representation.</p>
<p>3) Dynamic attention adjustment: The dual-attention mechanism dynamically adjusts the focus of the deep neural network on different regions of CXR images, enabling the model to better capture important spatial features. This flexibility allows MATDH to maintain high retrieval performance across a diverse set of input images.</p>
<p>4) Combined effect: By integrating the above factors, the dual-attention mechanism significantly enhances the overall performance of MATDH, demonstrating its effectiveness and superiority in medical image retrieval tasks.</p>
</sec>
</sec>
<sec id="s4_6">
<label>4.6</label>
<title>Encryption Performance Analysis</title>
<p>The cryptographic performance analysis in this section is demonstrated using three CXR images selected from the COVIDx dataset, as shown in <xref ref-type="fig" rid="fig-12">Fig. 12</xref>, referred to as CXR-1, CXR-2, and CXR-3.</p>
<fig id="fig-12">
<label>Figure 12</label>
<caption>
<title>The three CXR images were used for encryption analysis: (a) CXR-1; (b) CXR-2; (c) CXR-3</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_57269-fig-12.tif"/>
</fig>
<sec id="s4_6_1">
<label>4.6.1</label>
<title>Information Entropy Analysis</title>
<p>In cryptography, information entropy indicates the degree of unpredictability in image data. For a perfectly random image, it is essential that all grayscale values have an equal probability of occurrence. The formula for calculating information entropy is shown in <xref ref-type="disp-formula" rid="eqn-14">Eq. (14)</xref>:</p>
<p><disp-formula id="eqn-14"><label>(14)</label><mml:math id="mml-eqn-14" display="block"><mml:mi>H</mml:mi><mml:mo>=</mml:mo><mml:mo>&#x2212;</mml:mo><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:munderover><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>log</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2061;</mml:mo><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>where <italic>L</italic> represents the total number of grayscale levels in the image, while <italic>p(i)</italic> denotes the probability of occurrence for grayscale value <italic>i</italic>. For an image with <italic>p</italic>(<italic>i</italic>) &#x003D; 256, when each pixel value in the image has an equal probability of occurrence, the entropy of the image can reach its maximum value of 8. A higher entropy indicates greater randomness in the image, leading to better encryption performance. <xref ref-type="table" rid="table-5">Table 5</xref> shows the compares the entropy between the original and encrypted images, both with dimensions of 512 &#x00D7; 512.</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Information entropy comparison</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Image</th>
<th>The entropy of the original image</th>
<th>The entropy of the encrypted image</th>
</tr>
</thead>
<tbody>
<tr>
<td>CXR-1</td>
<td>6.5566</td>
<td>7.9992</td>
</tr>
<tr>
<td>CXR-2</td>
<td>7.7805</td>
<td>7.9994</td>
</tr>
<tr>
<td>CXR-3</td>
<td>7.6164</td>
<td>7.9993</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As can be seen from <xref ref-type="table" rid="table-5">Table 5</xref>, the average entropy of the encrypted images generated by the encryption method presented in this paper is 7.9993, approaching the optimal value of 8. This suggests that the encryption technique presented here achieves strong performance, resulting in encrypted images that show significant randomness.</p>

</sec>
<sec id="s4_6_2">
<label>4.6.2</label>
<title>Comparative Performance Analysis with Existing Encryption Schemes</title>
<p>As the encryption method described in this study is a secure lightweight encryption algorithm, this section compares and analyzes it with existing schemes based on three evaluation indicators reflecting the lightweight and security of encryption algorithms: key space size, encryption speed, and information entropy. <xref ref-type="table" rid="table-6">Table 6</xref> shows the performance comparing results between the encryption method introduced here and various medical image encryption schemes [<xref ref-type="bibr" rid="ref-46">46</xref>&#x2013;<xref ref-type="bibr" rid="ref-48">48</xref>].</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Comparison with existing encryption schemes</title>
</caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Keyspace</th>
<th>Encryption speed(Mbit/s)</th>
<th>Information entropy</th>
</tr>
</thead>
<tbody>
<tr>
<td>Proposed</td>
<td><bold>2</bold><sup><bold>256</bold></sup></td>
<td>7.9482</td>
<td><bold>7.9993</bold></td>
</tr>
<tr>
<td>Castro et al. [<xref ref-type="bibr" rid="ref-46">46</xref>]</td>
<td><bold>2</bold><sup><bold>256</bold></sup></td>
<td>4.8840</td>
<td>N/A</td>
</tr>
<tr>
<td>Abdelfatah et al. [<xref ref-type="bibr" rid="ref-47">47</xref>]</td>
<td>2<sup>478</sup></td>
<td>1.4873</td>
<td>7.9971</td>
</tr>
<tr>
<td>Inam et al. [<xref ref-type="bibr" rid="ref-48">48</xref>]</td>
<td>2<sup>240&#x002A;4</sup></td>
<td><bold>7.9886</bold></td>
<td>7.9992</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>From <xref ref-type="table" rid="table-6">Table 6</xref>, it is evident that the encryption method in this paper shares the same key space as in Castro et al. [<xref ref-type="bibr" rid="ref-46">46</xref>], but is significantly smaller than that in Abdelfatah et al. [<xref ref-type="bibr" rid="ref-47">47</xref>] and Inam et al. [<xref ref-type="bibr" rid="ref-48">48</xref>]. This suggests that the proposed method is more resource-efficient, making it suitable for lightweight encryption of medical images. Although its key space is smaller, it still provides sufficient security (greater than 2<sup>100</sup>), enough to resist exhaustive attacks [<xref ref-type="bibr" rid="ref-48">48</xref>]. The encryption speed is higher than in Castro et al. [<xref ref-type="bibr" rid="ref-46">46</xref>] and Abdelfatah et al. [<xref ref-type="bibr" rid="ref-47">47</xref>] and slightly lower than in Inam et al. [<xref ref-type="bibr" rid="ref-48">48</xref>], due to frequency domain operations. These operations, while slowing down the process, offer better attack resistance. Additionally, the proposed method has higher entropy, indicating strong randomness and resistance to entropy attacks, making it a secure lightweight encryption algorithm for medical images compared to other schemes [<xref ref-type="bibr" rid="ref-46">46</xref>&#x2013;<xref ref-type="bibr" rid="ref-48">48</xref>].</p>

</sec>
<sec id="s4_6_3">
<label>4.6.3</label>
<title>The Analysis Resists Different Types of Attacks</title>
<p>Building on the insights from the previous two sections, this part demonstrates the advantages of the encryption algorithm in resisting various types of attacks.</p>
<p>First, as discussed in <xref ref-type="sec" rid="s4_6_1">Section 4.6.1</xref>, it is understood that the ciphertext generated by this algorithm exhibits high entropy, meaning that it is statistically close to a random distribution. This randomness complicates the process for attackers trying to obtain useful information through frequency analysis, as well as through statistical attacks such as known-plaintext or chosen-plaintext attacks. The existence of high-entropy ciphertext significantly enhances the algorithm&#x2019;s defense against these forms of attacks.</p>
<p>Secondly, the analysis in <xref ref-type="sec" rid="s4_6_2">Section 4.6.2</xref> shows that the key space of this encryption algorithm is 2<sup>256</sup>, and such an extensive key range renders brute-force attacks virtually impossible. Even if attackers possess powerful computational resources, attempting to exhaust all possible keys within a reasonable time frame is still unlikely, providing additional security for the algorithm.</p>
<p>In summary, based on the aforementioned points, this encryption algorithm can be considered to possess strong security, effectively resisting brute-force attacks, statistical attacks, known-plaintext attacks, chosen-plaintext attacks, and chosen-ciphertext attacks, among other common attack methods. The high-entropy characteristic and the vast key space complement each other, ensuring the reliability and robustness of the encryption algorithm under various threats.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Conclusions</title>
<p>This paper proposes a secure technique for retrieving medical images utilizing a multi-attention mechanism and triplet deep hashing, addressing issues like poor feature extraction, low retrieval precision, and inadequate security in existing solutions. The approach enhances CXR images using contrast-limited adaptive histogram equalization, which reduces noise and highlights details for better feature extraction. The multi-attention mechanism dynamically allocates channel attention and focuses on important local features, improving retrieval accuracy through deep hash codes. A lightweight CXR image encryption method enhances system security while maintaining efficiency. A limitation is the small dataset of CXR disease types, with future work aimed at improving retrieval accuracy and security on more complex datasets.</p>
</sec>
</body>
<back>
<ack>
<p>The authors are grateful to all the editors and anonymous reviewers for their comments and suggestions and thank all the members who have contributed to this work with us.</p>
</ack>
<sec><title>Funding Statement</title>
<p>This work is supported by the National Natural Science Foundation of China (No. 61862041).</p>
</sec>
<sec><title>Author Contributions</title>
<p>The first author Shaozheng Zhang gives the main conception and writes of this paper. The second author Qiuyu Zhang mainly reviews and proposes revisions to the manuscript. The third author Jiahui Tang and the fourth author Ruihua Xu conducted the data collection. All authors reviewed the results and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability"><title>Availability of Data and Materials</title>
<p>The data supporting the findings of this study can be obtained from the corresponding author upon reasonable request.</p>
</sec>
<sec><title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement"><title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest to report regarding the present study.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. K.</given-names> <surname>Hasan</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Islam</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Sulaiman</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Khan</surname></string-name>, <string-name><given-names>A. A.</given-names> <surname>Hashim</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Habib</surname></string-name></person-group>, &#x201C;<article-title>Lightweight encryption technique to enhance medical image security on internet of medical things applications</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>9</volume>, pp. <fpage>47731</fpage>&#x2013;<lpage>47742</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3061710</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Shamna</surname></string-name>, <string-name><given-names>V. K.</given-names> <surname>Govindan</surname></string-name>, and <string-name><given-names>K. A. A.</given-names> <surname>Nazeer</surname></string-name></person-group>, &#x201C;<article-title>Content-based medical image retrieval by spatial matching of visual words</article-title>,&#x201D; <source>J. King Saud Univ.-Comput. Inf. Sci.</source>, vol. <volume>34</volume>, no. <issue>2</issue>, pp. <fpage>58</fpage>&#x2013;<lpage>71</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1016/j.jksuci.2018.10.002</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>&#x015E;.</given-names> <surname>&#x00D6;zt&#x00FC;rk</surname></string-name>, <string-name><given-names>E.</given-names> <surname>&#x00C7;elik</surname></string-name>, and <string-name><given-names>T.</given-names> <surname>&#x00C7;ukur</surname></string-name></person-group>, &#x201C;<article-title>Content-based medical image retrieval with opponent class adaptive margin loss</article-title>,&#x201D; <source>Inf. Sci.</source>, vol. <volume>637</volume>, no. <issue>1</issue>, <year>2023, Art. no. 118938</year>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2023.118938</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>V. H.</given-names> <surname>Vu</surname></string-name></person-group>, &#x201C;<article-title>Content-based image retrieval with fuzzy clustering for feature vector normalization</article-title>,&#x201D; <source>Multimed. Tools Appl.</source>, vol. <volume>83</volume>, no. <issue>2</issue>, pp. <fpage>4309</fpage>&#x2013;<lpage>4329</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.1007/s11042-023-15215-1</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Bai</surname></string-name>, and <string-name><given-names>K.</given-names> <surname>Kpalma</surname></string-name></person-group>, &#x201C;<article-title>OMCBIR: Offline mobile content-based image retrieval with lightweight CNN optimization</article-title>,&#x201D; <source>Displays</source>, vol. <volume>76</volume>, no. <issue>5</issue>, <year>2023, Art. no. 102355</year>. doi: <pub-id pub-id-type="doi">10.1016/j.displa.2022.102355</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Chen</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Recent advances and clinical applications of deep learning in medical image analysis</article-title>,&#x201D; <source>Med. Image Anal.</source>, vol. <volume>79</volume>, <year>2022, Art. no. 102444</year>. doi: <pub-id pub-id-type="doi">10.1016/j.media.2022.102444</pub-id>; <pub-id pub-id-type="pmid">35472844</pub-id></mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. M.</given-names> <surname>Alizadeh</surname></string-name>, <string-name><given-names>M. S.</given-names> <surname>Helfroush</surname></string-name>, and <string-name><given-names>H.</given-names> <surname>M&#x00FC;ller</surname></string-name></person-group>, &#x201C;<article-title>A novel Siamese deep hashing model for histopathology image retrieval</article-title>,&#x201D; <source>Expert Syst. Appl</source>, vol. <volume>225</volume>, no. <issue>4</issue>, <year>2023, Art. no. 120169</year>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2023.120169</pub-id>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Ma</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Triplet-constrained deep hashing for chest X-ray image retrieval in COVID-19 assessment</article-title>,&#x201D; <source>Neural Netw.</source>, vol. <volume>173</volume>, no. <issue>12</issue>, <year>2024, Art. no. 106182</year>. doi: <pub-id pub-id-type="doi">10.1016/j.neunet.2024.106182</pub-id>; <pub-id pub-id-type="pmid">38387203</pub-id></mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Hussain</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Ali</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Abbas</surname></string-name> and <string-name><given-names>M.</given-names> <surname>Hussain</surname></string-name></person-group>, &#x201C;<article-title>An optimized deep supervised hashing model for fast image retrieval</article-title>,&#x201D; <source>Image Vis. Comput.</source>, vol. <volume>133</volume>, no. <issue>10</issue>, <year>2023, Art. no. 104668</year>. doi: <pub-id pub-id-type="doi">10.1016/j.imavis.2023.104668</pub-id>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Cui</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Huo</surname></string-name>, and <string-name><given-names>T.</given-names> <surname>Fang</surname></string-name></person-group>, &#x201C;<article-title>Deep hashing with multi-central ranking loss for multi-label image retrieval</article-title>,&#x201D; <source>IEEE Signal Process. Lett.</source>, vol. <volume>30</volume>, pp. <fpage>135</fpage>&#x2013;<lpage>139</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1109/LSP.2023.3244516</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Dai</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Li</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Wang</surname></string-name></person-group>, &#x201C;<article-title>Deep uncoupled discrete hashing via similarity matrix decomposition</article-title>,&#x201D; <source>ACM Trans. Multimed. Comput., Commun., Appl.</source>, vol. <volume>19</volume>, no. <issue>1</issue>, <year>2023, Art. no. 22</year>. doi: <pub-id pub-id-type="doi">10.1145/3524021</pub-id>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Peng</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Qian</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Liu</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Dong</surname></string-name></person-group>, &#x201C;<article-title>Swin transformer-based supervised hashing</article-title>,&#x201D; <source>Appl. Intell.</source>, vol. <volume>53</volume>, no. <issue>14</issue>, pp. <fpage>17548</fpage>&#x2013;<lpage>17560</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s10489-022-04410-6</pub-id>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>H.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Qin</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Hou</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Dai</surname></string-name>, <string-name><given-names>L.</given-names> <surname>Huang</surname></string-name> and <string-name><given-names>W.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Deep global semantic structure-preserving hashing via corrective triplet loss for remote sensing image retrieval</article-title>,&#x201D; <source>Expert. Syst. Appl.</source>, vol. <volume>238</volume>, no. <issue>2</issue>, <year>2024, Art. no. 122105</year>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2023.122105</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>L&#x00FC;</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Liao</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Xiang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Wu</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Le</surname></string-name></person-group>, &#x201C;<article-title>AVPMIR: Adaptive verifiable privacy-preserving medical image retrieval</article-title>,&#x201D; <source>IEEE Trans. Depend. Secure Comput.</source>, vol. <volume>21</volume>, no. <issue>5</issue>, pp. <fpage>4637</fpage>&#x2013;<lpage>4651</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.1109/TDSC.2024.3355223</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K. N.</given-names> <surname>Singh</surname></string-name>, <string-name><given-names>O. P.</given-names> <surname>Singh</surname></string-name>, <string-name><given-names>A. K.</given-names> <surname>Singh</surname></string-name>, and <string-name><given-names>A. K.</given-names> <surname>Agrawal</surname></string-name></person-group>, &#x201C;<article-title>WatMIF: Multimodal medical image fusion-based watermarking for telehealth applications</article-title>,&#x201D; <source>Cogn. Comput.</source>, vol. <volume>16</volume>, no. <issue>4</issue>, pp. <fpage>1947</fpage>&#x2013;<lpage>1963</lpage>, <year>2024</year>. doi: <pub-id pub-id-type="doi">10.1007/s12559-022-10040-4</pub-id>; <pub-id pub-id-type="pmid">35818513</pub-id></mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Chang</surname></string-name>, <string-name><given-names>Q.</given-names> <surname>Ren</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Ji</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Xu</surname></string-name>, and <string-name><given-names>R.</given-names> <surname>Xue</surname></string-name></person-group>, &#x201C;<article-title>Secure medical data management with privacy-preservation and authentication properties in smart healthcare system</article-title>,&#x201D; <source>Comput. Netw.</source>, vol. <volume>212</volume>, no. <issue>1</issue>, <year>2022, Art. no. 109013</year>. doi: <pub-id pub-id-type="doi">10.1016/j.comnet.2022.109013</pub-id>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>&#x00D6;zbay</surname></string-name> and <string-name><given-names>F. A.</given-names> <surname>&#x00D6;zbay</surname></string-name></person-group>, &#x201C;<article-title>Interpretable pap-smear image retrieval for cervical cancer detection with rotation invariance mask generation deep hashing</article-title>,&#x201D; <source>Comput. Biol. Med.</source>, vol. <volume>154</volume>, no. <issue>2</issue>, <year>2023, Art. no. 106574</year>. doi: <pub-id pub-id-type="doi">10.1016/j.compbiomed.2023.106574</pub-id>; <pub-id pub-id-type="pmid">36738706</pub-id></mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zeng</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Zheng</surname></string-name>, and <string-name><given-names>W.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>Multi-manifold deep discriminative cross-modal hashing for medical image retrieval</article-title>,&#x201D; <source>IEEE Trans. Image Process.</source>, vol. <volume>31</volume>, no. <issue>4</issue>, pp. <fpage>3371</fpage>&#x2013;<lpage>3385</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1109/TIP.2022.3171081</pub-id>; <pub-id pub-id-type="pmid">35507618</pub-id></mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Tang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Huang</surname></string-name>, and <string-name><given-names>S.</given-names> <surname>Xiong</surname></string-name></person-group>, &#x201C;<article-title>Multi-scale triplet hashing for medical image retrieval</article-title>,&#x201D; <source>Comput. Biol. Med.</source>, vol. <volume>155</volume>, <year>2023, Art. no. 106633</year>. doi: <pub-id pub-id-type="doi">10.1016/j.compbiomed.2023.106633</pub-id>; <pub-id pub-id-type="pmid">36827786</pub-id></mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Suganyadevi</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Seethalakshmi</surname></string-name>, and <string-name><given-names>K.</given-names> <surname>Balasamy</surname></string-name></person-group>, &#x201C;<article-title>A review on deep learning in medical image analysis</article-title>,&#x201D; <source>Int. J. Multimed. Inf. Retr.</source>, vol. <volume>11</volume>, no. <issue>1</issue>, pp. <fpage>19</fpage>&#x2013;<lpage>38</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1007/s13735-021-00218-1</pub-id>; <pub-id pub-id-type="pmid">34513553</pub-id></mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Xie</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Song</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Zheng</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Liu</surname></string-name> and <string-name><given-names>J.</given-names> <surname>Wang</surname></string-name></person-group>, &#x201C;<article-title>Dermoscopic image retrieval based on rotation-invariance deep hashing</article-title>,&#x201D; <source>Med. Image Anal.</source>, vol. <volume>77</volume>, no. <issue>5</issue>, <year>2022, Art. no. 102301</year>. doi: <pub-id pub-id-type="doi">10.1016/j.media.2021.102301</pub-id>; <pub-id pub-id-type="pmid">34836790</pub-id></mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Fang</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Fu</surname></string-name>, and <string-name><given-names>J.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Deep triplet hashing network for case-based medical image retrieval</article-title>,&#x201D; <source>Med. Image Anal.</source>, vol. <volume>69</volume>, no. <issue>4&#x2013;5</issue>, <year>2021, Art. no. 101981</year>. doi: <pub-id pub-id-type="doi">10.1016/j.media.2021.101981</pub-id>; <pub-id pub-id-type="pmid">33588123</pub-id></mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>&#x015E;.</given-names> <surname>&#x00D6;zt&#x00FC;rk</surname></string-name></person-group>, &#x201C;<article-title>Class-driven content-based medical image retrieval using hash codes of deep features</article-title>,&#x201D; <source>Biomed Signal Process. Control.</source>, vol. <volume>68</volume>, no. <issue>2</issue>, <year>2021, Art. no. 102601</year>. doi: <pub-id pub-id-type="doi">10.1016/j.bspc.2021.102601</pub-id>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>P.</given-names> <surname>Huang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wei</surname></string-name>, and <string-name><given-names>G.</given-names> <surname>Guo</surname></string-name></person-group>, &#x201C;<article-title>Energy-based supervised hashing for multimorbidity image retrieval</article-title>,&#x201D; in <conf-name>Med. Image Comput. Comput. Assist. Interv.&#x2014;MICCAI 2021: 24th Int. Conf.</conf-name>, <publisher-loc>Strasbourg, France</publisher-loc>, <publisher-name>Springer International Publishing</publisher-name>, <year>2021</year>, pp. <fpage>205</fpage>&#x2013;<lpage>214</lpage>. doi: <pub-id pub-id-type="doi">10.1007/978-3-030-87240-3_20</pub-id>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Y.</given-names> <surname>Qi</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Gu</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Jia</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Su</surname></string-name></person-group>, &#x201C;<article-title>Unsupervised deep hashing by joint optimization for pulmonary nodule image retrieval</article-title>,&#x201D; <source>Measurement</source>, vol. <volume>159</volume>, no. <issue>1</issue>, <year>2020, Art. no. 107785</year>. doi: <pub-id pub-id-type="doi">10.1016/j.measurement.2020.107785</pub-id>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. R.</given-names> <surname>Dubey</surname></string-name></person-group>, &#x201C;<article-title>A decade survey of content based image retrieval using deep learning</article-title>,&#x201D; <source>IEEE Trans. Circuits Syst. Video Technol.</source>, vol. <volume>32</volume>, no. <issue>5</issue>, pp. <fpage>2687</fpage>&#x2013;<lpage>2704</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/TCSVT.2021.3080920</pub-id>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Hafsa</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Gafsi</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Malek</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Machhout</surname></string-name></person-group>, &#x201C;<article-title>FPGA implementation of improved security approach for medical image encryption and decryption</article-title>,&#x201D; <source>Sci. Program.</source>, vol. <volume>2021</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>20</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1155/2021/6610655</pub-id>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Haddad</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Coatrieux</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Moreau-Gaudry</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Cozic</surname></string-name></person-group>, &#x201C;<article-title>Joint watermarking-encryption-JPEG-LS for medical image reliability control in encrypted and compressed domains</article-title>,&#x201D; <source>IEEE Trans. Inf. Foren. Secur.</source>, vol. <volume>15</volume>, pp. <fpage>2556</fpage>&#x2013;<lpage>2569</lpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.1109/TIFS.2020.2972159</pub-id>.</mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.</given-names> <surname>Guo</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Jia</surname></string-name>, <string-name><given-names>K. K. R.</given-names> <surname>Choo</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Jie</surname></string-name></person-group>, &#x201C;<article-title>Privacy-preserving image search (PPIS): Secure classification and searching using convolutional neural network over large-scale encrypted medical images</article-title>,&#x201D; <source>Comput. Secur.</source>, vol. <volume>99</volume>, no. <issue>11</issue>, <year>2020, Art. no. 102021</year>. doi: <pub-id pub-id-type="doi">10.1016/j.cose.2020.102021</pub-id>.</mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Kumar</surname></string-name>, <string-name><given-names>S. K.</given-names> <surname>Agarwal</surname></string-name>, and <string-name><given-names>S. S.</given-names> <surname>Ahmad</surname></string-name></person-group>, &#x201C;<article-title>A secure medical image retrieval technique using encrypted query image</article-title>,&#x201D; in <conf-name>2022 2nd Int. Conf. Emerg. Front. Electr. Electron. Technol. (ICEFEET)</conf-name>, <publisher-loc>Patna, India</publisher-loc>, <publisher-name>IEEE</publisher-name>, <year>2022</year>, pp. <fpage>1</fpage>&#x2013;<lpage>4</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ICEFEET51821.2022.9847826</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Zhu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Lu</surname></string-name>, and <string-name><given-names>D.</given-names> <surname>Feng</surname></string-name></person-group>, &#x201C;<article-title>An accurate and privacy-preserving retrieval scheme over outsourced medical images</article-title>,&#x201D; <source>IEEE Trans. Serv. Comput.</source>, vol. <volume>16</volume>, no. <issue>2</issue>, pp. <fpage>913</fpage>&#x2013;<lpage>926</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1109/TSC.2022.3149847</pub-id>.</mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Cai</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Wei</surname></string-name>, and <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>Privacy-preserving CNN feature extraction and retrieval over medical images</article-title>,&#x201D; <source>Int. J. Intell. Syst.</source>, vol. <volume>37</volume>, no. <issue>11</issue>, pp. <fpage>9267</fpage>&#x2013;<lpage>9289</lpage>, <year>2022</year>. doi: <pub-id pub-id-type="doi">10.1002/int.22991</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Rahman</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images</article-title>,&#x201D; <source>Comput. Biol. Med.</source>, vol. <volume>132</volume>, no. <issue>2</issue>, <year>2021, Art. no. 104319</year>. doi: <pub-id pub-id-type="doi">10.1016/j.compbiomed.2021.104319</pub-id>; <pub-id pub-id-type="pmid">33799220</pub-id></mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>G.</given-names> <surname>Alwakid</surname></string-name>, <string-name><given-names>W.</given-names> <surname>Gouda</surname></string-name>, and <string-name><given-names>M.</given-names> <surname>Humayun</surname></string-name></person-group>, &#x201C;<article-title>Deep learning-based prediction of diabetic retinopathy using CLAHE and ESRGAN for enhancement</article-title>,&#x201D; <source>Healthcare</source>, vol. <volume>11</volume>, no. <issue>6</issue>, <year>2023, Art. no. 863</year>. doi: <pub-id pub-id-type="doi">10.3390/healthcare11060863</pub-id>; <pub-id pub-id-type="pmid">36981520</pub-id></mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Xiong</surname></string-name> <etal>et al.</etal></person-group>, &#x201C;<article-title>Application of histogram equalization for image enhancement in corrosion areas</article-title>,&#x201D; <source>Shock Vib.</source>, vol. <volume>2021</volume>, no. <issue>1</issue>, pp. <fpage>1</fpage>&#x2013;<lpage>13</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1155/2021/8883571</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Qin</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Wu</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Li</surname></string-name></person-group>, &#x201C;<article-title>FcaNet: Frequency channel attention networks</article-title>,&#x201D; in <conf-name>Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV)</conf-name>, <year>2021</year>, pp. <fpage>783</fpage>&#x2013;<lpage>792</lpage>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>O.</given-names> <surname>Ronneberger</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Fischer</surname></string-name>, and <string-name><given-names>T.</given-names> <surname>Brox</surname></string-name></person-group>, &#x201C;<article-title>U-Net: Convolutional networks for biomedical image segmentation</article-title>,&#x201D; in <conf-name>Proc. 18th Int. Conf. Med. Image Comput. Comput.-Assist. Interv</conf-name>, <year>2015</year>, pp. <fpage>234</fpage>&#x2013;<lpage>241</lpage>. doi: <pub-id pub-id-type="doi">10.1007/978-3-319-24574-4_28</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Sui</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>E.</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Jiaerken</surname></string-name> and <string-name><given-names>N.</given-names> <surname>Luo</surname></string-name></person-group>, &#x201C;<article-title>A mutual multi-scale triplet graph convolutional network for classification of brain disorders using functional or structural connectivity</article-title>,&#x201D; <source>IEEE Trans. Med. Imag.</source>, vol. <volume>40</volume>, no. <issue>4</issue>, pp. <fpage>1279</fpage>&#x2013;<lpage>1289</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/TMI.2021.3051604</pub-id>; <pub-id pub-id-type="pmid">33444133</pub-id></mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Z. Q.</given-names> <surname>Lin</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Wong</surname></string-name></person-group>, &#x201C;<article-title>COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images</article-title>,&#x201D; <source>Sci. Rep.</source>, vol. <volume>10</volume>, no. <issue>1</issue>, <year>2020, Art. no. 19549</year>. doi: <pub-id pub-id-type="doi">10.1038/s41598-020-76550-z</pub-id>; <pub-id pub-id-type="pmid">33177550</pub-id></mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Fan</surname></string-name>, <string-name><given-names>K. W.</given-names> <surname>Ng</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Ju</surname></string-name>, <string-name><given-names>T.</given-names> <surname>Zhang</surname></string-name>, and <string-name><given-names>C. S.</given-names> <surname>Chan</surname></string-name></person-group>, &#x201C;<article-title>Deep polarized network for supervised learning of accurate binary hashing codes</article-title>,&#x201D; in <source>IJCAI&#x2019;20: Proc. Twenty-Ninth Int. Conf. Int. Joint Conf. Artif. Intell.</source>, <year>2021</year>, pp. <fpage>825</fpage>&#x2013;<lpage>831</lpage>.</mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>J.</given-names> <surname>Fang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Hu</surname></string-name>, and <string-name><given-names>J.</given-names> <surname>Liu</surname></string-name></person-group>, &#x201C;<article-title>Attention-based saliency hashing for ophthalmic image retrieval</article-title>,&#x201D; in <conf-name>Proc. IEEE Int. Conf. Bioinf. Biomed.</conf-name>, <year>2020</year>, pp. <fpage>990</fpage>&#x2013;<lpage>995</lpage>. doi: <pub-id pub-id-type="doi">10.1109/BIBM49941.2020.9313536</pub-id>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>X.</given-names> <surname>Zheng</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Zhang</surname></string-name>, and <string-name><given-names>X.</given-names> <surname>Lu</surname></string-name></person-group>, &#x201C;<article-title>Deep balanced discrete hashing for image retrieval</article-title>,&#x201D; <source>Neurocomputing</source>, vol. <volume>403</volume>, no. <issue>3</issue>, pp. <fpage>224</fpage>&#x2013;<lpage>236</lpage>, <year>2020</year>. doi: <pub-id pub-id-type="doi">10.1016/j.neucom.2020.04.037</pub-id>.</mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Hu</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Vasu</surname></string-name>, and <string-name><given-names>A.</given-names> <surname>Hoogs</surname></string-name></person-group>, &#x201C;<article-title>X-MIR: Explainable medical image retrieval</article-title>,&#x201D; in <conf-name>Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis. (WACV)</conf-name>, <year>2022</year>, pp. <fpage>440</fpage>&#x2013;<lpage>450</lpage>.</mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>E.</given-names> <surname>Cort&#x00E9;s</surname></string-name> and <string-name><given-names>S.</given-names> <surname>S&#x00E1;nchez</surname></string-name></person-group>, &#x201C;<article-title>Deep Learning Transfer with AlexNet for chest X-ray COVID-19 recognition</article-title>,&#x201D; <source>IEEE Lat. Am. Trans.</source>, vol. <volume>19</volume>, no. <issue>6</issue>, pp. <fpage>944</fpage>&#x2013;<lpage>951</lpage>, <year>2021</year>. doi: <pub-id pub-id-type="doi">10.1109/TLA.2021.9451239</pub-id>.</mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Q.</given-names> <surname>Zhuang</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Gan</surname></string-name>, and <string-name><given-names>L.</given-names> <surname>Zhang</surname></string-name></person-group>, &#x201C;<article-title>Human-computer interaction based health diagnostics using ResNet34 for tongue image classification</article-title>,&#x201D; <source>Comput. Methods Prog. Biomed</source>, vol. <volume>226</volume>, no. <issue>3</issue>, <year>2022, Art. no. 107096</year>. doi: <pub-id pub-id-type="doi">10.1016/j.cmpb.2022.107096</pub-id>; <pub-id pub-id-type="pmid">36191350</pub-id></mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Castro</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Impedovo</surname></string-name>, and <string-name><given-names>G.</given-names> <surname>Pirlo</surname></string-name></person-group>, &#x201C;<article-title>A medical image encryption scheme for secure fingerprint-based authenticated transmission</article-title>,&#x201D; <source>Appl. Sci.</source>, vol. <volume>13</volume>, no. <issue>10</issue>, <year>2023, Art. no. 6099</year>. doi: <pub-id pub-id-type="doi">10.3390/app13106099</pub-id>.</mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>R. I.</given-names> <surname>Abdelfatah</surname></string-name>, <string-name><given-names>H. M.</given-names> <surname>Saqr</surname></string-name>, and <string-name><given-names>M. E.</given-names> <surname>Nasr</surname></string-name></person-group>, &#x201C;<article-title>An efficient medical image encryption scheme for (WBAN) based on adaptive DNA and modern multi chaotic map</article-title>,&#x201D; <source>Multimed. Tools Appl.</source>, vol. <volume>82</volume>, no. <issue>14</issue>, pp. <fpage>22213</fpage>&#x2013;<lpage>22227</lpage>, <year>2023</year>. doi: <pub-id pub-id-type="doi">10.1007/s11042-022-13343-8</pub-id>.</mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Inam</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Kanwal</surname></string-name>, <string-name><given-names>R.</given-names> <surname>Firdous</surname></string-name>, and <string-name><given-names>F.</given-names> <surname>Hajjej</surname></string-name></person-group>, &#x201C;<article-title>Blockchain based medical image encryption using Arnold&#x2019;s cat map in a cloud environment</article-title>,&#x201D; <source>Sci. Rep.</source>, vol. <volume>14</volume>, no. <issue>1</issue>, <year>2024, Art. no. 5678</year>. doi: <pub-id pub-id-type="doi">10.1038/s41598-024-56364-z</pub-id>; <pub-id pub-id-type="pmid">38453988</pub-id></mixed-citation></ref>
</ref-list>
</back></article>