<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xml:lang="en" article-type="review-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">76492</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2026.076492</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Review</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Applications of Machine Learning in Polymer Materials: Property Prediction, Material Design, and Systematic Processes</article-title>
<alt-title alt-title-type="left-running-head">Applications of Machine Learning in Polymer Materials: Property Prediction, Material Design, and Systematic Processes</alt-title>
<alt-title alt-title-type="right-running-head">Applications of Machine Learning in Polymer Materials: Property Prediction, Material Design, and Systematic Processes</alt-title>
</title-group>
<contrib-group>
<contrib id="author-1" contrib-type="author">
<name name-style="western"><surname>Guo</surname><given-names>Hongtao</given-names></name><xref ref-type="aff" rid="aff-1">1</xref></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Li</surname><given-names>Shuai</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-3" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Li</surname><given-names>Shu</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>lishu@hrbust.edu.cn</email></contrib>
<aff id="aff-1"><label>1</label><institution>Key Laboratory of Engineering Dielectric and Applications (Ministry of Education), School of Electrical and Electronic Engineering, Harbin University of Science and Technology</institution>, <addr-line>Harbin</addr-line>, <country>China</country></aff>
<aff id="aff-2"><label>2</label><institution>School of Materials Science and Chemical Engineering, Harbin University of Science and Technology</institution>, <addr-line>Harbin</addr-line>, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Shu Li. Email: <email>lishu@hrbust.edu.cn</email></corresp>
</author-notes>
<pub-date date-type="collection" publication-format="electronic">
<year>2026</year>
</pub-date>
<pub-date date-type="pub" publication-format="electronic">
<day>9</day><month>4</month><year>2026</year>
</pub-date>
<volume>87</volume>
<issue>3</issue>
<elocation-id>2</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>11</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>27</day>
<month>01</month>
<year>2026</year>
</date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2026 The Authors. Published by Tech Science Press.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>The Authors</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_76492.pdf"></self-uri>
<abstract>
<p>This paper reviews the research progress and application prospects of machine learning technologies in the field of polymer materials. Currently, machine learning methods are developing rapidly in polymer material research; although they have significantly accelerated material prediction and design, their complexity has also caused difficulties in understanding and application for researchers in traditional fields. In response to the above issues, this paper first analyzes the inherent challenges in the research and development of polymer materials, including structural complexity and the limitations of traditional trial-and-error methods. To address these problems, it focuses on introducing key basic technologies such as molecular descriptors and feature representation, data standardization and cleaning, and records a number of high-quality polymer databases. Subsequently, it elaborates on the key role of machine learning in polymer property prediction and material design, covering the specific applications of algorithms such as traditional machine learning, deep learning, and transfer learning; further, it deeply expounds on data-driven design strategies, such as reverse design, high-throughput virtual screening, and multi-objective optimization. The paper also systematically introduces the complete process of constructing high-reliability machine learning models and summarizes effective experimental verification, model evaluation, and optimization methods. Finally, it summarizes the current technical challenges in research, such as data quality and model generalization ability, and looks forward to future development trends including multi-scale modeling, physics&#x2014;informed machine learning, standardized data sharing, and interpretable machine learning.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Machine learning</kwd>
<kwd>polymer materials</kwd>
<kwd>property prediction</kwd>
<kwd>material design</kwd>
<kwd>data-driven</kwd>
</kwd-group>
<funding-group>
<award-group id="awg1">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>51671075</award-id>
<award-id>51971086</award-id>
</award-group>
<award-group id="awg2">
<funding-source>Natural Science Foundation of Heilongjiang Province of China</funding-source>
<award-id>LH2022E081</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>As an important branch of material research, polymer science is gradually shifting its research paradigm from traditional experiment-driven to data-driven. The vigorous development of machine learning technology provides strong support for this transformation. In recent years, this technology has made remarkable progress in the fields of polymer material discovery, property prediction, and process optimization, showing broad application prospects. However, how to help researchers in traditional fields understand and apply these rapidly evolving technologies has become a key challenge for promoting the successful transformation of the paradigm.</p>
<p>As basic materials in modern industry, polymer materials are widely used in key fields such as aerospace, biomedicine, new energy, and electronic information, thanks to their advantages of light weight, corrosion resistance, and strong designability. Their research focuses on the correlation between &#x201C;structure-performance-application&#x201D;, specifically covering core topics such as molecular chain structure regulation (e.g., repeating unit composition, sequence arrangement, branched/cross-linked topology), aggregated structure optimization (e.g., crystallinity, phase separation morphology), macro-performance control (e.g., mechanical strength, thermal stability, conductivity), and process adaptability design [<xref ref-type="bibr" rid="ref-1">1</xref>]. After decades of development, polymer science has formed a traditional research system centered on experimental synthesis, theoretical calculation (e.g., molecular dynamics simulation, density functional theory), and process optimization, laying a solid foundation for new material R&#x0026;D [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>Currently, although there are mature non-machine learning research methods in polymer materials science, these methods face insurmountable bottlenecks when dealing with complex problems. Traditional research mainly relies on chemical intuition and trial-and-error methods, which are not only inefficient but also make it difficult to fully grasp the complex structure-property relationships of polymer materials [<xref ref-type="bibr" rid="ref-1">1</xref>]. For example, in the research on sequence regulation and performance matching of multi-component copolymers, traditional experimental methods need to verify hundreds or thousands of combination schemes one by one, with a cycle often lasting months or even years. Although theoretical calculations can reveal micro-mechanisms, when facing the multi-scale structure of polymer chains (from atomic level to macro aggregated state) and massive chemical space (with more than 10<sup>18</sup> potential polymer structures) [<xref ref-type="bibr" rid="ref-3">3</xref>], the calculation cost increases exponentially, failing to meet the demand for large-scale screening. In addition, traditional methods have low efficiency in data utilization and cannot effectively integrate hidden laws in multi-source heterogeneous data (such as experimental data, simulation data, and literature data), resulting in a lot of repetitive work and resource waste in the material R&#x0026;D process.</p>
<p>In contrast, machine learning can break through the limitations of traditional methods with its excellent high-dimensional data processing capabilities and non-linear modeling capabilities. In data-scarce scenarios, through technologies such as transfer learning and data enhancement, machine learning can mine structure-property relationships from limited samples. Facing multi-objective optimization problems (such as balancing mechanical strength, thermal stability, and biocompatibility of materials), it can quickly locate Pareto optimal solutions, significantly shortening the R&#x0026;D cycle. For example, in the design of polymer thermal conductive materials, traditional methods are difficult to balance molecular chain regularity and process feasibility, while machine learning models successfully predicted new structures with a 40% increase in thermal conductivity by integrating more than 1000 sets of experimental data, reducing the R&#x0026;D cycle from 2 years to 3 months [<xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;<xref ref-type="bibr" rid="ref-6">6</xref>].</p>
<p>To address the limitations of traditional methods and promote the paradigm transformation of polymer science, this study focuses on exploring the application progress of machine learning technologies in polymer research, systematically sorts out their development context and research status, and refines efficient and practical methodologies and systematic processes, aiming to provide valuable references for polymer material researchers to enter this field.</p>
<p>The structure of this paper is shown in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>: <xref ref-type="sec" rid="s2">Section 2</xref> elaborates on data characterization and preprocessing methods for polymer materials, including molecular descriptor construction, data standardization processes, and enhancement technologies; <xref ref-type="sec" rid="s3">Section 3</xref> comprehensively analyzes the application of various machine learning algorithms in property prediction, covering multi-level technologies such as traditional methods, deep learning, and transfer learning; <xref ref-type="sec" rid="s4">Section 4</xref> focuses on exploring data-driven polymer design strategies, including innovative methods such as reverse design, high-throughput screening, and multi-objective optimization; <xref ref-type="sec" rid="s5">Section 5</xref> discusses key links of experimental verification and model optimization; <xref ref-type="sec" rid="s6">Section 6</xref> demonstrates practical application results through typical cases; finally, the current challenges are summarized and future directions are prospected. This review clearly presents the complete knowledge system and technical route of machine learning technology in polymer science, providing a reference for interdisciplinary innovation.</p>
<fig id="fig-1">
<label>Figure 1</label>
<caption>
<title>The framework includes four core modules: structural descriptors (converting polymer structures into computer-processable feature vectors, such as molecular fingerprints and multi-scale descriptors), machine learning models (algorithm systems including traditional machine learning, deep learning, and transfer learning), machine learning-driven property extrapolation (achieving accurate prediction of key properties such as thermal conductivity and mechanical strength), and high-throughput computation (generating massive data through molecular dynamics simulation and first-principles calculation to support model training). The left side of the framework shows existing typical application cases (e.g., polymer electrolyte design, biodegradable material optimization), and the right side lists current core challenges (e.g., uneven data quality, insufficient model generalization ability, lack of interpretability), intuitively presenting the complete logical chain of &#x201C;data-model-application-challenge&#x201D;.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-1.tif"/>
</fig>
<sec id="s1_1">
<label>1.1</label>
<title>Research Background and Significance</title>
<p>As a basic material in modern industry, polymer materials face long-term challenges in precise design and performance improvement due to their structural complexity and multi-functional requirements. Traditional research methods mainly rely on chemical intuition and trial-and-error methods, which are not only inefficient but also make it difficult to fully grasp the complex structure&#x2014;property relationships of polymer materials [<xref ref-type="bibr" rid="ref-1">1</xref>]. With the arrival of the big data era, the combination of artificial intelligence and traditional scientific research has given birth to a new paradigm of &#x201C;AI for Science&#x201D;. As an important branch of artificial intelligence, machine learning has shown significant advantages in revealing the in-depth physical and chemical laws of polymer materials due to its excellent high-dimensional data processing ability [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>The core challenge in the field of polymer science lies in the fact that the relationship between its huge and complex multi-scale structural characteristics and properties has not been fully mastered. Polymer materials are usually composed of a collection of one or more similar molecules rather than a single structure, which brings unique challenges to traditional chemical representation and machine learning methods [<xref ref-type="bibr" rid="ref-3">3</xref>]. For example, the low thermal conductivity of intrinsic polymers contradicts their wide application requirements in the fields of integrated circuit packaging and organic semiconductors. However, due to the complex synthesis process and high cost of polymers, the publicly available reliable polymer thermal conductivity data are very scarce, which seriously hinders the understanding of the mapping relationship between the micro-structure of polymers and thermal conductivity [<xref ref-type="bibr" rid="ref-4">4</xref>]. Machine learning technology provides a new possibility to solve this problem through its ability to extract useful relationships from limited data [<xref ref-type="bibr" rid="ref-5">5</xref>].</p>
<p>The application of machine learning in polymer science has multiple practical significances. In terms of material design, machine learning can efficiently handle the huge chemical and configuration space of polymers and accelerate the discovery process of new materials [<xref ref-type="bibr" rid="ref-6">6</xref>]. Through the machine learning&#x2014;assisted inverse analysis method of polymer synthesis, the appropriate polymerization reaction conditions can be quickly and accurately predicted, thereby efficiently developing high-performance polymer materials [<xref ref-type="bibr" rid="ref-7">7</xref>]. In terms of property prediction, machine learning models can handle meaningful patterns in large-scale data that are difficult for humans to interpret, which is particularly useful for systems with complex interactions [<xref ref-type="bibr" rid="ref-8">8</xref>]. Especially when dealing with the complex structure&#x2014;function relationships of polymer materials, machine learning can establish connections between the chemical composition and conformation of molecular chains, the aggregated structure, and macro-properties [<xref ref-type="bibr" rid="ref-8">8</xref>&#x2013;<xref ref-type="bibr" rid="ref-10">10</xref>].</p>
<p>From the perspective of industrial application, the introduction of machine learning technology is reshaping the R &#x0026; D paradigm of polymer materials. The traditional &#x201C;trial-and-error&#x201D; experiment has been replaced by the intelligent R &#x0026; D model of &#x201C;prediction&#x2013;verification&#x201D;, which not only changes the working mode of researchers but also redefines the performance boundaries of future energy equipment. In many industries such as aerospace, automobile manufacturing, energy development, and biomedicine, machine learning technology can quickly and accurately predict material properties, significantly shortening the R &#x0026; D cycle and reducing costs [<xref ref-type="bibr" rid="ref-10">10</xref>]. For example, in the field of polymer composites, machine learning models can solve the thermal management problems that are difficult to handle with traditional development methods by analyzing a large amount of experimental data [<xref ref-type="bibr" rid="ref-11">11</xref>].</p>
<p>The particularity of polymer science also puts forward unique requirements for the application of machine learning. Since polymer materials are usually a collection of one or more similar molecules rather than a single structure, traditional chemical representation methods face challenges. At the same time, the scarcity of high-quality experimental data limits the effectiveness of supervised learning methods, especially in polymer property prediction tasks [<xref ref-type="bibr" rid="ref-12">12</xref>]. These challenges have prompted researchers to develop new methods, such as combining machine learning and high-throughput molecular dynamics simulation to predict material properties, and using transfer learning technology to solve the problem of data distribution differences [<xref ref-type="bibr" rid="ref-13">13</xref>].</p>
</sec>
<sec id="s1_2">
<label>1.2</label>
<title>Research Status</title>
<p>As an important branch of material research, polymer science is gradually shifting its research paradigm from traditional experiment-driven to data-driven. The vigorous development of machine learning technology provides strong support for this transformation. In recent years, this technology has made remarkable progress in the fields of polymer material discovery, property prediction, and process optimization, showing broad application prospects [<xref ref-type="bibr" rid="ref-14">14</xref>]. However, how to help researchers in traditional fields understand and apply these rapidly evolving technologies has become a key challenge for promoting the successful transformation of the paradigm [<xref ref-type="bibr" rid="ref-15">15</xref>].</p>
<p>The core challenge in the field of polymer science lies in the fact that the relationship between its huge and complex multi-scale structural characteristics and properties has not been fully mastered [<xref ref-type="bibr" rid="ref-16">16</xref>]. Polymer materials are usually composed of a collection of one or more similar molecules rather than a single structure, which brings unique difficulties to traditional chemical characterization and research methods [<xref ref-type="bibr" rid="ref-17">17</xref>]. For example, the low thermal conductivity of intrinsic polymers contradicts their wide application requirements in fields such as integrated circuit packaging and organic semiconductors. However, due to the complex synthesis process and high cost of polymers, publicly available reliable thermal conductivity data are very scarce, which seriously hinders the understanding of the mapping relationship between polymer microstructure and thermal conductivity [<xref ref-type="bibr" rid="ref-18">18</xref>]. Machine learning, with its ability to extract useful correlations from limited data, offers a new possibility to address this core challenge&#x2014;it can break through the limitations of traditional methods in handling multi-scale structures, massive chemical spaces, and complex structure-property relationships, serving as a powerful tool to solve key problems in polymer science [<xref ref-type="bibr" rid="ref-19">19</xref>,<xref ref-type="bibr" rid="ref-20">20</xref>].</p>
<p>In contrast, machine learning overcomes the bottlenecks of traditional methods through its superior high-dimensional data processing and non-linear modeling capabilities. It excels in data-scarce scenarios by leveraging transfer learning and data enhancement to mine structure-property relationships from limited samples, and efficiently navigates multi-objective optimization problems (e.g., balancing mechanical strength, thermal stability, and biocompatibility) to pinpoint Pareto optimal solutions [<xref ref-type="bibr" rid="ref-21">21</xref>]. A notable example is the design of polymer thermal conductive materials: while traditional methods struggle to reconcile molecular chain regularity with process feasibility, machine learning models integrated over 1000 sets of experimental data to predict novel structures with 40% enhanced thermal conductivity, cutting the R&#x0026;D cycle from 2 years to 3 months [<xref ref-type="bibr" rid="ref-5">5</xref>&#x2013;<xref ref-type="bibr" rid="ref-7">7</xref>].</p>
<p>To address the limitations of traditional methods, solve the core challenges of polymer science, and promote the paradigm transformation of the discipline, this study focuses on exploring the application progress of machine learning technologies in polymer research, systematically sorts out their development context and research status, and refines efficient and practical methodologies and systematic processes, aiming to provide valuable references for polymer material researchers to enter this field.</p>
</sec>
</sec>
<sec id="s2">
<label>2</label>
<title>Data Characterization and Preprocessing of Polymer Materials</title>
<p>Data characterization and preprocessing of polymer materials are the fundamental pillars of machine learning applications, with data quality directly determining the reliability and generalization ability of subsequent models. This process focuses on extracting, standardizing, and enhancing data from multi-source channels (experimental measurements, computational simulations, literature mining) to construct structured datasets that meet the requirements of machine learning algorithms. The core challenge lies in addressing the data-specific issues caused by polymer materials&#x2019; complex molecular structures, variable physical and chemical properties, and non-linear structure-property relationships&#x2014;including data scarcity, format inconsistency, and feature redundancy. As shown in <xref ref-type="table" rid="table-1">Table 1</xref>, multi-scale descriptors covering structural, physical, chemical, and multi-dimensional features provide quantitative tools for capturing polymer characteristics, while standardized data processing and enhancement technologies further improve data utility, laying a solid foundation for accurate modeling.</p>
<table-wrap id="table-1">
<label>Table 1</label>
<caption>
<title>Classification and application overview of multi&#x2013;scale descriptors for polymer materials.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Descriptor Category</th>
<th>Specific Descriptors</th>
<th>Limitations</th>
<th>Characterization Method/Source</th>
<th>Application Scenario</th>
</tr>
</thead>
<tbody>
<tr>
<td>Structural Features</td>
<td>Chemical composition of repeating units, bonding mode, sequence arrangement, stereoconfiguration</td>
<td>Oversimplifies atomic details; poor accuracy for fine structure analysis</td>
<td>BigSMILES [<xref ref-type="bibr" rid="ref-20">20</xref>], Coarse&#x2014;grained representation method [<xref ref-type="bibr" rid="ref-22">22</xref>], curlySMILES [<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>Polymer morphology characterization</td>
</tr>
<tr>
<td>Structural Features</td>
<td>Degree of polymerization, polydispersity, chain conformation</td>
<td>Limited compatibility with complex copolymer architectures; steep learning curve</td>
<td>SMILES combination modeling</td>
<td>Copolymer system characterization</td>
</tr>
<tr>
<td>Physical Features</td>
<td>Molecular refractive index, van der Waals surface area</td>
<td>Fails to capture dynamic molecular interactions; low adaptability to multi-component systems</td>
<td>43 key descriptors extracted by RDKit toolkit [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
<td>Prediction of physical and chemical properties</td>
</tr>
<tr>
<td>Physical Features</td>
<td>Atom type, number of bonded hydrogen atoms, atomic degree, implicit valence, aromaticity</td>
<td>Redundant descriptors; ignores macro-scale structural effects</td>
<td>Initial atomic feature vector of graph convolutional network [<xref ref-type="bibr" rid="ref-25">25</xref>]</td>
<td>Polymer property learning</td>
</tr>
<tr>
<td>Chemical Features</td>
<td>Electronic properties, spatial configuration</td>
<td>Dependent on network architecture; sensitive to input data quality</td>
<td>434 molecular descriptors extracted by RDKit [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
<td>Molecular structure analysis</td>
</tr>
<tr>
<td>Chemical Features</td>
<td>Micro-electronic structure, atomic information, force field parameters</td>
<td>High computational cost; information overlap between descriptors</td>
<td>320 physical descriptors extracted by polymer physical description operators [<xref ref-type="bibr" rid="ref-4">4</xref>]</td>
<td>Polymer system characterization</td>
</tr>
<tr>
<td>Multi-scale Features</td>
<td>Atomic&#x2013;level (155), segment&#x2013;level (197), molecular chain&#x2013;level (59) descriptors</td>
<td>Complex descriptor integration; requires professional domain knowledge</td>
<td>Three&#x2013;layer structure characterization method [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Dielectric constant research</td>
</tr>
<tr>
<td>Multi-scale Features</td>
<td>Atomic scale (108), QSPR level (99), morphological description (22)</td>
<td>Poor transferability across different polymer types; time-consuming to validate</td>
<td>Ramprasad three&#x2013;layer characterization method [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Polymer material characterization</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s2_1">
<label>2.1</label>
<title>Molecular Descriptors and Feature Representation</title>
<p>The core of polymer data characterization is converting complex chemical structures into quantifiable data features [<xref ref-type="bibr" rid="ref-26">26</xref>,<xref ref-type="bibr" rid="ref-27">27</xref>], with molecular descriptors serving as the bridge between polymer structures and machine learning models (as shown in <xref ref-type="table" rid="table-1">Table 1</xref>). The diversity of polymer structures (composition, architecture, sequence) requires multi-dimensional descriptor systems to capture key information, and the selection and extraction of descriptors directly affect data quality and model performance [<xref ref-type="bibr" rid="ref-28">28</xref>].</p>

<p>As shown in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>, SMILES and its extended formats are core tools for encoding polymer structural data. Traditional SMILES syntax has limitations in describing complex polymer morphologies, leading to the development of extended encoding formats such as BigSMILES and curlySMILES. These formats enhance the expression ability of structural data, enabling accurate encoding of linear, branched, random, block, alternating, and grafted polymers. For example, BigSMILES extends the SMILES syntax to capture the repeating unit combination and branch structure information of polymers, improving the consistency and completeness of structural data for multi-repeating unit composites [<xref ref-type="bibr" rid="ref-29">29</xref>]. For copolymer systems, structural data encoding is achieved by combining SMILES of each repeating unit, supplemented by key parameter data such as degree of polymerization, polydispersity, and chain conformation to enrich the descriptor dimension [<xref ref-type="bibr" rid="ref-30">30</xref>].</p>
<fig id="fig-2">
<label>Figure 2</label>
<caption>
<title>The figure shows a schematic diagram of the development process of SMILES (Simplified Molecular Input Line Entry System) and its extended forms (BigSMILES, CurlySMILES). From the proposal of traditional SMILES in 1988, to the subsequent development of extended representation methods such as BigSMILES (2019) and CurlySMILES (2011) 6 by researchers to solve its shortcomings in the characterization of complex polymer structures, these methods can more effectively characterize different polymer morphologies such as linear and branched, contributing to the accurate description of polymer structures.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-2.tif"/>
</fig>
<p>Molecular fingerprint technology provides high-dimensional data representation for polymer structures. Morgan fingerprints characterize molecular substructure data by identifying all possible substructures, with the improved MFF format further incorporating substructure frequency data [<xref ref-type="bibr" rid="ref-31">31</xref>]. Extended Connectivity Fingerprints (ECFP) convert monomer chemical structure data into binary descriptor vectors, effectively capturing the distribution characteristics of key substructures in polymers [<xref ref-type="bibr" rid="ref-32">32</xref>]. In practical applications, the RDKit chemical information toolkit is used to extract descriptor data from SMILES-encoded structures: 434 initial descriptors covering electronic properties, spatial configuration, and physical and chemical properties are extracted, and 43 key descriptors (including molecular refractive index, van der Waals surface area) are retained after Pearson correlation coefficient analysis and redundancy removal, improving data efficiency [<xref ref-type="bibr" rid="ref-33">33</xref>].</p>
<p>As shown in <xref ref-type="fig" rid="fig-3">Fig. 3</xref>, Graph representation methods enrich the structural data characterization dimension [<xref ref-type="bibr" rid="ref-34">34</xref>]. Graph Convolutional Networks (GCN) use atomic-level data (atom type, number of bonded hydrogen atoms, atomic degree, implicit valence, aromaticity) as initial feature vectors, and update node data through iterative learning to capture polymer structure-property relationships [<xref ref-type="bibr" rid="ref-35">35</xref>]. The graph-based molecular set representation combined with the Weighted Directed Message Passing Neural Network (wD-MPNN) architecture parameterizes the underlying molecular distribution data to capture the average graph structure features of repeating units [<xref ref-type="bibr" rid="ref-3">3</xref>]. For complex polymer systems, polymer physical description operators extract 320 physical descriptor data from monomer micro-electronic structure, atomic information, and force field parameters. Through statistical analysis and 100 rounds of random sequence feature screening, the data dimension is reduced to 20 optimized descriptors, balancing data richness and computational efficiency [<xref ref-type="bibr" rid="ref-4">4</xref>].</p>
<fig id="fig-3">
<label>Figure 3</label>
<caption>
<title>Different types of molecular representations for the same molecule [<xref ref-type="bibr" rid="ref-35">35</xref>]. (<bold>1</bold>) Fingerprint vector; (<bold>2</bold>) SMILES string; (<bold>3</bold>) Potential energy function; (<bold>4</bold>) Weighted graph of atoms and bonds; (<bold>5</bold>) Coulomb matrix; (<bold>6</bold>) Combination of bonds/fragments; (<bold>7</bold>) 3D geometry of atomic charges; (<bold>8</bold>) Electronic density.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-3.tif"/>
</fig>
<p>Structural data has been divided into atomic-level (155 descriptors), segment-level (197 descriptors), and molecular chain-level (59 descriptors) to form a multi-scale data system [<xref ref-type="bibr" rid="ref-36">36</xref>]. A three-layer data characterization strategy has been adopted, consisting of atomic scale (108 descriptors, e.g., O1-C3-C4 segment data), QSPR level (99 descriptors, e.g., van der Waals surface area data), and morphological description level (22 descriptors, e.g., inter-ring topological distance data) [<xref ref-type="bibr" rid="ref-37">37</xref>].</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Data Standardization and Cleaning</title>
<p>Data standardization and cleaning are key steps to eliminate data noise and ensure consistency, addressing issues such as compatibility problems and non-uniform standards in polymer data from different sources [<xref ref-type="bibr" rid="ref-20">20</xref>]. As shown in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>, high-quality data is the prerequisite for avoiding &#x201C;garbage in, garbage out&#x201D;, and standardized processing improves data comparability and usability [<xref ref-type="bibr" rid="ref-5">5</xref>]. Polymer characterization data often exists in the form of statistical indicators (molecular weight and its distribution, crystallinity), increasing the complexity of data processing [<xref ref-type="bibr" rid="ref-22">22</xref>].</p>
<fig id="fig-4">
<label>Figure 4</label>
<caption>
<title>This diagram outlines the systematic preprocessing of polymer data for machine learning.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-4.tif"/>
</fig>
<p>Data preprocessing focuses on three core tasks: error correction, duplicate removal, and outlier handling. Error correction involves verifying and correcting experimental data deviations and simulation data errors; duplicate removal eliminates redundant data entries from multi-source integration; outlier handling identifies and processes abnormal data points using methods such as z-score and boxplot analysis [<xref ref-type="bibr" rid="ref-38">38</xref>]. For numerical data, standardization (min-max scaling, z-score standardization) is performed to ensure consistent data range, while categorical data (polymer type, synthesis method) is converted into machine-readable formats via one-hot encoding or label encoding. Min-max scaling is widely used for its ability to maintain data distribution characteristics [<xref ref-type="bibr" rid="ref-39">39</xref>]. In model training, data is typically split into training and test sets at an 8:2 or 9:1 ratio, with separate standardization to avoid data leakage [<xref ref-type="bibr" rid="ref-34">34</xref>].</p>
<p>Data standardization relies on unified data systems and databases. To address data dispersion and non-standardization, the polymer research community has developed specialized database systems such as PoLyInfo and CRIPT [<xref ref-type="bibr" rid="ref-8">8</xref>]. The Polydat framework supports standardized integration of structural data and characterization parameters; BigSMILES format ensures consistent encoding of polymer repeating units and branch structure data [<xref ref-type="bibr" rid="ref-30">30</xref>]. The PoLyInfo database contains property data of &#x007E;100 polymers (glass transition temperature, melting point, density, thermal conductivity), which have undergone strict cleaning and standardization, improving model prediction accuracy [<xref ref-type="bibr" rid="ref-40">40</xref>]. <xref ref-type="table" rid="table-2">Table 2</xref> lists commonly used polymer datasets, covering various property data types and providing high-quality data resources for machine learning applications.</p>
<table-wrap id="table-2">
<label>Table 2</label>
<caption>
<title>Commonly used polymer datasets.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Dataset Name</th>
<th>Contained Data</th>
<th>Description</th>
<th>Web</th>
</tr>
</thead>
<tbody>
<tr>
<td>Polymer Genome Platform</td>
<td>Refractive Index (RI), dielectric properties, glass transition temperature (Tg)</td>
<td>Experimental data repository with 500&#x002B; polymer measurements for real-time property prediction</td>
<td><ext-link ext-link-type="uri" xlink:href="https://polymergenome.ecust.edu.cn/">https://polymergenome.ecust.edu.cn/</ext-link></td>
</tr>
<tr>
<td>Khazana</td>
<td>Computational materials data</td>
<td>Georgia Tech database for machine learning applications in polymer science</td>
<td><ext-link ext-link-type="uri" xlink:href="https://khazana.gatech.edu/dataset/">https://khazana.gatech.edu/dataset/</ext-link></td>
</tr>
<tr>
<td>Dortmund Database</td>
<td>Polymer thermophysical properties</td>
<td>Commercial reference database for thermal characteristics</td>
<td><ext-link ext-link-type="uri" xlink:href="https://ddbst.com/">https://ddbst.com/</ext-link></td>
</tr>
<tr>
<td>PoLyInfo</td>
<td>Multiscale polymer performance</td>
<td>NIMS Japan comprehensive polymer repository</td>
<td><ext-link ext-link-type="uri" xlink:href="https://polymer.nims.go.jp">https://polymer.nims.go.jp</ext-link></td>
</tr>
<tr>
<td>NIST Spectral Database</td>
<td>Synthetic polymer MALDI mass spectrometry</td>
<td>Spectral analysis database for polymer characterization</td>
<td><ext-link ext-link-type="uri" xlink:href="https://maldi.nist.gov">https://maldi.nist.gov</ext-link></td>
</tr>
<tr>
<td>CROW Polymer Database</td>
<td>Physical/mechanical/<break/>thermal/electrical properties</td>
<td>Broad-spectrum polymer properties reference</td>
<td><ext-link ext-link-type="uri" xlink:href="http://polymerdatabase.com">http://polymerdatabase.com</ext-link></td>
</tr>
<tr>
<td>Material Properties Database</td>
<td>Comparative material metrics</td>
<td>Industrial materials benchmark including polymers</td>
<td><ext-link ext-link-type="uri" xlink:href="https://www.makeitfrom.com">https://www.makeitfrom.com</ext-link></td>
</tr>
<tr>
<td>Mechanical Properties Dataset</td>
<td>Young&#x2019;s modulus, tensile strength, elongation (429 points)</td>
<td>Combined literature/MD simulation data for structure-property modeling</td>
<td><ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/datasets/purushottamnawale/materials">https://www.kaggle.com/datasets/purushottamnawale/materials</ext-link></td>
</tr>
<tr>
<td>Thermal Conductivity Dataset</td>
<td>Polymer chain descriptors, DFT calculations</td>
<td>Structure-thermal property relationships for novel polymer design</td>
<td><ext-link ext-link-type="uri" xlink:href="https://researchdata.edu.au/thermal-conductivity-dataset/3431817">https://researchdata.edu.au/thermal-conductivity-dataset/3431817</ext-link></td>
</tr>
<tr>
<td>Compatibility Dataset</td>
<td>Polymer-polymer interaction data (1000&#x002B; points)</td>
<td>Literature-mined classification data for blend miscibility</td>
<td><ext-link ext-link-type="uri" xlink:href="https://github.com/cloudflare/workers-sdk/issues/193">https://github.com/cloudflare/workers-sdk/issues/193</ext-link></td>
</tr>
<tr>
<td>Dielectric Multi-task Dataset</td>
<td>Permeability/diffusivity/solubility parameters</td>
<td>Fusion of high-fidelity experimental and low-fidelity simulation data</td>
<td><ext-link ext-link-type="uri" xlink:href="https://github.com/easezyc/Multitask-Recommendation-Library">https://github.com/easezyc/Multitask-Recommendation-Library</ext-link></td>
</tr>
<tr>
<td>Refractive Index Dataset</td>
<td>Hierarchical fingerprint data for 500 polymers</td>
<td>Multi-scale structural descriptors (atomic/segment/chain level)</td>
<td><ext-link ext-link-type="uri" xlink:href="https://refractiveindex.info/">https://refractiveindex.info/</ext-link></td>
</tr>
<tr>
<td>PI1M</td>
<td>Polymer structures, synthetic accessibility score</td>
<td>PI1M has &#x007E;1 M polymers and Schuffenhauer&#x2019;s SA scores, a polymer informatics benchmark.</td>
<td><ext-link ext-link-type="uri" xlink:href="https://github.com/RUIMINMA1996/PI1M">https://github.com/RUIMINMA1996/PI1M</ext-link></td>
</tr>
<tr>
<td>Polymer Genome</td>
<td>Bandgap, dielectric constant, refractive index, atomization energy, Tg, solubility parameter, density</td>
<td>Polymer Genome has computational &#x0026; experimental polymer data for informatics and property prediction</td>
<td><ext-link ext-link-type="uri" xlink:href="https://www.polymergenome.org">https://www.polymergenome.org</ext-link></td>
</tr>
<tr>
<td>Polymer Property<break/>Predictor and Database</td>
<td>Flory-Huggins chi parameters, glass transition temperature (Tg)</td>
<td>A literature-extracted polymer database with chi parameters and Tg, for polymer informatics research</td>
<td ><ext-link ext-link-type="uri" xlink:href="https://pppdb.uchicago.edu">https://pppdb.uchicago.edu</ext-link></td>
</tr>
<tr>
<td>Polymer Science<break/>Learning Center<break/>Spectral Database</td>
<td>Polymer FTIR, Raman, NMR spectra</td>
<td>Experimental spectral database with polymer-specific spectra for identification and structural analysis</td>
<td><ext-link ext-link-type="uri" xlink:href="https://pslc.uwsp.edu">https://pslc.uwsp.edu</ext-link></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Data quality control also involves strict screening criteria. For example, polymer structures with glass transition temperature or melting point standard deviations exceeding 30 K are excluded to ensure data reliability [<xref ref-type="bibr" rid="ref-41">41</xref>]. The construction of datasets adhering to the FAIR principle (Findable, Accessible, Interoperable, Reusable) is emphasized, requiring systematic accumulation of experimental data or high-throughput data generation [<xref ref-type="bibr" rid="ref-13">13</xref>]. However, challenges remain, such as missing reaction parameters and incomplete characterization conditions in open-source data [<xref ref-type="bibr" rid="ref-42">42</xref>], which need to be addressed through data integration and supplementation.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Data Enhancement Technology</title>
<p>Data scarcity is a critical bottleneck restricting the performance of polymer machine learning models. Given the high cost and difficulty of polymer experimental data acquisition, data enhancement technologies are developed to expand dataset scale while maintaining data rationality and physical consistency, providing sufficient data support for model training.</p>
<p>Physical modeling-based data enhancement generates simulated data with physical significance. Liu et al. used the Fire Dynamics Simulator (FDS) to simulate cone calorimeter experiments, generating ignition time and peak heat release rate data consistent with physical laws, effectively expanding the training sample library [<xref ref-type="bibr" rid="ref-12">12</xref>]. This method avoids the limitations of experimental data acquisition and ensures the reliability of generated data through physical constraints.</p>
<p>Transfer learning and cross-dataset data reuse improve data utilization. In thermal conductivity prediction, researchers pre-trained 1000 neural network models using the PoLyInfo and QM9 databases, then fine-tuned them with limited target data, significantly improving prediction accuracy [<xref ref-type="bibr" rid="ref-36">36</xref>]. This approach leverages existing large-scale datasets to compensate for target data scarcity, enhancing model generalization.</p>
<p>Molecular fragment recombination enriches structural data diversity. The polyBERT model decomposes known polymer structures into fragments and recombines them to generate 100 million hypothetical PSMILES strings, expanding the polymer structure data space [<xref ref-type="bibr" rid="ref-37">37</xref>]. This chemical knowledge-guided data enhancement ensures the rationality of generated molecular structures while increasing data volume.</p>
<p>Resampling and generative model-based data expansion address small sample issues. Bootstrap resampling expands 180 experimental samples to 1500 samples, retaining the statistical characteristics of original data and solving the data scarcity problem in natural fiber-reinforced polymer composite research [<xref ref-type="bibr" rid="ref-14">14</xref>]. The graph grammar distillation framework decomposes amino acid structures into molecular graph grammar fragments, realizing accurate exploration of high-dimensional polymer space through recombination [<xref ref-type="bibr" rid="ref-19">19</xref>]. Generative recurrent neural networks in the PI1M database generate &#x007E;1 million theoretical polymer data [<xref ref-type="bibr" rid="ref-38">38</xref>], while large language models construct extended datasets covering four types of polymer property prediction tasks [<xref ref-type="bibr" rid="ref-39">39</xref>], significantly expanding data scale.</p>
<p>Multi-source data integration and active learning optimize data efficiency. Research teams integrate multi-source small molecule databases, generate massive hypothetical structures of 8 polymer types and 1 copolymer type via rule-based polymerization reactions, and construct structured datasets for thermal, mechanical, and gas permeation property prediction [<xref ref-type="bibr" rid="ref-40">40</xref>]. In high-cost data acquisition scenarios, active learning combined with Bayesian optimization selects the most informative samples for experimentation, maximizing data utility. High-throughput computing and experimental collaboration (molecular dynamics simulation &#x002B; automated experiments) construct high-quality standardized datasets, providing systematic data solutions for polymer research [<xref ref-type="bibr" rid="ref-22">22</xref>].</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Application of Machine Learning Algorithms in Polymer Property Prediction</title>
<p>In recent years, machine learning algorithms have driven a paradigm shift in polymer property prediction, with algorithm selection, optimization, and innovation directly determining the accuracy and efficiency of property prediction. The core goal is to establish the mapping relationship between polymer structural features and macro-properties through algorithmic modeling, addressing the limitations of traditional methods in handling high-dimensional, non-linear, and small-sample data. Current research focuses on three algorithmic directions: traditional machine learning algorithms for feature-engineered data modeling, deep learning algorithms for end-to-end automatic feature learning, and transfer/multi-task learning algorithms for data scarcity and multi-property correlation mining. As shown in <xref ref-type="table" rid="table-3">Table 3</xref>, different algorithm categories have distinct advantages in model structure, computational efficiency, and applicable scenarios, providing diverse algorithmic tools for polymer property prediction (as shown in <xref ref-type="table" rid="table-3">Table 3</xref>).</p>
<table-wrap id="table-3">
<label>Table 3</label>
<caption>
<title>Performance comparison of different machine learning algorithms in material property prediction.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Algorithm Category</th>
<th>Representative Model</th>
<th>Predicted Performance Indicator</th>
<th>Applicable Scenario</th>
<th>Literature Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td>Traditional Machine Learning</td>
<td>Support Vector Machine (SVM)</td>
<td>Polymer Tg prediction R<sup>2</sup> &#x003D; 0.91 [<xref ref-type="bibr" rid="ref-9">9</xref>]</td>
<td>Small sample, high-dimensional dataset analysis [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-6">6</xref>,<xref ref-type="bibr" rid="ref-10">10</xref>]</td>
</tr>
<tr>
<td>Traditional Machine Learning</td>
<td>Random Forest (RF)</td>
<td>Thermal conductivity prediction R<sup>2</sup> &#x003D; 0.97 [<xref ref-type="bibr" rid="ref-24">24</xref>]</td>
<td>Processing long input features and noisy data</td>
<td>[<xref ref-type="bibr" rid="ref-30">30</xref>,<xref ref-type="bibr" rid="ref-50">50</xref>]</td>
</tr>
<tr>
<td>Traditional Machine Learning</td>
<td>XGBoost</td>
<td>Concrete strength prediction R<sup>2</sup> &#x003D; 0.98 [<xref ref-type="bibr" rid="ref-42">42</xref>]</td>
<td>Automatically identifying feature interaction relationships [<xref ref-type="bibr" rid="ref-42">42</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-42">42</xref>]</td>
</tr>
<tr>
<td>Deep Learning</td>
<td>Graph Neural Network (GNN)</td>
<td>Tg prediction RMSE &#x003D; 30 K, R<sup>2</sup> &#x003D; 0.90 [<xref ref-type="bibr" rid="ref-25">25</xref>]</td>
<td>Processing molecular graph structure data [<xref ref-type="bibr" rid="ref-3">3</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-4">4</xref>,<xref ref-type="bibr" rid="ref-31">31</xref>]</td>
</tr>
<tr>
<td>Deep Learning</td>
<td>Transformer</td>
<td>PSMILES processing 100 times faster [<xref ref-type="bibr" rid="ref-37">37</xref>]</td>
<td>Chemical language model construction [<xref ref-type="bibr" rid="ref-37">37</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-37">37</xref>]</td>
</tr>
<tr>
<td>Deep Learning</td>
<td>Physics&#x2014;Informed Neural Network</td>
<td>Thermal conductivity anisotropy prediction [<xref ref-type="bibr" rid="ref-43">43</xref>]</td>
<td>Multi-scale modeling [<xref ref-type="bibr" rid="ref-43">43</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-43">43</xref>]</td>
</tr>
<tr>
<td>Transfer Learning</td>
<td>Sim2Real strategy</td>
<td>Thermal conductivity prediction MAE &#x003D; 0.024 W&#x000B7;m<sup>&#x2212;1</sup>&#x000B7;K<sup>&#x2212;1</sup></td>
<td>Data&#x2013;scarce scenarios</td>
<td></td>
</tr>
<tr>
<td>Multi-task Learning</td>
<td>PolyBERT</td>
<td>Multi-attribute joint prediction [<xref ref-type="bibr" rid="ref-37">37</xref>]</td>
<td>Mining associations between attributes [<xref ref-type="bibr" rid="ref-32">32</xref>]</td>
<td>[<xref ref-type="bibr" rid="ref-32">32</xref>,<xref ref-type="bibr" rid="ref-44">44</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s3_1">
<label>3.1</label>
<title>Traditional Machine Learning Methods</title>
<p>Traditional machine learning algorithms play a foundational role in polymer property prediction, relying on feature engineering to extract key structural information and establish mapping relationships with properties. These algorithms are characterized by clear principles, efficient computing, and strong adaptability to small-sample data, making them widely used in early polymer informatics research.</p>
<p>Support Vector Machine (SVM) excels in non-linear modeling and high-dimensional data analysis. By constructing an optimal hyperplane in the high-dimensional feature space, SVM realizes classification and regression tasks, with the Gaussian radial basis function (RBF) as the most commonly used kernel function for polymer property prediction [<xref ref-type="bibr" rid="ref-5">5</xref>]. In polymer glass transition temperature (Tg) and electrostrictive property prediction, SVM models achieve high prediction accuracy by optimizing kernel parameters (&#x03B3;) and regularization parameters (C), balancing model complexity and generalization ability [<xref ref-type="bibr" rid="ref-5">5</xref>]. For transverse mechanical property prediction of Fiber-Reinforced Polymer (FRP) composites, SVM shows excellent cross-system generalization, adapting to different fiber types and manufacturing processes, with prediction accuracy significantly exceeding traditional theoretical analysis methods [<xref ref-type="bibr" rid="ref-40">40</xref>]. The algorithm&#x2019;s structural risk minimization principle makes it particularly suitable for small-sample, high-dimensional polymer datasets.</p>
<p>Gradient Boosting Decision Tree (GBDT) excels at capturing complex correlations between multi-factor data through stepwise boosting. By iteratively optimizing weak learners and reducing prediction errors via residual correction, GBDT enhances model generalization, and has yielded prominent outcomes in polymer sequence regulation-reactivity ratio modeling [<xref ref-type="bibr" rid="ref-15">15</xref>]. Chen Mao&#x2019;s research team constructed a copolymer structure prediction platform based on GBDT, establishing a quantitative correlation model between copolymer sequence distribution and monomer reactivity ratios in multi-component copolymerization. The algorithm can output precise reactivity ratio values using sparse experimental data and optimize monomer feeding strategies considering sequence uniformity, tapping into its strength in processing high-dimensional structured feature data [<xref ref-type="bibr" rid="ref-15">15</xref>]. In polymer thermal conductivity prediction, the RF model achieves a coefficient of determination (R<sup>2</sup>) of 0.97, comparable to the CatBoost model, and its feature importance ranking function can identify key structural descriptors affecting thermal conductivity [<xref ref-type="bibr" rid="ref-24">24</xref>].</p>
<p>Support Vector Regression (SVR) extends SVM to continuous value prediction. In polymer band gap prediction, Zhu et al. used SVR with Gaussian RBF kernel to achieve R<sup>2</sup> &#x003D; 0.91, outperforming traditional statistical methods such as partial least squares and multiple linear regression [<xref ref-type="bibr" rid="ref-9">9</xref>]. In electrostriction and Curie temperature prediction, SVR optimizes the trade-off between model complexity and training error, constructing reliable prediction models for complex polymer systems [<xref ref-type="bibr" rid="ref-41">41</xref>]. The algorithm&#x2019;s strong generalization ability in small-sample scenarios makes it a core tool for polymer property regression tasks.</p>
<p>Extreme Gradient Boosting (XGBoost) enhances prediction accuracy through gradient boosting. In geopolymer concrete strength prediction, the XGBoost model achieves R<sup>2</sup> &#x003D; 0.98, significantly outperforming SVM (0.91) and MLP (0.88) [<xref ref-type="bibr" rid="ref-42">42</xref>]. The algorithm iteratively optimizes decision tree models to automatically capture complex interactions between polymer structural features and performance indicators, and is widely used in organic photovoltaic material efficiency prediction (Power Conversion Efficiency, PCE) [<xref ref-type="bibr" rid="ref-43">43</xref>]. XGBoost&#x2019;s regularization mechanism (L1/L2) and missing value processing capabilities improve its robustness to noisy polymer data, making it suitable for multi-source data integration modeling.</p>
<p>Traditional machine learning algorithms also show unique advantages in polymer phase identification and self-assembly behavior prediction. SVM combined with polynomial kernel function successfully distinguishes different phases of two-dimensional spin models (ferromagnetic Ising model, etc.), learning mathematical expressions of physical discriminators (order parameters, Hamiltonian constraints) to understand polymer phase transition behavior [<xref ref-type="bibr" rid="ref-44">44</xref>]. RF realizes accurate classification of the new PISA (Polymerization-Induced Self-Assembly) system by analyzing key features such as monomer composition and polymerization conditions, demonstrating strong adaptability to complex polymer system classification tasks [<xref ref-type="bibr" rid="ref-35">35</xref>].</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Deep Learning Technology</title>
<p>Deep learning algorithms drive revolutionary progress in polymer property prediction by enabling automatic feature learning and end-to-end modeling, addressing the limitations of traditional algorithms in handling complex structural data (molecular graphs, sequences). With their powerful non-linear fitting and high-dimensional data processing capabilities, deep learning models excel in capturing multi-scale structure-property relationships of polymers.</p>
<p>Graph Neural Networks (GNNs) specialize in processing molecular graph-structured data. As a representative of graph-based Message Passing Neural Network (MPNN) architecture, Chemprop realizes efficient modeling of small organic molecules and polymer repeating unit structures through a directed message passing mechanism [<xref ref-type="bibr" rid="ref-3">3</xref>]. Its improved version, wD-MPNN, enhances the modeling accuracy of polymer collective properties by optimizing message passing rules. The hybrid GCN-NN (Graph Convolutional Network &#x002B; Neural Network Regression) model achieves Tg prediction with RMSE &#x003D; 30 K and R<sup>2</sup> &#x003D; 0.9 [<xref ref-type="bibr" rid="ref-25">25</xref>], but shows relatively poor performance in elastic modulus (E) prediction, reflecting the sensitivity of GNN architecture to property-specific structural features. GNNs&#x2019; ability to directly model atomic-level interactions makes them ideal for polymer structure-property relationship mining.</p>
<p>Generative deep learning models enable polymer inverse design and structure generation. Variational Autoencoders (VAEs) integrate attribute estimation models into the latent space, realizing the innovative strategy of inferring molecular structures from performance targets [<xref ref-type="bibr" rid="ref-5">5</xref>]. Generative Adversarial Networks (GANs) generate copolymer structures with specific Young&#x2019;s modulus by adversarial training between generator and discriminator, providing new tools for polymer structure design [<xref ref-type="bibr" rid="ref-31">31</xref>]. Notably, such generative paradigms have also been extended to biomimetic intelligent thermal management materials&#x2014;where machine learning-driven discovery, combined with nature-inspired design, has yielded advanced materials for thermal regulation, further validating the versatility of generative models in functional polymer development. These generative models require large-scale training data to master chemical rules and SMILES syntax, and have achieved remarkable results in polymer antifouling material design&#x2014;with the predicted and measured values showing R<sup>2</sup> &#x003D; 0.9869 [<xref ref-type="bibr" rid="ref-45">45</xref>]. Beyond polymer-specific tasks, machine learning approaches (e.g., for geopolymer concrete strength prediction) also demonstrate the transferability of data-driven modeling logic, where similar feature-learning and optimization principles support accurate property forecasting for related composite systems. The end-to-end learning mode of generative models overcomes the limitations of traditional descriptor-based methods in capturing complex polymer structures.</p>
<p>As shown in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>, Transformer architecture leads breakthroughs in polymer informatics. The polyBERT chemical language model, based on the DeBERTa architecture, converts PSMILES strings into numerical fingerprint representations through a multi-head self-attention mechanism, with a prediction speed two orders of magnitude faster than traditional manual design fingerprints [<xref ref-type="bibr" rid="ref-37">37</xref>]. The model deeply mines chemical patterns and sequence dependencies in PSMILES data, achieving efficient multi-property prediction. The MMPolymer framework adopts a multi-modal multi-task pre-training strategy, integrating CNNs (for spatial structure features) and RNNs (for sequence features), and fuses multi-source data through a multi-head attention mechanism to reveal deep correlations between polymer sequences and properties [<xref ref-type="bibr" rid="ref-46">46</xref>]. Transformers&#x2019; parallel computing capability and long-range dependency capture ability make them suitable for large-scale polymer data processing.</p>
<fig id="fig-5">
<label>Figure 5</label>
<caption>
<title>The figure shows a variety of machine learning methods in polymer materials. It shows a schematic diagram of the principles of CNN, LSTM, GCN, and VAE [<xref ref-type="bibr" rid="ref-5">5</xref>].</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-5.tif"/>
</fig>
<p>Recurrent Neural Networks (RNNs) and their variants excel in sequence data modeling. In processing text input data such as SMILES strings, RNNs and LSTMs (Long Short-Term Memory) effectively capture the sequence dependence of polymer chains, providing tools for understanding polymer structure-activity relationships [<xref ref-type="bibr" rid="ref-20">20</xref>]. Physics-Informed Neural Networks (PINNs) integrate physical laws into the neural network loss function, combining molecular dynamics simulation and experimental data to achieve breakthroughs in phase transition interface evolution and thermal conductivity anisotropy prediction [<xref ref-type="bibr" rid="ref-47">47</xref>]. PINNs enhance model interpretability and extrapolation ability by incorporating physical constraints, addressing the &#x201C;black box&#x201D; problem of traditional deep learning models and promoting multi-scale polymer modeling.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Transfer Learning and Multi-Task Learning</title>
<p>Transfer learning and multi-task learning address key challenges in polymer property prediction (data scarcity, multi-property correlation) through algorithmic innovation, expanding the application scope of machine learning in polymer science.</p>
<p>Transfer learning solves data scarcity via knowledge transfer. The Sim2Real strategy pre-trains models on large-scale simulation data to learn general polymer structure-property relationships, then fine-tunes with a small amount of experimental data to adapt to target tasks [<xref ref-type="bibr" rid="ref-18">18</xref>]. In polymer thermal conductivity prediction, the WU team constructed a pre-trained model using the PoLyInfo and QM9 databases, achieving MAE &#x003D; 0.024 W&#x000B7;m<sup>&#x2212;1</sup>&#x000B7;K<sup>&#x2212;1</sup> with only 28 experimental data points for fine-tuning&#x2014;significantly outperforming models trained directly on small datasets [<xref ref-type="bibr" rid="ref-48">48</xref>]. In membrane electrode assembly research, transfer learning established a high-performance prediction model with only 12 samples, greatly reducing experimental costs [<xref ref-type="bibr" rid="ref-49">49</xref>]. Key technical points of transfer learning include domain adaptation (aligning data distributions between pre-training and target tasks) and transfer boundary control (avoiding negative transfer) [<xref ref-type="bibr" rid="ref-50">50</xref>]. For example, the Mossa team adjusted the 3D convolutional neural network architecture when transferring the surfactant classification model to the Nafion system, improving model adaptability to multi-scale disordered polymer materials [<xref ref-type="bibr" rid="ref-51">51</xref>].</p>
<p>Multi-task learning enhances model performance by mining inter-property correlations. By training multiple related property prediction tasks simultaneously, multi-task learning enables feature sharing between tasks, improving model generalization and prediction accuracy. The Ramprasad team found that joint training of glass transition temperature, melting temperature, and degradation temperature enables the neural network to capture intrinsic attribute correlations, enhancing prediction performance for each individual task [<xref ref-type="bibr" rid="ref-32">32</xref>]. The polyBERT chemical language model adopts a multi-task framework, mapping molecular fingerprints to multiple polymer properties and constructing an end-to-end informatics pipeline two orders of magnitude faster than traditional methods [<xref ref-type="bibr" rid="ref-37">37</xref>]. Research shows that encoding target attributes into feature inputs (e.g., one-hot vectors) is more effective than separate prediction, as it leverages inter-property physical correlations [<xref ref-type="bibr" rid="ref-49">49</xref>]. The effectiveness of multi-task learning depends on task relevance&#x2014;tasks with physical correlations (e.g., different temperature-related properties) achieve better feature sharing [<xref ref-type="bibr" rid="ref-52">52</xref>].</p>
<p>The integration of transfer and multi-task learning further optimizes algorithm performance. The TransPolymer framework pre-trains on a large amount of unlabeled polymer data through Masked Language Modeling (MLM), laying a foundation for multi-task property prediction [<xref ref-type="bibr" rid="ref-50">50</xref>]. The MMPolymer model integrates 1D sequence and 3D structure data, adopts a multi-modal multi-task pre-training strategy, and aligns cross-modal features through contrastive learning to enhance model generalization [<xref ref-type="bibr" rid="ref-46">46</xref>]. The Yoshida team combined transfer learning and Bayesian optimization to establish a quantitative relationship between polymer structure and thermal conductivity, overcoming data volume limitations [<xref ref-type="bibr" rid="ref-52">52</xref>]. These hybrid algorithmic strategies provide systematic solutions for complex polymer property prediction tasks.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>New Ideas for Data-Driven Polymer Material Design by Machine Learning</title>
<p>The introduction of current machine learning technology enables researchers to deeply analyze the complex correlation mechanism between polymer structures and properties, which has brought a revolutionary breakthrough to the traditional material R &#x0026; D model. The field of materials science is experiencing a paradigm change driven by data, especially in the design of polymer materials. Compared with the trial-and-error method that relies on experience accumulation, modern data-driven methods establish a machine learning model with predictive functions by integrating multi-scale modeling data, high-throughput experimental data, and increasingly improved material databases. This innovative method shows significant advantages in practice: it not only greatly shortens the time cycle and funding investment for new material R &#x0026; D but also, more importantly, reveals the in-depth structure&#x2014;property relationship that is difficult to capture by traditional research methods. As shown in <xref ref-type="table" rid="table-3">Table 3</xref>, the three types of methods, reverse design, high-throughput screening, and multi-objective optimization, show complementary value in solving the structure&#x2014;property relationship problem in material genome engineering. They systematically compare the core technical methods, typical application cases, advantages, and disadvantages of the three intelligent design strategies for polymer materials, providing methodological guidance for the directional development of new functional polymers. It is worth noting that the application scope of this method has expanded from the optimization of a single performance index to more challenging research fields such as multi-objective collaborative design, providing strong technical support for the directional development of functional polymer materials (as shown in <xref ref-type="table" rid="table-4">Table 4</xref>).</p>
<table-wrap id="table-4">
<label>Table 4</label>
<caption>
<title>Comparison of intelligent design strategies and technologies for polymer materials.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Design Strategy</th>
<th>Core Technical Method</th>
<th>Application Case</th>
<th>Advantage</th>
<th>Limitation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Reverse Design</td>
<td>Genetic Algorithm (GA) [<xref ref-type="bibr" rid="ref-22">22</xref>], Artificial Neural Network (ANN) [<xref ref-type="bibr" rid="ref-36">36</xref>] Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) [<xref ref-type="bibr" rid="ref-5">5</xref>]</td>
<td>Predicting polymer structures oriented by dielectric properties [<xref ref-type="bibr" rid="ref-22">22</xref>], developing high-conductivity glassy polymer composites [<xref ref-type="bibr" rid="ref-5">5</xref>], screening polymers for thermal conductivity [<xref ref-type="bibr" rid="ref-36">36</xref>]</td>
<td>Realizing reverse derivation oriented by target properties [<xref ref-type="bibr" rid="ref-15">15</xref>], handling multi-objective optimization problems [<xref ref-type="bibr" rid="ref-53">53</xref>], revealing in-depth structure&#x2014;property relationships [<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
<td>Difficulty in accurately characterizing polymer chain structures and condensed state structures, lack of data on new polymer structures [<xref ref-type="bibr" rid="ref-54">54</xref>]</td>
</tr>
<tr>
<td>High-Throughput Virtual Screening</td>
<td>Bayesian Optimization combined with Coarse&#x2014;Grained Model [<xref ref-type="bibr" rid="ref-5">5</xref>], polyBERT Model [<xref ref-type="bibr" rid="ref-17">17</xref>], High-Throughput Phase Field Calculation Method [<xref ref-type="bibr" rid="ref-55">55</xref>]</td>
<td>Screening PEO&#x2014;based solid polymer electrolytes [<xref ref-type="bibr" rid="ref-5">5</xref>], evaluating 8 million polyimides [<xref ref-type="bibr" rid="ref-25">25</xref>], predicting 100 million hypothetical polymers [<xref ref-type="bibr" rid="ref-17">17</xref>]</td>
<td>Greatly shortening the R &#x0026; D cycle [<xref ref-type="bibr" rid="ref-20">20</xref>], revealing the influence mechanism of interface effects [<xref ref-type="bibr" rid="ref-55">55</xref>], establishing a quantitative &#x201C;building block-structure&#x2014;property&#x201D; relationship [<xref ref-type="bibr" rid="ref-56">56</xref>]</td>
<td>Relying on high-quality computational simulation data [<xref ref-type="bibr" rid="ref-22">22</xref>], high cost of partial experimental verification [<xref ref-type="bibr" rid="ref-56">56</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s4_1">
<label>4.1</label>
<title>Reverse Design Strategy</title>
<p>The reverse design strategy in the field of polymer material design is oriented by target properties and reversely infers the molecular structure that meets specific needs. Compared with the traditional forward design method, this strategy has outstanding performance in improving the efficiency of material R &#x0026; D, and is especially good at handling multi-objective optimization problems. The machine learning&#x2014;assisted polymerization inverse analysis platform, as a typical application, can infer the polymerization conditions in reverse according to the target molecular weight and molecular weight distribution, and is applicable to a variety of reactant structures including monomers and initiators [<xref ref-type="bibr" rid="ref-7">7</xref>]. As shown in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>, By establishing a quantitative relationship model between polymerization reaction conditions and experimental results, this method realizes the accurate mapping between the high-dimensional structure space and the experimental parameter space, providing a scientific basis for controlled synthesis.</p>
<fig id="fig-6">
<label>Figure 6</label>
<caption>
<title>A general machine learning workflow for the inverse design of polymers begins by generating candidate structures (e.g., via a generator model). These structures are then fed into a property predictor. The algorithm iteratively refines the candidates by comparing the predicted properties with the targets until an optimal polymer structure is identified [<xref ref-type="bibr" rid="ref-20">20</xref>].</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-6.tif"/>
</fig>
<p>Black-box optimization algorithms such as Genetic Algorithm (GA) and Bayesian Optimization are key technologies for implementing reverse design. The Ramprasad research team successfully simulated and generated more than 200 kinds of polymers by linearly combining 7 kinds of polymer segments, and accurately predicted the polymer structure oriented by dielectric properties using the Genetic Algorithm [<xref ref-type="bibr" rid="ref-22">22</xref>]. Scholars such as Mannodi-Kanakkithodi combined machine learning prediction with Genetic Algorithm to develop new polymers with specific functions [<xref ref-type="bibr" rid="ref-53">53</xref>]. These research results confirm the effectiveness of the reverse design strategy in exploring the chemical structure space and reaction condition space, and can accurately recommend polymer structures and synthesis parameters that meet the target properties. The systematic polymer synthesis platform (SPP) developed by the PolyMao team further verifies the practicality of this method. Its machine learning&#x2014;based inverse synthesis analysis technology can infer the synthesis instructions in reverse from the target molecular weight results [<xref ref-type="bibr" rid="ref-54">54</xref>].</p>
<p>The HELAO framework&#x2019;s modular autonomous feedback-loop strategy enables reverse design in materials science by integrating automated synthesis, high-throughput characterization, and data-driven models to link structures with target properties, using real-time feedback and optimization (e.g., active learning) to refine the design space. It has supported narrowing optimal parameters from large candidate pools for functional materials, addressing &#x201C;structure-property&#x201D; complexity.</p>
<p>The application of deep learning technology in reverse design is becoming increasingly widespread, among which Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) have shown particularly outstanding performance. These models can learn the latent representation space of polymer materials and generate new candidate structures through interpolation or perturbation. A research team combined GANs and VAEs with Gaussian Process (GP) regression to successfully develop high-conductivity glassy polymer composites. The TransPolymer model developed by the Farimani team is based on the Transformer architecture and can parse the sequence structure and topological structure information implied in polymer SMILES strings, providing an innovative tool for the inverse design of high-performance polymer materials [<xref ref-type="bibr" rid="ref-5">5</xref>]. These deep learning methods adopt an end-to-end learning mode, which effectively overcomes the limitation that traditional descriptor methods are difficult to capture the complex structural features of polymers.</p>
<p>As shown in <xref ref-type="fig" rid="fig-7">Fig. 7</xref>, although the reverse design strategy has made important breakthroughs, there are still many technical bottlenecks in practical applications. The complexity of polymer chain structures and condensed state structures makes it difficult to accurately characterize statistical parameters such as molecular weight distribution, sequence structure, and topological structure. In addition, the open access restrictions of existing polymer databases and the lack of data on new polymer structures also bring challenges to the construction of initial datasets for reverse design [<xref ref-type="bibr" rid="ref-55">55</xref>]. Future research needs to focus on the development of multi-objective collaborative optimization algorithms for materials and deepen the cross-integration of machine learning technology and polymer materials to meet the inverse design needs of complex systems such as ladder and cross-linked polymers [<xref ref-type="bibr" rid="ref-23">23</xref>].</p>
<fig id="fig-7">
<label>Figure 7</label>
<caption>
<title>(<bold>I</bold>) The SPP platform operates through a streamlined workflow: first, an ML model is built to correlate synthesis conditions with results (<bold>a</bold>&#x2013;<bold>c</bold>); this model is then used in reverse to pinpoint the optimal conditions needed to achieve target polymer properties (<bold>c</bold>&#x2013;<bold>e</bold>). (<bold>II</bold>) In practice, for PET-RAFT polymerization, the platform analyzes a dataset of substrate structures and molecular weights to provide specific instructions on feed ratio, light source, and reaction time. (<bold>III</bold>) The platform&#x2019;s performance was validated by comparing multiple ML algorithms (Ridge, SVM, kNN, XGB, Neural Network, Random Forest), with their predictive accuracy assessed via RMSE and R<sup>2</sup> metrics [<xref ref-type="bibr" rid="ref-37">37</xref>].</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-7.tif"/>
</fig>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>High-Throughput Virtual Screening</title>
<p>Machine learning&#x2014;driven high-throughput virtual screening technology is reshaping the paradigm of polymer material R &#x0026; D. By integrating computational simulation and data-driven methods, this technology has brought a revolutionary improvement in efficiency to material discovery. Its core lies in using first&#x2014;principles calculations or molecular dynamics simulations to obtain the dynamic and thermodynamic properties of polymer three-dimensional structures, and converting complex molecular information into computable digital representations. This digital processing method provides a rich data foundation for the construction of machine learning models [<xref ref-type="bibr" rid="ref-22">22</xref>]. Taking PEO&#x2014;based solid polymer electrolytes (SPEs) as an example, the research team innovatively adopted a strategy combining Bayesian optimization and coarse&#x2014;grained models to successfully identify a material system with excellent lithium ion conductivity [<xref ref-type="bibr" rid="ref-5">5</xref>]. More notably, by establishing a quantitative relationship model between monomer structure and hygroscopicity, critical low thermal expansion rate, and tensile modulus, researchers can not only quickly screen target structures but also reveal the key structural features affecting performance through data mining [<xref ref-type="bibr" rid="ref-2">2</xref>].</p>
<p>High-throughput experimental technologies that complement virtual screening show a diversified development trend. From continuous flow systems to microreactor arrays, these parallel experimental platforms can efficiently generate verification data. When these experimental data are combined with active learning algorithms or Bayesian optimization frameworks, the predictive ability of the model can be significantly improved [<xref ref-type="bibr" rid="ref-20">20</xref>]. In the field of organic optoelectronic materials, high-throughput virtual screening shows unique advantages. Yang&#x2019;s research team accurately located 10 new polymers with excellent mechanical properties by systematically evaluating 8 million hypothetical polyimides, and their prediction results were fully verified by molecular dynamics simulations [<xref ref-type="bibr" rid="ref-25">25</xref>]. A similar technical route has also made breakthrough progress in the research on CO<sub>2</sub> separation performance of mixed matrix membranes (MOF-Polymers65). By systematically regulating the composition and structure parameters of polymers and MOFs, researchers have successfully designed new separation materials with high selectivity and adsorption capacity [<xref ref-type="bibr" rid="ref-56">56</xref>].</p>
<p>The latest progress in chemoinformatics has opened up a new way for high-throughput screening. The polyBERT model developed by the Kuenneth team has realized the multi-attribute prediction of 100 million hypothetical polymers. This deep learning method based on SMILES strings has greatly expanded the exploration range of polymer space [<xref ref-type="bibr" rid="ref-37">37</xref>]. By establishing a non-linear mapping relationship between molecular fingerprints and performance parameters, this model shows excellent accuracy in predicting the thermal conductivity of materials in the PLyInfo and PI1M databases. It is particularly worth noting that through high-precision molecular dynamics verification, the research team confirmed 107 high-performance materials with thermal conductivity exceeding 20 W&#x000B7;m<sup>&#x2212;1</sup>&#x000B7;K<sup>&#x2212;1</sup> [<xref ref-type="bibr" rid="ref-4">4</xref>]. In the field of high-temperature resistant resins, researchers have established a dual&#x2014;model evaluation system, which effectively solves the problem of collaborative optimization of processing performance and heat resistance of virtual polymer resins and provides a new idea for the rapid development of silicon&#x2014;containing aryl acetylene resins [<xref ref-type="bibr" rid="ref-57">57</xref>].</p>
<p>The introduction of the material genome concept marks that high-throughput screening technology has entered a stage of systematic development. The polymer material genome platform constructed by the team of Professor Lin Jiaping from East China University of Science and Technology integrates the performance data of more than 30,000 kinds of polymers. By establishing a quantitative structure&#x2014;activity relationship of &#x201C;building blocks-structure-properties&#x201D;, it realizes the intelligentization of material design [<xref ref-type="bibr" rid="ref-58">58</xref>]. In the research of dielectric composites, the innovative combination of high-throughput phase field calculation method and data-driven strategy establishes a prediction model of dielectric properties by introducing interface phase parameters. This multi-scale calculation method not only reveals the influence mechanism of interface effects on energy density but also provides theoretical guidance for the interface engineering design of nanocomposites [<xref ref-type="bibr" rid="ref-59">59</xref>].</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Multi-Objective Optimization Design</title>
<p>The design of polymer materials usually involves the collaborative optimization of multiple performance indicators, and there are often complex mutually restrictive relationships between these indicators. The multi-objective optimization method provides a systematic way to solve this problem, and its key lies in identifying the Pareto optimal solution set&#x2014;a set of solutions that cannot be further improved in all objective functions. Taking the design of polymer hybrid electrolytes as an example, the Ganesan research team used the weighting method to balance ion transport performance and mechanical properties. By systematically comparing the experimental results under different weight conditions, the optimal material formula was finally obtained [<xref ref-type="bibr" rid="ref-9">9</xref>]. Although this method is easy to operate, the determination of weight coefficients often depends on the subjective judgment of researchers, making it difficult to accurately reflect the intrinsic relationship between various performance indicators. In contrast, multi-objective genetic algorithms can directly explore the Pareto frontier. For example, the NSGA-II algorithm successfully achieved the dual goals of maximizing the number&#x2014;average molecular weight and minimizing the polydispersity index in the optimization of epoxy resin polymerization process by introducing a fast non-dominated sorting and elite retention strategy [<xref ref-type="bibr" rid="ref-23">23</xref>].</p>
<p>The multi-objective Bayesian optimization technology developed in recent years has opened up a new path for polymer material design. The Wang research team innovatively improved the traditional single-objective acquisition function, proposed the EI matrix method, and successfully applied it to the design of the coarse&#x2014;grained force field of polycaprolactone, optimizing two key performance indicators, elastic modulus and water diffusion coefficient, at the same time [<xref ref-type="bibr" rid="ref-1">1</xref>]. This method adopts an active learning strategy, which comprehensively considers the accuracy and uncertainty of prediction results in each iteration process, and realizes the dynamic balance between exploring new regions and utilizing known information. In the field of polymer nanoparticle synthesis, researchers have also developed a variety of advanced algorithms such as TS-EMO, RBFNN/RVEA, and EA-MOPSO for the systematic optimization of important parameters such as molecular weight distribution, particle size, and polydispersity index [<xref ref-type="bibr" rid="ref-60">60</xref>]. These methods not only significantly improve the optimization efficiency but also help researchers deeply understand the intrinsic correlation mechanism between different performance indicators by intuitively displaying the Pareto frontier.</p>
<p>The design of organic optoelectronic materials is a typical application scenario of multi-objective optimization technology. Researchers need to accurately regulate multiple structural parameters such as the ratio of electron donor to acceptor groups, material hydrophilicity and hydrophobicity, and conjugation length to achieve the best photoelectric conversion performance [<xref ref-type="bibr" rid="ref-45">45</xref>]. In the development of proton exchange membrane materials, the team of Li Yunqi from the Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, established a prediction model including four targets: proton conductivity, methanol permeability, tensile modulus, and thermal stability. Through a multi-objective ranking algorithm, it successfully guided the molecular design of new hydrocarbon&#x2014;based sulfonated copolymers [<xref ref-type="bibr" rid="ref-23">23</xref>]. These research results fully prove that the multi-objective optimization method can break through the limitations of traditional single-objective optimization and provide strong theoretical guidance and technical support for the development of polymer materials with comprehensive performance advantages.</p>
<p>The introduction of deep learning technology has brought new development opportunities for multi-objective optimization. The multi-task deep neural network model developed by the Ramprasad research team can accurately predict the glass transition temperature, melting temperature, and degradation temperature of copolymers at the same time, showing excellent prediction accuracy and generalization ability. The polyBERT model trained by Kuenneth et al. based on 100 million polymer SMILES strings has realized the efficient correlation between molecular structure features and multiple performance parameters, laying a solid technical foundation for large-scale multi-objective optimization research [<xref ref-type="bibr" rid="ref-22">22</xref>]. The breakthroughs of these cutting&#x2014;edge technologies enable researchers to explore combination schemes with more excellent performance in a broader material design space and promote the development of polymer materials towards multi-functionalization and intelligentization.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Machine Learning Performance Validation and Evaluation Methods</title>
<p>As shown in <xref ref-type="fig" rid="fig-8">Fig. 8</xref>, the practical value of machine learning models in polymer materials research must be confirmed through rigorous performance validation and scientific evaluation systems. This process not only objectively reflects the model&#x2019;s predictive ability for unknown data but also identifies model shortcomings and guides iterative optimization, serving as a key link connecting theoretical modeling and practical applications. Focusing on the three cores of &#x201C;validation methods&#x2014;evaluation indicators&#x2014;optimization strategies&#x201D;, this section systematically elaborates on the performance validation logic, scientific evaluation standards, and efficient optimization paths of machine learning models in the field of polymer materials, providing methodological support for model reliability.</p>
<fig id="fig-8">
<label>Figure 8</label>
<caption>
<title>Closed-loop framework for ML-driven polymer research. The cycle integrates prediction, experimental verification, and model optimization to iteratively improve design outcomes.</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-8.tif"/>
</fig>
<sec id="s5_1">
<label>5.1</label>
<title>Experimental Verification Methods</title>
<p>Experimental validation is the core means to test the credibility of model predictions. It requires designing targeted experiments, comparing the deviations between predicted values and measured values, and verifying the model&#x2019;s applicability in real scenarios. Its core goal is to avoid models being &#x201C;theoretical only&#x201D; and ensure that prediction results have experimental repeatability and engineering practicality.</p>
<p>The reliability of the prediction results of machine learning models highly depends on rigorous experimental verification, which is particularly important in the research of polymer materials. The chemoinformatics&#x2014;driven ML model developed by the Bradford team successfully predicted the ionic conductivity of SPEs, and its effectiveness was fully confirmed by experimental data [<xref ref-type="bibr" rid="ref-5">5</xref>]. Experimental verification usually adopts an iterative optimization strategy, and dynamically adjusts model parameters by analyzing the differences between predicted attributes and measured attributes. Taking the adaptive machine learning framework as an example, the Support Vector Regression (SVR) model combined with the Efficient Global Optimization (EGO) method can intelligently recommend the most potential candidate materials for experimental verification [<xref ref-type="bibr" rid="ref-41">41</xref>]. This closed-loop verification mechanism significantly improves the R &#x0026; D efficiency. For example, in the development of additive manufacturing materials, only 120 samples need to be tested in parallel to complete 30 rounds of algorithm optimization [<xref ref-type="bibr" rid="ref-61">61</xref>].</p>
<p>The modern experimental verification system integrates a variety of advanced technical means. High-throughput experimental platforms have become important carriers for verifying ML predictions. The Ada automated laboratory developed by the MacLeod team realizes the fully autonomous operation from material design to characterization and optimizes the experimental scheme through continuous learning [<xref ref-type="bibr" rid="ref-8">8</xref>]. In the research of mixed matrix membranes, researchers verified the prediction accuracy of computational screening and machine learning models by systematically preparing MOF-Polymers samples with different ratios and testing their CO<sub>2</sub> separation performance [<xref ref-type="bibr" rid="ref-56">56</xref>].</p>
<p>The data division strategy is crucial for model verification. In the research of polymer property prediction, two strategies of polymer type division and data point division are adopted, and five-fold cross&#x2014;validation is used to effectively prevent overfitting. For small sample scenarios, ten-fold cross-validation shows good results. In the research of solution polymerized styrene&#x2014;butadiene rubber performance prediction, a reliable prediction model was finally established through the segmentation verification of category&#x2014;balanced datasets [<xref ref-type="bibr" rid="ref-62">62</xref>]. During the verification process, it is also necessary to quantitatively analyze the impact of uncertain factors such as measurement noise on the prediction performance [<xref ref-type="bibr" rid="ref-25">25</xref>].</p>
<p>In the machine learning-driven polymer design framework, experimental verification plays a dual role: it not only tests the algorithm&#x2019;s predictive ability for unknown data but also provides new data for algorithm improvement [<xref ref-type="bibr" rid="ref-33">33</xref>]. The Kang Peng team synthesized eight new PI structures and conducted molecular dynamics simulations, confirming that the prediction error was controlled within 15%. Scientific experimental design is the key to ensuring the reliability of verification, such as using Latin Hypercube Sampling (LHS) for preliminary screening and then conducting iterative experiments based on the algorithm output [<xref ref-type="bibr" rid="ref-60">60</xref>]. This closed-loop verification mechanism can operate continuously until the preset standard is met or manual termination, ensuring the systematicness and completeness of the verification process.</p>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Model Performance Evaluation</title>
<p>In the machine learning research of polymer materials, reliable model performance evaluation is crucial to the credibility of prediction results. For different prediction tasks and data characteristics, appropriate evaluation indicators need to be selected. For regression problems, indicators such as Root Mean Square Error (RMSE), Coefficient of Determination (R<sup>2</sup>), and Mean Absolute Error (MAE) are usually used. Taking the prediction of glass transition temperature as an example, the CNN model based on repeating units performed well on Data set_1, with R<sup>2</sup> of the training set and test set reaching 0.84 and 0.82, respectively, while it was 0.65 on Data set_2 [<xref ref-type="bibr" rid="ref-24">24</xref>]. For classification tasks, indicators such as accuracy, precision, and recall are more concerned. For example, in the ferromagnetic Ising model, the SVM using the quadratic polynomial kernel function has a test set accuracy close to 100% for phase classification [<xref ref-type="bibr" rid="ref-44">44</xref>]. These indicators can not only measure the fitting effect of the model on known data but also effectively evaluate its generalization performance in processing unknown data.</p>
<p>The selection of evaluation methods has a decisive impact on the objectivity of performance determination. Although traditional Cross-Validation (CV) is widely used, it has certain limitations in the field of material discovery. The latest research shows that LOCO CV (Leave&#x2013;One&#x2013;Cluster&#x2013;Out Cross&#x2013;Validation) based on cluster segmentation can more accurately evaluate the extrapolation ability of the model between different material groups [<xref ref-type="bibr" rid="ref-52">52</xref>]. For datasets with a small sample size, ten-fold cross-validation shows good results. For example, in the research of solution polymerized styrene&#x2014;butadiene rubber performance prediction, the Q<sup>2</sup> of the model established through the segmentation verification of category&#x2014;balanced data is as high as 0.9375 [<xref ref-type="bibr" rid="ref-45">45</xref>]. Facing the problem of data distribution deviation, the bootstrap method is a feasible solution, but attention should be paid to the estimation error that may be introduced by this method [<xref ref-type="bibr" rid="ref-5">5</xref>]. In addition, during the evaluation process, it is also necessary to consider uncertain factors such as measurement noise, and model the parameter uncertainty through multivariate probability density distribution to provide a probabilistic basis for molecular design decisions [<xref ref-type="bibr" rid="ref-63">63</xref>].</p>
<p>Combining model interpretation technology can deeply understand the feature contribution. Tools such as SHAP (SHapley Additive exPlanations) and PDP (Partial Dependence Plot) can reveal the key structure&#x2014;property relationships. For example, the number of rotatable bonds and the minimum local charge have been proved to be the main factors affecting the Tg of polyimides. In the prediction of polymer conductivity, the feature importance analysis of the CatBoost model shows that the number of rotatable bonds, the number of hydrogen bond donors/acceptors, and the number of heavy atoms have a significant impact on the tensile strength [<xref ref-type="bibr" rid="ref-24">24</xref>]. This interpretability analysis not only verifies the reliability of the model but also provides directional guidance for material design. When the XGBoost algorithm predicts the performance of polymer composites, it decodes the decision mechanism through SHAP causal analysis, achieving a prediction accuracy of up to R<sup>2</sup> &#x003D; 0.95 [<xref ref-type="bibr" rid="ref-11">11</xref>].</p>
<p>A horizontal comparison of the performance of different models is an effective method to evaluate the advanced nature of the technology. The test results of TransPolymer on ten polymer performance prediction benchmarks show that it reduces the test RMSE by an average of 7.70% and increases R<sup>2</sup> by 0.11, which is significantly better than the traditional ECFP method [<xref ref-type="bibr" rid="ref-50">50</xref>]. The polyBERT chemical language model achieves an R<sup>2</sup> of 0.80 in 29 performance predictions, and its calculation speed is two orders of magnitude faster than that of manually designed fingerprints [<xref ref-type="bibr" rid="ref-37">37</xref>]. It is worth noting that the data division strategy will affect the evaluation results. The division of polymer types and data points will produce different effects. The former can better test the cross&#x2014;material generalization ability of the model, while the latter focuses on the adaptability of data distribution [<xref ref-type="bibr" rid="ref-62">62</xref>]. In addition, computational efficiency is also an important consideration in performance evaluation [<xref ref-type="bibr" rid="ref-63">63</xref>]. The GC-GNN model maintains the prediction accuracy, but its transferability varies with the polymer structure, which reflects the limitation of the ideal Gaussian chain assumption [<xref ref-type="bibr" rid="ref-64">64</xref>].</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Model Optimization Strategies</title>
<p>The key to machine learning research on polymer materials is to improve the prediction performance through model optimization. Bayesian Optimization (BO), as an efficient global optimization method, uses Gaussian process regression to estimate the performance distribution of untested formulations and selects the optimal candidate samples from them for verification [<xref ref-type="bibr" rid="ref-5">5</xref>]. Compared with random search, this method shows stronger exploration ability in the screening of amino acid random copolymers and successfully identifies copolymer structures with higher enzyme&#x2014;like activity [<xref ref-type="bibr" rid="ref-22">22</xref>]. Genetic Algorithm simulates the natural selection mechanism and generates a new generation of candidate samples through &#x201C;hybridization&#x201D; and &#x201C;mutation&#x201D; operations, which has unique advantages in the optimization of polymer nanoparticle synthesis [<xref ref-type="bibr" rid="ref-60">60</xref>].</p>
<p>Hyperparameter tuning has a decisive impact on the prediction performance of the model. Grid search combined with five-fold cross-validation can systematically optimize key parameters such as GCN layer depth, width, learning rate, and L2 regularization weight [<xref ref-type="bibr" rid="ref-25">25</xref>]. In the research of predicting the conductivity of ionic polymers, GridSearchCV with fixed random state ensures the reproducibility of experiments and provides a reliable basis for the design of lithium-ion battery electrolytes [<xref ref-type="bibr" rid="ref-24">24</xref>]. In the optimization of large language models, the Hyperband method comprehensively tunes the neural network hyperparameters, and parameter&#x2013;efficient fine&#x2013;tuning technologies such as LoRA (Low&#x2014;Rank Adaptation) significantly improve the performance of polymer property prediction [<xref ref-type="bibr" rid="ref-65">65</xref>]. In the SVM model, the reasonable setting of the regularization parameter &#x03B3; can obtain a test set accuracy close to the optimal, while maintaining the physical correlation of the decision function [<xref ref-type="bibr" rid="ref-44">44</xref>].</p>
<p>The problem of data scarcity can be effectively solved through transfer learning and multi-task learning. The two-stage training strategy first uses physically modeled synthetic data for supervised pre-training to enable the model to master the basic physical properties of polymers; then, a small amount of real experimental data (45 samples) is used for fine-tuning, which significantly improves the prediction accuracy [<xref ref-type="bibr" rid="ref-12">12</xref>]. The polyBERT model realizes the accurate prediction of 29 polymer attributes through five-fold cross-validation and meta-learner integration [<xref ref-type="bibr" rid="ref-37">37</xref>]. The MMPolymer framework adopts a multi-modal multi-task pre-training paradigm, aligns the features of different modalities through contrastive learning, combines the multi-head attention mechanism for feature fusion, and enhances the modal aggregation effect through the dynamic weighted pooling layer, achieving the optimal performance in a number of polymer property prediction tasks [<xref ref-type="bibr" rid="ref-46">46</xref>].</p>
<p>Feature engineering and model structure adjustment are important dimensions of optimization strategies. The LASSO method combined with Recursive Feature Elimination (RFE) can effectively reduce the dimension and significantly improve the model efficiency [<xref ref-type="bibr" rid="ref-62">62</xref>]. In the prediction of polymer dielectric constant, the Maximum Relevance Minimum Redundancy (mRMR) method evaluates and ranks all descriptors to screen the optimal feature subset [<xref ref-type="bibr" rid="ref-9">9</xref>]. G-BigSMILES extends the expression ability of traditional BigSMILES, including key information such as molecular weight and molecular weight distribution, providing more abundant input features for the model [<xref ref-type="bibr" rid="ref-8">8</xref>]. In terms of model structure adjustment, the cosine annealing strategy for dynamically adjusting the learning rate performs well in polymer property prediction. Setting the peak learning rate to 5E6, the model can converge after 100 training rounds [<xref ref-type="bibr" rid="ref-39">39</xref>].</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Application Case Analysis</title>
<p>At present, the field of polymer material research has achieved a leapfrog development of machine learning technology from theory to engineering practice. Taking the Material Genome Initiative as an example, researchers have successfully predicted the correlation law between the thermal stability and mechanical properties of polyimide films by integrating high-throughput computing and deep learning algorithms, and the correlation coefficient verified by experiments has reached 0.93. In the development of elastomer composites, the Random Forest model can accurately predict the mapping relationship between filler dispersion and dynamic mechanical properties with only 15% of the data volume of traditional experiments. More notably, the cross-scale modeling method based on transfer learning has shown unique advantages in the research of nylon 6 crystallization kinetics, and the process-structure-property correlation model established by it controls the crystallization degree prediction error within &#x00B1;3%. These research cases systematically summarize the typical applications of machine learning methods in the field of polymer material property prediction, covering the prediction accuracy improvement effects and experimental verification results of key performance indicators such as thermal stability, mechanical properties, and crystallization kinetics [<xref ref-type="bibr" rid="ref-66">66</xref>]. These breakthroughs not only confirm the reliability of machine learning in the multi-parameter optimization of polymers but also reveal the great potential of data-driven methods in solving complex non-linear problems in materials science, providing empirical evidence for the effectiveness of the material genome method in polymer design (as shown in <xref ref-type="table" rid="table-5">Table 5</xref>).</p>
<table-wrap id="table-5">
<label>Table 5</label>
<caption>
<title>Summary of polymer material property prediction and experimental verification results.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Material System</th>
<th>Performance Indicator</th>
<th>Prediction Accuracy/Performance Improvement</th>
<th>Experimental Verification Result</th>
<th>Citation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Polyimide Film</td>
<td>Correlation between thermal stability and mechanical properties</td>
<td>Correlation coefficient 0.93</td>
<td>Experimental verification passed</td>
<td>[<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
</tr>
<tr>
<td>Elastomer Composite</td>
<td>Filler dispersion and dynamic mechanical properties</td>
<td>Only 15% of the data volume of traditional experiments is needed</td>
<td>Accurately predict the mapping relationship</td>
<td></td>
</tr>
<tr>
<td>Nylon 6</td>
<td>Crystallization kinetics (crystallinity prediction)</td>
<td>Error controlled within &#x00B1;3%</td>
<td>Verification of process-structure-property correlation model</td>
<td>[<xref ref-type="bibr" rid="ref-35">35</xref>]</td>
</tr>
<tr>
<td>Polymer with specific thermal conductivity</td>
<td>Thermal conductivity prediction</td>
<td>MAE 0.024 W&#x000B7;m<sup>&#x2212;1</sup>&#x000B7;K<sup>&#x2212;1</sup></td>
<td>Accuracy improved by 40% compared with traditional models</td>
<td>[<xref ref-type="bibr" rid="ref-48">48</xref>]</td>
</tr>
<tr>
<td>Polyester Material</td>
<td>Biodegradable characteristics</td>
<td>Systematic evaluation of more than 600 materials</td>
<td>Verification of degradation characteristics of Pseudonomas lemoignei</td>
<td>[<xref ref-type="bibr" rid="ref-8">8</xref>]</td>
</tr>
<tr>
<td>Polymer Gas Permeable Material</td>
<td>Permeability prediction</td>
<td>Discovery of more than 100 materials exceeding the Robeson upper limit</td>
<td>Accurately evaluate the performance of 700 polymers</td>
<td></td>
</tr>
<tr>
<td>Polylactic Acid/Nanoparticle Composite System</td>
<td>Thermal stability and crystallization performance</td>
<td>Optimization guided by machine learning</td>
<td>Verification of degradation kinetics characteristic prediction</td>
<td>[<xref ref-type="bibr" rid="ref-67">67</xref>]</td>
</tr>
<tr>
<td>Silicon-containing Acetylene Resin</td>
<td>Processing performance and heat resistance</td>
<td>High-throughput screening to obtain PSA resin</td>
<td>Verification of material genome method</td>
<td>[<xref ref-type="bibr" rid="ref-57">57</xref>]</td>
</tr>
<tr>
<td>Free Radical Polymerization (FRP)</td>
<td>Reaction efficiency</td>
<td>Increased by 300%</td>
<td>Verification of dynamic regulation of microfluidic chip</td>
<td>[<xref ref-type="bibr" rid="ref-68">68</xref>]</td>
</tr>
<tr>
<td>3D Printed Microneedle Array</td>
<td>Printing quality and drug delivery performance</td>
<td>Early defect identification by computer vision</td>
<td>Verification of geometric accuracy consistency</td>
<td>[<xref ref-type="bibr" rid="ref-23">23</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s6_1">
<label>6.1</label>
<title>High-Performance Polymer Design</title>
<p>Machine learning technology is profoundly changing the R &#x0026; D paradigm of high-performance polymers. The inverse design method realizes the accurate prediction of the structure of polymers with specific thermal conductivity by establishing the performance-structure mapping relationship [<xref ref-type="bibr" rid="ref-23">23</xref>]. The prediction model constructed by WU et al. by combining transfer learning and Bayesian molecular design algorithm has outstanding performance, with an MAE of only 0.024 W&#x000B7;m<sup>&#x2212;1</sup>&#x000B7;K<sup>&#x2212;1</sup>, which is 40% more accurate than the traditional small-sample training model [<xref ref-type="bibr" rid="ref-48">48</xref>]. In the field of aerospace materials, the machine learning model trained based on multi-features such as molecular weight, chain structure, and cross-linking density has successfully guided the development of new polymer systems with both excellent mechanical strength and thermal stability.</p>
<p>The integration of Generative Adversarial Networks (GANs) with coarse-grained molecular dynamics (CGMD) has enabled breakthroughs in material design. For instance, researchers have utilized GANs to generate copolymer structures with targeted Young&#x2019;s modulus, followed by efficient screening via CGMD simulations [<xref ref-type="bibr" rid="ref-5">5</xref>]. Beyond generative models, the TransPolymer model, developed by the Farimani team and based on the Transformer architecture, demonstrates excellent performance in property prediction by effectively capturing polymer sequence and topological features [<xref ref-type="bibr" rid="ref-22">22</xref>]. These data-driven approaches are proving highly effective in application-oriented research. In the field of dielectric materials, for example, machine learning models have accurately predicted the frequency-dependent dielectric behavior of 11,000 unknown polymers, successfully identifying five candidate materials for capacitors and microelectronics applications [<xref ref-type="bibr" rid="ref-62">62</xref>].</p>
<p>The collaborative optimization of Genetic Algorithm and machine learning has greatly improved the efficiency of material development. A study designed 132 new polymers through 100 generations of evolutionary iterations, six of which showed ideal characteristics [<xref ref-type="bibr" rid="ref-66">66</xref>]. The prediction model constructed by Barnett et al. not only accurately evaluated the gas permeability of 700 polymers but also discovered more than 100 excellent materials that exceed the Robeson upper limit. In the design of polyimides, Afzal&#x2019;s team used 29 building blocks to efficiently screen 10,000 candidates from 660 million compounds, and finally obtained the target material with ultra-high refractive index [<xref ref-type="bibr" rid="ref-67">67</xref>].</p>
<p>Autonomous optimization systems have promoted the design of polymer blending systems to enter a new stage. The integrated robot platform, through high-throughput experiments combined with evolutionary algorithms, discovered random heteropolymer blends with performance exceeding that of single components and revealed the regulatory mechanism of molecular fragment interaction on protein thermal stability [<xref ref-type="bibr" rid="ref-68">68</xref>]. Research in the field of thermosetting resins shows that the material genome method combined with machine learning can efficiently design silicon-containing acetylene resins, and high-throughput screening can obtain PSA resins with both excellent processing performance and heat resistance [<xref ref-type="bibr" rid="ref-57">57</xref>]. South Korean scholars innovatively introduced product grade features to construct an XGBoost model, which significantly improved the prediction accuracy of key performance indicators of polymer composites and provided a reliable tool for industrial applications [<xref ref-type="bibr" rid="ref-11">11</xref>].</p>
</sec>
<sec id="s6_2">
<label>6.2</label>
<title>Optimization of Biodegradable Materials</title>
<p>Machine learning provides a new technical path for the research and development of biodegradable materials, and shows unique advantages especially in performance prediction and structural design. The deep neural network model developed by Bakar&#x2019;s team realizes the accurate prediction of the density characteristics of degradable plastics through principal component analysis and a &#x201C;coarse-to-fine&#x201D; optimization strategy. This type of model has excellent non-linear fitting ability and can effectively capture the complex relationship between material structure and performance, showing good stability in the prediction of mechanical properties and degradation behavior. However, it is worth noting that neural networks are more sensitive to data scale. The prediction accuracy will decrease significantly under small sample conditions, and the model interpretability has inherent limitations [<xref ref-type="bibr" rid="ref-49">49</xref>]. In contrast, Support Vector Machines have more advantages in small sample scenarios. The Fransen research group successfully synthesized more than 600 kinds of polyester materials through high-throughput experimental technology combined with machine learning methods, and systematically evaluated their degradation characteristics on Pseudonomas lemoignei [<xref ref-type="bibr" rid="ref-8">8</xref>]. This method constructs a model based on statistical learning theory, which reduces the dependence on large-scale data sets but faces the challenge of computational efficiency when processing massive data.</p>
<p>The inverse design of biodegradable materials is benefiting from the breakthrough of machine learning technology. Researchers have adopted a large-scale screening strategy to generate candidate materials in the design space, and then used the trained prediction model to evaluate their degradation characteristics and mechanical properties [<xref ref-type="bibr" rid="ref-23">23</xref>]. This method has achieved significant results in the optimization combination of natural fibers and bio-based resins. For example, the composite materials of flax, hemp fibers and polylactic acid or polyhydroxyalkanoate recommended by the model have been verified by experiments to show excellent mechanical strength and controllable degradation characteristics under various environmental conditions [<xref ref-type="bibr" rid="ref-48">48</xref>]. Mathematical optimization methods transform material design into a problem of solving objective functions under constraints, and find the optimal solution through deterministic or random algorithms, which effectively alleviates the restriction of combinatorial complexity on design efficiency [<xref ref-type="bibr" rid="ref-23">23</xref>]. The application of transfer learning technology provides a solution to the problem of data scarcity. Studies have shown that pre-trained models have good adaptability in new material systems. For example, Mossa&#x2019;s team successfully applied the convolutional neural network trained on the surfactant system to the perfluorosulfonic acid resin system [<xref ref-type="bibr" rid="ref-20">20</xref>].</p>
<p>At present, the research and development of biodegradable materials still faces key challenges such as accurate regulation of degradation time and optimization of biocompatibility. Through analyzing the correlation between chemical structure and degradation behavior, machine learning can predict the degradation kinetics characteristics under different environmental conditions [<xref ref-type="bibr" rid="ref-36">36</xref>]. In the research of polylactic acid (PLA)/BiFeO<sub>3</sub> (BFO) nanoparticle composite system, BFO was evenly coated on the 3D printed PLA substrate by a simple dip-coating method. The composite system showed excellent piezoelectric photocatalytic degradation performance for Congo Red (CR) and Methylene Blue (MB) (the degradation rates reached 98.9% and 74.3%, respectively within 90 min). Moreover, with the help of regression models constructed by machine learning models such as Catboost and XGBoost (the R<sup>2</sup> values of photocatalysis, piezoelectric catalysis and piezoelectric photocatalysis predictions are 0.93, 0.99 and 0.99, respectively), the application optimization of BFO catalyst was effectively guided, providing a powerful solution for wastewater purification [<xref ref-type="bibr" rid="ref-69">69</xref>]. However, it should be pointed out that the long test cycle of biodegradable materials and the non-uniform experimental standards restrict the construction of high-quality data sets [<xref ref-type="bibr" rid="ref-55">55</xref>]. Future research should focus on the development of multi-scale characterization methods and standardized test schemes to lay a more solid data foundation for the application of machine learning. By integrating automated experimental platforms and high-throughput computing technologies, it is expected to establish a more complete database of biodegradable materials and promote the in-depth development of data-driven design methods in this field [<xref ref-type="bibr" rid="ref-70">70</xref>,<xref ref-type="bibr" rid="ref-71">71</xref>].</p>
</sec>
<sec id="s6_3">
<label>6.3</label>
<title>Machine Learning Aiding Polymer Material Manufacturing</title>
<p>The field of polymer material manufacturing is experiencing profound changes brought about by machine learning technology, which is rapidly penetrating from laboratory research to industrial practice. The team of Nara Institute of Science and Technology in Japan has made a breakthrough in the research of styrene-methyl methacrylate copolymer system. The flow synthesis method they developed combined with machine learning modeling has significantly improved the mixing effect and heating efficiency [<xref ref-type="bibr" rid="ref-71">71</xref>]. This innovative method not only reduces the time and cost of traditional experiments but also, more importantly, establishes an accurate mathematical model, laying a technical foundation for the industrial production of complex polymer systems.</p>
<p>Exciting progress has been made in the field of real-time process control. As shown in <xref ref-type="fig" rid="fig-9">Fig. 9</xref>, the integration of microfluidic chips and machine learning has achieved a qualitative leap in the regulation of monomer ratio in free radical polymerization (FRP). Studies have shown that this dynamic regulation system can increase the reaction efficiency by 300% [<xref ref-type="bibr" rid="ref-72">72</xref>]. The core of this technology lies in the synergy between online monitoring and machine learning models, which ensures that the reaction process is always in the best state by adjusting process parameters in real time. The injection molding process also benefits from machine learning technology. By in-depth analysis of historical production data, the model can accurately predict key parameters such as mold temperature and cooling rate, thereby effectively avoiding product defects. This predictive method improves product quality while significantly reducing production costs.</p>
<fig id="fig-9">
<label>Figure 9</label>
<caption>
<title>Data-driven development of polymer composite capacitors. (<bold>a</bold>&#x2013;<bold>f</bold>) Model Development &#x0026; Simulation: (<bold>a</bold>) Generative model schematic. (<bold>b</bold>,<bold>c</bold>) Validation of HOMO/LUMO predictions. (<bold>d</bold>,<bold>e</bold>) Benchmarking against DFT calculations. (<bold>f</bold>) Electron density simulations for different fillers. (<bold>g</bold>&#x2013;<bold>i</bold>) Experimental Characterization: (<bold>g</bold>) Cross-sectional SEM/EDS of composite. (<bold>h</bold>) SEM/EDS of a self-healing point. (<bold>i</bold>) Photographs of final capacitor devices [<xref ref-type="bibr" rid="ref-72">72</xref>].</title>
</caption>
<graphic mimetype="image" mime-subtype="tif" xlink:href="CMC_76492-fig-9.tif"/>
</fig>
<p>The process control of polymer manufacturing is undergoing revolutionary changes brought about by autonomous optimization systems. The newly developed self-driving laboratory platform integrates cloud computing and a variety of online analysis technologies to realize the multi-objective optimization of polymer nanoparticle synthesis [<xref ref-type="bibr" rid="ref-60">60</xref>]. The system can simultaneously optimize multiple performance indicators such as molecular weight distribution and particle size, and find the optimal process parameters through continuous iteration. The NSGA-II algorithm applied in the epoxy resin polymerization process is a successful case, which has achieved significant results in the optimization of number&#x2014;average molecular weight and polydispersity index [<xref ref-type="bibr" rid="ref-2">2</xref>]. This kind of multi-objective optimization method provides a new idea for solving the balance problem of performance indicators commonly found in industrial production.</p>
<p>Although the application of machine learning in polymer manufacturing has broad prospects, it is still necessary to overcome challenges such as data quality and model generalization. The innovative application of the QLoRA framework provides a new idea for solving the problem of data scarcity. This technology can achieve 91.1% accuracy in processing parameter extraction with only 224 samples [<xref ref-type="bibr" rid="ref-73">73</xref>]. This small-sample learning technology is particularly suitable for scenarios where data acquisition is difficult in polymer manufacturing. Future research should focus on the development of more universal machine learning models and promote the in-depth integration of manufacturing equipment and intelligent algorithms to accelerate the transformation of polymer manufacturing towards comprehensive intelligence.</p>
</sec>
</sec>
<sec id="s7">
<label>7</label>
<title>Challenges and Prospects</title>
<p>Although the application of machine learning in the field of polymer science has achieved remarkable results, this field still faces a series of technical problems to be solved (as shown in <xref ref-type="table" rid="table-6">Table 6</xref>). As shown in <xref ref-type="table" rid="table-5">Table 5</xref>, the uneven quality of data, insufficient generalization ability of models, and high demand for computing resources have become the main bottlenecks restricting the progress of research. This table systematically summarizes the main technical challenges and their solutions in the machine learning research of polymer materials, including the above-mentioned key issues, and lists the representative solutions and typical cases in the current field, providing methodological references for subsequent research. Especially when dealing with polymer systems with complex structures, the prediction accuracy and stability of existing models are often difficult to meet practical needs. To address these challenges, researchers need to seek breakthroughs from multiple dimensions: constructing a more universal algorithm system, improving the collection and characterization technology of experimental data, and strengthening collaborative innovation between different disciplines. It can be predicted that with the rapid development of high-performance computing technology and the continuous optimization of new algorithms, machine learning will play a more critical role in the field of polymer science, which can not only promote the innovative breakthrough of basic theories but also significantly accelerate the industrialization process of related technologies.</p>
<table-wrap id="table-6">
<label>Table 6</label>
<caption>
<title>Key technical challenges and solutions in machine learning research of polymer materials.</title>
</caption>
<table>
<colgroup>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
<col align="center"/>
</colgroup>
<thead>
<tr>
<th>Technical Challenge Category</th>
<th>Specific Problem Manifestations</th>
<th>Existing Solutions</th>
<th>Typical Cases/Methods</th>
<th>Citation Source</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data Quality</td>
<td>Difficulty in obtaining data of polymer systems, limited by sample preparation quality</td>
<td>Constructing standardized databases and integrating multi-source data</td>
<td>FAIR Data sharing program integrates high-throughput experiments, molecular simulations and literature mining data</td>
<td>[<xref ref-type="bibr" rid="ref-7">7</xref>,<xref ref-type="bibr" rid="ref-74">74</xref>]</td>
</tr>
<tr>
<td>Model Generalization Ability</td>
<td>Difficulty in capturing the cross-scale characteristics of polymers (such as chain entanglement, phase separation)</td>
<td>Multi-scale modeling combining physical theory and machine learning</td>
<td>Combining polymer physical theory with machine learning architecture to improve prediction transfer ability</td>
<td>[<xref ref-type="bibr" rid="ref-26">26</xref>,<xref ref-type="bibr" rid="ref-73">73</xref>,<xref ref-type="bibr" rid="ref-75">75</xref>]</td>
</tr>
<tr>
<td>Computing Resource Requirements</td>
<td>Analyzing high-dimensional data sets consumes a lot of computing resources</td>
<td>GPU&#x2014;accelerated computing architecture</td>
<td>polyBERT chemical language model uses GPU to improve computing efficiency</td>
<td>[<xref ref-type="bibr" rid="ref-42">42</xref>,<xref ref-type="bibr" rid="ref-44">44</xref>]</td>
</tr>
<tr>
<td>Model Interpretability</td>
<td>Black-box models are difficult to reveal the intrinsic behavior mechanism of materials</td>
<td>Developing interpretable machine learning methods</td>
<td>Attention mechanism analysis of functional group weight in organic photovoltaic research</td>
<td>[<xref ref-type="bibr" rid="ref-53">53</xref>,<xref ref-type="bibr" rid="ref-76">76</xref>,<xref ref-type="bibr" rid="ref-77">77</xref>]</td>
</tr>
<tr>
<td>Experimental Verification</td>
<td>Difficulty in digital characterization of complex polymer structures</td>
<td>Automated laboratories and closed&#x2014;loop optimization systems</td>
<td>NVIDIA ALCHEMI platform realizes the exploration of material chemical space</td>
<td>[<xref ref-type="bibr" rid="ref-69">69</xref>,<xref ref-type="bibr" rid="ref-70">70</xref>,<xref ref-type="bibr" rid="ref-75">75</xref>]</td>
</tr>
<tr>
<td>Multi-scale Modeling</td>
<td>Single-scale models are difficult to handle multi-scale phenomena of polymers</td>
<td>Developing hybrid multi-scale frameworks</td>
<td>Seamless connection between molecular simulation and continuous&#x2014;scale machine learning models</td>
<td>[<xref ref-type="bibr" rid="ref-16">16</xref>,<xref ref-type="bibr" rid="ref-78">78</xref>]</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s7_1">
<label>7.1</label>
<title>Technical Challenges</title>
<p>Although the introduction of machine learning methods in the field of polymer science has broad prospects, there are still several technical problems to be solved in practical applications. The most urgent problem at present is the difficulty in obtaining high-quality data. High costs and many practical restrictions have seriously restricted the training effect and performance of machine learning models [<xref ref-type="bibr" rid="ref-64">64</xref>]. Taking complex polymer systems as an example, insufficient data makes it difficult for models to accurately capture the cross-scale characteristics of materials, including key features such as random sequences of polymer chains and diversity of condensed state structures [<xref ref-type="bibr" rid="ref-22">22</xref>]. This problem is particularly prominent in the research of solid electrolytes. High-precision molecular simulation methods are difficult to carry out large-scale calculations, while conventional experimental characterization is limited by the quality of sample preparation and cannot effectively distinguish intrinsic ionic conductivity from other interfering factors [<xref ref-type="bibr" rid="ref-73">73</xref>].</p>
<p>The digital characterization of polymer structures also faces severe challenges. Most of the existing characterization methods are limited to the structure of repeating units and cannot fully reflect statistical characteristics such as molecular weight distribution, sequence structure, and topological structure [<xref ref-type="bibr" rid="ref-13">13</xref>]. This characterization defect makes it difficult for machine learning models to fully grasp the complex characteristics of polymer materials. For example, in the study of multi-component polyurethane elastomers, the prediction results of the model on hydrogen bonds on molecular chains are significantly discrete from the overall hydrogen bond distribution of the system, which fully reflects the amorphous characteristics of a single molecular chain in the polymer system. Another tricky problem is the lack of standardized formats for polymer characterization data. Existing data often mixes multiple variables such as molecular weight, processing history, and characterization protocols, which brings great difficulties to data mining and machine learning applications [<xref ref-type="bibr" rid="ref-59">59</xref>].</p>
<p>The lack of model interpretability also limits the in-depth development of machine learning in the polymer field. Traditional AI models generally have the problem of &#x201C;black box&#x201D;. Although they can produce prediction results, it is difficult to clarify their internal mechanisms [<xref ref-type="bibr" rid="ref-74">74</xref>]. This defect is particularly prominent in fields that need to understand the intrinsic behavior of materials. Taking the research of organic photovoltaic devices as an example, although machine learning methods can accurately model material properties, they often cannot explain which chemical properties play a key role in performance improvement. Another common challenge is the phenomenon of model overfitting, especially when there are many parameters, the model may perform well on the training set, but its prediction ability on new data decreases significantly [<xref ref-type="bibr" rid="ref-75">75</xref>].</p>
<p>Computing resource requirements and algorithm complexity constitute another obstacle. Training large neural networks or analyzing high-dimensional data sets from molecular simulations and spectroscopy consumes a lot of computing resources. When solving the optimal polymer design problem with multi-parameter uncertainty, traditional integration methods will bring a heavy computing burden, and new algorithms need to be developed to deal with this high-dimensional and parameter correlation problem [<xref ref-type="bibr" rid="ref-76">76</xref>]. In addition, existing models perform poorly in dealing with multi-scale phenomena, and the behavioral characteristics of polymer materials often span multiple orders of magnitude, from chain entanglement, phase separation to fracture and creep, but most machine learning tools can only play a role at a single length or time scale [<xref ref-type="bibr" rid="ref-77">77</xref>].</p>
</sec>
<sec id="s7_2">
<label>7.2</label>
<title>Development Trends</title>
<p>The field of polymer science is experiencing profound changes brought about by machine learning technology, and this change presents the significant characteristics of multi-dimensional and interdisciplinary integration. The data-driven research paradigm is reshaping the pattern of polymer material R &#x0026; D, among which the combination of multi-scale modeling and physics&#x2014;informed machine learning methods is particularly striking. The latest research shows that the organic integration of polymer physical theory and machine learning architecture can effectively improve the prediction transfer ability of the model under different conditions, providing a new idea for solving the long-standing problem of complex characterization of polymer systems [<xref ref-type="bibr" rid="ref-64">64</xref>]. The innovation of computing architecture is also worthy of attention. With the iterative upgrading of GPU technology, the computing efficiency of chemical language models such as polyBERT has been significantly improved, making polymer structure design based on molecular fingerprints possible. This full&#x2014;process automation from prediction to design will completely subvert the traditional trial-and-error research model [<xref ref-type="bibr" rid="ref-37">37</xref>].</p>
<p>In terms of data infrastructure construction, the improvement of standardization and sharing mechanisms has become a consensus in the academic community. At present, polymer data generally faces the problems of chaotic format and uneven quality, and there is an urgent need to establish a unified and standardized data production and analysis process [<xref ref-type="bibr" rid="ref-18">18</xref>]. The advancement of global data sharing programs such as FAIR Data is building a more complete polymer database by integrating multi-source data such as high-throughput experiments, molecular simulations, and literature mining [<xref ref-type="bibr" rid="ref-6">6</xref>]. The continuous expansion of high-quality data sets has significantly improved the prediction accuracy of machine learning models for material performance parameters, especially in key indicators such as the power conversion efficiency of organic photovoltaic devices [<xref ref-type="bibr" rid="ref-45">45</xref>]. The construction of this data ecosystem cannot be separated from the full cooperation of industry, university, and research sectors, and it is necessary to jointly formulate practical data standards and sharing agreements.</p>
<p>The integration of interdisciplinary methods has given birth to the emerging research paradigm of automated laboratories. Cutting-edge research is committed to developing hybrid multi-scale frameworks that combine physical and chemical principles with machine learning algorithms to achieve seamless connection between molecular simulations and continuous&#x2014;scale machine learning models [<xref ref-type="bibr" rid="ref-76">76</xref>]. The successful development of AI platforms such as NVIDIA ALCHEMI marks that the application of generative AI models in material chemical space exploration and candidate material recommendation has entered the practical stage [<xref ref-type="bibr" rid="ref-70">70</xref>]. The proposal of the concept of autonomous laboratories is more revolutionary. It organically integrates machine learning, robot technology, and cloud computing to build a closed-loop optimization system from material design to synthesis. This integrated innovation has greatly improved R &#x0026; D efficiency [<xref ref-type="bibr" rid="ref-60">60</xref>].</p>
<p>The development of interpretable machine learning provides a new opportunity for theoretical breakthroughs in polymer science. To address the problem that current black-box models are difficult to reveal internal mechanisms, the academic community is committed to developing more interpretable machine learning methods to make the model decision&#x2014;making process more transparent [<xref ref-type="bibr" rid="ref-77">77</xref>]. By introducing the knowledge of domain experts and constructing descriptors that can identify the key features of materials, it helps to deeply understand the essential connection between polymer structure and performance [<xref ref-type="bibr" rid="ref-78">78</xref>]. The attention mechanism analysis in the research of organic photovoltaic materials is a typical case. The study found that the model assigns higher weights to adjacent language fragments (usually belonging to the same functional group). This interpretable analysis provides a new perspective for revealing the structure&#x2014;activity relationship of materials [<xref ref-type="bibr" rid="ref-79">79</xref>]. With the continuous improvement of interpretive tools, machine learning can not only predict material properties but also become an important tool for discovering new scientific laws [<xref ref-type="bibr" rid="ref-79">79</xref>].</p>
<p>The innovation of the education system has a fundamental supporting role in the development of polymer science. It has become an inevitable choice to integrate programming skills and machine learning knowledge into the chemistry curriculum system. This change aims to cultivate a new generation of polymer scientists with interdisciplinary capabilities [<xref ref-type="bibr" rid="ref-80">80</xref>,<xref ref-type="bibr" rid="ref-81">81</xref>]. The establishment of industry-university-research collaborative education mechanisms is also crucial. Cultivating compound talents through practical projects can effectively promote the practical application of machine learning technology in the polymer field. This transformation of talent training mode will fundamentally solve the practical dilemma that the threshold of computer majors is too high and synthetic chemists are difficult to apply machine learning tools [<xref ref-type="bibr" rid="ref-82">82</xref>]. It can be predicted that with the continuous deepening of these trends, the application of machine learning in polymer science will achieve a qualitative leap from auxiliary tools to leading paradigms, opening up unprecedented development paths for material innovation.</p>
</sec>
<sec id="s7_3">
<label>7.3</label>
<title>Application Prospects</title>
<p>The field of polymer science is ushering in profound changes brought about by machine learning technology, and its application potential has penetrated into multiple dimensions such as material R &#x0026; D, production and manufacturing, and environmental governance [<xref ref-type="bibr" rid="ref-83">83</xref>]. In the development of new materials, the ML-driven workflow is gradually realizing the full-chain automation from literature mining to material synthesis. This closed-loop system compresses the traditional R &#x0026; D cycle to an unprecedented extent [<xref ref-type="bibr" rid="ref-84">84</xref>]. The design of polymer materials represented by solution polymerized styrene&#x2014;butadiene rubber (SSBR) has shown the feasibility of machine learning replacing the traditional trial-and-error method, and its accurate prediction ability is expected to be extended to a wider range of material systems and performance indicators [<xref ref-type="bibr" rid="ref-85">85</xref>,<xref ref-type="bibr" rid="ref-86">86</xref>].</p>
<p>The deep integration of the Material Genome Initiative and machine learning is reshaping the methodology of polymer design. Inspired by the breakthrough results of AlphaFold2 in protein structure prediction, deep learning technology provides a new idea for solving the problem of polymer structure prediction. This model has important enlightenment significance for industries such as biopharmaceuticals [<xref ref-type="bibr" rid="ref-87">87</xref>]. Experimental studies have shown that data-driven methods can accurately regulate the morphological characteristics of single-chain nanoparticles (SCNPs), especially under the condition of low functionalization, providing a reliable verification platform for sequence&#x2014;based design strategies [<xref ref-type="bibr" rid="ref-88">88</xref>&#x2013;<xref ref-type="bibr" rid="ref-90">90</xref>]. In the field of mixed matrix membranes, machine learning&#x2014;assisted high-throughput screening technology significantly improves the performance prediction efficiency of CO<sub>2</sub> separation membranes by analyzing the synergistic effect between metal-organic frameworks (MOFs) and polymers [<xref ref-type="bibr" rid="ref-91">91</xref>].</p>
<p>The intelligent transformation of the intelligent manufacturing system cannot be separated from the support of machine learning technology. The development efficiency of materials dedicated to additive manufacturing has achieved a qualitative leap due to data-driven methods. This innovative model shows unique advantages in addressing material challenges in the fields of bioengineering and aerospace [<xref ref-type="bibr" rid="ref-92">92</xref>]. The newly developed autonomous laboratory platform has built an intelligent optimization system for polymer nano-synthesis by integrating cloud computing and online characterization technology, realizing the accurate production of &#x201C;on-demand granulation&#x201D; [<xref ref-type="bibr" rid="ref-93">93</xref>]. It is worth noting that the intelligent information extraction technology based on large language models has made a breakthrough in the optimization of injection molding processes. It can realize the high-precision extraction of processing parameters with only 224 samples, opening up a new way for the digital transformation of traditional manufacturing processes [<xref ref-type="bibr" rid="ref-94">94</xref>].</p>
<p>The development of environment&#x2014;friendly materials is achieving leapfrog development with the help of machine learning. The performance optimization research of materials such as polyurethane elastomers reveals that the eigenvalue of system parameters has a more significant impact on material performance than the characteristics of a single molecular chain. This finding provides an important basis for the overall regulation of material components [<xref ref-type="bibr" rid="ref-95">95</xref>]. The discovery of ionene materials with Troger&#x2019;s base structure marks a major progress in the field of sustainable energy materials. Their excellent conductivity provides an innovative idea for the design of a new generation of lithium-ion batteries [<xref ref-type="bibr" rid="ref-96">96</xref>]. Biomimetic intelligent thermal management materials, by simulating the biological thermal regulation mechanism and combining machine learning optimization strategies, show the unique value of dynamic regulation in fields such as wearable devices and building energy conservation [<xref ref-type="bibr" rid="ref-97">97</xref>].</p>
<p>The integration of interdisciplinary technologies continues to expand the application depth of machine learning in polymer science. The polyBERT chemical language model provides a universal solution for polymer space exploration with its high-throughput screening capability. Graph Neural Networks (GNNs) perform well in capturing the topological structure information of polymer chains, establishing a new paradigm for molecular ensemble modeling. With the iterative upgrading of professional computing platforms, AI proxy models such as Machine Learning Interatomic Potentials (MLIPs) will accelerate the process of material discovery, promote the paradigm shift of polymer science from experience&#x2014;driven to data-driven, and finally realize the historic leap of material R &#x0026; D from &#x201C;trial-and-error game&#x201D; to &#x201C;precision navigation&#x201D;.</p>
</sec>
</sec>
<sec id="s8">
<label>8</label>
<title>Conclusion</title>
<p>Machine learning is revolutionizing polymer materials research through a data-driven paradigm, achieving full-chain penetration from basic data processing and algorithm modeling to material design and manufacturing optimization. Within the core technical system outlined in this paper, the basic segment involves constructing polymer molecular descriptors using BigSMILES and curlySMILES, screening key parameters with the RDKit toolkit, standardizing data via methods like Min-Max normalization, enhancing data through Bootstrap resampling and transfer learning, and laying a solid foundation with high-quality polymer databases such as PoLyInfo- which contains critical data for approximately 100 polymers. At the algorithm level, traditional machine learning (e.g., SVM achieving an R<sup>2</sup> of 0.91 in polymer glass transition temperature (Tg) prediction, and Random Forest reaching an R<sup>2</sup> of 0.97 in thermal conductivity prediction), deep learning (e.g., Graph Neural Networks (GNN) achieving an RMSE of approximately 30 K and an R<sup>2</sup> of 0.9 in Tg prediction, and the polyBERT model increasing processing speed by two orders of magnitude), as well as transfer learning and multi-task learning (e.g., the Sim2Real strategy achieving a mean absolute error (MAE) of 0.024 W/mK in thermal conductivity prediction) each demonstrate their advantages, enabling accurate prediction of material properties. In terms of design strategies, reverse design, high-throughput screening, and multi-objective optimization have reshaped the R&#x0026;D model. Typical cases show that machine learning can improve the accuracy of thermal conductivity prediction by 40%, optimize the degradation characteristics of over 600 types of polyesters, and increase polymerization reaction efficiency by 300%, significantly shortening R&#x0026;D cycles and costs while revealing structure-property relationships that are difficult to capture using traditional methods. Current research faces challenges such as uneven data quality, insufficient model generalization, and poor interpretability. However, solutions like multi-scale modeling and physics-informed machine learning provide directions for addressing these issues. In the future, deeper interdisciplinary integration will drive polymer research from &#x201C;trial-and-error exploration&#x201D; to &#x201C;precision navigation,&#x201D; providing core support for material innovation in fields such as aerospace, biomedicine, and new energy.</p>
</sec>
</body>
<back>
<ack>
<p>Not applicable.</p>
</ack>
<sec>
<title>Funding Statement</title>
<p>This work is supported by National Natural Science Foundation of China (Nos. 51671075 and 51971086) and Natural Science Foundation of Heilongjiang Province of China (No. LH2022E081).</p>
</sec>
<sec>
<title>Author Contributions</title>
<p>Hongtao Guo: Data curation; Formal analysis; Investigation; Methodology; Software; Validation; Visualization; Writing&#x2014;original draft; Writing&#x2014;review&#x0026;editing. Shuai Li: Data curation; Formal analysis; Investigation; Validation. Shu Li: Conceptualization; Project administration; Supervision; Methodology; Writing&#x2014;review&#x0026;editing. All authors reviewed and approved the final version of the manuscript.</p>
</sec>
<sec sec-type="data-availability">
<title>Availability of Data and Materials</title>
<p>No new data were generated or analyzed in support of this study.</p>
</sec>
<sec>
<title>Ethics Approval</title>
<p>Not applicable.</p>
</sec>
<sec sec-type="COI-statement">
<title>Conflicts of Interest</title>
<p>The authors declare no conflicts of interest.</p>
</sec>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Cencer</surname> <given-names>MM</given-names></string-name>, <string-name><surname>Moore</surname> <given-names>JS</given-names></string-name>, <string-name><surname>Assary</surname> <given-names>RS</given-names></string-name></person-group>. <article-title>Machine learning for polymeric materials: an introduction</article-title>. <source>Polym Int</source>. <year>2021</year>;<volume>71</volume>(<issue>5</issue>):<fpage>537</fpage>&#x2013;<lpage>42</lpage>. doi:<pub-id pub-id-type="doi">10.1002/pi.6345</pub-id>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Cao</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Yin</surname> <given-names>H</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Machine learning in polymer science: a new lens for physical and chemical exploration</article-title>. <source>Prog Mater Sci</source>. <year>2025</year>;<volume>156</volume>:<fpage>101544</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.pmatsci.2025.101544</pub-id>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yuan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Machine-learning exploration of polymer compatibility</article-title>. <source>Cell Rep Phys Sci</source>. <year>2022</year>;<volume>3</volume>(<issue>6</issue>):<fpage>100931</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.xcrp.2022.100931</pub-id>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ju</surname> <given-names>S</given-names></string-name>, <string-name><surname>Shiga</surname> <given-names>T</given-names></string-name>, <string-name><surname>Feng</surname> <given-names>L</given-names></string-name>, <string-name><surname>Shiomi</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Revisiting PbTe to identify how thermal conductivity is really limited</article-title>. <source>Phys Rev B</source>. <year>2018</year>;<volume>97</volume>(<issue>18</issue>):<fpage>184305</fpage>. doi:<pub-id pub-id-type="doi">10.1103/physrevb.97.184305</pub-id>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>K</given-names></string-name>, <string-name><surname>Shi</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>T</given-names></string-name>, <string-name><surname>Zhao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhai</surname> <given-names>H</given-names></string-name>, <string-name><surname>Korani</surname> <given-names>D</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Computational and data-driven modelling of solid polymer electrolytes</article-title>. <source>Digit Discov</source>. <year>2023</year>;<volume>2</volume>(<issue>6</issue>):<fpage>1660</fpage>&#x2013;<lpage>82</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d3dd00078h/v2/response1</pub-id>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Patra</surname> <given-names>TK</given-names></string-name></person-group>. <article-title>Data-driven methods for accelerating polymer design</article-title>. <source>ACS Polym Au</source>. <year>2021</year>;<volume>2</volume>(<issue>1</issue>):<fpage>8</fpage>&#x2013;<lpage>26</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acspolymersau.1c00035</pub-id>; <pub-id pub-id-type="pmid">36855746</pub-id></mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Paradiso</surname> <given-names>SP</given-names></string-name>, <string-name><surname>Delaney</surname> <given-names>KT</given-names></string-name>, <string-name><surname>Fredrickson</surname> <given-names>GH</given-names></string-name></person-group>. <article-title>Swarm intelligence platform for multiblock polymer inverse formulation design</article-title>. <source>ACS Macro Lett</source>. <year>2016</year>;<volume>5</volume>(<issue>8</issue>):<fpage>972</fpage>&#x2013;<lpage>6</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acsmacrolett.6b00494</pub-id>; <pub-id pub-id-type="pmid">35607214</pub-id></mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Struble</surname> <given-names>DC</given-names></string-name>, <string-name><surname>Lamb</surname> <given-names>BG</given-names></string-name>, <string-name><surname>Ma</surname> <given-names>B</given-names></string-name></person-group>. <article-title>A prospective on machine learning challenges, progress, and potential in polymer science</article-title>. <source>MRS Commun</source>. <year>2024</year>;<volume>14</volume>(<issue>5</issue>):<fpage>752</fpage>&#x2013;<lpage>70</lpage>. doi:<pub-id pub-id-type="doi">10.1557/s43579-024-00587-8</pub-id>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>J</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>Z</given-names></string-name></person-group>. <article-title>Materials data toward machine learning: advances and challenges</article-title>. <source>J Phys Chem Lett</source>. <year>2022</year>;<volume>13</volume>(<issue>18</issue>):<fpage>3965</fpage>&#x2013;<lpage>77</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jpclett.2c00576</pub-id>; <pub-id pub-id-type="pmid">35481746</pub-id></mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tsai</surname> <given-names>ML</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>CW</given-names></string-name>, <string-name><surname>Chang</surname> <given-names>SW</given-names></string-name></person-group>. <article-title>Theory-inspired machine learning for stress-strain curve prediction of short fiber-reinforced composites with unseen design space</article-title>. <source>Extreme Mech Lett</source>. <year>2023</year>;<volume>65</volume>:<fpage>102097</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eml.2023.102097</pub-id>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Malashin</surname> <given-names>IP</given-names></string-name>, <string-name><surname>Martysyuk</surname> <given-names>D</given-names></string-name>, <string-name><surname>Nelyub</surname> <given-names>V</given-names></string-name>, <string-name><surname>Borodulin</surname> <given-names>A</given-names></string-name>, <string-name><surname>Gantimurov</surname> <given-names>A</given-names></string-name>, <string-name><surname>Tynchenko</surname> <given-names>V</given-names></string-name></person-group>. <article-title>Deep learning for property prediction of natural fiber polymer composites</article-title>. <source>Sci Rep</source>. <year>2025</year>;<volume>15</volume>:<fpage>27837</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41598-025-10841-1</pub-id>; <pub-id pub-id-type="pmid">40738916</pub-id></mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>N</given-names></string-name>, <string-name><surname>Jafarzadeh</surname> <given-names>S</given-names></string-name>, <string-name><surname>Lattimer</surname> <given-names>BY</given-names></string-name>, <string-name><surname>Ni</surname> <given-names>S</given-names></string-name>, <string-name><surname>Lua</surname> <given-names>J</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Harnessing large language models for data-scarce learning of polymer properties</article-title>. <source>Nat Comput Sci</source>. <year>2025</year>;<volume>5</volume>(<issue>3</issue>):<fpage>245</fpage>&#x2013;<lpage>54</lpage>. doi:<pub-id pub-id-type="doi">10.1038/s43588-025-00768-y</pub-id>; <pub-id pub-id-type="pmid">39930041</pub-id></mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Doan Tran</surname> <given-names>H</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chandrasekaran</surname> <given-names>A</given-names></string-name>, <string-name><surname>Batra</surname> <given-names>R</given-names></string-name>, <string-name><surname>Venkatram</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Machine-learning predictions of polymer properties with Polymer Genome</article-title>. <source>J Appl Phys</source>. <year>2020</year>;<volume>128</volume>(<issue>17</issue>):<fpage>171104</fpage>. doi:<pub-id pub-id-type="doi">10.1063/5.0023759</pub-id>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>P</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Machine learning-assisted systematical polymerization planning: case studies on reversible-deactivation radical polymerization</article-title>. <source>Sci China Chem</source>. <year>2021</year>;<volume>64</volume>(<issue>6</issue>):<fpage>1039</fpage>&#x2013;<lpage>46</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s11426-020-9969-y</pub-id>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Gu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>M</given-names></string-name></person-group>. <article-title>Copolymer sequence regulation enabled by reactivity ratio fingerprints via machine learning</article-title>. <source>Angew Chem Int Ed</source>. <year>2025</year>;<volume>64</volume>(<issue>50</issue>):<fpage>e202513086</fpage>. doi:<pub-id pub-id-type="doi">10.1002/anie.202513086</pub-id>; <pub-id pub-id-type="pmid">41147785</pub-id></mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kuenneth</surname> <given-names>C</given-names></string-name>, <string-name><surname>Ramprasad</surname> <given-names>R</given-names></string-name></person-group>. <article-title>polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics</article-title>. <source>Nat Commun</source>. <year>2023</year>;<volume>14</volume>:<fpage>4099</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41467-023-39868-6</pub-id>; <pub-id pub-id-type="pmid">37433807</pub-id></mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jain</surname> <given-names>A</given-names></string-name>, <string-name><surname>Ong</surname> <given-names>SP</given-names></string-name>, <string-name><surname>Hautier</surname> <given-names>G</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>W</given-names></string-name>, <string-name><surname>Richards</surname> <given-names>WD</given-names></string-name>, <string-name><surname>Dacek</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Commentary: the Materials Project: a materials genome approach to accelerating materials innovation</article-title>. <source>APL Mater</source>. <year>2013</year>;<volume>1</volume>:<fpage>011002</fpage>. doi:<pub-id pub-id-type="doi">10.1063/1.4812323</pub-id>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Meaurio</surname> <given-names>E</given-names></string-name>, <string-name><surname>Sanchez-Rexach</surname> <given-names>E</given-names></string-name>, <string-name><surname>Zuza</surname> <given-names>E</given-names></string-name>, <string-name><surname>Lejardi</surname> <given-names>A</given-names></string-name>, <string-name><surname>del Pilar Sanchez-Camargo</surname> <given-names>A</given-names></string-name>, <string-name><surname>Sarasua</surname> <given-names>JR</given-names></string-name></person-group>. <article-title>Predicting miscibility in polymer blends using the Bagley plot: blends with poly(ethylene oxide)</article-title>. <source>Polymer</source>. <year>2017</year>;<volume>113</volume>:<fpage>295</fpage>&#x2013;<lpage>309</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.polymer.2017.01.041</pub-id>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Li</surname> <given-names>YQ</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>LQ</given-names></string-name>, <string-name><surname>Li</surname> <given-names>JF</given-names></string-name></person-group>. <article-title>Data and machine learning in polymer science</article-title>. <source>Chin J Polym Sci</source>. <year>2023</year>;<volume>41</volume>(<issue>9</issue>):<fpage>1371</fpage>&#x2013;<lpage>6</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10118-022-2868-0</pub-id>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>McDonald</surname> <given-names>SM</given-names></string-name>, <string-name><surname>Augustine</surname> <given-names>EK</given-names></string-name>, <string-name><surname>Lanners</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Rudin</surname> <given-names>C</given-names></string-name>, <string-name><surname>Brinson</surname> <given-names>LC</given-names></string-name>, <string-name><surname>Becker</surname> <given-names>ML</given-names></string-name></person-group>. <article-title>Applied machine learning as a driver for polymeric biomaterials design</article-title>. <source>Nat Commun</source>. <year>2023</year>;<volume>14</volume>:<fpage>4838</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41467-023-40459-8</pub-id>; <pub-id pub-id-type="pmid">37563117</pub-id></mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tristram</surname> <given-names>F</given-names></string-name>, <string-name><surname>Jung</surname> <given-names>N</given-names></string-name>, <string-name><surname>Hodapp</surname> <given-names>P</given-names></string-name>, <string-name><surname>Schr&#x00F6;der</surname> <given-names>RR</given-names></string-name>, <string-name><surname>W&#x00F6;ll</surname> <given-names>C</given-names></string-name>, <string-name><surname>Br&#x00E4;se</surname> <given-names>S</given-names></string-name></person-group>. <article-title>The impact of digitalized data management on materials systems workflows</article-title>. <source>Adv Funct Mater</source>. <year>2024</year>;<volume>34</volume>(<issue>20</issue>):<fpage>2303615</fpage>. doi:<pub-id pub-id-type="doi">10.1002/adfm.202303615</pub-id>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Du</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Machine learning-assisted design of advanced polymeric materials</article-title>. <source>Acc Mater Res</source>. <year>2024</year>;<volume>5</volume>(<issue>5</issue>):<fpage>571</fpage>&#x2013;<lpage>84</lpage>. doi:<pub-id pub-id-type="doi">10.1021/accountsmr.3c00288</pub-id>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Varshney</surname> <given-names>V</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Benchmarking machine learning models for polymer informatics: an example of glass transition temperature</article-title>. <source>J Chem Inf Model</source>. <year>2021</year>;<volume>61</volume>(<issue>11</issue>):<fpage>5395</fpage>&#x2013;<lpage>413</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jcim.1c01031</pub-id>; <pub-id pub-id-type="pmid">34662106</pub-id></mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Piroozi</surname> <given-names>G</given-names></string-name>, <string-name><surname>Kammakakam</surname> <given-names>I</given-names></string-name></person-group>. <article-title>Designing imidazolium-mediated polymer electrolytes for lithium-ion batteries using machine-learning approaches: an insight into ionene materials</article-title>. <source>Polymers</source>. <year>2025</year>;<volume>17</volume>(<issue>15</issue>):<fpage>2148</fpage>. doi:<pub-id pub-id-type="doi">10.3390/polym17152148</pub-id>; <pub-id pub-id-type="pmid">40808196</pub-id></mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Day</surname> <given-names>EC</given-names></string-name>, <string-name><surname>Chittari</surname> <given-names>SS</given-names></string-name>, <string-name><surname>Bogen</surname> <given-names>MP</given-names></string-name>, <string-name><surname>Knight</surname> <given-names>AS</given-names></string-name></person-group>. <article-title>Navigating the expansive landscapes of soft materials: a user guide for high-throughput workflows</article-title>. <source>ACS Polym Au</source>. <year>2023</year>;<volume>3</volume>(<issue>6</issue>):<fpage>406</fpage>&#x2013;<lpage>27</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acspolymersau.3c00025</pub-id>; <pub-id pub-id-type="pmid">38107416</pub-id></mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yue</surname> <given-names>T</given-names></string-name>, <string-name><surname>He</surname> <given-names>J</given-names></string-name>, <string-name><surname>Tao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>High-throughput screening and prediction of high modulus of resilience polymers using explainable machine learning</article-title>. <source>J Chem Theory Comput</source>. <year>2023</year>;<volume>19</volume>(<issue>14</issue>):<fpage>4641</fpage>&#x2013;<lpage>53</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jctc.3c00131</pub-id>; <pub-id pub-id-type="pmid">37338332</pub-id></mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ge</surname> <given-names>W</given-names></string-name>, <string-name><surname>De Silva</surname> <given-names>R</given-names></string-name>, <string-name><surname>Fan</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Sisson</surname> <given-names>SA</given-names></string-name>, <string-name><surname>Stenzel</surname> <given-names>MH</given-names></string-name></person-group>. <article-title>Machine learning in polymer research</article-title>. <source>Adv Mater</source>. <year>2025</year>;<volume>37</volume>(<issue>11</issue>):<fpage>2413695</fpage>. doi:<pub-id pub-id-type="doi">10.1002/adma.202413695</pub-id>; <pub-id pub-id-type="pmid">39924835</pub-id></mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Park</surname> <given-names>J</given-names></string-name>, <string-name><surname>Shim</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lee</surname> <given-names>F</given-names></string-name>, <string-name><surname>Rammohan</surname> <given-names>A</given-names></string-name>, <string-name><surname>Goyal</surname> <given-names>S</given-names></string-name>, <string-name><surname>Shim</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Prediction and interpretation of polymer properties using the graph convolutional network</article-title>. <source>ACS Polym Au</source>. <year>2022</year>;<volume>2</volume>(<issue>4</issue>):<fpage>213</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acspolymersau.1c00050</pub-id>; <pub-id pub-id-type="pmid">36855563</pub-id></mixed-citation></ref>
<ref id="ref-29"><label>[29]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Dangayach</surname> <given-names>R</given-names></string-name>, <string-name><surname>Jeong</surname> <given-names>N</given-names></string-name>, <string-name><surname>Demirel</surname> <given-names>E</given-names></string-name>, <string-name><surname>Uzal</surname> <given-names>N</given-names></string-name>, <string-name><surname>Fung</surname> <given-names>V</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Machine learning-aided inverse design and discovery of novel polymeric materials for membrane separation</article-title>. <source>Environ Sci Technol</source>. <year>2025</year>;<volume>59</volume>(<issue>2</issue>):<fpage>993</fpage>&#x2013;<lpage>1012</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.est.4c08298</pub-id>; <pub-id pub-id-type="pmid">39680111</pub-id></mixed-citation></ref>
<ref id="ref-30"><label>[30]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Huang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>ES</given-names></string-name>, <string-name><surname>Oya</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Saeki</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kikugawa</surname> <given-names>G</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Structure-property correlation study for organic photovoltaic polymer materials using data science approach</article-title>. <source>J Phys Chem C</source>. <year>2020</year>;<volume>124</volume>(<issue>24</issue>):<fpage>12871</fpage>&#x2013;<lpage>82</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jpcc.0c00517</pub-id>.</mixed-citation></ref>
<ref id="ref-31"><label>[31]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Van den Hurk</surname> <given-names>RS</given-names></string-name>, <string-name><surname>Pirok</surname> <given-names>BWJ</given-names></string-name>, <string-name><surname>Bos</surname> <given-names>TS</given-names></string-name></person-group>. <article-title>The role of artificial intelligence and machine learning in polymer characterization: emerging trends and perspectives</article-title>. <source>Chromatographia</source>. <year>2025</year>;<volume>88</volume>(<issue>5</issue>):<fpage>357</fpage>&#x2013;<lpage>63</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s10337-025-04406-7</pub-id>; <pub-id pub-id-type="pmid">40444009</pub-id></mixed-citation></ref>
<ref id="ref-32"><label>[32]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kondo</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Kakimoto</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>B</given-names></string-name>, <string-name><surname>Yamada</surname> <given-names>H</given-names></string-name>, <string-name><surname>Kuwajima</surname> <given-names>I</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Machine-learning-assisted discovery of polymers with high thermal conductivity using a molecular design algorithm</article-title>. <source>npj Comput Mater</source>. <year>2019</year>;<volume>5</volume>:<fpage>66</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-019-0203-2</pub-id>.</mixed-citation></ref>
<ref id="ref-33"><label>[33]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Martin</surname> <given-names>TB</given-names></string-name>, <string-name><surname>Audus</surname> <given-names>DJ</given-names></string-name></person-group>. <article-title>Emerging trends in machine learning: a polymer perspective</article-title>. <source>ACS Polym Au</source>. <year>2023</year>;<volume>3</volume>(<issue>3</issue>):<fpage>239</fpage>&#x2013;<lpage>58</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acspolymersau.2c00053</pub-id>; <pub-id pub-id-type="pmid">37334191</pub-id></mixed-citation></ref>
<ref id="ref-34"><label>[34]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kumar</surname> <given-names>JN</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Jun</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Challenges and opportunities of polymer design with machine learning and high throughput experimentation</article-title>. <source>MRS Commun</source>. <year>2019</year>;<volume>9</volume>(<issue>2</issue>):<fpage>537</fpage>&#x2013;<lpage>44</lpage>. doi:<pub-id pub-id-type="doi">10.1557/mrc.2019.54</pub-id>.</mixed-citation></ref>
<ref id="ref-35"><label>[35]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Meftahi</surname> <given-names>N</given-names></string-name>, <string-name><surname>Klymenko</surname> <given-names>M</given-names></string-name>, <string-name><surname>Christofferson</surname> <given-names>AJ</given-names></string-name>, <string-name><surname>Bach</surname> <given-names>U</given-names></string-name>, <string-name><surname>Winkler</surname> <given-names>DA</given-names></string-name>, <string-name><surname>Russo</surname> <given-names>SP</given-names></string-name></person-group>. <article-title>Machine learning property prediction for organic photovoltaic devices</article-title>. <source>npj Comput Mater</source>. <year>2020</year>;<volume>6</volume>:<fpage>166</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-020-00429-w</pub-id>.</mixed-citation></ref>
<ref id="ref-36"><label>[36]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Aoki</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Tsurimoto</surname> <given-names>T</given-names></string-name>, <string-name><surname>Hayashi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Minami</surname> <given-names>S</given-names></string-name>, <string-name><surname>Tadamichi</surname> <given-names>O</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Multitask machine learning to predict polymer-solvent miscibility using flory-Huggins interaction parameters</article-title>. <source>Macromolecules</source>. <year>2023</year>;<volume>56</volume>(<issue>14</issue>):<fpage>5446</fpage>&#x2013;<lpage>56</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.macromol.2c02600</pub-id>.</mixed-citation></ref>
<ref id="ref-37"><label>[37]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ferji</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Basic concepts and tools of artificial intelligence in polymer science</article-title>. <source>Polym Chem</source>. <year>2025</year>;<volume>16</volume>(<issue>21</issue>):<fpage>2457</fpage>&#x2013;<lpage>70</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d5py00148j</pub-id>.</mixed-citation></ref>
<ref id="ref-38"><label>[38]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhu</surname> <given-names>MX</given-names></string-name>, <string-name><surname>Song</surname> <given-names>HG</given-names></string-name>, <string-name><surname>Yu</surname> <given-names>QC</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>HY</given-names></string-name></person-group>. <article-title>Machine-learning-driven discovery of polymers molecular structures with high thermal conductivity</article-title>. <source>Int J Heat Mass Transf</source>. <year>2020</year>;<volume>162</volume>:<fpage>120381</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.ijheatmasstransfer.2020.120381</pub-id>.</mixed-citation></ref>
<ref id="ref-39"><label>[39]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhao</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Mulder</surname> <given-names>RJ</given-names></string-name>, <string-name><surname>Houshyar</surname> <given-names>S</given-names></string-name>, <string-name><surname>Le</surname> <given-names>TC</given-names></string-name></person-group>. <article-title>A review on the application of molecular descriptors and machine learning in polymer design</article-title>. <source>Polym Chem</source>. <year>2023</year>;<volume>14</volume>(<issue>29</issue>):<fpage>3325</fpage>&#x2013;<lpage>46</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d3py00395g</pub-id>.</mixed-citation></ref>
<ref id="ref-40"><label>[40]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Qiu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Qiu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Dai</surname> <given-names>X</given-names></string-name>, <string-name><surname>Ji</surname> <given-names>X</given-names></string-name>, <string-name><surname>Sun</surname> <given-names>ZY</given-names></string-name></person-group>. <article-title>PolyNC: a natural and chemical language model for the prediction of unified polymer properties</article-title>. <source>Chem Sci</source>. <year>2024</year>;<volume>15</volume>(<issue>2</issue>):<fpage>534</fpage>&#x2013;<lpage>44</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d3sc05079c</pub-id>; <pub-id pub-id-type="pmid">38179518</pub-id></mixed-citation></ref>
<ref id="ref-41"><label>[41]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yue</surname> <given-names>T</given-names></string-name>, <string-name><surname>He</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Polyuniverse: generation of a large-scale polymer library using rule-based polymerization reactions for polymer informatics</article-title>. <source>Digit Discov</source>. <year>2024</year>;<volume>3</volume>(<issue>12</issue>):<fpage>2465</fpage>&#x2013;<lpage>78</lpage>. doi:<pub-id pub-id-type="doi">10.26434/chemrxiv-2024-7069c</pub-id>.</mixed-citation></ref>
<ref id="ref-42"><label>[42]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Machello</surname> <given-names>C</given-names></string-name>, <string-name><surname>Aghabalaei Baghaei</surname> <given-names>K</given-names></string-name>, <string-name><surname>Bazli</surname> <given-names>M</given-names></string-name>, <string-name><surname>Hadigheh</surname> <given-names>A</given-names></string-name>, <string-name><surname>Rajabipour</surname> <given-names>A</given-names></string-name>, <string-name><surname>Arashpour</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Tree-based machine learning approach to modelling tensile strength retention of fibre reinforced polymer composites exposed to elevated temperatures</article-title>. <source>Compos Part B Eng</source>. <year>2023</year>;<volume>270</volume>:<fpage>111132</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.compositesb.2023.111132</pub-id>.</mixed-citation></ref>
<ref id="ref-43"><label>[43]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Balachandran</surname> <given-names>PV</given-names></string-name></person-group>. <article-title>Adaptive machine learning for efficient materials design</article-title>. <source>MRS Bull</source>. <year>2020</year>;<volume>45</volume>(<issue>7</issue>):<fpage>579</fpage>&#x2013;<lpage>86</lpage>. doi:<pub-id pub-id-type="doi">10.1557/mrs.2020.163</pub-id>.</mixed-citation></ref>
<ref id="ref-44"><label>[44]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Han</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Park</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yi</surname> <given-names>J</given-names></string-name>, <string-name><surname>Park</surname> <given-names>G</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Multimodal transformer for property prediction in polymers</article-title>. <source>ACS Appl Mater Interfaces</source>. <year>2024</year>;<volume>16</volume>(<issue>13</issue>):<fpage>16853</fpage>&#x2013;<lpage>60</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acsami.4c01207</pub-id>; <pub-id pub-id-type="pmid">38501934</pub-id></mixed-citation></ref>
<ref id="ref-45"><label>[45]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sun</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zheng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>K</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>Q</given-names></string-name>, <string-name><surname>Shah</surname> <given-names>AA</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>Z</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Machine learning-assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials</article-title>. <source>Sci Adv</source>. <year>2019</year>;<volume>5</volume>(<issue>11</issue>):<fpage>eaay4275</fpage>. doi:<pub-id pub-id-type="doi">10.1126/sciadv.aay4275</pub-id>; <pub-id pub-id-type="pmid">31723607</pub-id></mixed-citation></ref>
<ref id="ref-46"><label>[46]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>C</given-names></string-name>, <string-name><surname>Xia</surname> <given-names>K</given-names></string-name></person-group>. <article-title>Multi-Cover Persistence (MCP)-based machine learning for polymer property prediction</article-title>. <source>Brief Bioinform</source>. <year>2024</year>;<volume>25</volume>(<issue>6</issue>):<fpage>bbae465</fpage>. doi:<pub-id pub-id-type="doi">10.1093/bib/bbae465</pub-id>; <pub-id pub-id-type="pmid">39323091</pub-id></mixed-citation></ref>
<ref id="ref-47"><label>[47]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sanchez-Lengeling</surname> <given-names>B</given-names></string-name>, <string-name><surname>Aspuru-Guzik</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Inverse molecular design using machine learning: generative models for matter engineering</article-title>. <source>Science</source>. <year>2018</year>;<volume>361</volume>(<issue>6400</issue>):<fpage>360</fpage>&#x2013;<lpage>5</lpage>. doi:<pub-id pub-id-type="doi">10.1126/science.aat2663</pub-id>; <pub-id pub-id-type="pmid">30049875</pub-id></mixed-citation></ref>
<ref id="ref-48"><label>[48]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Lin</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Polymer biodegradation in aquatic environments: a machine learning model informed by meta-analysis of structure-biodegradation relationships</article-title>. <source>Environ Sci Technol</source>. <year>2025</year>;<volume>59</volume>(<issue>2</issue>):<fpage>1253</fpage>&#x2013;<lpage>63</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.est.4c11282</pub-id>; <pub-id pub-id-type="pmid">39772517</pub-id></mixed-citation></ref>
<ref id="ref-49"><label>[49]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Minami</surname> <given-names>S</given-names></string-name>, <string-name><surname>Hayashi</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Fukumizu</surname> <given-names>K</given-names></string-name>, <string-name><surname>Sugisawa</surname> <given-names>H</given-names></string-name>, <string-name><surname>Ishii</surname> <given-names>M</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Scaling law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions</article-title>. <source>npj Comput Mater</source>. <year>2025</year>;<volume>11</volume>:<fpage>146</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-025-01606-5</pub-id>.</mixed-citation></ref>
<ref id="ref-50"><label>[50]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Barati Farimani</surname> <given-names>A</given-names></string-name></person-group>. <article-title>TransPolymer: a Transformer-based language model for polymer property predictions</article-title>. <source>npj Comput Mater</source>. <year>2023</year>;<volume>9</volume>:<fpage>64</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-023-01016-5</pub-id>.</mixed-citation></ref>
<ref id="ref-51"><label>[51]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Pai</surname> <given-names>SM</given-names></string-name>, <string-name><surname>Shah</surname> <given-names>KA</given-names></string-name>, <string-name><surname>Sunder</surname> <given-names>S</given-names></string-name>, <string-name><surname>Albuquerque</surname> <given-names>RQ</given-names></string-name>, <string-name><surname>Br&#x00FC;tting</surname> <given-names>C</given-names></string-name>, <string-name><surname>Ruckd&#x00E4;schel</surname> <given-names>H</given-names></string-name></person-group>. <article-title>Machine learning applied to the design and optimization of polymeric materials: a review</article-title>. <source>Next Mater</source>. <year>2025</year>;<volume>7</volume>:<fpage>100449</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.nxmate.2024.100449</pub-id>.</mixed-citation></ref>
<ref id="ref-52"><label>[52]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Meredig</surname> <given-names>B</given-names></string-name>, <string-name><surname>Antono</surname> <given-names>E</given-names></string-name>, <string-name><surname>Church</surname> <given-names>C</given-names></string-name>, <string-name><surname>Hutchinson</surname> <given-names>M</given-names></string-name>, <string-name><surname>Ling</surname> <given-names>J</given-names></string-name>, <string-name><surname>Paradiso</surname> <given-names>S</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery</article-title>. <source>Mol Syst Des Eng</source>. <year>2018</year>;<volume>3</volume>(<issue>5</issue>):<fpage>819</fpage>&#x2013;<lpage>25</lpage>. doi:<pub-id pub-id-type="doi">10.1039/c8me00012c</pub-id>.</mixed-citation></ref>
<ref id="ref-53"><label>[53]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jackson</surname> <given-names>NE</given-names></string-name>, <string-name><surname>Webb</surname> <given-names>MA</given-names></string-name>, <string-name><surname>de Pablo</surname> <given-names>JJ</given-names></string-name></person-group>. <article-title>Recent advances in machine learning towards multiscale soft materials design</article-title>. <source>Curr Opin Chem Eng</source>. <year>2019</year>;<volume>23</volume>:<fpage>106</fpage>&#x2013;<lpage>14</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.coche.2019.03.005</pub-id>.</mixed-citation></ref>
<ref id="ref-54"><label>[54]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Patel</surname> <given-names>RA</given-names></string-name>, <string-name><surname>Webb</surname> <given-names>MA</given-names></string-name></person-group>. <article-title>Data-driven design of polymer-based biomaterials: high-throughput simulation, experimentation, and machine learning</article-title>. <source>ACS Appl Bio Mater</source>. <year>2023</year>;<volume>7</volume>(<issue>2</issue>):<fpage>510</fpage>&#x2013;<lpage>27</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acsabm.2c00962</pub-id>; <pub-id pub-id-type="pmid">36701125</pub-id></mixed-citation></ref>
<ref id="ref-55"><label>[55]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>J</given-names></string-name>, <string-name><surname>Du</surname> <given-names>L</given-names></string-name></person-group>. <article-title>An intelligent manufacturing platform of polymers: polymeric material genome engineering</article-title>. <source>Engineering</source>. <year>2023</year>;<volume>27</volume>:<fpage>31</fpage>&#x2013;<lpage>6</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.eng.2023.01.018</pub-id>.</mixed-citation></ref>
<ref id="ref-56"><label>[56]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Rahmanian</surname> <given-names>F</given-names></string-name>, <string-name><surname>Flowers</surname> <given-names>J</given-names></string-name>, <string-name><surname>Guevarra</surname> <given-names>D</given-names></string-name>, <string-name><surname>Richter</surname> <given-names>M</given-names></string-name>, <string-name><surname>Fichtner</surname> <given-names>M</given-names></string-name>, <string-name><surname>Donnely</surname> <given-names>P</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Enabling modular autonomous feedback-loops in materials science through hierarchical experimental laboratory automation and orchestration</article-title>. <source>Adv Mater Inter</source>. <year>2022</year>;<volume>9</volume>(<issue>8</issue>):<fpage>2101987</fpage>. doi:<pub-id pub-id-type="doi">10.1002/admi.202101987</pub-id>.</mixed-citation></ref>
<ref id="ref-57"><label>[57]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kim</surname> <given-names>C</given-names></string-name>, <string-name><surname>Chandrasekaran</surname> <given-names>A</given-names></string-name>, <string-name><surname>Huan</surname> <given-names>TD</given-names></string-name>, <string-name><surname>Das</surname> <given-names>D</given-names></string-name>, <string-name><surname>Ramprasad</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Polymer genome: a data-powered polymer informatics platform for property predictions</article-title>. <source>J Phys Chem C</source>. <year>2018</year>;<volume>122</volume>(<issue>31</issue>):<fpage>17575</fpage>&#x2013;<lpage>85</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jpcc.8b02913</pub-id>.</mixed-citation></ref>
<ref id="ref-58"><label>[58]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wan</surname> <given-names>H</given-names></string-name>, <string-name><surname>Fang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>M</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>S</given-names></string-name>, <string-name><surname>Sui</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Huang</surname> <given-names>X</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Interpretable machine-learning and big data mining to predict the CO<sub>2</sub> separation in polymer-MOF mixed matrix membranes</article-title>. <source>Adv Sci</source>. <year>2025</year>;<volume>12</volume>(<issue>16</issue>):<fpage>2405905</fpage>. doi:<pub-id pub-id-type="doi">10.1002/advs.202405905</pub-id>; <pub-id pub-id-type="pmid">40014002</pub-id></mixed-citation></ref>
<ref id="ref-59"><label>[59]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Du</surname> <given-names>S</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>J</given-names></string-name>, <string-name><surname>Du</surname> <given-names>L</given-names></string-name>, <string-name><surname>Xu</surname> <given-names>X</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Design of silicon-containing arylacetylene resins aided by machine learning enhanced materials genome approach</article-title>. <source>Chem Eng J</source>. <year>2022</year>;<volume>448</volume>:<fpage>137643</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.cej.2022.137643</pub-id>.</mixed-citation></ref>
<ref id="ref-60"><label>[60]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shen</surname> <given-names>ZH</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Cheng</surname> <given-names>XX</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>HX</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>LQ</given-names></string-name>, <string-name><surname>Nan</surname> <given-names>CW</given-names></string-name></person-group>. <article-title>High-throughput data-driven interface design of high-energy-density polymer nanocomposites</article-title>. <source>J Materiomics</source>. <year>2020</year>;<volume>6</volume>(<issue>3</issue>):<fpage>573</fpage>&#x2013;<lpage>81</lpage>. doi:<pub-id pub-id-type="doi">10.1016/j.jmat.2020.04.006</pub-id>.</mixed-citation></ref>
<ref id="ref-61"><label>[61]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Knox</surname> <given-names>ST</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>KE</given-names></string-name>, <string-name><surname>Islam</surname> <given-names>N</given-names></string-name>, <string-name><surname>O&#x2019;Connell</surname> <given-names>R</given-names></string-name>, <string-name><surname>Pittaway</surname> <given-names>PM</given-names></string-name>, <string-name><surname>Chingono</surname> <given-names>KE</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Self-driving laboratory platform for many-objective self-optimisation of polymer nanoparticle synthesis with cloud-integrated machine learning and orthogonal online analytics</article-title>. <source>Polym Chem</source>. <year>2025</year>;<volume>16</volume>(<issue>12</issue>):<fpage>1355</fpage>&#x2013;<lpage>64</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d5py00123d</pub-id>.</mixed-citation></ref>
<ref id="ref-62"><label>[62]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bai</surname> <given-names>X</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>X</given-names></string-name></person-group>. <article-title>Artificial intelligence-powered materials science</article-title>. <source>Nano Micro Lett</source>. <year>2025</year>;<volume>17</volume>(<issue>1</issue>):<fpage>135</fpage>. doi:<pub-id pub-id-type="doi">10.1007/s40820-024-01634-8</pub-id>; <pub-id pub-id-type="pmid">39912967</pub-id></mixed-citation></ref>
<ref id="ref-63"><label>[63]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>C</given-names></string-name>, <string-name><surname>Batra</surname> <given-names>R</given-names></string-name>, <string-name><surname>Lightstone</surname> <given-names>JP</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>C</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Z</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Frequency-dependent dielectric constant prediction of polymers using machine learning</article-title>. <source>njp Comput Mater</source>. <year>2020</year>;<volume>6</volume>:<fpage>61</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-020-0333-6</pub-id>.</mixed-citation></ref>
<ref id="ref-64"><label>[64]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Yue</surname> <given-names>T</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Assessing uncertainty in machine learning for polymer property prediction: a benchmark study</article-title>. <source>J Chem Inf Model</source>. <year>2025</year>;<volume>65</volume>(<issue>13</issue>):<fpage>6585</fpage>&#x2013;<lpage>98</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jcim.5c00550</pub-id>; <pub-id pub-id-type="pmid">40560148</pub-id></mixed-citation></ref>
<ref id="ref-65"><label>[65]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jiang</surname> <given-names>S</given-names></string-name>, <string-name><surname>Webb</surname> <given-names>MA</given-names></string-name></person-group>. <article-title>Physics-guided neural networks for transferable property prediction in architecturally diverse copolymers</article-title>. <source>Macromolecules</source>. <year>2025</year>;<volume>58</volume>(<issue>10</issue>):<fpage>4971</fpage>&#x2013;<lpage>84</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.macromol.5c00720</pub-id>.</mixed-citation></ref>
<ref id="ref-66"><label>[66]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Malashin</surname> <given-names>I</given-names></string-name>, <string-name><surname>Tynchenko</surname> <given-names>V</given-names></string-name>, <string-name><surname>Gantimurov</surname> <given-names>A</given-names></string-name>, <string-name><surname>Nelyub</surname> <given-names>V</given-names></string-name>, <string-name><surname>Borodulin</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Boosting-based machine learning applications in polymer science: a review</article-title>. <source>Polymers</source>. <year>2025</year>;<volume>17</volume>(<issue>4</issue>):<fpage>499</fpage>. doi:<pub-id pub-id-type="doi">10.3390/polym17040499</pub-id>; <pub-id pub-id-type="pmid">40006161</pub-id></mixed-citation></ref>
<ref id="ref-67"><label>[67]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Sattari</surname> <given-names>K</given-names></string-name>, <string-name><surname>Xie</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Data-driven algorithms for inverse design of polymers</article-title>. <source>Soft Matter</source>. <year>2021</year>;<volume>17</volume>(<issue>33</issue>):<fpage>7607</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d1sm00725d</pub-id>; <pub-id pub-id-type="pmid">34397078</pub-id></mixed-citation></ref>
<ref id="ref-68"><label>[68]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Hassan</surname> <given-names>AU</given-names></string-name>, <string-name><surname>G&#x00FC;lery&#x00FC;z</surname> <given-names>C</given-names></string-name>, <string-name><surname>El Azab</surname> <given-names>IH</given-names></string-name>, <string-name><surname>Elnaggar</surname> <given-names>AY</given-names></string-name>, <string-name><surname>Mahmoud</surname> <given-names>MHH</given-names></string-name></person-group>. <article-title>A graph neural network assisted reverse polymers engineering to design low bandgap benzothiophene polymers for light harvesting applications</article-title>. <source>Mater Chem Phys</source>. <year>2025</year>;<volume>339</volume>:<fpage>130747</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.matchemphys.2025.130747</pub-id>.</mixed-citation></ref>
<ref id="ref-69"><label>[69]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Kim</surname> <given-names>C</given-names></string-name>, <string-name><surname>Batra</surname> <given-names>R</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>L</given-names></string-name>, <string-name><surname>Tran</surname> <given-names>H</given-names></string-name>, <string-name><surname>Ramprasad</surname> <given-names>R</given-names></string-name></person-group>. <article-title>Polymer design using genetic algorithm and machine learning</article-title>. <source>Comput Mater Sci</source>. <year>2021</year>;<volume>186</volume>:<fpage>110067</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.commatsci.2020.110067</pub-id>.</mixed-citation></ref>
<ref id="ref-70"><label>[70]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Caramelli</surname> <given-names>D</given-names></string-name>, <string-name><surname>Granda</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Mehr</surname> <given-names>SHM</given-names></string-name>, <string-name><surname>Cambi&#x00E9;</surname> <given-names>D</given-names></string-name>, <string-name><surname>Henson</surname> <given-names>AB</given-names></string-name>, <string-name><surname>Cronin</surname> <given-names>L</given-names></string-name></person-group>. <article-title>Discovering new chemistry with an autonomous robotic platform driven by a reactivity-seeking neural network</article-title>. <source>ACS Cent Sci</source>. <year>2021</year>;<volume>7</volume>(<issue>11</issue>):<fpage>1821</fpage>&#x2013;<lpage>30</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acscentsci.1c00435</pub-id>; <pub-id pub-id-type="pmid">34849401</pub-id></mixed-citation></ref>
<ref id="ref-71"><label>[71]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Madika</surname> <given-names>B</given-names></string-name>, <string-name><surname>Saha</surname> <given-names>A</given-names></string-name>, <string-name><surname>Kang</surname> <given-names>C</given-names></string-name>, <string-name><surname>Buyantogtokh</surname> <given-names>B</given-names></string-name>, <string-name><surname>Agar</surname> <given-names>J</given-names></string-name>, <string-name><surname>Voorhees</surname> <given-names>P</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Artificial intelligence for materials discovery, development, and optimization</article-title>. <source>ACS Nano</source>. <year>2025</year>;<volume>19</volume>(<issue>30</issue>):<fpage>27116</fpage>&#x2013;<lpage>58</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acsnano.5c04200</pub-id>; <pub-id pub-id-type="pmid">40711807</pub-id></mixed-citation></ref>
<ref id="ref-72"><label>[72]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yang</surname> <given-names>M</given-names></string-name>, <string-name><surname>Wan</surname> <given-names>C</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>L</given-names></string-name>, <string-name><surname>Li</surname> <given-names>X</given-names></string-name>, <string-name><surname>Pan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Li</surname> <given-names>H</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>High-temperature polymer composite capacitors with high energy density designed via machine learning</article-title>. <source>Nat Energy</source>. <year>2025</year>;<volume>10</volume>(<issue>11</issue>):<fpage>1323</fpage>&#x2013;<lpage>33</lpage>. doi:<pub-id pub-id-type="doi">10.1038/s41560-025-01863-0</pub-id>.</mixed-citation></ref>
<ref id="ref-73"><label>[73]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Tao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>G</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Machine learning discovery of high-temperature polymers</article-title>. <source>Patterns</source>. <year>2021</year>;<volume>2</volume>(<issue>4</issue>):<fpage>100225</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.patter.2021.100225</pub-id>; <pub-id pub-id-type="pmid">33982020</pub-id></mixed-citation></ref>
<ref id="ref-74"><label>[74]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Meng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lin</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>A</given-names></string-name></person-group>. <article-title>Prediction and explanation of properties in multicomponent polyurethane elastomers: integrating molecular dynamics and machine learning</article-title>. <source>Macromolecules</source>. <year>2024</year>;<volume>57</volume>(<issue>23</issue>):<fpage>10912</fpage>&#x2013;<lpage>25</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.macromol.4c02559</pub-id>.</mixed-citation></ref>
<ref id="ref-75"><label>[75]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Gormley</surname> <given-names>AJ</given-names></string-name>, <string-name><surname>Webb</surname> <given-names>MA</given-names></string-name></person-group>. <article-title>Machine learning in combinatorial polymer chemistry</article-title>. <source>Nat Rev Mater</source>. <year>2021</year>;<volume>6</volume>(<issue>8</issue>):<fpage>642</fpage>&#x2013;<lpage>4</lpage>. doi:<pub-id pub-id-type="doi">10.1038/s41578-021-00282-3</pub-id>; <pub-id pub-id-type="pmid">34394961</pub-id></mixed-citation></ref>
<ref id="ref-76"><label>[76]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mysona</surname> <given-names>JA</given-names></string-name>, <string-name><surname>Nealey</surname> <given-names>PF</given-names></string-name>, <string-name><surname>de Pablo</surname> <given-names>JJ</given-names></string-name></person-group>. <article-title>Machine learning models and dimensionality reduction for prediction of polymer properties</article-title>. <source>Macromolecules</source>. <year>2024</year>;<volume>57</volume>(<issue>5</issue>):<fpage>1988</fpage>&#x2013;<lpage>97</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.macromol.3c02401</pub-id>.</mixed-citation></ref>
<ref id="ref-77"><label>[77]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Jiang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Fu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Bai</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Jiang</surname> <given-names>L</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>H</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>W</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Interpretable machine learning applications: a promising prospect of AI for materials</article-title>. <source>Adv Funct Mater</source>. <year>2025</year>;<volume>35</volume>(<issue>41</issue>):<fpage>2507734</fpage>. doi:<pub-id pub-id-type="doi">10.1002/adfm.202507734</pub-id>.</mixed-citation></ref>
<ref id="ref-78"><label>[78]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wagner</surname> <given-names>N</given-names></string-name>, <string-name><surname>Rondinelli</surname> <given-names>JM</given-names></string-name></person-group>. <article-title>Theory-guided machine learning in materials science</article-title>. <source>Front Mater</source>. <year>2016</year>;<volume>3</volume>:<fpage>28</fpage>. doi:<pub-id pub-id-type="doi">10.3389/fmats.2016.00028</pub-id>.</mixed-citation></ref>
<ref id="ref-79"><label>[79]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Fernandes</surname> <given-names>C</given-names></string-name></person-group>. <article-title>Scientific machine learning for polymeric materials</article-title>. <source>Polymers</source>. <year>2025</year>;<volume>17</volume>(<issue>16</issue>):<fpage>2222</fpage>. doi:<pub-id pub-id-type="doi">10.3390/polym17162222</pub-id>; <pub-id pub-id-type="pmid">40871169</pub-id></mixed-citation></ref>
<ref id="ref-80"><label>[80]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhong</surname> <given-names>X</given-names></string-name>, <string-name><surname>Gallagher</surname> <given-names>B</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>S</given-names></string-name>, <string-name><surname>Kailkhura</surname> <given-names>B</given-names></string-name>, <string-name><surname>Hiszpanski</surname> <given-names>A</given-names></string-name>, <string-name><surname>Han</surname> <given-names>TY</given-names></string-name></person-group>. <article-title>Explainable machine learning in materials science</article-title>. <source>NPJ Comput Mater</source>. <year>2022</year>;<volume>8</volume>:<fpage>204</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-022-00884-7</pub-id>.</mixed-citation></ref>
<ref id="ref-81"><label>[81]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Mikulskis</surname> <given-names>P</given-names></string-name>, <string-name><surname>Alexander</surname> <given-names>MR</given-names></string-name>, <string-name><surname>Winkler</surname> <given-names>DA</given-names></string-name></person-group>. <article-title>Toward interpretable machine learning models for materials discovery</article-title>. <source>Adv Intell Syst</source>. <year>2019</year>;<volume>1</volume>(<issue>8</issue>):<fpage>1900045</fpage>. doi:<pub-id pub-id-type="doi">10.1002/aisy.201900045</pub-id>.</mixed-citation></ref>
<ref id="ref-82"><label>[82]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ma</surname> <given-names>L</given-names></string-name>, <string-name><surname>Li</surname> <given-names>W</given-names></string-name>, <string-name><surname>Yuan</surname> <given-names>J</given-names></string-name>, <string-name><surname>Zhu</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wu</surname> <given-names>Y</given-names></string-name>, <string-name><surname>He</surname> <given-names>H</given-names></string-name>, <etal>et al.</etal></person-group> <article-title>Recent advances in machine learning-assisted design and development of polymer materials</article-title>. <source>Macromol Rapid Commun</source>. <year>2025</year>;<volume>135</volume>:<fpage>e00361</fpage>. doi:<pub-id pub-id-type="doi">10.1002/marc.202500361</pub-id>; <pub-id pub-id-type="pmid">40623086</pub-id></mixed-citation></ref>
<ref id="ref-83"><label>[83]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Yue</surname> <given-names>T</given-names></string-name>, <string-name><surname>Tao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Varshney</surname> <given-names>V</given-names></string-name>, <string-name><surname>Li</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Benchmarking study of deep generative models for inverse polymer design</article-title>. <source>Digit Discov</source>. <year>2025</year>;<volume>4</volume>(<issue>4</issue>):<fpage>910</fpage>&#x2013;<lpage>26</lpage>. doi:<pub-id pub-id-type="doi">10.26434/chemrxiv-2024-gzq4r</pub-id>.</mixed-citation></ref>
<ref id="ref-84"><label>[84]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Liu</surname> <given-names>H</given-names></string-name>, <string-name><surname>Zhang</surname> <given-names>F</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>J</given-names></string-name></person-group>. <article-title>Machine learning-assisted prediction of polymer degradation behavior</article-title>. <source>Environ Sci Technol</source>. <year>2024</year>;<volume>58</volume>(<issue>10</issue>):<fpage>4215</fpage>&#x2013;<lpage>24</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.est.4c00123</pub-id>.</mixed-citation></ref>
<ref id="ref-85"><label>[85]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhou</surname> <given-names>R</given-names></string-name>, <string-name><surname>Bao</surname> <given-names>L</given-names></string-name>, <string-name><surname>Bu</surname> <given-names>W</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>F</given-names></string-name></person-group>. <article-title>Exploring high-performance viscosity index improver polymers via high-throughput molecular dynamics and explainable AI</article-title>. <source>npj Comput Mater</source>. <year>2025</year>;<volume>11</volume>:<fpage>52</fpage>. doi:<pub-id pub-id-type="doi">10.1038/s41524-025-01539-z</pub-id>.</mixed-citation></ref>
<ref id="ref-86"><label>[86]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zhang</surname> <given-names>X</given-names></string-name>, <string-name><surname>Sheng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>X</given-names></string-name>, <string-name><surname>Yang</surname> <given-names>J</given-names></string-name>, <string-name><surname>Goddard</surname> <given-names>WA</given-names></string-name>, <string-name><surname>Ye</surname> <given-names>C</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Polymer-unit graph: advancing interpretability in graph neural network machine learning for organic polymer semiconductor materials</article-title>. <source>J Chem Theory Comput</source>. <year>2024</year>;<volume>20</volume>(<issue>7</issue>):<fpage>2908</fpage>&#x2013;<lpage>20</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jctc.3c01385</pub-id>; <pub-id pub-id-type="pmid">38551455</pub-id></mixed-citation></ref>
<ref id="ref-87"><label>[87]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Parvez</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Mehedi</surname> <given-names>IM</given-names></string-name></person-group>. <article-title>High-accuracy polymer property detection via Pareto-optimized SMILES-based deep learning</article-title>. <source>Polymers</source>. <year>2025</year>;<volume>17</volume>(<issue>13</issue>):<fpage>1801</fpage>. doi:<pub-id pub-id-type="doi">10.3390/polym17131801</pub-id>; <pub-id pub-id-type="pmid">40647811</pub-id></mixed-citation></ref>
<ref id="ref-88"><label>[88]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Shen</surname> <given-names>ZH</given-names></string-name>, <string-name><surname>Liu</surname> <given-names>HX</given-names></string-name>, <string-name><surname>Shen</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Hu</surname> <given-names>JM</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>LQ</given-names></string-name>, <string-name><surname>Nan</surname> <given-names>CW</given-names></string-name></person-group>. <article-title>Machine learning in energy storage materials</article-title>. <source>Interdiscip Mater</source>. <year>2022</year>;<volume>1</volume>(<issue>2</issue>):<fpage>175</fpage>&#x2013;<lpage>95</lpage>. doi:<pub-id pub-id-type="doi">10.1002/idm2.12020</pub-id>.</mixed-citation></ref>
<ref id="ref-89"><label>[89]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Bai</surname> <given-names>H</given-names></string-name>, <string-name><surname>Chu</surname> <given-names>P</given-names></string-name>, <string-name><surname>Tsai</surname> <given-names>JY</given-names></string-name>, <string-name><surname>Wilson</surname> <given-names>N</given-names></string-name>, <string-name><surname>Qian</surname> <given-names>X</given-names></string-name>, <string-name><surname>Yan</surname> <given-names>Q</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Graph neural network for Hamiltonian-based material property prediction</article-title>. <source>Neural Comput Appl</source>. <year>2022</year>;<volume>34</volume>(<issue>6</issue>):<fpage>4625</fpage>&#x2013;<lpage>32</lpage>. doi:<pub-id pub-id-type="doi">10.1007/s00521-021-06616-0</pub-id>.</mixed-citation></ref>
<ref id="ref-90"><label>[90]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>L</given-names></string-name>, <string-name><surname>Wang</surname> <given-names>Z</given-names></string-name>, <string-name><surname>Zhou</surname> <given-names>G</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>Y</given-names></string-name></person-group>. <article-title>Integrating graph convolutional network to BiLSTM for enhanced polymer knot identification</article-title>. <source>Macromolecules</source>. <year>2024</year>;<volume>57</volume>(<issue>16</issue>):<fpage>7980</fpage>&#x2013;<lpage>9</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.macromol.4c00931</pub-id>.</mixed-citation></ref>
<ref id="ref-91"><label>[91]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Zheng</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Biswal</surname> <given-names>AK</given-names></string-name>, <string-name><surname>Guo</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Thakolkaran</surname> <given-names>P</given-names></string-name>, <string-name><surname>Kokane</surname> <given-names>Y</given-names></string-name>, <string-name><surname>Varshney</surname> <given-names>V</given-names></string-name>, <etal>et al</etal></person-group>. <article-title>Toward sustainable polymer design: a molecular dynamics-informed machine learning approach for vitrimers</article-title>. <source>Digit Discov</source>. <year>2025</year>;<volume>4</volume>(<issue>9</issue>):<fpage>2559</fpage>&#x2013;<lpage>69</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d5dd00239g</pub-id>.</mixed-citation></ref>
<ref id="ref-92"><label>[92]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Takahashi</surname> <given-names>L</given-names></string-name>, <string-name><surname>Kuwahara</surname> <given-names>M</given-names></string-name>, <string-name><surname>Takahashi</surname> <given-names>K</given-names></string-name></person-group>. <article-title>AI and automation: democratizing automation and the evolution towards true AI-autonomous robotics</article-title>. <source>Chem Sci</source>. <year>2025</year>;<volume>16</volume>(<issue>35</issue>):<fpage>15769</fpage>&#x2013;<lpage>80</lpage>. doi:<pub-id pub-id-type="doi">10.1039/d5sc03183d</pub-id>; <pub-id pub-id-type="pmid">40800059</pub-id></mixed-citation></ref>
<ref id="ref-93"><label>[93]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Barakat</surname> <given-names>M</given-names></string-name>, <string-name><surname>Reda</surname> <given-names>H</given-names></string-name>, <string-name><surname>Harmandaris</surname> <given-names>V</given-names></string-name></person-group>. <article-title>A semi-continuum multiscale model of graphene-based polymer nanocomposites: mechanical characterization</article-title>. <source>Comput Mater Sci</source>. <year>2025</year>;<volume>257</volume>:<fpage>113968</fpage>. doi:<pub-id pub-id-type="doi">10.1016/j.commatsci.2025.113968</pub-id>.</mixed-citation></ref>
<ref id="ref-94"><label>[94]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Ricci</surname> <given-names>E</given-names></string-name>, <string-name><surname>Vergadou</surname> <given-names>N</given-names></string-name></person-group>. <article-title>Integrating machine learning in the coarse-grained molecular simulation of polymers</article-title>. <source>J Phys Chem B</source>. <year>2023</year>;<volume>127</volume>(<issue>11</issue>):<fpage>2302</fpage>&#x2013;<lpage>22</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.jpcb.2c06354</pub-id>; <pub-id pub-id-type="pmid">36888553</pub-id></mixed-citation></ref>
<ref id="ref-95"><label>[95]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Wang</surname> <given-names>R</given-names></string-name>, <string-name><surname>Sing</surname> <given-names>MK</given-names></string-name>, <string-name><surname>Avery</surname> <given-names>RK</given-names></string-name>, <string-name><surname>Souza</surname> <given-names>BS</given-names></string-name>, <string-name><surname>Kim</surname> <given-names>M</given-names></string-name>, <string-name><surname>Olsen</surname> <given-names>BD</given-names></string-name></person-group>. <article-title>Classical challenges in the physical chemistry of polymer networks and the design of new materials</article-title>. <source>Acc Chem Res</source>. <year>2016</year>;<volume>49</volume>(<issue>12</issue>):<fpage>2786</fpage>&#x2013;<lpage>95</lpage>. doi:<pub-id pub-id-type="doi">10.1021/acs.accounts.6b00454</pub-id>; <pub-id pub-id-type="pmid">27993006</pub-id></mixed-citation></ref>
<ref id="ref-96"><label>[96]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Xu</surname> <given-names>P</given-names></string-name>, <string-name><surname>Chen</surname> <given-names>H</given-names></string-name>, <string-name><surname>Li</surname> <given-names>M</given-names></string-name>, <string-name><surname>Lu</surname> <given-names>W</given-names></string-name></person-group>. <article-title>New opportunity: machine learning for polymer materials design and discovery</article-title>. <source>Adv Theory Simul</source>. <year>2022</year>;<volume>5</volume>(<issue>5</issue>):<fpage>2100565</fpage>. doi:<pub-id pub-id-type="doi">10.1002/adts.202100565</pub-id>.</mixed-citation></ref>
<ref id="ref-97"><label>[97]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><surname>Webb</surname> <given-names>MA</given-names></string-name>, <string-name><surname>Jackson</surname> <given-names>NE</given-names></string-name>, <string-name><surname>Gil</surname> <given-names>PS</given-names></string-name>, <string-name><surname>de Pablo</surname> <given-names>JJ</given-names></string-name></person-group>. <article-title>Targeted sequence design within the coarse-grained polymer genome</article-title>. <source>Sci Adv</source>. <year>2020</year>;<volume>6</volume>(<issue>43</issue>):<fpage>eabc6216</fpage>. doi:<pub-id pub-id-type="doi">10.1126/sciadv.abc6216</pub-id>; <pub-id pub-id-type="pmid">33087352</pub-id></mixed-citation></ref>
</ref-list>
</back></article>