Advancing machine learning for organic material simulations with quantum accuracy


AI4AM2025 | event contribution
Link to conference: https://ai4am.net/2025/index.php
April 8, 2025 | San Sebastian, Spain

The rising demand for sustainable solutions to technological and societal challenges has driven significant research and development efforts to integrate machine learning (ML) techniques in computational physics and chemistry. As ML becomes more prevalent in interdisciplinary research, the amount of comprehensive quantummechanical (QM) property data generated in recent years to train robust predictive models has significantly increased. Recently, we introduced high-fidelity property data at the level of nonempirical hybrid density-functional theory (DFT) with a many-body treatment of vdW dispersion interactions (i.e., PBE0+MBD) for both small and large drug-like molecules in equilibrium and nonequilibrium states. These datasets have been instrumental in advancing QM-based ML interatomic potentials (e.g., SO3LR) and enhancing semiempirical methods (e.g., third-order density
functional tight-binding DFTB3), enabling accurate (bio)molecular simulations. In this presentation, we will discuss our recent efforts to improve the transferability and generalizability of the ML-corrected DFTB3 method. Within the DFTB method, the pairwise repulsive component has certain shortcomings, which we will address by training a many-body repulsive potential using neural networks (NNs). Indeed, we have demonstrated that equivariant NNs (e.g., SpookyNet and MACE) significantly enhance the accuracy and scalability of ML-based many-body repulsive potentials trained on energies and forces of small organic systems and molecular dimers. The developed framework, namely EN4TB, facilitates the
calculation of the energetic and structural properties of large drug-like molecules and molecular dimers at a higher level of theory such as PBE0+MBD. Preliminary results are shown in Figure 1. Additionally, we have expanded this approach to investigate the structural and thermodynamic properties of potential candidates for organic electrodes in Li-battery applications. For comparison, our results are compared with these obtained by ML force fields trained on full DFT reference data. Hence, EN4TB highlights the benefits of integrating ML with semi-empirical methods to achieve both high accuracy and computational efficiency, thereby paving the way for diverse applications in organic material simulations. See the EN4TB GitHub repository for examples of how to use our approach.


Presenter

Authors

Related groups

Advancing machine learning for organic material simulations with quantum accuracy


AI4AM2025 | event contribution
Link to conference: https://ai4am.net/2025/index.php
April 8, 2025 | San Sebastian, Spain

The rising demand for sustainable solutions to technological and societal challenges has driven significant research and development efforts to integrate machine learning (ML) techniques in computational physics and chemistry. As ML becomes more prevalent in interdisciplinary research, the amount of comprehensive quantummechanical (QM) property data generated in recent years to train robust predictive models has significantly increased. Recently, we introduced high-fidelity property data at the level of nonempirical hybrid density-functional theory (DFT) with a many-body treatment of vdW dispersion interactions (i.e., PBE0+MBD) for both small and large drug-like molecules in equilibrium and nonequilibrium states. These datasets have been instrumental in advancing QM-based ML interatomic potentials (e.g., SO3LR) and enhancing semiempirical methods (e.g., third-order density
functional tight-binding DFTB3), enabling accurate (bio)molecular simulations. In this presentation, we will discuss our recent efforts to improve the transferability and generalizability of the ML-corrected DFTB3 method. Within the DFTB method, the pairwise repulsive component has certain shortcomings, which we will address by training a many-body repulsive potential using neural networks (NNs). Indeed, we have demonstrated that equivariant NNs (e.g., SpookyNet and MACE) significantly enhance the accuracy and scalability of ML-based many-body repulsive potentials trained on energies and forces of small organic systems and molecular dimers. The developed framework, namely EN4TB, facilitates the
calculation of the energetic and structural properties of large drug-like molecules and molecular dimers at a higher level of theory such as PBE0+MBD. Preliminary results are shown in Figure 1. Additionally, we have expanded this approach to investigate the structural and thermodynamic properties of potential candidates for organic electrodes in Li-battery applications. For comparison, our results are compared with these obtained by ML force fields trained on full DFT reference data. Hence, EN4TB highlights the benefits of integrating ML with semi-empirical methods to achieve both high accuracy and computational efficiency, thereby paving the way for diverse applications in organic material simulations. See the EN4TB GitHub repository for examples of how to use our approach.


Presenter

Authors

Related groups