Machine learning has been proven to be an extremely valuable tool for simulations with ab-initio accuracy at the computational cost between classical interatomic potentials and density-functional approximations. Similar efficiency can only be achieved by semi-empirical (SE) methods such as density- functional tight-binding (DFTB). However, shortcomings still exist in the pairwise DFTB repulsive component and the treatment of long-range (e.g., electrostatics and van der Waals) interactions in non- covalent systems. Therefore, building on our previous work (DFTB+NNrep) [1], we have developed a scalable methodology that corrects the DFTB overall performance via the development of many-body 3 interactomic potentials using physically-inspired state-of-the-art equivariant neural networks (ENN) such as Spookynet and MACE [2]. Moreover, a many-body dispersion treatment is applied to describe van der Waals interactions, which are crucial to investigate large/more flexible molecules and molecular dimers. Our many-body 3 potentials are trained to fit the PBE0-level data for single molecules from the QM7-X dataset [3] and molecular dimers from the S66x8 dataset. Firstly, ML-corrected DFTB models show an improvement in capturing intramolecular interactions as illustrated by the prediction of rotational energy profiles and the conformational landscape for organic molecules of increased size and flexibility compared to the training set. Furthermore, depending on the physical model used in the ENN, the resulting ML model can accurately predict the dissociation curves of molecular dimers as well as the interaction energy of large molecular clusters extracted from the X23 molecular crystals dataset. Hence, our new ML-corrected DFTB models combine scalability and generalisability with improved accuracy, enabling the efficient investigation of the physicochemical properties of large covalent and non-covalent molecular complexes.
Machine learning has been proven to be an extremely valuable tool for simulations with ab-initio accuracy at the computational cost between classical interatomic potentials and density-functional approximations. Similar efficiency can only be achieved by semi-empirical (SE) methods such as density- functional tight-binding (DFTB). However, shortcomings still exist in the pairwise DFTB repulsive component and the treatment of long-range (e.g., electrostatics and van der Waals) interactions in non- covalent systems. Therefore, building on our previous work (DFTB+NNrep) [1], we have developed a scalable methodology that corrects the DFTB overall performance via the development of many-body 3 interactomic potentials using physically-inspired state-of-the-art equivariant neural networks (ENN) such as Spookynet and MACE [2]. Moreover, a many-body dispersion treatment is applied to describe van der Waals interactions, which are crucial to investigate large/more flexible molecules and molecular dimers. Our many-body 3 potentials are trained to fit the PBE0-level data for single molecules from the QM7-X dataset [3] and molecular dimers from the S66x8 dataset. Firstly, ML-corrected DFTB models show an improvement in capturing intramolecular interactions as illustrated by the prediction of rotational energy profiles and the conformational landscape for organic molecules of increased size and flexibility compared to the training set. Furthermore, depending on the physical model used in the ENN, the resulting ML model can accurately predict the dissociation curves of molecular dimers as well as the interaction energy of large molecular clusters extracted from the X23 molecular crystals dataset. Hence, our new ML-corrected DFTB models combine scalability and generalisability with improved accuracy, enabling the efficient investigation of the physicochemical properties of large covalent and non-covalent molecular complexes.