Doloszeski, E. (2024). Exploring phosphine ligation states through QM/MM simulations with equivariant graph neural networks [Diploma Thesis, Technische Universität Wien; ETH Zürich]. reposiTUm. https://doi.org/10.34726/hss.2024.120246
Machine-learned force fields; QM-MM; nickel-benzaldehyde
en
Abstract:
To validate a previously developed equivariant graph neural network (GNN) featuring anisotropic message passing, investigation of ligation states of phosphine ligands within transition metal complexes was chosen. These ligands play a key role in cross-coupling reactions. By integrating anisotropic states inspired by the Cartesian multipole formalism into the neural network, the network captures directional information. The neural network was successfully used to replace the quantum mechanical calculations in a quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulation of 20 distinct phosphine ligands attached to a nickel-benzaldehyde complex. A speed-up factor of 10^5 was obtained for the largest complex in the training data with the ligand CataCXiumA using the ML model substituting the DFT method. The model with approximately 600 000 parameters achieved a mean absolute error (MAE) for energies of 2.2 kJ/mol when it was trained on 42 000 data points including coordinates and charges to reproduce forces, energies and molecular multipole information. The model showed good transferability for four ligands, which were not present in the training data set, maintaining a mean absolute error for energies between 2.3 and 6.2 kJ/mol. Chemical accuracy was achieved for three of these ligands. For all complexes ML/MM MD trajectories could be produced for extensive periods of time, spanning several hundred picoseconds. Prospective simulations indicate comparable trends across all 13 ligands, comparable to experimental data from Newman-Stonebraker et al. For two complexes, (PteroPhos)2 Ni(benzaldehyde) and (PCy3 )2 Ni(benzaldehyde), free energy profiles obtained from prospective simulations using umbrella sampling correctly predicted the experimentally observed ligation state, when transferred to the monoligated equivalent. Remarkably, the largest system, PteroPhos, comprising 385 QM atoms and 4 000 MM atoms, was accurately predicted despite its size and not being present in the training or validation set demonstrating the scalability of the method.This new method allows the description of systems that were previously difficult or impossible to describe due to their size and complexity, which prevented the use of more established methods such as pure QM calculations or the force field formalism.