email: firstname.lastname@example.org Benchmark result showing here is purely torchani inference, MD interface (e.g. Amber) is still WIP.
Benchmark Result on GTX 2080 Ti Original:
Some Optimization detail: The timing below is for a system of 10k atoms (a different pdb as above) running on GTX 1080 (averaged for 200 iterations)
For original ensemble, NN is CPU bounded for small systems: 8 models * 7 Networks (HCNOFSCl) * 7 layers (4 Linear + 3 CELU) = 392 kernel calls (will be doubled if also count backward)
Infer Model (ON) + MNP (OFF):
Infer Model (ON) + MNP (ON):