Scaling GPA* for complex protein folding pathway simulations

No Thumbnail Available
Patel, Foram
Journal Title
Journal ISSN
Volume Title
University Honors College, Middle Tennessee State University
Finding improved protein folding pathway modeling tools is crucial to develop more potent treatments for disorders caused by protein misfolding. The fast-folding streptococcal protein G (1GB1), which has alpha-helices and several beta-sheets, can be used to assess models of protein folding. Pathway prediction is often computationally expensive and time-consuming, so current research focuses on accelerating Molecular Dynamics (MD) simulations. To fix the issue of proteins getting trapped in minima, past methods have imposed an unnatural bias on the potential and kinetic energies of the simulation environments. Finding unbiased methods for MD simulations was an open problem and was addressed by (Syzonenko & Phillips, 2020), introducing a combination of the A* algorithm and MD simulations. The current implementation has storage issues due to an abundant number of files produced preventing large-scale implementation. A viable alternative could be the replacement of auxiliary file storage on disk with a keyvalue data structure for storage. This would prove less burdensome on the file systems. Instead of relying on GROMACS commands using OS system calls, the MDAnalysis library, which is based on GROMACS, may be used for simulation commands and storing coordinates. Once validated on the complex and fast-folding 1GB1 protein, the approach may be applied to even larger α-β proteins.