Download PDFOpen PDF in browserUnsupervised Learning for Tertiary Structure Prediction of Protein Molecules: Systematic ReviewEasyChair Preprint 1556513 pages•Date: December 12, 2024AbstractTertiary structures of molecules represent high-dimensional data containing spatial information of hundreds (even thousands) of atoms. Unsupervised learning techniques can be applied to such spatial data to uncover hidden organizations that can be subjected to further evaluation. Such techniques have already been employed in a number of relevant applications e.g., tracking the conformational changes in a set of structures, detecting biologically active tertiary structures from computed structures of proteins, analyzing molecular dynamics simulation of peptides, and so on. This paper presents a comprehensive review of clustering techniques for tertiary (3D) molecular structure data focusing on protein molecules. In fact, the article systematically organizes as well as analyzes the existing approaches in terms of the data representation, methodology, proximity measure, and evaluation metric. Besides, it highlights key open challenges and proposes future research directions to advance this domain. Keyphrases: Clustering, protein tertiary structure, proximity measure, unsupervised learning
|