Protein Structure Prediction Using Robust Principal Component Analysis and Support Vector Machine
Main Article Content
Existence of bioinformatics is to increase the further understanding of biological process. Proteins structure is one of the major challenges in structural bioinformatics. With former knowledge of the structure, the quality of secondary structure, prediction of tertiary structure, and prediction function of amino acid from its sequence increase significantly. Recently, the gap between sequence known and structure known proteins had increase dramatically. So it is compulsory to understand on proteins structure to overcome this problem so further functional analysis could be easier. The research applying RPCA algorithm to extract the essential features from the original high-dimensional input vectors. Then the process followed by experimenting SVM with RBF kernel. The proposed method obtains accuracy by 84.41% for training dataset and 89.09% for testing dataset. The result then compared with the same method but PCA was applied as the feature extraction. The prediction assessment is conducted by analyzing the accuracy and number of principal component selected. It shows that combination of RPCA and SVM produce a high quality classification of protein structure
Ding, Chris HQ, and Inna Dubchak. (2001), "Multi-class protein fold recognition using support vector machines and neural networks." Bioinformatics 17.4: 349-358.
Singh, Lavneet, Girija Chetty, and Dharmendra Sharma.(2012) "A novel approach to protein structure prediction using PCA or LDA based extreme learning machines." Neural Information Processing. Springer Berlin Heidelberg.
Li L, Cui X, Yu S, Zhang Y, Luo Z, Yang H, et al. PSSP-RFE: Accurate Prediction of Protein structure by Recursive Feature Extraction from PSI-BLAST Profile, PhysicalChemical Property and Functional Annotations.” PLoS ONE 9(3): e92863. doi:10.1371/journal.pone.0092863, (2014)