论文部分内容阅读
收集了23种氨基酸1262种结构信息,经主成分分析得新的-氨基酸拓扑描述子得分(SATD)矢量.将其用于125个不同长度肽的结构表征,分别以支持向量机(SVM)和偏最小二乘(PLS)建立肽定量序列-迁移模型(QSMM).结果表明,SATD描述子所含信息量大,易于操作,能较好地表征125个肽结构.与PLS相比,SVM在对电泳迁移率建模预测中表现出较强的拟合能力和良好的预测能力.
A total of 1262 structural information of 23 kinds of amino acids were collected, and a new amino acid topological descriptor score (SATD) vector was obtained by principal component analysis.It was used to characterize the structure of 125 peptides of different lengths, respectively, using support vector machine (SVM) and Partial least squares (PLS) was used to establish the peptide quantitative sequence-migration model (QSMM). The results showed that the SATD descriptor contained a large amount of information and was easy to manipulate and could well characterize the peptide structure of 125. Compared with PLS, The model shows good fitting ability and good predictive ability in modeling and prediction of electrophoretic mobility.