论文部分内容阅读
随着大量分子描述符应用于QSAR/QSPR,如何筛选出具有良好稳定性和预测能力的描述符集,成为亟待解决的一个瓶颈问题.将63个有机化合物的1664个描述符经过初步预选后,利用偏最小二乘(PLS)方法进行变量筛选,获得42个重要描述符;随机选择43个有机物,针对透聚乙烯膜性能进行训练研究,得优良估计能力和良好稳定性模型(A=6,r2=0.9647,RMSE=0.213,q2=0.8364,RMSV=0.467);对模型外部20个有机物进行预测,表明模型具有良好预测能力(rp 2=0.9306,RMSP=0.326).PLS变量筛选法可以快速有效地筛选与活性密切相关的重要描述符,进而构建具有良好稳定性和预测能力的QSAR模型.
With the application of a large number of molecular descriptors to QSAR / QSPR, how to screen the set of descriptors with good stability and predictability becomes a bottleneck that needs to be solved urgently.Through the preliminary preliminary selection of 1664 descriptors of 63 organic compounds, Forty-two important descriptors were obtained by PLS (Partial Least Squares) method. Forty-three organic compounds were selected at random to study the properties of polyethylene-permeable membranes. Good estimation ability and good stability model (A = 6, r2 = 0.9647, RMSE = 0.213, q2 = 0.8364, RMSV = 0.467). Prediction of 20 organic compounds outside the model showed that the model has good predictive ability (rp 2 = 0.9306, RMSP = 0.326) .PLS variable screening method can be fast and effective We screened the important descriptors closely related to activity to construct a QSAR model with good stability and predictability.