论文部分内容阅读
非负矩阵分解(NMF)是1种全新的提取特征和数据降维方法,本文将NMF率先引入近红外(NIR)光谱的数据处理,以识别中药鱼腥草为例,将收集到的干燥鱼腥草样品粉碎后,取适量置于石英样品杯中,以积分球漫反射法采集样品的NIR光谱。将各样本NIR光谱数据经多元散射校正后,计算和研究鱼腥草NIR漫反射光谱的NMF,实现快速鉴别鱼腥草的质量,并说明NMF分解结果更加体现了基于局部表示的性质,更能代表数据的局部特征。在与传统的统计方法--主成分分析(PCA)比较的过程中,讨论了NMF的多解性问题,提出基于奇异值分解的简洁初始化算法,讨论了NMF低维特征数r的选择问题,并通过PCA选择NMF的低维特征数r值,讨论了矩阵中样本的排列顺序对结果的影响,通过计算说明矩阵中样本的排列次序对识别没有影响。在探讨NMF的优势与不足的原因的基础上,证明NMF是较理想的数据分析方法。
Non-negative matrix factorization (NMF) is a new method for extracting features and data dimensionality reduction methods. In this paper, NMF is first introduced into the processing of near-infrared (NIR) spectra to identify the dried fish that will be collected from Houttuynia cordata as an example. After the psyllium sample was crushed, an appropriate amount was placed in a quartz sample cup, and the NIR spectrum of the sample was collected by an integrating sphere diffuse reflection method. After NIR spectral data of each sample was corrected by multiple scatter, the NMF of NIR diffuse reflectance spectrum of Houttuynia cordata was calculated and studied to quickly identify the quality of Houttuynia cordata. It was shown that the NMF decomposition result more reflects the nature of local representation, and can Represents the local characteristics of the data. In the process of comparison with the traditional statistical method-PCA, the multi-solution problem of NMF is discussed. A simple initialization algorithm based on singular value decomposition is proposed. The selection of the low-dimensional feature number r of NMF is discussed. The PCA is used to select the low-dimension feature r value of the NMF. The effect of the arrangement order of the samples in the matrix on the result is discussed. There is no influence on the recognition by calculating the order in which the samples are arranged in the matrix. Based on the discussion of NMF’s advantages and disadvantages, it is proved that NMF is an ideal data analysis method.