论文部分内容阅读
同时使用标号点和成对约束信息,设计了半监督的最近邻分类算法。为了解决可能无法为某些数据点分配类标号的问题,提出了ratio排序方法以降低冲突点的个数,并采用基于Citation-kNN评分的主动式学习策略,通过获取一些与周围数据点不一致的点的标号来改善半监督学习的效果,以寻找有价值的监督信息。实验结果表明,本文的学习策略可以提高算法的聚类效果,其CRI指标好于COP-kmeans和CCL算法。
At the same time, the semi-supervised nearest neighbor classification algorithm is designed using label points and pairwise constraint information. In order to solve the problem that class labels may not be assigned to some data points, a ratio sorting method is proposed to reduce the number of conflict points and adopt an active learning strategy based on the Citation-kNN score. By obtaining some inconsistencies with the surrounding data points Point marking to improve the effect of semi-supervised learning in order to find valuable supervisory information. The experimental results show that the learning strategy of this paper can improve the clustering effect of the algorithm, and its CRI index is better than the COP-kmeans and CCL algorithm.