论文部分内容阅读
稀疏性问题是协作过滤算法应用中的一个突出问题,当系统中用户对资源的评分数据集很稀疏的条件下,算法的精度和覆盖率会显著降低。针对这一问题,该文通过分析影响基于资源的协作过滤算法中的相似性计算的因素,提出采用“资源关系密度”作为描述协作过滤评分矩阵的一个特征指标,分析并总结了“资源关系密度”对典型的基于资源的协作过滤算法的影响,进而提出一种虚拟用户填充算法。实验结果表明,虚拟用户填充法能够有效改善典型的基于资源的协作过滤算法在稀疏数据集上的精度和覆盖率。
The sparsity problem is a prominent problem in the application of collaborative filtering algorithm. When the scoring data set of users in the system is very sparse, the accuracy and coverage of the algorithm will be significantly reduced. In order to solve this problem, this paper analyzes the factors that affect the similarity calculation in resource - based collaborative filtering algorithm and proposes using “Resource Relationship Density ” as a characteristic index to describe the collaborative filtering scoring matrix. The paper analyzes and summarizes the “ Resource density ”on the typical resource-based collaborative filtering algorithm, and then proposed a virtual user-filling algorithm. The experimental results show that the virtual user-filling method can effectively improve the accuracy and coverage of typical resource-based collaborative filtering algorithms on sparse data sets.