论文部分内容阅读
指出Web挖掘是从数据挖掘发展而来,是集合Web技术、数据挖掘、信息科学等多领域为一体的一项综合技术;介绍Web挖掘的概念、分类以及Web页面之间链接结构挖掘的HITS与Page-rank等算法;提出基于样本模式特征提取的信息检索方法。最后,分析Web链接挖掘面临的问题和未来研究的发展趋势。
It is pointed out that Web mining is an integrated technology that integrates Web, data mining and information science. It introduces the concept, classification of Web mining and the HITS Page-rank algorithm and so on. An information retrieval method based on sample pattern feature extraction is proposed. Finally, the paper analyzes the problems faced by Web link mining and the future research trends.