论文部分内容阅读
随着互联网的蓬勃发展,微博在信息传播过程中扮演着非常重要的角色,正逐渐演变成一种新型线上交流新闻源.人们已经习惯于通过微博平台来了解他们身边的朋友或家人在做什么,关心这个世界正在发生什么.然而,由于微博平台蕴含着海量信息,很难以人工的方式在微博上快速检测当前实时发生的重大新闻或突发事件.因此,面向微博的热点主题检测成为当下的一个研究热点.然而,现有研究主要侧重于主题识别而忽略了用户对于实时性的要求,少数针对实时热点主题发现的方法主要基于关键词的统计分析,实时性和准确率都有待提高.根据我们的观察发现,微博平台汇集了成千上万的观点与意见,包括对社会事件的讨论、对产品的评价等,这些观点使得微博成为一个非常有价值的观点意见数据源.通过分析观点与情感的实时变化,我们可以更好地了解相关主题的变化趋势,从而辅助用户判定其是否是流行的热点主题.本文结合微博的情感时序变化提出了一种实时的非参数化的热点主题检测方法.该方法通过对微博情感极性分析及其强度变化来计算情感时序分布,并利用上述特征构建一个复合模型以识别、检测微博热点主题.实验分别在Twitter和新浪微博等真实数据集上进行,结果表明我们提出的方法能够在保证检测准确率的前提下更快地识别热点话题.
With the rapid development of the Internet, microblogging plays a very important role in the process of information dissemination and is gradually evolving into a new type of online news source. People are accustomed to understanding their friends or family through the Weibo platform What to do and what is happening in the world.However, due to the huge amount of information contained in the Weibo platform, it is hard to manually detect the most important news or incidents happening in real time on the Weibo. Therefore, the hotspot for Weibo However, the existing researches mainly focus on topic recognition and neglect the user’s requirements for real-time performance. A few methods based on the real-time hot topic discovery are mainly based on statistical analysis of key words, real-time and accuracy Have to be improved.According to our observation found that the Weibo platform brings together tens of thousands of views and opinions, including the discussion of social events, product evaluation, these views make the microblogging becomes a very valuable opinion Data sources. By analyzing the real-time changes in perspectives and emotions, we can better understand the changes in related topics Trend, so as to assist the user to determine whether it is a popular hot topic.This paper presents a real-time non-parametric hot topic detection method based on the emotional timing changes of Weibo.This method analyzes the emotional polarity of Weibo and its intensity Changes to calculate the emotional timing distribution, and use the above characteristics to build a composite model to identify and detect hot topics microblogging experiments were carried out on Twitter and Sina microblogging and other real data sets, the results show that our proposed method can ensure that the detection accuracy Rate faster recognition of hot topics under the premise.