论文部分内容阅读
对社交网络的信息传播进行时间序列聚类是研究其规律非常有效的方法.目前,相关的工作特别是针对国内社交网络的时间序列聚类研究,还不够深入.对时间序列聚类算法K-SC算法进行了针对性的改进,提出的T-SC算法借鉴了凝聚层次聚类的思想解决了聚类个数设置的难题.对人人网、腾讯微博和百度贴吧三个国内非常有代表性的社交网络进行了大量的测量和分析工作,并运用T-SC算法对测量数据进行了聚类分析.研究发现了不同社交网络典型而又互不相同的传播模式:人人网的视频分享呈现明显的周期性,每个周期内的分享传播存在一个主流的模式,该模式与一天之中不同时段人人网的在线人数变化趋势非常相近;腾讯微博的转发传播呈现爆发性,绝大多数的转发出现在微博发出后的48小时之内,其主流的传播模式是微博发出后大量传播并迅速消失;百度贴吧帖子的生命期很长,但是没有一个占主导地位的传播模式.本文创新性的将聚类分析的结果应用于信息传播的预测,根据已知的传播时间序列,得到未来信息传播行为在聚类层面的预测,为解决传播预测的难题提供了新的思路.
It is a very effective method to study the regularity of the information dissemination in social networks.At present, the relevant work, especially for the time-series clustering of social networks in China, is still not deep enough.For the time-series clustering algorithm K- SC algorithm is improved, the proposed T-SC algorithm draws on the idea of cohesive hierarchical clustering to solve the problem of the number of clusters set.At the same time, there are three representatives of Renren, Tencent Weibo and Baidu Post Bar Sex social networks have performed a great deal of measurement and analysis and clustered the measured data using the T-SC algorithm.The study found typical and different modes of communication in different social networks: video sharing in Renren Showing a significant cyclical, there is a mainstream mode of sharing communication in each cycle, which is very similar to the changing trend of online population of all networks in different periods during the day. Tencent microblogging’s forwarding transmission is explosive and vast Most of the retransmission occurred within 48 hours after the microblogging issue, the mainstream mode of transmission is the spread of a large number of microblogging and quickly disappear; But there is not a dominant mode of communication.This paper innovatively applies the result of clustering analysis to the forecast of information dissemination and predicts the clustering level of information dissemination in the future based on the known time series of dissemination , Which provides a new way of thinking to solve the problem of propagation prediction.