,Automatic malware classification and new malware detection using machine learning

来源 :信息与电子工程前沿(英文版) | 被引量 : 0次 | 上传用户:yhmlivefor51
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
The explosive growth of malware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware pro-grams. Therefore, we propose a machine leaing based malware analysis system, which is composed of three modules: data processing, decision making, and new malware detection. The data processing module deals with gray-scale images, Opcode n-gram, and import functions, which are employed to extract the features of the malware. The decision-making module uses the features to classify the malware and to identify suspicious malware. Finally, the detection module uses the shared nearest neighbor (SNN) clustering algorithm to discover new malware families. Our approach is evaluated on more than 20000 malware instances, which were collected by Kingsoft, ESET NOD32, and Anubis. The results show that our system can effectively classify the un-known malware with a best accuracy of 98.9%, and successfully detects 86.7% of the new malware.
其他文献
本实验选用糯性青稞品种甘垦5号和非糯性青稞品种北青6号和昆仑12号作为试验材料,以双波长分光光度法测定3个青稞品种籽粒灌浆期中直链淀粉与支链淀粉含量的动态变化;通过测定酶活分析研究了淀粉代谢的关键酶活性变化;通过RT-PCR研究并分析了淀粉代谢关键酶编码基因的表达量,以及不同基因表达量之间和基因表达量与对应酶活的相互关系。结果表明:1.随着籽粒灌浆期的进行甘垦5号、北青6号和昆仑12号的支链淀粉的