论文部分内容阅读
为了更加有效地识别网络流量中少量的异常流量样本,本论文给出一种基于改进极端随机树的异常流量分类方法。该方法首先计算数据中每个特征的信息增益率,获得较低维度的特征集;然后使用随机训练方法训练分类模型,对一部分基分类器使用全部样本进行训练,另一部分基分类器的使用经过重采样的数据进行训练,对于使用重采样数据的基分类器,使用加权统计的方法修改其最后的投票规则。实验结果表明,本文给出的方法在NSL-KDD数据集上获得了接近99.56%的精确率,同时与其它的集成分类算法相比,该方法在数据样本比较少的类别上获得了更好的分类结果。
In order to identify a small number of abnormal traffic samples in network traffic more effectively, this paper presents an abnormal traffic classification method based on improved extreme random tree. The method first calculates the information gain rate of each feature in the data and obtains the feature set of lower dimension. Then, the classification model is trained by using the stochastic training method, and a part of the classifiers are trained by using all the samples. The other part of the classifier is used Resampled data for training, for the use of resampling data base classifier, using weighted statistical method to modify its final voting rules. Experimental results show that the proposed method achieves an accuracy of nearly 99.56% on the NSL-KDD dataset. Compared with other integrated classification algorithms, this method achieves a better Classification results.