论文部分内容阅读
朴素贝叶斯分类器(NB)由于结构简单,计算高效而被广泛应用,但它不能充分利用属性间的依赖关系,有一定的局限性.因此,隐朴素贝叶斯分类器(HNB)通过为每个属性引入一个隐藏父节点,将各个属性之间的依赖关系都综合其中,使属性间的依赖关系得到了利用.但隐朴素贝叶斯分类器忽略了属性对与该属性的依赖关系,故在此基础上提出一种改进算法--双隐朴素贝叶斯算法(DHNB),使属性对与该属性的依赖关系得到了充分的利用,并提出一种新型的阈值定义法,使得选取的阈值让分类精度与时间复杂度的比值为最大,缓解了算法时间复杂度和分类精度之间的矛盾.然后将改进的算法在UCI数据集上进行仿真试验,结果表明其分类性能优于HNB和NB,该方法具有较好的适用性.
Naïve Bayes classifier (NB) has been widely used due to its simple structure and high computational efficiency, but it can not make full use of the dependencies between properties and has some limitations. Therefore, the Hough Bayes classifier (HNB) A hidden parent node is introduced for each attribute, and the dependencies among the attributes are integrated into each other so that the dependencies between the attributes are utilized. However, the hidden naive Bayesian classifier ignores the dependence of attributes on the attribute On this basis, an improved algorithm called Double Hidden Bayes Algorithm (DHNB) is proposed, which makes full use of attribute dependency on this attribute, and proposes a new threshold definition method that makes The threshold value is chosen to maximize the ratio of classification accuracy to time complexity, which alleviates the conflict between algorithm time complexity and classification accuracy.And then, the improved algorithm is simulated on UCI dataset and the results show that the classification performance is better than HNB and NB, the method has good applicability.