,Variable importance-weighted Random Forests

来源 :定量生物学(英文版) | 被引量 : 0次 | 上传用户:Victsman
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background:Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies.However,its performance often deteriorates when the number of features increases.To address this limitation,feature elimination Random Forests was proposed that only uses features with the largest variable importance scores.Yet the performance of this method is not satisfying,possibly due to its rigid feature selection,and increased correlations between trees of forest.Methods:We propose variable importance-weighted Random Forests,which instead of sampling features with equal probability at each node to build up trees,samples features according to their variable importance scores,and then select the best split from the randomly selected features.Results:We evaluate the performance of our method through comprehensive simulation and real data analyses,for both regression and classification.Compared to the standard Random Forests and the feature elimination Random Forests methods,our proposed method has improved performance in most cases.Conclusions:By incorporating the variable importance scores into the random feature selection step,our method can better utilize more informative features without completely ignoring less informative ones,hence has improved prediction accuracy in the presence of weak signals and large noises.We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http://zhaocenter.org/software.
其他文献
由大豆疫霉引起的大豆疫霉根腐病是一种毁灭性的土传病害,目前已在美国、巴西、中国等20多个国家均有报道该病的发生,该病在我国有逐年加重的趋势。大豆疫霉根腐病对全世界大豆生产造成了巨大的产量和经济损失。大豆疫霉不断出现新的生理小种,具有丰富的遗传变异,不同地区间往往具有不同的生理小种类型。因此,合理利用抗源,针对不同地区间生理小种类型进行抗源筛选和发掘新的抗病基因显得至关重要。大豆与大豆疫霉根腐病之间
以黑农38等6个不同基因型大豆为材料,研究了大豆豆荚光合性能及其与超微结构的关系;系统地比较了早(垦18)、中(黑农41)和晚(吉育60)熟基因型的花、荚成荚规律。获得如下主要结果:1.高产春大豆在鼓粒期间荚面积为叶片面积的19.83%~35.44%,荚皮叶绿素含量为叶片的5.67%~8.20%,荚的真光合速率为叶片的13.32%~55.98%。黑农38的荚面积占叶面积百分比值、叶片的真光合速率均
试验于2009年4-10月,在东北农业大学植物学实验实习基地进行。种植材料为绥农14、东农42、黑农44、东农47和黑农48五个品种,试验小区采用随机区组设计,每公顷保苗24万株,每公顷施用磷酸二氢铵225kg和硫酸钾75kg。考察大豆荚果中氮素、脂肪、蛋白质、蔗糖等物质的积累和碳素代谢关键酶活性,以及物质积累与酶活性的关系。试验结果表明:子粒发育过程中,蛋白质含量变化表现为降低升高,再降低又长时