论文部分内容阅读
With the development of large-scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co-occurrence of keywords in the same text, and the second refers to that in the same category. Then we compare the difference between them. Our experiment results show that they are efficient to reduce the dimension of text feature space.