A Comparative Study on Two Techniques of Reducing the Dimension of Text Feature Space

来源 :系统工程与电子技术 | 被引量 : 0次 | 上传用户：famzhang

【摘要】

：

With the development of large-scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to

【作者】

：

YIN Zhonghang Wang Yongcheng C

【机构】

：

School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, P.R.Ch

【出处】

：

系统工程与电子技术

【发表日期】

：

2002年1期

【关键词】

：

Text data mining Natural language processing Keyword clustering

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

With the development of large-scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co-occurrence of keywords in the same text, and the second refers to that in the same category. Then we compare the difference between them. Our experiment results show that they are efficient to reduce the dimension of text feature space.

其他文献

杂种和纯种猪之间脂肪组织差异表达基因的分离、克隆和序列分析

为探究吕家坨井田地质构造格局,根据钻孔勘探资料,采用分形理论和趋势面分析方法,研究了井田7

期刊

EST杂种优势mRNA差异显示半定量RT-PCR猪ESTsheterosismRNA differntial displaysemi-quant

Analytical Criteria for Local Activity of CNN with Two Ports and Application to Smoothed Chua's

Presents analytic criteria for the local activity theory in two-port cellular neural network (CNN) cells with four local state variables, and gives the applicat

期刊

cellular neural networklocal activitysmoothed Chua's circuittwo-portcha

Outliers Mining in Time Series Data Sets

In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from ti

期刊

Data miningTime seriesOutlier mining

Molecular Dynamics Simulation and Experimental Proof of Hydrogen-enhanced Dislocation Emission in Ni

A quasi three dimensions molecular dynamic method was used to simulate the effect of hydrogen on dislocation emission and crack propagation in nickel. In situ o

期刊

nickelmolecular dynamics simulationhydrogendislocation emissionTEM

THE EXACT SOLITARY WAVE SOLUTIONS FOR THE KLEIN-GORDON-SCHR(O)DINGER EQUATIONS

The solitary wave solutions for the Klein-Gordon-Schrodinger Equations were obtained by using the homogeaeous balance principle. The form of the solutions is mo

期刊

Klein-Gordon-Schrodinger equationshomogeneous balance principleexact solitary

Theoretical Analysis of Lattice Parameter Effect on Order-Disorder Transformation Based on Pair Pote

Based on pair potential, the Bragg Williams (B-W) model is modified to take into account the effect of the lattice parameter on theoretical order-disorder trans

期刊

order-disorder transformationcomputer simulationpair potentialphase transform

New process of low-temperature methanol synthesis from CO/CO2/H2 based on dual-catalysis

A new process of low-temperature methanol synthesis from CO/CO2/H2 based on dual-catalysis has been developed. Some alcohols, especially 2-alcohol, were found t

期刊

dual-catalysismethanollow-temperature synthesisnew process

An Improved Minimum Distance Method Based on Artificial Neural Networks

该文从挂篮荷载计算、施工流程、支座及临时固结施工、挂篮安装及试验、合拢段施工、模板制作安装、钢筋安装、混凝土的浇筑及养生、测量监控等方面人手,介绍了S226海滨大桥

期刊

state recognitionminimum distance methodartificial neural networks

Characteristics of multi-component MI-based hydrogen storage alloys and their hydride electrodes

A series of multi-component MI-based hydrogen storage alloys with a cobalt atomic ratio of 0.40-0.75 w ere prepared. The electrochemical properties under differ

期刊

rare earth alloylow-Co hydrogen storage alloymetal hydride electrodeelectroch

两类新的无向双环网络紧优无限族

给出了无向双环网络 ( UDLN)的直径的一个新上界 .并由此构造出了两类新的紧优双环网无限族 ,改进了已有的结果 A new upper bound on the diameter of undirected bicyclic

期刊

无向双环网最优双环网络紧优双环网络直径

A Comparative Study on Two Techniques of Reducing the Dimension of Text Feature Space

其他学术论文