论文部分内容阅读
[目的 /意义]以数据集成过程中异构信息的集成为研究目标,在保证文献综合集成系统对信息提取的准确性要求的基础上,以资源环境学科为例,提出一种异构信息的标准化处理方式。[方法 /过程]采用团队自建的资源环境学科知识本体为依据,通过对资源环境学科异构信息在地理空间、时间单位及属性提取中的标准化分析,提出异构信息标准化处理的思路,指导搭建实现信息集成、支持综合集成的人机交互的文献综合集成平台。[结果 /结论]最终主要针对不同数据格式、不同来源的文献进行知识格式化提取及处理,完成文献综合集成的数据准备阶段的工作。异构信息标准化处理仅仅是知识发现过程的起点,后续将重点关注标准化的信息统计分析及可视化展示,完整实现文献综合集成的知识发现过程。
[Purpose / Significance] With the integration of heterogeneous information in the process of data integration as the research objective, on the basis of ensuring the accuracy of the information extraction integrated with the system, taking the discipline of resources and environment as an example, a heterogeneous information Standardization approach. [Method / Process] Based on the ontology of resource and environment subject self-built by team, this paper proposes the standardization and treatment of heterogeneous information through the standardization analysis of heterogeneous information in geography and space, time unit and attribute extraction. Set up to achieve information integration, support for integrated integrated human-computer interaction platform for integrated literature. [Results / Conclusions] Finally, the paper mainly focuses on different formats and different sources of knowledge for format extraction and processing, and completes the work of data integration stage of integrated literature. Heterogeneous information standardization process is just the starting point of knowledge discovery process. Follow-up will focus on standardized information statistical analysis and visual display, and complete knowledge discovery process of integrated literature integration.