首页 >  2013, Vol. 17, Issue (4) : 788-801

摘要

全文摘要次数: 9061 全文下载次数: 3738
引用本文:

DOI:

10.11834/jrs.20132164

收稿日期:

2012-05-16

修改日期:

2012-06-18

PDF Free   HTML   EndNote   BibTeX
自然语言理解的中文地址匹配算法
遥感国家重点实验室 中国科学院遥感与数字地球应用研究所, 北京 100101
摘要:

在分析现有3类主要的中文地址匹配算法:要素层级匹配法、全文检索法、正则表达式法的基础上,提出了基于自然语言理解的中文地址匹配算法。新算法中建立了空间关系地址模型以解决中文地址抽象问题、地址库逻辑模型以解决地址信息的空间知识表达问题。新算法的完整流程包括预处理、地址解析、地址要素标准化、推理匹配和匹配登记等5个环节,本文重点阐述了地址解析和推理匹配这两个重要环节,分别依据"自然语言理解"中的中文分词和语义推理原理,对用非结构化的中文自然语言描述的中文地址进行处理,实现自然语言理解方法与地址匹配之间的结合,从而建立完整的基于自然语言理解的中文地址匹配算法。为验证该算法,开发了中文地址智能匹配实验系统,对河南省濮阳市人口库1000条居民地址数据进行匹配,匹配率达到了95%,准确率高于93%。

Address matching algorithm based on chinese natural language understanding
Abstract:

Address matching algorithm that has broad application prospects is the core and key technology for location-based services. This paper analyzes the existing three major address matching algorithms which are the level based matching algorithm, the full-text search algorithm and the regular expression algorithm. An address matching algorithm based on Chinese natural language understanding is proposed in this paper. The complete process of this new algorithm includes five parts as pretreatment, address parsing, address elements standardization, reasoning about address matching and matching registration. This paper focuses on address parsing and reasoning matching the two most important parts. The paper establishes a complete Chinese address matching algorithm based on natural language understanding. In the principle of Chinese segmentation and semantic reasoning in natural language understanding, the new algorithm achieves the goal to combine natural language understanding with address matching by processing Chinese address of unstructured format. To check the new algorithm, an address matching experimental system was developed. The matching experiment using 1000 resident addresses of Puyang city, Henan province shows that the matching rate can be 95% or more and the accuracy rate is above 93%.

本文暂时没有被引用!

欢迎关注学报微信

遥感学报交流群