A Comparative Study on the Language Networks Based on Co-occurrence,Syntax,Semantics

Abstract

网络方法应用于语言研究是语言研究大数据时代的新趋势。语言是一个多层级的符号系统,选择哪种语言单位作为网络节点,选择哪种语言单位间的关系作为网络联结,影响到语言网络的结构和功能。该文梳理了以汉语词为单位,以同现、句法、语义关系为联结依据的几类网络构造方法,并针对同一文本构造三类网络发现:句法网络的网络直径、平均路径长度远小于同现网络,实词在语义网络中占据中心节点位置。这提示我们网络分析方法的应用仍要以可靠的语言学理论为指导,从语言学内部出发才能更好解释各类语言网络的差异。Network structure has been wildely applied in language studies with the coming of the big data era.Since language is a multi-level system of symbols,different language units will exhibit networks of different structure and function.This paper surveys the construction methods for the word co-occurrence network(on the basis of the adjacency of words),the syntactic network(on the basis of syntactic theory-dependency grammar)and the semantic network(on the basis of conceptual relation)for the same text.It is revealed that the syntactic network's diameter and average path length are much smaller than those of the co-occurrence network,and the content words in the semantic network occupy central node locations.This suggests that the linguistic theory is to be applied in the network analysis,and will contribute to better explain the differences of various language networks.国家社会科学基金(11&ZD188;14CYY046

    Similar works