17 research outputs found
Crossings as a side effect of dependency lengths
The syntactic structure of sentences exhibits a striking regularity:
dependencies tend to not cross when drawn above the sentence. We investigate
two competing explanations. The traditional hypothesis is that this trend
arises from an independent principle of syntax that reduces crossings
practically to zero. An alternative to this view is the hypothesis that
crossings are a side effect of dependency lengths, i.e. sentences with shorter
dependency lengths should tend to have fewer crossings. We are able to reject
the traditional view in the majority of languages considered. The alternative
hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in
Complexity (Wiley
Constraint Based Hybrid Approach to Parsing Indian Languages
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Conceptual structure representation of causative verb in Malay language and relation with syntax [Representasi argumen struktur konseptual bagi kata kerja kausatif Bahasa Melayu dan hubungannya dengan sintaksis]
Causative verbs can refer to an act because of something that is happening, because nothing will happen without any specific reason, for the occurrence of a favor and the reason for to justifying something. Therefore, this study will examine lexical verbs that are causative in representing the structure of the argument and its relation to syntax. To examine the lexical verb of causative in representing the structure of the argument and its relation to the syntax, this study will focus on the lexical of verb kill, force, allowed and help from the organizational class of the conceptual structure of semantics using data taken from the Corpus Base of the Dewan Bahasa dan Pustaka as an example of a data. The analysis of the study will use the Conceptual Structure Theory – by Ray Jackendoff (2011). Semanticrepresentations are depicted in the form of conceptual structures within the Function [Event] that carry the conceptual structure of the formula [Event CAUSE ([THING, x], [Event ([y], [x])])]. The findings show the representation of conceptual structures for verbs to kill, force, allowed and help represented by constituents [Event CAUSE ([BE X], [Y])]), NO [event LET ([ GO X], [Y])], and [Event HELP ([ GO X], [Y])] to constituents [THING], [PLACE] and [PATH]. For conceptual structure representation to syntax, data findings indicate a connection between the Head Rules and Arguments Rules based on the role of the argument being a subset of the acceptance and verbal request to the Name Phrase. Consequently, this study will show the representation of conceptual structure and its relationship with systematic syntax and especially in the verbal nature
Representasi argumen gerak ruang Bahasa Melayu berdasarkan teori struktur konseptual - Representation of the spatial motion in the Malay language based on the Conceptual Structure Theory
Gerak difahami sebagai perubahan dari satu lokasi ke lokasi yang
lain. Oleh itu kajian ini tertumpu pada gerak ruang yang terdiri
daripada gerak bersifat arah, gerak bersifat tambah, gerak bersifat
menjadi dan gerak bersifat kausatif. Data yang digunakan dalam
kajian ini diambil daripada Pangkalan data korpus Dewan Bahasa dan
Pustaka, Malaysia sebagai bahan dan contoh kajian. Penganalisisan
data menggunakan Teori Struktur Konseptual (TSK) oleh Jackendoff
(1997 & 2011). Pemetaan argumen ini merangkumi tiga tatatingkat
utama, iaitu pemetaan kepada representasi struktur konseptual,
pemetaan kepada rajah pohon struktur konseptual dan pemetaan
kepada peranan tematik. Hasil dapatan menunjukkan, kehadiran gerak
ruang hadir dalam fungsi [Peristiwa] dan [Keadaan] yang membawa rumus asas [[[GERAK [ x [Peristiwa] [Benda] [Hala] [Tempat]]].
Hasil dapatan menunjukkan leksikal balik, berlari, tinggal, terletak,
berada, bunuh, paksa, benar dan tolong mempunyai pemetaan
representasi argumen yang tersendiri yang melibatkan konstituen
seperti [Peristiwa],[Keadaan],[Benda] [Hala][Tempat] [Peristiwa
SEBAB], [Peristiwa TIDAK], [Peristiwa BENAR] dan [Peristiwa
TOLONG]. Bagi peringkat pemetaan dalam peranan tematik,
item yang terlibat ialah peranan kepada Aktor, Tema, Matlamat,
Sumber, Agen dan Penerima (-). Sehubungan dengan itu, kajian ini
menunjukkan representasi struktur konseptual secara sistematik dan
berpada khususnya dalam gerak ruang bahasa Melayu.
******************************************************************************
Motion is understood as a change from one location to another.
Therefore, this study focuses on spatial motion consisting of
directional, extensional, state and causative motion. This study
utilizes data from the Pangkalan Data Korpus Dewan Bahasa dan
Pustaka database in Malaysia for its study samples. The analysis
employs Jackendoff’s (1997 & 2011) Theory of Conceptual Structure
for its framework. The mapping of these arguments comprises three
main stages, namely mapping to conceptual structure representation,
mapping to conceptual tree diagrams and mapping to thematic
roles. The results show that spatial motion is present in the [Event]
and [States] functions that carry the basic formula [[[MOTION
[x [Event] [Object] [Path] [Place]]]. The findings show that the
lexical items balik, berlari, tinggal, terletak, berada, bunuh, paksa,
benar and tolong have their own representation of arguments
involving constituents such as [Event], [States], [Object] [Path]
[Place] [Event] REASON], [NO Causative], [TRUE Causative] and
[HELP Causative]. For the mapping stage in the thematic role, the
item involved is the role of Actor, Theme, Goal, Source, Agent and
Beneficiary (-). Accordingly, this study shows an adequate systematic
representation of the conceptual structure, particularly in the spatial
motion domain in the Malay language
Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit
The primary focus of this thesis is to make Sanskrit manuscripts more
accessible to the end-users through natural language technologies. The
morphological richness, compounding, free word orderliness, and low-resource
nature of Sanskrit pose significant challenges for developing deep learning
solutions. We identify four fundamental tasks, which are crucial for developing
a robust NLP technology for Sanskrit: word segmentation, dependency parsing,
compound type identification, and poetry analysis. The first task, Sanskrit
Word Segmentation (SWS), is a fundamental text processing task for any other
downstream applications. However, it is challenging due to the sandhi
phenomenon that modifies characters at word boundaries. Similarly, the existing
dependency parsing approaches struggle with morphologically rich and
low-resource languages like Sanskrit. Compound type identification is also
challenging for Sanskrit due to the context-sensitive semantic relation between
components. All these challenges result in sub-optimal performance in NLP
applications like question answering and machine translation. Finally, Sanskrit
poetry has not been extensively studied in computational linguistics.
While addressing these challenges, this thesis makes various contributions:
(1) The thesis proposes linguistically-informed neural architectures for these
tasks. (2) We showcase the interpretability and multilingual extension of the
proposed systems. (3) Our proposed systems report state-of-the-art performance.
(4) Finally, we present a neural toolkit named SanskritShala, a web-based
application that provides real-time analysis of input for various NLP tasks.
Overall, this thesis contributes to making Sanskrit manuscripts more accessible
by developing robust NLP technology and releasing various resources, datasets,
and web-based toolkit.Comment: Ph.D. dissertatio