Search CORE

1,467 research outputs found

The logic and linguistic model for automatic extraction of collocation similarity

Author: Gautam Ajit Pratap Singh
Khairova N. F.
Petrasova S. V.
Publication venue: Національний університет "Львівська політехніка"
Publication date: 01/01/2015
Field of study

The article discusses the process of automatic identification of collocation similarity. The semantic analysis is one of the most advanced as well as the most difficult NLP task. The main problem of semantic processing is the determination of polysemy and synonymy of linguistic units. In addition, the task becomes complicated in case of word collocations. The paper suggests a logical and linguistic model for automatic determining semantic similarity between colocations in Ukraine and English languages. The proposed model formalizes semantic equivalence of collocations by means of semantic and grammatical characteristics of collocates. The basic idea of this approach is that morphological, syntactic and semantic characteristics of lexical units are to be taken into account for the identification of collocation similarity. Basic mathematical means of our model are logical-algebraic equations of the finite predicates algebra. Verb-noun and noun-adjective collocations in Ukrainian and English languages consist of words belonged to main parts of speech. These collocations are examined in the model. The model allows extracting semantically equivalent collocations from semi-structured and non-structured texts. Implementations of the model will allow to automatically recognize semantically equivalent collocations. Usage of the model allows increasing the effectiveness of natural language processing tasks such as information extraction, ontology generation, sentiment analysis and some others

Biblioteka Nauki - repozytorium artykuÅÃ³w

Electronic National Technical University "Kharkiv Polytechnic Institute" Institutional Repository (eNTUKhPIIR)

Uvid u automatsko izlučivanje metaforičkih kolokacija

Author: Brkić Bakarić Marija
Matetić Maja
Načinović Prskalo Lucia
Publication venue: Institute of Croatian Language and Linguistics
Publication date: 01/01/2023
Field of study

Collocations have been the subject of much scientific research over the years. The focus of this research is on a subset of collocations, namely metaphorical collocations. In metaphorical collocations, a semantic shift has taken place in one of the components, i.e., one of the components takes on a transferred meaning. The main goal of this paper is to review the existing literature and provide a systematic overview of the existing research on collocation extraction, as well as the overview of existing methods, measures, and resources. The existing research is classified according to the approach (statistical, hybrid, and distributional semantics) and presented in three separate sections. The insights gained from existing research serve as a first step in exploring the possibility of developing a method for automatic extraction of metaphorical collocations. The methods, tools, and resources that may prove useful for future work are highlighted.Kolokacije su već dugi niz godina tema mnogih znanstvenih istraživanja. U fokusu ovoga istraživanja podskupina je kolokacija koju čine metaforičke kolokacije. Kod metaforičkih je kolokacija kod jedne od sastavnica došlo do semantičkoga pomaka, tj. jedna od sastavnica poprima preneseno značenje. Glavni su ciljevi ovoga rada istražiti postojeću literaturu te dati sustavan pregled postojećih istraživanja na temu izlučivanja kolokacija i postojećih metoda, mjera i resursa. Postojeća istraživanja opisana su i klasificirana prema različitim pristupima (statistički, hibridni i zasnovani na distribucijskoj semantici). Također su opisane različite asocijativne mjere i postojeći načini procjene rezultata automatskoga izlučivanja kolokacija. Metode, alati i resursi koji su korišteni u prethodnim istraživanjima, a mogli bi biti korisni za naš budući rad posebno su istaknuti. Stečeni uvidi u postojeća istraživanja čine prvi korak u razmatranju mogućnosti razvijanja postupka za automatsko izlučivanje metaforičkih kolokacija

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Construction of Semantic Collocation Bank Based on Semantic Dependency Parsing

Author: Ding Yu
Liu Shijun
Shao Yanqiu
Zheng Lijuan
Publication venue
Publication date: 01/01/2015
Field of study

Waseda University Repository

Using conceptual vectors to get Magn collocations (and using contrastive properties to get their translations)

Author: Archer Vincent
Publication venue: Wiener Slawistischer Almanach
Publication date: 01/01/2007
Field of study

International audienceThis paper presents a semi-automatic approach for extraction of collocations from corpora which uses the results of Conceptual Vectors as a semantic filter. First, this method estimates the ability of each co-occurrence to be a collocation, using a statistical measure based on the fact that it occurs more often than by chance. Then the results are automatically filtered (with conceptual vectors) to retain only one given semantic kind of collocations. Finally we perform a new filtering based on manually entered data. Our evaluation on monolingual and bilingual experiments shows the interest to combine automatic extraction and manual intervention to extract collocations (to fill multilingual lexical databases). It proves especially that the use of conceptual vectors to filter the candidates allows us to increase the precision noticeably

CiteSeerX

Hal - Université Grenoble Alpes

Architectures of Meaning, A Systematic Corpus Analysis of NLP Systems

Author: Florea Malina
Freitas Andre
Landers Donal
Wysocki Oskar
Publication venue
Publication date: 16/07/2021
Field of study

This paper proposes a novel statistical corpus analysis framework targeted towards the interpretation of Natural Language Processing (NLP) architectural patterns at scale. The proposed approach combines saturation-based lexicon construction, statistical corpus analysis methods and graph collocations to induce a synthesis representation of NLP architectural patterns from corpora. The framework is validated in the full corpus of Semeval tasks and demonstrated coherent architectural patterns which can be used to answer architectural questions on a data-driven fashion, providing a systematic mechanism to interpret a largely dynamic and exponentially growing field.Comment: 20 pages, 6 figures, 9 supplementary figures, Lexicon.txt in the appendi

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Using distributional similarity to organise biomedical terminology

Author: Dowdall James
Keller Bill
Schneider Gerold
Weeds Julie
Weir David
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2005
Field of study

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy

ZORA

Sussex Research Online

Constructions, Collocations, and Patterns

Author: Simon Gábor
Publication venue
Publication date: 01/01/2023
Field of study

ELTE Digital Institutional Repository (EDIT)