Search CORE

205 research outputs found

Огляд підходів до розв’язання задач ідентифікації парафраз

Author: Marchenko A. A.
Vrublevskyi V. N.
Publication venue: Taras Shevchenko National University of Kyiv
Publication date: 13/07/2023
Field of study

The article is devoted to a review of approaches to solving the problem of identifying paraphrases. This problem's relevance and use in tasks such as plagiarism detection, text simplification, and information search are described. Several classes of solutions were considered. The first approach is based on manual rules - it uses manually selected features based on the fundamental properties of paraphrases. The second approach is based on lexical similarity and various databases and ontologies. Machine learning-based approaches are also presented in this paper and describe different architectures that can be used to identify paraphrases. The last approach considered is based on deep learning and modern models of transformers. Pages of the article in the issue: 71 - 78 Language of the article: UkrainianСтаття присвячена огляду підходів до розв’язання задачі ідентифікації парафраз. Описується актуальність та використання даної задачі у таких задачах як виявлення плагіату, спрощення тексту та пошук інформації. Було розглянуто декілька класів вирішення даної задачі. Перший підхід заснований на ручних правилах - використовує вручну підібрані особливості базуючись на базових властивостях парафраз. Другий підхід заснований на лексичній подібності та різноманітних базах даних і онтології. Підходи, засновані на машинному навчанні також представлені у даній статті та описує архітектури, які можуть бути використані для ідентифікації парафраз. Останній підхід який розглянуто базується на глибокому навчанні та сучасних моделях трансформерів

Вісник Київського національного університету імені Тараса Шевченка. Серія: фізико-математичні науки

Recommended from our members

Event-based hyperspace analogue to language for query expansion

Author: Hou Yuexian
Maxwell Tamsin
Song Dawei
Yan Tingxu
Zhang Peng
Publication venue
Publication date: 01/07/2010
Field of study

Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and validated semantic space model that captures statistical dependencies between words by considering their co-occurrences in a surrounding window of text. HAL has been successfully applied to query expansion in IR, but has several limitations, including high processing cost and use of distributional statistics that do not exploit syntax. In this paper, we pursue two methods for incorporating syntactic-semantic information from textual ‘events’ into HAL. We build the HAL space directly from events to investigate whether processing costs can be reduced through more careful definition of word co-occurrence, and improve the quality of the pseudo-relevance feedback by applying event information as a constraint during HAL construction. Both methods significantly improve performance results in comparison with original HAL, and interpolation of HAL and relevance model expansion outperforms either method alone

Open Research Online (The Open University)

Adapting a general parser to a sublanguage

Author: Aubin Sophie
Nazarenko Adeline
Nédellec Claire
Publication venue
Publication date: 01/01/2005
Field of study

In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identication and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and finally combined among which text normalization, lexicon and morpho-guessing module extensions and grammar rules adaptation. We compare the parsing results before and after these adaptations

arXiv.org e-Print Archive

HAL Descartes

HAL-Paris 13

Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing

Author: Joshi Aravind K.
Srinivas B.
Publication venue
Publication date: 26/10/1994
Field of study

In a lexicalized grammar formalism such as Lexicalized Tree-Adjoining Grammar (LTAG), each lexical item is associated with at least one elementary structure (supertag) that localizes syntactic and semantic dependencies. Thus a parser for a lexicalized grammar must search a large set of supertags to choose the right ones to combine for the parse of the sentence. We present techniques for disambiguating supertags using local information such as lexical preference and local lexical dependencies. The similarity between LTAG and Dependency grammars is exploited in the dependency model of supertag disambiguation. The performance results for various models of supertag disambiguation such as unigram, trigram and dependency-based models are presented.Comment: ps file. 8 page

arXiv.org e-Print Archive

ScholarlyCommons@Penn

Three New Probabilistic Models for Dependency Parsing: An Exploration

Author: Eisner Jason
Publication venue
Publication date: 01/01/1997
Field of study

After presenting a novel O(n^3) parsing algorithm for dependency grammar, we develop three contrasting ways to stochasticize it. We propose (a) a lexical affinity model where words struggle to modify each other, (b) a sense tagging model where words fluctuate randomly in their selectional preferences, and (c) a generative model where the speaker fleshes out each word's syntactic and conceptual structure without regard to the implications for the hearer. We also give preliminary empirical results from evaluating the three models' parsing performance on annotated Wall Street Journal training text (derived from the Penn Treebank). In these results, the generative (i.e., top-down) model performs significantly better than the others, and does about equally well at assigning part-of-speech tags.Comment: 6 pages, LaTeX 2.09 packaged with 4 .eps files, also uses colap.sty and acl.bs

arXiv.org e-Print Archive

CiteSeerX