Search CORE

654 research outputs found

D6.1: Technologies and Tools for Lexical Acquisition

Author: Abrate Matteo
Bacciu Clara
Bel Nuria
Caselli Tommaso
Gavrilidou Maria
Korhonen Anna
Monachini Monica
Padr? Muntsa
Poibeau Thierry
Prokopidis Prokopis
Quochi Valeria
Revilla Eva
Rimell Laura
Tesconi Maurizio
Publication venue
Publication date
Field of study

This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

PUblication MAnagement

Extraction and Classification of App Features from App Reviews

Author: Yankovskaya Elizaveta
Publication venue
Publication date: 01/01/2017
Field of study

Aasta aastalt on kasvanud bioinformaatikas kasutatavate rakenduste arv.Selle tulemusena on konkreetse ülesande lahendamiseks sobiliku rakenduse leidmine muutunud keerukaks ülesandeks.Rakenduste kirjelduste paremaks süstematiseerimiseks ja otsitavaks muutmiseks on kasutusele võetud erinevaid märksõnade ontoloogiaid. Hetkel annoteeritakse kirjeldusi käsitsi, mis on ajamahukas ning ei anna alati õigeid tulemusi.Antud töös kirjeldame uut annoteerimismeetodit, mis pakub automaatselt välja ühe või mitu märksõna kasutades selleks vaid tööriista vabatekstilist kirjeldust.Selleks kasutab meie meetod uusimaid loomuliku keele töötlemise meetodeid nagu Dirichlet' peitlahutus (latent Dirichlet allocation) ja sõnade vektoresitust (word2vec).Esmane võrdlus meie poolt välja pakutud algoritmi ja käsitsi saadud märgendusega näitab, et tulemused on paljulubavad.The number of tools for bioinformatics is constantly increasing. To organize the available information and to facilitate the search, different ontologies are used. Today annotation of new descriptions is done manually, which is time-consuming and not always correct. We proposed a new annotation method, which, based on the description of the tool, offers one or more annotation labels in accordance with the ontology. In our method, we applied modern methods of natural language processing, such as latent Dirichlet allocation and word2vec. We compared the manual annotation labels with the labels obtained by using our algorithm and the first results look auspicious

DSpace at Tartu University Library

情報検索における意味的ギャップの解消 : トピックモデルを用いた先進的画像探索

Author: Nguyen Cam Tu
Publication venue
Publication date: 15/09/2011
Field of study

Tohoku University徳山豪課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)

Uvid u automatsko izlučivanje metaforičkih kolokacija

Author: Brkić Bakarić Marija
Matetić Maja
Načinović Prskalo Lucia
Publication venue: Institute of Croatian Language and Linguistics
Publication date: 01/01/2023
Field of study

Collocations have been the subject of much scientific research over the years. The focus of this research is on a subset of collocations, namely metaphorical collocations. In metaphorical collocations, a semantic shift has taken place in one of the components, i.e., one of the components takes on a transferred meaning. The main goal of this paper is to review the existing literature and provide a systematic overview of the existing research on collocation extraction, as well as the overview of existing methods, measures, and resources. The existing research is classified according to the approach (statistical, hybrid, and distributional semantics) and presented in three separate sections. The insights gained from existing research serve as a first step in exploring the possibility of developing a method for automatic extraction of metaphorical collocations. The methods, tools, and resources that may prove useful for future work are highlighted.Kolokacije su već dugi niz godina tema mnogih znanstvenih istraživanja. U fokusu ovoga istraživanja podskupina je kolokacija koju čine metaforičke kolokacije. Kod metaforičkih je kolokacija kod jedne od sastavnica došlo do semantičkoga pomaka, tj. jedna od sastavnica poprima preneseno značenje. Glavni su ciljevi ovoga rada istražiti postojeću literaturu te dati sustavan pregled postojećih istraživanja na temu izlučivanja kolokacija i postojećih metoda, mjera i resursa. Postojeća istraživanja opisana su i klasificirana prema različitim pristupima (statistički, hibridni i zasnovani na distribucijskoj semantici). Također su opisane različite asocijativne mjere i postojeći načini procjene rezultata automatskoga izlučivanja kolokacija. Metode, alati i resursi koji su korišteni u prethodnim istraživanjima, a mogli bi biti korisni za naš budući rad posebno su istaknuti. Stečeni uvidi u postojeća istraživanja čine prvi korak u razmatranju mogućnosti razvijanja postupka za automatsko izlučivanje metaforičkih kolokacija

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words

Author: Alves Elsa
Dias Gael
Nunes C elia
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2005
Field of study

In this paper, we present an innovative topic segmentation system based on a new informative similarity measure that takes into account word co-occurrence in order to avoid the accessibility to existing linguistic resources such as electronic dictionaries or lexico-semantic databases such as thesauri or ontology. Topic segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. Topic segmentation has extensively been used in information retrieval and text summarization. In particular, our architecture proposes a language-independent topic segmentation system that solves three main problems evidenced by previous research: systems based uniquely on lexical repetition that show reliability problems, systems based on lexical cohesion using existing linguistic resources that are usually available only for dominating languages and as a consequence do not apply to less favored languages and finally systems that need previously existing harvesting training data. For that purpose, we only use statistics on words and sequences of words based on a set of texts. This solution provides a flexible solution that may narrow the gap between dominating languages and less favored languages thus allowing equivalent access to information

Bulgarian Digital Mathematics Library at IMI-BAS

Exploring the feasability and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications.

Author: Magerman Tom
Song Xiaoyan
Van Looy Bart
Publication venue
Publication date
Field of study

Research Papers in Economics

Weakly-supervised appraisal analysis

Author: Carroll John
Read Jonathon Lee
Publication venue: CSLI Publications
Publication date: 01/01/2012
Field of study

This article is concerned with the computational treatment of Appraisal, a Systemic Functional Linguistic theory of the types of language employed to communicate opinion in English. The theory considers aspects such as Attitude (how writers communicate their point of view), Engagement (how writers align themselves with respect to the opinions of others) and Graduation (how writers amplify or diminish their attitudes and engagements). To analyse text according to the theory we employ a weakly-supervised approach to text classification, which involves comparing the similarity of words with prototypical examples of classes. We evaluate the method's performance using a collection of book reviews annotated according to the Appraisal theory

Sussex Research Online

Semantic Networks and Applications in Public Opinion Research

Author: González-Bailón Sandra
Yang Sijia
Publication venue: ScholarlyCommons
Publication date: 01/01/2018
Field of study

ScholarlyCommons@Penn