654 research outputs found

    D6.1: Technologies and Tools for Lexical Acquisition

    Get PDF
    This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

    Extraction and Classification of App Features from App Reviews

    Get PDF
    Aasta aastalt on kasvanud bioinformaatikas kasutatavate rakenduste arv.Selle tulemusena on konkreetse Ć¼lesande lahendamiseks sobiliku rakenduse leidmine muutunud keerukaks Ć¼lesandeks.Rakenduste kirjelduste paremaks sĆ¼stematiseerimiseks ja otsitavaks muutmiseks on kasutusele vƵetud erinevaid mƤrksƵnade ontoloogiaid. Hetkel annoteeritakse kirjeldusi kƤsitsi, mis on ajamahukas ning ei anna alati Ƶigeid tulemusi.Antud tƶƶs kirjeldame uut annoteerimismeetodit, mis pakub automaatselt vƤlja Ć¼he vƵi mitu mƤrksƵna kasutades selleks vaid tƶƶriista vabatekstilist kirjeldust.Selleks kasutab meie meetod uusimaid loomuliku keele tƶƶtlemise meetodeid nagu Dirichlet' peitlahutus (latent Dirichlet allocation) ja sƵnade vektoresitust (word2vec).Esmane vƵrdlus meie poolt vƤlja pakutud algoritmi ja kƤsitsi saadud mƤrgendusega nƤitab, et tulemused on paljulubavad.The number of tools for bioinformatics is constantly increasing. To organize the available information and to facilitate the search, different ontologies are used. Today annotation of new descriptions is done manually, which is time-consuming and not always correct. We proposed a new annotation method, which, based on the description of the tool, offers one or more annotation labels in accordance with the ontology. In our method, we applied modern methods of natural language processing, such as latent Dirichlet allocation and word2vec. We compared the manual annotation labels with the labels obtained by using our algorithm and the first results look auspicious

    Uvid u automatsko izlučivanje metaforičkih kolokacija

    Get PDF
    Collocations have been the subject of much scientific research over the years. The focus of this research is on a subset of collocations, namely metaphorical collocations. In metaphorical collocations, a semantic shift has taken place in one of the components, i.e., one of the components takes on a transferred meaning. The main goal of this paper is to review the existing literature and provide a systematic overview of the existing research on collocation extraction, as well as the overview of existing methods, measures, and resources. The existing research is classified according to the approach (statistical, hybrid, and distributional semantics) and presented in three separate sections. The insights gained from existing research serve as a first step in exploring the possibility of developing a method for automatic extraction of metaphorical collocations. The methods, tools, and resources that may prove useful for future work are highlighted.Kolokacije su već dugi niz godina tema mnogih znanstvenih istraživanja. U fokusu ovoga istraživanja podskupina je kolokacija koju čine metaforičke kolokacije. Kod metaforičkih je kolokacija kod jedne od sastavnica doÅ”lo do semantičkoga pomaka, tj. jedna od sastavnica poprima preneseno značenje. Glavni su ciljevi ovoga rada istražiti postojeću literaturu te dati sustavan pregled postojećih istraživanja na temu izlučivanja kolokacija i postojećih metoda, mjera i resursa. Postojeća istraživanja opisana su i klasificirana prema različitim pristupima (statistički, hibridni i zasnovani na distribucijskoj semantici). Također su opisane različite asocijativne mjere i postojeći načini procjene rezultata automatskoga izlučivanja kolokacija. Metode, alati i resursi koji su koriÅ”teni u prethodnim istraživanjima, a mogli bi biti korisni za naÅ” budući rad posebno su istaknuti. Stečeni uvidi u postojeća istraživanja čine prvi korak u razmatranju mogućnosti razvijanja postupka za automatsko izlučivanje metaforičkih kolokacija

    Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words

    Get PDF
    In this paper, we present an innovative topic segmentation system based on a new informative similarity measure that takes into account word co-occurrence in order to avoid the accessibility to existing linguistic resources such as electronic dictionaries or lexico-semantic databases such as thesauri or ontology. Topic segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. Topic segmentation has extensively been used in information retrieval and text summarization. In particular, our architecture proposes a language-independent topic segmentation system that solves three main problems evidenced by previous research: systems based uniquely on lexical repetition that show reliability problems, systems based on lexical cohesion using existing linguistic resources that are usually available only for dominating languages and as a consequence do not apply to less favored languages and finally systems that need previously existing harvesting training data. For that purpose, we only use statistics on words and sequences of words based on a set of texts. This solution provides a flexible solution that may narrow the gap between dominating languages and less favored languages thus allowing equivalent access to information

    Weakly-supervised appraisal analysis

    Get PDF
    This article is concerned with the computational treatment of Appraisal, a Systemic Functional Linguistic theory of the types of language employed to communicate opinion in English. The theory considers aspects such as Attitude (how writers communicate their point of view), Engagement (how writers align themselves with respect to the opinions of others) and Graduation (how writers amplify or diminish their attitudes and engagements). To analyse text according to the theory we employ a weakly-supervised approach to text classification, which involves comparing the similarity of words with prototypical examples of classes. We evaluate the method's performance using a collection of book reviews annotated according to the Appraisal theory
    • ā€¦
    corecore