22 research outputs found

    An approach to automated thesaurus construction using clusterization-based dictionary analysis

    Get PDF
    In the paper an automated approach for construction of the terminological thesaurus for a specific domain is proposed. It uses an explanatory dictionary as the initial text corpus and a controlled vocabulary related to the target lexicon to initiate extraction of the terms for the thesaurus. Subdivision of the terms into semantic clusters is based on the CLOPE clustering algorithm. The approach diminishes the cost of the thesaurus creation by involving the expert only once during the whole construction process, and only for analysis of a small subset of the initial dictionary. To validate the performance of the proposed approach the authors successfully constructed a thesaurus in the cardiology domain

    Ontology-based Competency Analyses in New Research Domains

    Get PDF
    Ontology-driven methods of competence management oriented on support of scientific research for new domains are proposed. Ontologies of research domain are matched with personal information about scientific researchers represented into Web (for example, at the social networks) and results of their work (publications, monographs, reports etc.) are processed by logical methods and ontological analysis. Web-services and multi-agent programming paradigm are used for their software realization

    Hierarchical text clustering applied to taxonomy evaluation

    Full text link
    In computer science, the use for taxonomies is widely embraced in fields such as Artifial Inteligence, Information Retrieval, Natural Language Processing or Machine Learning. This concept classifications provide knowledge structures to guide algorithms on the task to find an acceptable-to-nearly-optimal solution on non deterministic problems. The main problem with taxonomies is the huge amount of effort that requires to build one. Traditionally, this is done by human means and involves a team of experts to assure the quality of the result. Since this is evidently the way to get the best taxonomy possible (knowledge is an exclusive quality of humans), due to the manpower factor, it seems to be neither the fastest nor the cheapest one. This thesis makes an extensive review of the state of the art on taxonomy induction techniques as well as ontology evaluation methods. It claims the need for a fast, automatic and arbitrary-domain taxonomy generation method and justifies the chose of the Wikipedia encyclopedia as the dataset. A framework to deal with taxonomies is proposed and implemented. In the experiments chapter, two statements are successfully refuted: the Wikipedia categorization system forms an acyclic directed graph, and the longest path between two nodes is equivalent to the taxonomic organization. Finally the framework is used to explore three arbitrary domains

    Automatic Terminology Coding for the Biomedical Domain

    Get PDF
    The biomedical sector, rich in unstructured data from sources like clinical notes and health records, presents a prime opportunity for Natural Language Processing (NLP) applications. Especially pivotal is the task of entity linking, wherein textual mentions are mapped to medical concepts within a knowledge base, in this case, represented by the Unified Medical Language System (UMLS) Metathesaurus. Within this realm, the Italian language faces resource constraints (only 4% of UMLS 4M concepts have a label in the Italian language). Current systems like MAPS Group’s Clinika software lean on label matching to link the extracted facts to the corresponding UMLS concepts. This dissertation deals with the design of a new Clinika component aimed at enhancing entity linking for Italian terms against UMLS, even in the absence of direct Italian labels. Employing transformer-based multilingual embeddings, a novel 'concept guesser' architecture was developed to tackle the linking challenge intelligently, maximizing the level of exploitation of the currently available knowledge. This innovation not only enhances Clinika’s effectiveness but also paves the way for advanced multilingual clinical decision support systems

    Bibliographic Control in the Digital Ecosystem

    Get PDF
    With the contributions of international experts, the book aims to explore the new boundaries of universal bibliographic control. Bibliographic control is radically changing because the bibliographic universe is radically changing: resources, agents, technologies, standards and practices. Among the main topics addressed: library cooperation networks; legal deposit; national bibliographies; new tools and standards (IFLA LRM, RDA, BIBFRAME); authority control and new alliances (Wikidata, Wikibase, Identifiers); new ways of indexing resources (artificial intelligence); institutional repositories; new book supply chain; “discoverability” in the IIIF digital ecosystem; role of thesauri and ontologies in the digital ecosystem; bibliographic control and search engines

    Big data-driven multimodal traffic management : trends and challenges

    Get PDF

    Social work with airports passengers

    Get PDF
    Social work at the airport is in to offer to passengers social services. The main methodological position is that people are under stress, which characterized by a particular set of characteristics in appearance and behavior. In such circumstances passenger attracts in his actions some attention. Only person whom he trusts can help him with the documents or psychologically

    A robust methodology for automated essay grading

    Get PDF
    None of the available automated essay grading systems can be used to grade essays according to the National Assessment Program – Literacy and Numeracy (NAPLAN) analytic scoring rubric used in Australia. This thesis is a humble effort to address this limitation. The objective of this thesis is to develop a robust methodology for automatically grading essays based on the NAPLAN rubric by using heuristics and rules based on English language and neural network modelling

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF
    corecore