Search CORE

8 research outputs found

Epistemic logic for metadata modelling from scientific papers on Covid-19

Author: Cuconato Simone
Publication venue: APAV
Publication date: 31/12/2021
Field of study

The field of epistemic logic developed into an interdisciplinary area focused on explicating epistemic issues in, for example, artificial intelligence, computer security, game theory, economics, multiagent systems and the social sciences. Inspired, in part, by issues in these different ‘application’ areas, in this paper I propose an epistemic logic T for metadata extracted from scientific papers on COVID-19. More in details, I introduce a structure S to syntactically and semantically modelling metadata extracted with systems for extracting structured metadata from scientific articles in a born-digital form. These systems will be considered, in the logical model created, as ‘Metadata extraction agents’ (MEA). In this case MEA taken into consideration are CERMINE and TeamBeam. In an increasingly data-driven world, modelling data or metadata means to help systematise existing information and support the research community in building solutions to the COVID-19 pandemic

Science & Philosophy

APAV - Academy of Sciences, Letters, Arts and Technology (E-Journals)

Towards a Modular Recommender System for Research Papers written in Albanian

Author: Gani Eriglen
Greca Silvana
Hoxha Klesti
Kika Alda
Publication venue: 'The Science and Information Organization'
Publication date: 01/05/2014
Field of study

In the recent years there has been an increase in scientific papers publications in Albania and its neighboring countries that have large communities of Albanian speaking researchers. Many of these papers are written in Albanian. It is a very time consuming task to find papers related to the researchers' work, because there is no concrete system that facilitates this process. In this paper we present the design of a modular intelligent search system for articles written in Albanian. The main part of it is the recommender module that facilitates searching by providing relevant articles to the users (in comparison with a given one). We used a cosine similarity based heuristics that differentiates the importance of term frequencies based on their location in the article. We did not notice big differences on the recommendation results when using different combinations of the importance factors of the keywords, title, abstract and body. We got similar results when using only the title and abstract in comparison with the other combinations. Because we got fairly good results in this initial approach, we believe that similar recommender systems for documents written in Albanian can be build also in contexts not related to scientific publishing.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Doctor of Philosophy

Author: Bui Duy Duc an
Publication venue: University of Utah
Publication date: 01/01/2015
Field of study

dissertationMedical knowledge learned in medical school can become quickly outdated given the tremendous growth of the biomedical literature. It is the responsibility of medical practitioners to continuously update their knowledge with recent, best available clinical evidence to make informed decisions about patient care. However, clinicians often have little time to spend on reading the primary literature even within their narrow specialty. As a result, they often rely on systematic evidence reviews developed by medical experts to fulfill their information needs. At the present, systematic reviews of clinical research are manually created and updated, which is expensive, slow, and unable to keep up with the rapidly growing pace of medical literature. This dissertation research aims to enhance the traditional systematic review development process using computer-aided solutions. The first study investigates query expansion and scientific quality ranking approaches to enhance literature search on clinical guideline topics. The study showed that unsupervised methods can improve retrieval performance of a popular biomedical search engine (PubMed). The proposed methods improve the comprehensiveness of literature search and increase the ratio of finding relevant studies with reduced screening effort. The second and third studies aim to enhance the traditional manual data extraction process. The second study developed a framework to extract and classify texts from PDF reports. This study demonstrated that a rule-based multipass sieve approach is more effective than a machine-learning approach in categorizing document-level structures and iv that classifying and filtering publication metadata and semistructured texts enhances the performance of an information extraction system. The proposed method could serve as a document processing step in any text mining research on PDF documents. The third study proposed a solution for the computer-aided data extraction by recommending relevant sentences and key phrases extracted from publication reports. This study demonstrated that using a machine-learning classifier to prioritize sentences for specific data elements performs equally or better than an abstract screening approach, and might save time and reduce errors in the full-text screening process. In summary, this dissertation showed that there are promising opportunities for technology enhancement to assist in the development of systematic reviews. In this modern age when computing resources are getting cheaper and more powerful, the failure to apply computer technologies to assist and optimize the manual processes is a lost opportunity to improve the timeliness of systematic reviews. This research provides methodologies and tests hypotheses, which can serve as the basis for further large-scale software engineering projects aimed at fully realizing the prospect of computer-aided systematic reviews

The University of Utah: J. Willard Marriott Digital Library

A semi-automatic approach for detecting dataset references in social science texts

Author: Afzal
Kaur
Nadeau
Powers
Salton
Zhang
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref

CERMINE: automatic extraction of structured metadata from scientific literature

Author: A McCallum
C Chang
CH Lee
Dominika Tkaczyk
J Zou
L O’Gorman
LA Goodman
M Luong
Mateusz Fedoryszak
Paweł Szostek
Piotr Jan Dendek
R Kern
T Smith
X Zhang
Łukasz Bolikowski
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Extracción automática de metadatos para la administración del Repositorio Institucional de la Universidad Nacional del Altiplano Puno

Author: Herrera Urtiaga Alain Paul
Publication venue: 'Baishideng Publishing Group Inc.'
Publication date: 22/07/2022
Field of study

Los Repositorios Institucionales permiten organizar y preservar la producción científica de una Institución, la presente investigación tiene como finalidad optimizar la extracción de metadatos y publicación de documentos de investigación procesos fundamentales para la administración de Repositorios Institucionales que requieren de tiempo, mediante la implementación del software “E-MeRI”, cuya población se compone por 1518 documentos de investigación. Para el desarrollo del sistema se utilizó la programación por capas y para el contraste de la hipótesis se utilizó prueba t para muestras relacionadas. Con respecto a la extracción automática se elaboró un algoritmo mediante técnicas de procesamiento de lenguaje natural, al cual se determinó la complejidad algorítmica lineal O(n) y demostró ser eficiente en comparación a otras herramientas extractoras. A la misma vez se determinó el nivel de precisión entre 96% y 99% de resultados correctos en base a las métricas Precisión y Recall. De la diferencia del tiempo de extracción, el sistema logra reducir en 5 minutos y 21 segundos por documento y permitió extraer en un minuto 4 documentos. Se concluye que la extracción automática de metadatos y la publicación de documentos de investigación mejoran la administración del Repositorio Institucional de la Universidad Nacional del Altiplano, reduciendo el tiempo de extracción y publicación de forma significativa con un valor p (0.000)< α=0.05, además la evaluación del software basado en la norma ISO 25000 obtuvo un valor de 8.93 de calidad total, logrando un nivel cumple con los requisitos y un grado muy satisfactorio

Repositorio Institucional Digital de la Universidad Nacional del Altiplano Puno (UNAP)

Recommended from our members

Linking Textual Resources to Support Information Discovery

Author: Knoth Petr
Publication venue
Publication date: 14/05/2015
Field of study

A vast amount of information is today stored in the form of textual documents, many of which are available online. These documents come from different sources and are of different types. They include newspaper articles, books, corporate reports, encyclopedia entries and research papers. At a semantic level, these documents contain knowledge, which was created by explicitly connecting information and expressing it in the form of a natural language. However, a significant amount of knowledge is not explicitly stated in a single document, yet can be derived or discovered by researching, i.e. accessing, comparing, contrasting and analysing, information from multiple documents. Carrying out this work using traditional search interfaces is tedious due to information overload and the difficulty of formulating queries that would help us to discover information we are not aware of. In order to support this exploratory process, we need to be able to effectively navigate between related pieces of information across documents. While information can be connected using manually curated cross-document links, this approach not only does not scale, but cannot systematically assist us in the discovery of sometimes non-obvious (hidden) relationships. Consequently, there is a need for automatic approaches to link discovery. This work studies how people link content, investigates the properties of different link types, presents new methods for automatic link discovery and designs a system in which link discovery is applied on a collection of millions of documents to improve access to public knowledge

Open Research Online (The Open University)