Search CORE

798 research outputs found

ChartCheck: An Evidence-Based Fact-Checking Dataset over Real-World Chart Images

Author: Akhtar Mubashara
Cocarascu Oana
Gupta Vivek
Simperl Elena
Subedi Nikesh
Tahmasebi Sahar
Publication venue
Publication date: 13/11/2023
Field of study

Data visualizations are common in the real-world. We often use them in data sources such as scientific documents, news articles, textbooks, and social media to summarize key information in a visual form. Charts can also mislead its audience by communicating false information or biasing them towards a specific agenda. Verifying claims against charts is not a straightforward process. It requires analyzing both the text and visual components of the chart, considering characteristics such as colors, positions, and orientations. Moreover, to determine if a claim is supported by the chart content often requires different types of reasoning. To address this challenge, we introduce ChartCheck, a novel dataset for fact-checking against chart images. ChartCheck is the first large-scale dataset with 1.7k real-world charts and 10.5k human-written claims and explanations. We evaluated the dataset on state-of-the-art models and achieved an accuracy of 73.9 in the finetuned setting. Additionally, we identified chart characteristics and reasoning types that challenge the models

arXiv.org e-Print Archive

A systematic literature review on Wikidata

Author: García Barriocanal María Elena
Mora Cantallops Marçal
Sánchez Alonso Salvador
Publication venue: Emerald Publishing Limited
Publication date: 01/07/2019
Field of study

To review the current status of research on Wikidata and, in particular, of articles that either describe applications of Wikidata or provide empirical evidence, in order to uncover the topics of interest, the fields that are benefiting from its applications and which researchers and institutions are leading the work

e_Buah - Biblioteca Digital de la Universidad de Alcalá

A matter of words: NLP for quality evaluation of Wikipedia medical articles

Author: B Stvilia
DMW Powers
E Marzini
F Cabitza
G Pasi
K Wecel
K Wu
M Hall
NV Chawla
O Bodenreider
SA Azer
TL Saaty
TM Cover
Publication venue
Publication date: 01/01/2016
Field of study

Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza

Online Research Database In Technology

Archivio istituzionale della ricerca - Università di Padova

Biological Systems Workbook: Data modelling and simulations at molecular level

Author: Di Geronimo Bruno
Klett Javier
León Carlos
Publication venue: 'Universidad Carlos III de Madrid'
Publication date: 01/05/2021
Field of study

Nowadays, there are huge quantities of data surrounding the different fields of biology derived from experiments and theoretical simulations, where results are often stored in biological databases that are growing at a vertiginous rate every year. Therefore, there is an increasing research interest in the application of mathematical and physical models able to produce reliable predictions and explanations to understand and rationalize that information. All these investigations are helping to overcome biological questions pushing forward in the solution of problems faced by our society. In this Biological Systems Workbook, we aim to introduce the basic pieces allowing life to take place, from the 3D structural point of view. We will start learning how to look at the 3D structure of molecules from studying small organic molecules used as drugs. Meanwhile, we will learn some methods that help us to generate models of these structures. Then we will move to more complex natural organic molecules as lipid or carbohydrates, learning how to estimate and reproduce their dynamics. Later, we will revise the structure of more complex macromolecules as proteins or DNA. Along this process, we will refer to different computational tools and databases that will help us to search, analyze and model the different molecular systems studied in this course

Universidad Carlos III de Madrid e-Archivo

Discovery and publishing of primary biodiversity data associated with multimedia resources: The Audubon Core strategies and approaches

Author: Barve Vijay
Carausu Mihail
Chavan Vishwas
Cuadra José
Freeland Chris
Hagedorn Gregor
Leary Patrick
Morris Robert A
Mozzherin Dimitry
Olson Annette
Riccardi Gregory
Teage Ivan
Whitbread Greg
Publication venue: 'The University of Kansas'
Publication date: 01/07/2013
Field of study

The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections, now in the final stages as a proposed Biodiversity Informatics Standards (TDWG) standard. By defining only six terms as mandatory, it seeks to lighten the burden for providing or using multimedia useful for biodiversity science. At the same time it offers rich optional metadata terms that can help curators of multimedia collections provide authoritative media that document species occurrence, ecosystems, identification tools, ontologies, and many other kinds of biodiversity documents or data. About half of the vocabulary is re-used from other relevant controlled vocabularies that are often already in use for multimedia metadata, thereby reducing the mapping burden on existing repositories. A central design goal is to allow consuming applications to have a high likelihood of discovering suitable resources, reducing the human examination effort that might be required to decide if the resource is fit for the purpose of the application

Directory of Open Access Journals

The University of Kansas: Journals@KU

Biodiversity Informatics

NFDI4Culture - Consortium for research data on material and immaterial cultural heritage

Author: Altenhöner Reinhard
Bicher Katrin
Blümel Ina
Boehm Franziska
Bove Jens
Bracht Christian
Brand Ortrun
Dieckmann Lisa
Effinger Maria
Hagener Malte
Hammes Andrea
Heller Lambert
Kailus Angela
Kohle Hubertus
Ludwig Jens
Münzmay Andreas
Pittroff Sarah
Razum Matthias
Röwenstrunk Daniel
Sack Harald
Schmidt Dörte
Schrade Torsten
Simon Holger
Walzel Annika-Valeska
Wiermann Barbara
Publication venue: Pensoft Publishers
Publication date: 01/01/2020
Field of study

Digital data on tangible and intangible cultural assets is an essential part of daily life, communication and experience. It has a lasting influence on the perception of cultural identity as well as on the interactions between research, the cultural economy and society. Throughout the last three decades, many cultural heritage institutions have contributed a wealth of digital representations of cultural assets (2D digital reproductions of paintings, sheet music, 3D digital models of sculptures, monuments, rooms, buildings), audio-visual data (music, film, stage performances), and procedural research data such as encoding and annotation formats. The long-term preservation and FAIR availability of research data from the cultural heritage domain is fundamentally important, not only for future academic success in the humanities but also for the cultural identity of individuals and society as a whole. Up to now, no coordinated effort for professional research data management on a national level exists in Germany. NFDI4Culture aims to fill this gap and create a usercentered, research-driven infrastructure that will cover a broad range of research domains from musicology, art history and architecture to performance, theatre, film, and media studies. The research landscape addressed by the consortium is characterized by strong institutional differentiation. Research units in the consortium's community of interest comprise university institutes, art colleges, academies, galleries, libraries, archives and museums. This diverse landscape is also characterized by an abundance of research objects, methodologies and a great potential for data-driven research. In a unique effort carried out by the applicant and co-applicants of this proposal and ten academic societies, this community is interconnected for the first time through a federated approach that is ideally suited to the needs of the participating researchers. To promote collaboration within the NFDI, to share knowledge and technology and to provide extensive support for its users have been the guiding principles of the consortium from the beginning and will be at the heart of all workflows and decision-making processes. Thanks to these principles, NFDI4Culture has gathered strong support ranging from individual researchers to highlevel cultural heritage organizations such as the UNESCO, the International Council of Museums, the Open Knowledge Foundation and Wikimedia. On this basis, NFDI4Culture will take innovative measures that promote a cultural change towards a more reflective and sustainable handling of research data and at the same time boost qualification and professionalization in data-driven research in the domain of cultural heritage. This will create a long-lasting impact on science, cultural economy and society as a whole

KITopen

Server für wissenschaftliche Schriften der Hochschule Hannover

The People’s Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia

Author: Lanamäki Arto
Mehdi Mohamad
Mesgari Mostafa
Nielsen Finn Årup
Okoli Chitu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Crossref

Online Research Database In Technology

Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Author: Conrad Tim OF
Ferrer Eloi
Mietchen Daniel
Pusch Larissa
Schubotz Moritz
Stegmüller Johannes
Teschke Olaf
Publication venue
Publication date: 20/09/2023
Field of study

Mathematical world knowledge is a fundamental component of Wikidata. However, to date, no expertly curated knowledge graph has focused specifically on contemporary mathematics. Addressing this gap, the Mathematical Research Data Initiative (MaRDI) has developed a comprehensive knowledge graph that links multimodal research data in mathematics. This encompasses traditional research data items like datasets, software, and publications and includes semantically advanced objects such as mathematical formulas and hypotheses. This paper details the abilities of the MaRDI knowledge graph, which is based on Wikibase, leading up to its inaugural public release, codenamed Bravo, available on https://portal.mardi4nfdi.de.Comment: Accepted at Wikidata'23: Wikidata workshop at ISWC 202

arXiv.org e-Print Archive

Overview of the TREC 2013 federated web search track

Author: Demeester Thomas
Hiemstra D
Nguyen D
Trieschnigg D
Publication venue
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

Author: Jones Shawn M.
Oyen Diane
Publication venue
Publication date: 03/11/2022
Field of study

Much computer vision research has focused on natural images, but technical documents typically consist of abstract images, such as charts, drawings, diagrams, and schematics. How well do general web search engines discover abstract images? Recent advancements in computer vision and machine learning have led to the rise of reverse image search engines. Where conventional search engines accept a text query and return a set of document results, including images, a reverse image search accepts an image as a query and returns a set of images as results. This paper evaluates how well common reverse image search engines discover abstract images. We conducted an experiment leveraging images from Wikimedia Commons, a website known to be well indexed by Baidu, Bing, Google, and Yandex. We measure how difficult an image is to find again (retrievability), what percentage of images returned are relevant (precision), and the average number of results a visitor must review before finding the submitted image (mean reciprocal rank). When trying to discover the same image again among similar images, Yandex performs best. When searching for pages containing a specific image, Google and Yandex outperform the others when discovering photographs with precision scores ranging from 0.8191 to 0.8297, respectively. In both of these cases, Google and Yandex perform better with natural images than with abstract ones achieving a difference in retrievability as high as 54\% between images in these categories. These results affect anyone applying common web search engines to search for technical documents that use abstract images.Comment: 20 pages; 7 figures; to be published in the proceedings of the Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop from ECCV 202

arXiv.org e-Print Archive