4,855 research outputs found

    Do peers see more in a paper than its authors?

    Get PDF
    Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances-sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

    Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework

    Full text link
    Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence, especially when tackling complex tasks. While the chain-of-thought (CoT) technique has gained considerable attention, the existing ScienceQA dataset, which focuses on multimodal scientific questions and explanations from elementary and high school textbooks, lacks a comprehensive evaluation of diverse approaches. To address this gap, we present COCO Multi-Modal Reasoning Dataset(COCO-MMRD), a novel dataset that encompasses an extensive collection of open-ended questions, rationales, and answers derived from the large object dataset COCO. Unlike previous datasets that rely on multiple-choice questions, our dataset pioneers the use of open-ended questions in the context of multimodal CoT, introducing a more challenging problem that effectively assesses the reasoning capability of CoT models. Through comprehensive evaluations and detailed analyses, we provide valuable insights and propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders. Extensive experiments demonstrate the efficacy of the proposed dataset and techniques, offering novel perspectives for advancing multimodal reasoning

    SWA-KMDLS: An Enhanced e-Learning Management System Using Semantic Web and Knowledge Management Technology

    Get PDF
    In this era of knowledge economy in which knowledge have become the most precious resource, surveys have shown that e-Learning has been on the increasing trend in various organizations including, among others, education and corporate. The use of e-Learning is not only aim to acquire knowledge but also to maintain competitiveness and advantages for individuals or organizations. However, the early promise of e-Learning has yet to be fully realized, as it has been no more than a handout being published online, coupled with simple multiple-choice quizzes. The emerging of e-Learning 2.0 that is empowered by Web 2.0 technology still hardly overcome common problem such as information overload and poor content aggregation in a highly increasing number of learning objects in an e-Learning Management System (LMS) environment. The aim of this research study is to exploit the Semantic Web (SW) and Knowledge Management (KM) technology; the two emerging and promising technology to enhance the existing LMS. The proposed system is named as Semantic Web Aware-Knowledge Management Driven e-Learning System (SWA-KMDLS). An Ontology approach that is the backbone of SW and KM is introduced for managing knowledge especially from learning object and developing automated question answering system (Aquas) with expert locator in SWA-KMDLS. The METHONTOLOGY methodology is selected to develop the Ontology in this research work. The potential of SW and KM technology is identified in this research finding which will benefit e-Learning developer to develop e-Learning system especially with social constructivist pedagogical approach from the point of view of KM framework and SW environment. The (semi-) automatic ontological knowledge base construction system (SAOKBCS) has contributed to knowledge extraction from learning object semiautomatically whilst the Aquas with expert locator has facilitated knowledge retrieval that encourages knowledge sharing in e-Learning environment. The experiment conducted has shown that the SAOKBCS can extract concept that is the main component of Ontology from text learning object with precision of 86.67%, thus saving the expert time and effort to build Ontology manually. Additionally the experiment on Aquas has shown that more than 80% of users are satisfied with answers provided by the system. The expert locator framework can also improve the performance of Aquas in the future usage. Keywords: semantic web aware – knowledge e-Learning Management System (SWAKMDLS), semi-automatic ontological knowledge base construction system (SAOKBCS), automated question answering system (Aquas), Ontology, expert locator

    AMMO-Prot: amine system project 3D-model finder

    Get PDF
    BACKGROUND: Amines are biogenic amino acid derivatives, which play pleiotropic and very important yet complex roles in animal physiology. For many other relevant biomolecules, biochemical and molecular data are being accumulated, which need to be integrated in order to be effective in the advance of biological knowledge in the field. For this purpose, a multidisciplinary group has started an ontology-based system named the Amine System Project (ASP) for which amine-related information is the validation bench. RESULTS: In this paper, we describe the Ontology-Based Mediator developed in the Amine System Project (http://asp.uma.es) using the infrastructure of Semantic Directories, and how this system has been used to solve a case related to amine metabolism-related protein structures. CONCLUSIONS: This infrastructure is used to publish and manage not only ontologies and their relationships, but also metadata relating to the resources committed with the ontologies. The system developed is available at http://asp.uma.es/WebMediator

    Hadooping the genome: The impact of big data tools on biology

    Get PDF
    This essay examines the consequences of the so-called ‘big data’ technologies in biomedicine. Analyzing algorithms and data structures used by biologists can provide insight into how biologists perceive and understand their objects of study. As such, I examine some of the most widely used algorithms in genomics: those used for sequence comparison or sequence mapping. These algorithms are derived from the powerful tools for text searching and indexing that have been developed since the 1950s and now play an important role in online search. In biology, sequence comparison algorithms have been used to assemble genomes, process next-generation sequence data, and, most recently, for ‘precision medicine.’ I argue that the predominance of a specific set of text-matching and pattern-finding tools has influenced problem choice in genomics. It allowed genomics to continue to think of genomes as textual objects and to increasingly lock genomics into ‘big data’-driven text-searching methods. Many ‘big data’ methods are designed for finding patterns in human-written texts. However, genomes and other’ omic data are not human-written and are unlikely to be meaningful in the same way

    A P2P Integration Architecture for Protein Resources

    Get PDF
    The availability of a direct pathway from a primary sequence (denovo or DNA derived) to macromolecular structure to biological function using computer-based tools is the ultimate goal for a protein scientist. Today\u27s state of the art protein resources and on-going research and experiments provide the raw data that can enable protein scientists to achieve at least some steps of this goal. Thus, protein scientists are looking towards taking their benchtop research from the specific to a much broader base of using the large resources of available electronic information. However, currently the burden falls on the scientist to manually interface with each data resource, integrate the required information, and then finally interpret the results. Their discoveries are impeded by the lack of tools that can not only bring integrated information from several known data resources, but also weave in information as it is discovered and brought online by other research groups. We propose a novel peer-to-peer based architecture that allows protein scientists to share resources in the form of data and tools within their community, facilitating ad hoc, decentralized sharing of data. In this paper, we present an overview of this integration architecture and briefly describe the tools that are essential to this framework

    Normalized Web Distance and Word Similarity

    Get PDF
    There is a great deal of work in cognitive psychology, linguistics, and computer science, about using word (or phrase) frequencies in context in text corpora to develop measures for word similarity or word association, going back to at least the 1960s. The goal of this chapter is to introduce the normalizedis a general way to tap the amorphous low-grade knowledge available for free on the Internet, typed in by local users aiming at personal gratification of diverse objectives, and yet globally achieving what is effectively the largest semantic electronic database in the world. Moreover, this database is available for all by using any search engine that can return aggregate page-count estimates for a large range of search-queries. In the paper introducing the NWD it was called `normalized Google distance (NGD),' but since Google doesn't allow computer searches anymore, we opt for the more neutral and descriptive NWD. web distance (NWD) method to determine similarity between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN 978-142008592

    Similarity Reasoning over Semantic Context-Graphs

    Get PDF
    Similarity is a central cognitive mechanism for humans which enables a broad range of perceptual and abstraction processes, including recognizing and categorizing objects, drawing parallelism, and predicting outcomes. It has been studied computationally through models designed to replicate human judgment. The work presented in this dissertation leverages general purpose semantic networks to derive similarity measures in a problem-independent manner. We model both general and relational similarity using connectivity between concepts within semantic networks. Our first contribution is to model general similarity using concept connectivity, which we use to partition vocabularies into topics without the need of document corpora. We apply this model to derive topics from unstructured dialog, specifically enabling an early literacy primer application to support parents in having better conversations with their young children, as they are using the primer together. Second, we model relational similarity in proportional analogies. To do so, we derive relational parallelism by searching in semantic networks for similar path pairs that connect either side of this analogy statement. We then derive human readable explanations from the resulting similar path pair. We show that our model can answer broad-vocabulary analogy questions designed for human test takers with high confidence. The third contribution is to enable symbolic plan repair in robot planning through object substitution. When a failure occurs due to unforeseen changes in the environment, such as missing objects, we enable the planning domain to be extended with a number of alternative objects such that the plan can be repaired and execution to continue. To evaluate this type of similarity, we use both general and relational similarity. We demonstrate that the task context is essential in establishing which objects are interchangeable

    Semantic linking of complex properties, monitoring processes and facilities in web-based representations of the environment

    Get PDF
    Where a virtual representation of the Earth must contain data values observed within the physical Earth system, data models are required that allow the integration of data across the silos of various Earth and environmental sciences domains. Creating a mapping between the well-defined terminologies of these silos is a stubborn problem. This paper presents a generalised ontology for use within Web 3.0 services, which builds on European Commission spatial data infrastructure models. The presented ontology acknowledges that there are many complexities to the description of environmental properties which can be observed within the physical Earth system. The ontology is shown to be flexible and robust enough to describe concepts drawn from a range of Earth science disciplines, including ecology, geochemistry, hydrology and oceanography. This paper also demonstrates the alignment and compatibility of the ontology with existing systems and shows applications in which the ontology may be deployed
    • 

    corecore