4,182 research outputs found

    Do peers see more in a paper than its authors?

    Get PDF
    Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances-sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

    Hypotheses, evidence and relationships: The HypER approach for representing scientific knowledge claims

    Get PDF
    Biological knowledge is increasingly represented as a collection of (entity-relationship-entity) triplets. These are queried, mined, appended to papers, and published. However, this representation ignores the argumentation contained within a paper and the relationships between hypotheses, claims and evidence put forth in the article. In this paper, we propose an alternate view of the research article as a network of 'hypotheses and evidence'. Our knowledge representation focuses on scientific discourse as a rhetorical activity, which leads to a different direction in the development of tools and processes for modeling this discourse. We propose to extract knowledge from the article to allow the construction of a system where a specific scientific claim is connected, through trails of meaningful relationships, to experimental evidence. We discuss some current efforts and future plans in this area

    Extracting Scales of Measurement Automatically from Biomedical Text with Special Emphasis on Comparative and Superlative Scales

    Get PDF
    Abstract In this thesis, the focus is on the topic of “Extracting Scales of Measurement Automatically from Biomedical Text with Special Emphasis on Comparative and Superlative Scales.” Comparison sentences, when considered as a critical part of scales of measurement, play a highly significant role in the process of gathering information from a large number of biomedical research papers. A comparison sentence is defined as any sentence that contains two or more entities that are being compared. This thesis discusses several different types of comparison sentences such as gradable comparisons and non-gradable comparisons. The main goal is extracting comparison sentences automatically from the full text of biomedical articles. Therefore, the thesis presents a Java program that could be used to analyze biomedical text to identify comparison sentences by matching the sentences in the text to 37 syntactic and semantic features. These features or qualities would be helpful to extract comparative sentences from any biomedical text. Two machine learning techniques are used with the 37 roles to assess the curated dataset. The results of this study are compared with earlier studies

    Automated PDF highlighting to support faster curation of literature for Parkinson's and Alzheimer's disease

    Get PDF
    Neurodegenerative disorders such as Parkinson’s and Alzheimer’s disease are devastating and costly illnesses, a source of major global burden. In order to provide successful interventions for patients and reduce costs, both causes and pathological processes need to be understood. The ApiNATOMY project aims to contribute to our understanding of neurodegenerative disorders by manually curating and abstracting data from the vast body of literature amassed on these illnesses. As curation is labour-intensive, we aimed to speed up the process by automatically highlighting those parts of the PDF document of primary importance to the curator. Using techniques similar to those of summarisation, we developed an algorithm that relies on linguistic, semantic and spatial features. Employing this algorithm on a test set manually corrected for tool imprecision, we achieved a macro F1-measure of 0.51, which is an increase of 132% compared to the best bag-of-words baseline model. A user based evaluation was also conducted to assess the usefulness of the methodology on 40 unseen publications, which reveals that in 85% of cases all highlighted sentences are relevant to the curation task and in about 65% of the cases, the highlights are sufficient to support the knowledge curation task without needing to consult the full text. In conclusion, we believe that these are promising results for a step in automating the recognition of curation-relevant sentences. Refining our approach to pre-digest papers will lead to faster processing and cost reduction in the curation process

    Towards Automatic Extraction of Social Networks of Organizations in PubMed Abstracts

    Full text link
    Social Network Analysis (SNA) of organizations can attract great interest from government agencies and scientists for its ability to boost translational research and accelerate the process of converting research to care. For SNA of a particular disease area, we need to identify the key research groups in that area by mining the affiliation information from PubMed. This not only involves recognizing the organization names in the affiliation string, but also resolving ambiguities to identify the article with a unique organization. We present here a process of normalization that involves clustering based on local sequence alignment metrics and local learning based on finding connected components. We demonstrate the application of the method by analyzing organizations involved in angiogenensis treatment, and demonstrating the utility of the results for researchers in the pharmaceutical and biotechnology industries or national funding agencies.Comment: This paper has been withdrawn; First International Workshop on Graph Techniques for Biomedical Networks in Conjunction with IEEE International Conference on Bioinformatics and Biomedicine, Washington D.C., USA, Nov. 1-4, 2009; http://www.public.asu.edu/~sjonnal3/home/papers/IEEE%20BIBM%202009.pd

    Citation Function and Polarity Classification in Biomedical Papers

    Get PDF
    The traditional reference evaluation method treats all citations equally. However, a citation can serve various functions. It may reflect the citing paper author’s motivation as well as his/her true attitude towards the cited paper. Investigating such information can be achieved through citation content analysis. This thesis develops an 8-category classification scheme on citation function and polarity to help understand what role a citation played in scientific papers. A biomedical citation corpus is annotated with this scheme and experimented with supervised machine learning methods. Several types of features that capture the characteristics of citation sentences are extracted by natural language processing techniques to serve as the inputs of automatic classifiers. The importance of cue phrases in citation classification is also addressed and discussed
    corecore