4,121 research outputs found

    Full Text and Figure Display Improves Bioscience Literature Search

    Get PDF
    When reading bioscience journal articles, many researchers focus attention on the figures and their captions. This observation led to the development of the BioText literature search engine [1], a freely available Web-based application that allows biologists to search over the contents of Open Access Journals, and see figures from the articles displayed directly in the search results. This article presents a qualitative assessment of this system in the form of a usability study with 20 biologist participants using and commenting on the system. 19 out of 20 participants expressed a desire to use a bioscience literature search engine that displays articles' figures alongside the full text search results. 15 out of 20 participants said they would use a caption search and figure display interface either frequently or sometimes, while 4 said rarely and 1 said undecided. 10 out of 20 participants said they would use a tool for searching the text of tables and their captions either frequently or sometimes, while 7 said they would use it rarely if at all, 2 said they would never use it, and 1 was undecided. This study found evidence, supporting results of an earlier study, that bioscience literature search systems such as PubMed should show figures from articles alongside search results. It also found evidence that full text and captions should be searched along with the article title, metadata, and abstract. Finally, for a subset of users and information needs, allowing for explicit search within captions for figures and tables is a useful function, but it is not entirely clear how to cleanly integrate this within a more general literature search interface. Such a facility supports Open Access publishing efforts, as it requires access to full text of documents and the lifting of restrictions in order to show figures in the search interface

    Automatic figure classification in bioscience literature

    Get PDF
    AbstractMillions of figures appear in biomedical articles, and it is important to develop an intelligent figure search engine to return relevant figures based on user entries. In this study we report a figure classifier that automatically classifies biomedical figures into five predefined figure types: Gel-image, Image-of-thing, Graph, Model, and Mix. The classifier explored rich image features and integrated them with text features. We performed feature selection and explored different classification models, including a rule-based figure classifier, a supervised machine-learning classifier, and a multi-model classifier, the latter of which integrated the first two classifiers. Our results show that feature selection improved figure classification and the novel image features we explored were the best among image features that we have examined. Our results also show that integrating text and image features achieved better performance than using either of them individually. The best system is a multi-model classifier which combines the rule-based hierarchical classifier and a support vector machine (SVM) based classifier, achieving a 76.7% F1-score for five-type classification. We demonstrated our system at http://figureclassification.askhermes.org/

    Figure mining for biomedical research

    Get PDF
    Motivation: Figures from biomedical articles contain valuable information difficult to reach without specialized tools. Currently, there is no search engine that can retrieve specific figure types. Results: This study describes a retrieval method that takes advantage of principles in image understanding, text mining and optical character recognition (OCR) to retrieve figure types defined conceptually. A search engine was developed to retrieve tables and figure types to aid computational and experimental research. Availability: http://iossifovlab.cshl.edu/figurome Contact: [email protected]

    Figure Text Extraction in Biomedical Literature

    Get PDF
    Background: Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engin

    Bag-of-Colors for Biomedical Document Image Classification

    Get PDF
    The number of biomedical publications has increased noticeably in the last 30 years. Clinicians and medical researchers regularly have unmet information needs but require more time for searching than is usually available to find publications relevant to a clinical situation. The techniques described in this article are used to classify images from the biomedical open access literature into categories, which can potentially reduce the search time. Only the visual information of the images is used to classify images based on a benchmark database of ImageCLEF 2011 created for the task of image classification and image retrieval. We evaluate particularly the importance of color in addition to the frequently used texture and grey level features. Results show that bags–of–colors in combination with the Scale Invariant Feature Transform (SIFT) provide an image representation allowing to improve the classification quality. Accuracy improved from 69.75% of the best system in ImageCLEF 2011 using visual information, only, to 72.5% of the system described in this paper. The results highlight the importance of color for the classification of biomedical images

    Automatic Figure Ranking and User Interfacing for Intelligent Figure Search

    Get PDF
    Figures are important experimental results that are typically reported in full-text bioscience articles. Bioscience researchers need to access figures to validate research facts and to formulate or to test novel research hypotheses. On the other hand, the sheer volume of bioscience literature has made it difficult to access figures. Therefore, we are developing an intelligent figure search engine (http://figuresearch.askhermes.org). Existing research in figure search treats each figure equally, but we introduce a novel concept of "figure ranking": figures appearing in a full-text biomedical article can be ranked by their contribution to the knowledge discovery.We empirically validated the hypothesis of figure ranking with over 100 bioscience researchers, and then developed unsupervised natural language processing (NLP) approaches to automatically rank figures. Evaluating on a collection of 202 full-text articles in which authors have ranked the figures based on importance, our best system achieved a weighted error rate of 0.2, which is significantly better than several other baseline systems we explored. We further explored a user interfacing application in which we built novel user interfaces (UIs) incorporating figure ranking, allowing bioscience researchers to efficiently access important figures. Our evaluation results show that 92% of the bioscience researchers prefer as the top two choices the user interfaces in which the most important figures are enlarged. With our automatic figure ranking NLP system, bioscience researchers preferred the UIs in which the most important figures were predicted by our NLP system than the UIs in which the most important figures were randomly assigned. In addition, our results show that there was no statistical difference in bioscience researchers' preference in the UIs generated by automatic figure ranking and UIs by human ranking annotation.The evaluation results conclude that automatic figure ranking and user interfacing as we reported in this study can be fully implemented in online publishing. The novel user interface integrated with the automatic figure ranking system provides a more efficient and robust way to access scientific information in the biomedical domain, which will further enhance our existing figure search engine to better facilitate accessing figures of interest for bioscientists

    Image background assessment as a novel technique for insect microhabitat identification

    Full text link
    The effects of climate change, urbanisation and agriculture are changing the way insects occupy habitats. Some species may utilise anthropogenic microhabitat features for their existence, either because they prefer them to natural features, or because of no choice. Other species are dependent on natural microhabitats. Identifying and analysing these insects' use of natural and anthropogenic microhabitats is important to assess their responses to a changing environment, for improving pollination and managing invasive pests. Traditional studies of insect microhabitat use can now be supplemented by machine learning-based insect image analysis. Typically, research has focused on automatic insect classification, but valuable data in image backgrounds has been ignored. In this research, we analysed the image backgrounds available on the ALA database to determine their microhabitats. We analysed the microhabitats of three insect species common across Australia: Drone flies, European honeybees and European wasps. Image backgrounds were classified as natural or anthropogenic microhabitats using computer vision and machine learning tools benchmarked against a manual classification algorithm. We found flies and honeybees in natural microhabitats, confirming their need for natural havens within cities. Wasps were commonly seen in anthropogenic microhabitats. Results show these insects are well adapted to survive in cities. Management of this invasive pest requires a thoughtful reduction of their access to human-provided resources. The assessment of insect image backgrounds is instructive to document the use of microhabitats by insects. The method offers insight that is increasingly vital for biodiversity management as urbanisation continues to encroach on natural ecosystems and we must consciously provide resources within built environments to maintain insect biodiversity and manage invasive pests.Comment: Submitted in Ecological Informatics journal, first review completed, 19 pages, 10 figure
    • …
    corecore