8,718 research outputs found

    Special Libraries, December 1964

    Get PDF
    Volume 55, Issue 10https://scholarworks.sjsu.edu/sla_sl_1964/1009/thumbnail.jp

    Special Libraries, December 1964

    Get PDF
    Volume 55, Issue 10https://scholarworks.sjsu.edu/sla_sl_1964/1009/thumbnail.jp

    Classification and Retrieval of Digital Pathology Scans: A New Dataset

    Full text link
    In this paper, we introduce a new dataset, \textbf{Kimia Path24}, for image classification and retrieval in digital pathology. We use the whole scan images of 24 different tissue textures to generate 1,325 test patches of size 1000Ă—\times1000 (0.5mmĂ—\times0.5mm). Training data can be generated according to preferences of algorithm designer and can range from approximately 27,000 to over 50,000 patches if the preset parameters are adopted. We propose a compound patch-and-scan accuracy measurement that makes achieving high accuracies quite challenging. In addition, we set the benchmarking line by applying LBP, dictionary approach and convolutional neural nets (CNNs) and report their results. The highest accuracy was 41.80\% for CNN.Comment: Accepted for presentation at Workshop for Computer Vision for Microscopy Image Analysis (CVMI 2017) @ CVPR 2017, Honolulu, Hawai

    An exploratory study of user-centered indexing of published biomedical images

    Get PDF
    User-centered image indexing—often reported in research on collaborative tagging, social classification, folksonomy, or personal tagging—has received a considerable amount of attention [1-7]. The general themes in more recent studies on this topic include user-centered tagging behavior by types of images, pros and cons of user-created tags as compared to controlled index terms; assessment of the value added by user-generated tags, and comparison of automatic indexing versus human indexing in the context of web digital image collections such as Flickr. For instance, Golbeck\u27s finding restates the importance of indexer experience, order, and type of images [8]. Rorissa has found a significant difference in the number of terms assigned when using Flickr tags or index terms on the same image collection, which might suggest a difference in level of indexing by professional indexers and Flickr taggers [9]. Studies focusing on users and their tagging experiences and user-generated tags suggest ideas to be implemented as part of a personalized, customizable tagging system. Additionally, Stvilia and her colleagues have found that tagger age and image familiarity are negatively related, while indexing and tagging experience were positively associated [10]. A major question for biomedical image indexing is whether the results of the aforementioned studies, all of which dealt with general image collections, are applicable to images in the medical domain. In spite of the importance of visual material in medical education and the prevalence of digitized images in formal medical practice and education, medical students have few opportunities to annotate biomedical images. End-user training could improve the quality of image indexing and so improve retrieval. In a pilot assessment of image indexing and retrieval quality by medical students, this study compared concept completion and retrieval effectiveness of indexing terms generated by medical students on thirty-nine histology images selected from the PubMed Central (PMC) database. Indexing instruction was only given to an intervention group to test its impact on the quality of end-user image indexing

    Improving average ranking precision in user searches for biomedical research datasets

    Full text link
    Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorisation method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries. Our system provides competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP among the participants, being +22.3% higher than the median infAP of the participant's best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system's performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. Our similarity measure algorithm seems to be robust, in particular compared to Divergence From Randomness framework, having smaller performance variations under different training conditions. Finally, the result categorization did not have significant impact on the system's performance. We believe that our solution could be used to enhance biomedical dataset management systems. In particular, the use of data driven query expansion methods could be an alternative to the complexity of biomedical terminologies

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Advancing Biomedical Image Retrieval: Development and Analysis of a Test Collection

    Get PDF
    Objective: Develop and analyze results from an image retrieval test collection. Methods: After participating research groups obtained and assessed results from their systems in the image retrieval task of Cross-Language Evaluation Forum, we assessed the results for common themes and trends. In addition to overall performance, results were analyzed on the basis of topic categories (those most amenable to visual, textual, or mixed approaches) and run categories (those employing queries entered by automated or manual means as well as those using visual, textual, or mixed indexing and retrieval methods). We also assessed results on the different topics and compared the impact of duplicate relevance judgments. Results: A total of 13 research groups participated. Analysis was limited to the best run submitted by each group in each run category. The best results were obtained by systems that combined visual and textual methods. There was substantial variation in performance across topics. Systems employing textual methods were more resilient to visually oriented topics than those using visual methods were to textually oriented topics. The primary performance measure of mean average precision (MAP) was not necessarily associated with other measures, including those possibly more pertinent to real users, such as precision at 10 or 30 images. Conclusions: We developed a test collection amenable to assessing visual and textual methods for image retrieval. Future work must focus on how varying topic and run types affect retrieval performance. Users' studies also are necessary to determine the best measures for evaluating the efficacy of image retrieval system
    • …
    corecore