2 research outputs found

    Overview of ImageCLEFcaption 2017 – Image Caption Prediction and Concept Detection for Biomedical Images

    Get PDF
    This paper presents an overview of the ImageCLEF 2017 caption tasks on the analysis of images from the biomedical literature. Two subtasks were proposed to the participants: a concept detectiontask and caption prediction task, both using only images as input. Thetwo subtasks tackle the problem of providing image interpretation by extracting concepts and predicting a caption based on the visual information of an image alone. A dataset of 184,000 figure-caption pairs from the biomedical open access literature (PubMed Central) are provided asa testbed with the majority of them as training data and then 10,000 as validation and 10,000 as test data. Across two tasks, 11 participating groups submitted 71 runs. While the domain remains challenging and the data highly heterogeneous, we can note some surprisingly good results of the difficult task with a quality that could be beneficial for health applications by better exploiting the visual content of biomedical figures

    Exposure Fusion Framework in Deep Learning-Based Radiology Report Generator

    Get PDF
    Writing a radiology report is time-consuming and requires experienced radiologists. Hence a technology that could generate an automatic report would be beneficial. The key problem in developing an automated report-generating system is providing a coherent predictive text. To accomplish this, it is important to ensure the image has good quality so that the model can learn the parts of the image in interpreting, especially in medical images that tend to be noise-prone in the acquisition process. This research uses the Exposure Fusion Framework method to enhance the quality of medical images to increase the model performance in producing coherent predictive text. The model used is an encoder-decoder with visual feature extraction using a pre- trained ChexNet, Bidirectional Encoder Representation from Transformer (BERT) embedding for text feature, and Long-short Term Memory (LSTM) as a decoder. The model’s performance with EFF enhancement obtained a 7% better result than without enhancement processing using an evaluation value of Bilingual Evaluation Understudy (BLEU) with n-gram 4. It can be concluded that using the enhancement method effectively increases the model’s performance
    corecore