12,963 research outputs found

    Convolutional Sparse Kernel Network for Unsupervised Medical Image Analysis

    Full text link
    The availability of large-scale annotated image datasets and recent advances in supervised deep learning methods enable the end-to-end derivation of representative image features that can impact a variety of image analysis problems. Such supervised approaches, however, are difficult to implement in the medical domain where large volumes of labelled data are difficult to obtain due to the complexity of manual annotation and inter- and intra-observer variability in label assignment. We propose a new convolutional sparse kernel network (CSKN), which is a hierarchical unsupervised feature learning framework that addresses the challenge of learning representative visual features in medical image analysis domains where there is a lack of annotated training data. Our framework has three contributions: (i) We extend kernel learning to identify and represent invariant features across image sub-patches in an unsupervised manner. (ii) We initialise our kernel learning with a layer-wise pre-training scheme that leverages the sparsity inherent in medical images to extract initial discriminative features. (iii) We adapt a multi-scale spatial pyramid pooling (SPP) framework to capture subtle geometric differences between learned visual features. We evaluated our framework in medical image retrieval and classification on three public datasets. Our results show that our CSKN had better accuracy when compared to other conventional unsupervised methods and comparable accuracy to methods that used state-of-the-art supervised convolutional neural networks (CNNs). Our findings indicate that our unsupervised CSKN provides an opportunity to leverage unannotated big data in medical imaging repositories.Comment: Accepted by Medical Image Analysis (with a new title 'Convolutional Sparse Kernel Network for Unsupervised Medical Image Analysis'). The manuscript is available from following link (https://doi.org/10.1016/j.media.2019.06.005

    Radon-Gabor Barcodes for Medical Image Retrieval

    Full text link
    In recent years, with the explosion of digital images on the Web, content-based retrieval has emerged as a significant research area. Shapes, textures, edges and segments may play a key role in describing the content of an image. Radon and Gabor transforms are both powerful techniques that have been widely studied to extract shape-texture-based information. The combined Radon-Gabor features may be more robust against scale/rotation variations, presence of noise, and illumination changes. The objective of this paper is to harness the potentials of both Gabor and Radon transforms in order to introduce expressive binary features, called barcodes, for image annotation/tagging tasks. We propose two different techniques: Gabor-of-Radon-Image Barcodes (GRIBCs), and Guided-Radon-of-Gabor Barcodes (GRGBCs). For validation, we employ the IRMA x-ray dataset with 193 classes, containing 12,677 training images and 1,733 test images. A total error score as low as 322 and 330 were achieved for GRGBCs and GRIBCs, respectively. This corresponds to 81%\approx 81\% retrieval accuracy for the first hit.Comment: To appear in proceedings of the 23rd International Conference on Pattern Recognition (ICPR 2016), Cancun, Mexico, December 201

    Autoencoding the Retrieval Relevance of Medical Images

    Full text link
    Content-based image retrieval (CBIR) of medical images is a crucial task that can contribute to a more reliable diagnosis if applied to big data. Recent advances in feature extraction and classification have enormously improved CBIR results for digital images. However, considering the increasing accessibility of big data in medical imaging, we are still in need of reducing both memory requirements and computational expenses of image retrieval systems. This work proposes to exclude the features of image blocks that exhibit a low encoding error when learned by a n/p/nn/p/n autoencoder (p ⁣< ⁣np\!<\!n). We examine the histogram of autoendcoding errors of image blocks for each image class to facilitate the decision which image regions, or roughly what percentage of an image perhaps, shall be declared relevant for the retrieval task. This leads to reduction of feature dimensionality and speeds up the retrieval process. To validate the proposed scheme, we employ local binary patterns (LBP) and support vector machines (SVM) which are both well-established approaches in CBIR research community. As well, we use IRMA dataset with 14,410 x-ray images as test data. The results show that the dimensionality of annotated feature vectors can be reduced by up to 50% resulting in speedups greater than 27% at expense of less than 1% decrease in the accuracy of retrieval when validating the precision and recall of the top 20 hits.Comment: To appear in proceedings of The 5th International Conference on Image Processing Theory, Tools and Applications (IPTA'15), Nov 10-13, 2015, Orleans, Franc

    Gabor Barcodes for Medical Image Retrieval

    Full text link
    In recent years, advances in medical imaging have led to the emergence of massive databases, containing images from a diverse range of modalities. This has significantly heightened the need for automated annotation of the images on one side, and fast and memory-efficient content-based image retrieval systems on the other side. Binary descriptors have recently gained more attention as a potential vehicle to achieve these goals. One of the recently introduced binary descriptors for tagging of medical images are Radon barcodes (RBCs) that are driven from Radon transform via local thresholding. Gabor transform is also a powerful transform to extract texture-based information. Gabor features have exhibited robustness against rotation, scale, and also photometric disturbances, such as illumination changes and image noise in many applications. This paper introduces Gabor Barcodes (GBCs), as a novel framework for the image annotation. To find the most discriminative GBC for a given query image, the effects of employing Gabor filters with different parameters, i.e., different sets of scales and orientations, are investigated, resulting in different barcode lengths and retrieval performances. The proposed method has been evaluated on the IRMA dataset with 193 classes comprising of 12,677 x-ray images for indexing, and 1,733 x-rays images for testing. A total error score as low as 351351 (80%\approx 80\% accuracy for the first hit) was achieved.Comment: To appear in proceedings of The 2016 IEEE International Conference on Image Processing (ICIP 2016), Sep 25-28, 2016, Phoenix, Arizona, US

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    ImageCLEF 2013: The vision, the data and the open challenges

    Full text link
    This paper presents an overview of the ImageCLEF 2013 lab. Since its first edition in 2003, ImageCLEF has become one of the key initiatives promoting the benchmark evaluation of algorithms for the cross-language annotation and retrieval of images in various domains, such as public and personal images, to data acquired by mobile robot platforms and botanic collections. Over the years, by providing new data collections and challenging tasks to the community of interest, the ImageCLEF lab has achieved an unique position in the multi lingual image annotation and retrieval research landscape. The 2013 edition consisted of three tasks: the photo annotation and retrieval task, the plant identification task and the robot vision task. Furthermore, the medical annotation task, that traditionally has been under the ImageCLEF umbrella and that this year celebrates its tenth anniversary, has been organized in conjunction with AMIA for the first time. The paper describes the tasks and the 2013 competition, giving an unifying perspective of the present activities of the lab while discussion the future challenges and opportunities.This work has been partially supported by the Halser Foundation (B. C.),by the LiMoSINe FP7 project under grant # 288024 (B. T.), by the Khresmoi (grant# 257528) and PROMISE ( grant # 258191) FP 7 projects (H.M.) and by the tranScriptorium FP7 project under grant # 600707 (M. V., R. P.)Caputo ., B.; Muller ., H.; Thomee ., B.; Villegas, M.; Paredes Palacios, R.; Zellhofer ., D.; Goeau ., H.... (2013). ImageCLEF 2013: The vision, the data and the open challenges. En Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Springer Verlag (Germany). 8138:250-268. https://doi.org/10.1007/978-3-642-40802-1_26S2502688138Muller, H., Clough, P., Deselaers, T., Caputo, B.: ImageCLEF: experimental evaluation in visual information retrieval. Springer (2010)Tsikrika, T., Seco de Herrera, A.G., Müller, H.: Assessing the scholarly impact of imageCLEF. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 95–106. Springer, Heidelberg (2011)Huiskes, M., Lew, M.: The MIR Flickr retrieval evaluation. In: Proceedings of the 10th ACM Conference on Multimedia Information Retrieval, Vancouver, BC, Canada, pp. 39–43 (2008)Huiskes, M., Thomee, B., Lew, M.: New trends and ideas in visual concept detection. In: Proceedings of the 11th ACM Conference on Multimedia Information Retrieval, Philadelphia, PA, USA, pp. 527–536 (2010)Villegas, M., Paredes, R.: Overview of the ImageCLEF 2012 Scalable Web Image Annotation Task. In: CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy (2012)Zellhöfer, D.: Overview of the Personal Photo Retrieval Pilot Task at ImageCLEF 2012. In: CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy (2012)Villegas, M., Paredes, R., Thomee, B.: Overview of the ImageCLEF 2013 Scalable Concept Image Annotation Subtask. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, Valencia, Spain (2013)Zellhöfer, D.: Overview of the ImageCLEF 2013 Personal Photo Retrieval Subtask. In: CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, Valencia, Spain (2013)Leafsnap (2011)Plantnet (2013)Mobile flora (2013)Folia (2012)Goëau, H., Bonnet, P., Joly, A., Bakic, V., Boujemaa, N., Barthelemy, D., Molino, J.F.: The imageclef 2013 plant identification task. In: ImageCLEF 2013 Working Notes (2013)Pronobis, A., Xing, L., Caputo, B.: Overview of the CLEF 2009 robot vision track. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 110–119. Springer, Heidelberg (2010)Pronobis, A., Caputo, B.: The robot vision task. In: Muller, H., Clough, P., Deselaers, T., Caputo, B. (eds.) ImageCLEF. The Information Retrieval Series, vol. 32, pp. 185–198. Springer, Heidelberg (2010)Pronobis, A., Christensen, H.I., Caputo, B.: Overview of the imageCLEF@ICPR 2010 robot vision track. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 171–179. Springer, Heidelberg (2010)Martinez-Gomez, J., Garcia-Varea, I., Caputo, B.: Overview of the imageclef 2012 robot vision task. In: CLEF 2012 Working Notes (2012)Rusu, R., Cousins, S.: 3d is here: Point cloud library (pcl). In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–4. IEEE (2011)Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: International Conference on Computer Vision, pp. 1–8. Citeseer (2007)Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)Linde, O., Lindeberg, T.: Object recognition using composed receptive field histograms of higher dimensionality. In: Proc. ICPR. Citeseer (2004)Orabona, F., Castellini, C., Caputo, B., Luo, J., Sandini, G.: Indoor place recognition using online independent support vector machines. In: Proc. BMVC, vol. 7 (2007)Orabona, F., Castellini, C., Caputo, B., Jie, L., Sandini, G.: On-line independent support vector machines. Pattern Recognition 43, 1402–1412 (2010)Orabona, F., Jie, L., Caputo, B.: Online-Batch Strongly Convex Multi Kernel Learning. In: Proc. of Computer Vision and Pattern Recognition, CVPR (2010)Orabona, F., Jie, L., Caputo, B.: Multi kernel learning with online-batch optimization. Journal of Machine Learning Research 13, 165–191 (2012)Clough, P., Müller, H., Sanderson, M.: The CLEF 2004 cross-language image retrieval track. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 597–613. Springer, Heidelberg (2005)Clough, P., Müller, H., Deselaers, T., Grubinger, M., Lehmann, T.M., Jensen, J., Hersh, W.: The CLEF 2005 cross–language image retrieval track. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 535–557. Springer, Heidelberg (2006)Müller, H., Deselaers, T., Deserno, T., Clough, P., Kim, E., Hersh, W.: Overview of the imageCLEFmed 2006 medical retrieval and medical annotation tasks. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 595–608. Springer, Heidelberg (2007)Müller, H., Deselaers, T., Deserno, T., Kalpathy–Cramer, J., Kim, E., Hersh, W.: Overview of the imageCLEFmed 2007 medical retrieval and medical annotation tasks. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 472–491. Springer, Heidelberg (2008)Müller, H., Kalpathy–Cramer, J., Eggel, I., Bedrick, S., Radhouani, S., Bakke, B., Kahn Jr., C.E., Hersh, W.: Overview of the CLEF 2009 medical image retrieval track. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009, Part II. LNCS, vol. 6242, pp. 72–84. Springer, Heidelberg (2010)Tommasi, T., Caputo, B., Welter, P., Güld, M.O., Deserno, T.M.: Overview of the CLEF 2009 medical image annotation track. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 85–93. Springer, Heidelberg (2010)Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.): ImageCLEF – Experimental Evaluation in Visual Information Retrieval. The Springer International Series on Information Retrieval, vol. 32. Springer, Heidelberg (2010)Kalpathy-Cramer, J., Müller, H., Bedrick, S., Eggel, I., García Seco de Herrera, A., Tsikrika, T.: The CLEF 2011 medical image retrieval and classification tasks. In: Working Notes of CLEF 2011 (Cross Language Evaluation Forum) (2011)Müller, H., García Seco de Herrera, A., Kalpathy-Cramer, J., Demner Fushman, D., Antani, S., Eggel, I.: Overview of the ImageCLEF 2012 medical image retrieval and classification tasks. In: Working Notes of CLEF 2012 (Cross Language Evaluation Forum) (2012)García Seco de Herrera, A., Kalpathy-Cramer, J., Demner Fushman, D., Antani, S., Müller, H.: Overview of the ImageCLEF 2013 medical tasks. In: Working Notes of CLEF 2013 (Cross Language Evaluation Forum) (2013

    Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms

    Full text link
    Question categorization and expert retrieval methods have been crucial for information organization and accessibility in community question & answering (CQA) platforms. Research in this area, however, has dealt with only the text modality. With the increasing multimodal nature of web content, we focus on extending these methods for CQA questions accompanied by images. Specifically, we leverage the success of representation learning for text and images in the visual question answering (VQA) domain, and adapt the underlying concept and architecture for automated category classification and expert retrieval on image-based questions posted on Yahoo! Chiebukuro, the Japanese counterpart of Yahoo! Answers. To the best of our knowledge, this is the first work to tackle the multimodality challenge in CQA, and to adapt VQA models for tasks on a more ecologically valid source of visual questions. Our analysis of the differences between visual QA and community QA data drives our proposal of novel augmentations of an attention method tailored for CQA, and use of auxiliary tasks for learning better grounding features. Our final model markedly outperforms the text-only and VQA model baselines for both tasks of classification and expert retrieval on real-world multimodal CQA data.Comment: Submitted for review at CIKM 201
    corecore