59 research outputs found

    Biomedical image classification with random subwindows and decision trees

    Full text link
    peer reviewedIn this paper, we address a problem of biomedical image classification that involves the automatic classification of x-ray images in 57 predefined classes with large intra-class variability. To achieve that goal, we apply and slightly adapt a recent generic method for image classification based on ensemble of decision trees and random subwindows. We obtain classification results close to the state of the art on a publicly available database of 10000 x-ray images. We also provide some clues to interpret the classification of each image in terms of subwindow relevance

    Random subwindows and extremely randomized trees for image classification in cell biology

    Get PDF
    Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks. Results: We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose. Conclusion: Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems

    Segmentation of the breast region with pectoral muscle suppression and automatic breast density classification

    Get PDF
    Projecte final de carrera fet en col.laboració amb Université catholique de Louvain. Ecole Polytechnique de LouvainBreast cancer is one of the major causes of death among women. Nowadays screening mammography is the most adopted technique to perform an early breast cancer detection ahead other procedures like screen film mammography (SFM) or ultrasound scan. Computer assisted diagnosis (CAD) of mammograms attempts to help radiologists providing an automatic procedure to detect possible cancers in mammograms. Suspicious breast cancers appear as white spots in mammograms, indicating small clusters of micro-calcifications. Mammogram sensitivity decreases due some factors like density of the breast, presence of labels, artifacts or even pectoral muscle. The pre-processing of mammogram images is a very important step in the breast cancer analysis and detection because it might reduce the number of false positives. In this thesis we propose a method to segment and classify automatically mammograms according to their density. We perform several procedures including pre-processing (enhancement of the image, noise reduction, orientation finding or borders removal) and segmentation (separate the breast from the background, labels and pectoral muscle present in the mammograms) in order to increase the sensitivity of our CAD system. The final goal is the classification for diagnosis, in other words, finding the density class for an incoming mammography in order tot determine if more tests are needed to find possible cancers in the image. This functionality will be included in a new clinical imaging annotation system for computer aided breast cancer screening developed by the Communications and Remote Sensing Department at the Université Catholique de Louvain. The source code for the pre-processing and segmentation step has been programmed in C++ using the library of image processing ITK and CMake compiler. The performed method has been applied to medio-lateral oblique-view (MLO) mammograms as well as on caniocauldal mammograms (CC) belonging to different databases. The classification step has been implemented in Matlab. We have tested our pre-processing method obtaining a rate of 100% success removing labels and artifacts from mammograms of mini-MIAS database. The pectoral removal rate has been evaluated subjectively obtaining a rate of good removal of 57.76%. Finally, for the classification step, the best recognition rate that we have obtained was 76.25% using only pixel values, and 77.5% adding texture features, classifying images belonging to mini-MIAS database into 3 different density types. These results can be compared with the actual state of the art in segmentation and classification of biomedical images.El cáncer de mama es una de las mayores causas de muerte entre las mujeres. Actualmente, las mamografías digitales son la técnica más adoptada para realizar una previa detección de estos cánceres antes que otros procedimientos como "screen film mammography (SFM)" o escáneres de ultrasonidos. Los programas de diagnóstico automático (CAD) ayudan a los radiólogos proveyéndolos de un procedimiento automático para detectar posibles cánceres en las mamografías. Posibles cánceres aparecen en las mamografías como puntos blancos indicando pequeños grupos de micro-calcificaciones. La sensibilidad de las mamografias decrece debido a algunos factores como la densidad del pecho, presencia de etiquetas o artefactos o incluso de músculo pectoral. El pre-procesado de las mamografías es un paso muy importante en la detección de posibles cánceres de mama ya que puede reducir el número de falsos positivos. En esta tesis proponemos un método para segmentar y clasificar automáticamente las mamografías según su densidad. Hemos realizado varios procedimientos incluyendo, pre-procesado (realce de la imagen, reducción de ruido, descubrimiento de la orientación o supresión de bordes) y segmentación (separar el pecho de fondo, etiquetas y músculo pectoral presente en mamografías) para incrementar la sensibilidad de nuestro sistema CAD. El objetivo final es la clasificación para diagnosis, en otras palabras, encontrar la clase de densidad para una mamografía entrante y determinar si son necesarios más pruebas para encontrar posibles cánceres en las imágenes. Esta funcionalidad va a ser incluida en una nueva aplicación ara anotación de imágenes clínicas desarrollada por el Departamento de Comunicación y Detección Remota de la Universidad Católica de Lovaina. El código fuente para el pre-procesado y la segmentación ha sido desarrollado en C++ utilizando la librería de procesado de imagen ITK y el compilador CMake. El método implementado puede ser aplicado a tanto medio-lateral (MLO) como a caniocauldal mamografías (CC) pertenecientes a diferentes bases de datos. El método de clasificación ha sido implementado en Matlab. Hemos testeado nuestro método de pre-procesado obteniendo una tasa de suceso próxima al 100% en la eliminación de etiquetas y artefactos de la base de datos de mamografías mini-MIAS. La tasa de supresión de músculo pectoral ha sido evaluada de forma subjetiva obteniendo un 57.76%. Finalmente, en el método de clasificación se ha obtenido un 76.25% usando sólo información de los píxeles y un 77.5% usando información de texturas. Los resultados pueden ser comparados con el actual estado del arte en segmentación y clasificación de imágenes biomédicas.El càncer de mama és una de les majors causes de mort entre les dones. Actualment, les mamografies digitals són la tècnica més utilitzada per realitzar una prèvia detecció d'aquests càncers abans que altres procediments com "screen film mammography (SFM)" o escàners d'ultrasons. Els programes de diagnòstic automàtic (CAD) ajuden als radiòlegs proveïnt d'un procediment automàtic per detectar possibles càncers a les mamografies. Possibles càncers apareixen en les mamografies com punts blancs indicant petits grups de micro-calcificacions. La sensibilitat de les mamografies decreix a causa d'alguns factors com la densitat del pit, presència d'etiquetes o artefactes o fins i tot de múscul pectoral. El pre-processat de les mamografies és un pas molt important en la detecció de possibles càncers de mama ja que pot reduir el nombre de falsos positius. En aquesta tesi proposem un mètode per segmentar i classificar automàticament les mamografies segons la seva densitat. Hem realitzat diversos procediments incloent, pre-processat (realç de la imatge, reducció de soroll, descobriment de l'orientació o supressió de vores) i segmentació (separar el pit de fons, etiquetes i múscul pectoral present en mamografies) per incrementar la sensibilitat de nostre sistema CAD. L'objectiu final és la classificació per diagnosi, en altres paraules, trobar la classe de densitat per a una mamografia entrant i determinar si són necessaris més proves per trobar possibles càncers en les imatges. Aquesta funcionalitat serà inclosa en una nova aplicació ara anotació d'imatges clíniques desenvolupada pel Departament de Comunicació i Detecció Remota de la Universitat Catòlica de Lovaina. El codi font per al pre-processat i la segmentació ha estat desenvolupat en C + + utilitzant la llibreria de processat d'imatge ITK i el compilador CMake. El mètode implementat pot ser aplicat a tant mediolateral (MLO) com a caniocauldal mamografies (CC) pertanyents a diferents bases de dades. El mètode de classificació ha estat implementat en Matlab. Hem testejat el nostre mètode de pre-processat obtenint una taxa de succés propera al 100% en l'eliminació d'etiquetes i artefactes de la base de dades de mamografies mini-MIAS. La taxa de supressió de múscul pectoral ha estat avaluada de manera subjectiva obtenint un 57.76%. Finalment, en el mètode de classificació s'ha obtingut un 76.25% usant només informació dels píxels i un 77.5% usant informació de textures. Els resultats poden ser comparats amb l'actual estat de l'art en segmentació i classificació d'imatges biomèdiques

    Forest cover estimation in Ireland using radar remote sensing: a comparative analysis of forest cover assessment methodologies

    Get PDF
    Quantification of spatial and temporal changes in forest cover is an essential component of forest monitoring programs. Due to its cloud free capability, Synthetic Aperture Radar (SAR) is an ideal source of information on forest dynamics in countries with near-constant cloud-cover. However, few studies have investigated the use of SAR for forest cover estimation in landscapes with highly sparse and fragmented forest cover. In this study, the potential use of L-band SAR for forest cover estimation in two regions (Longford and Sligo) in Ireland is investigated and compared to forest cover estimates derived from three national (Forestry2010, Prime2, National Forest Inventory), one pan-European (Forest Map 2006) and one global forest cover (Global Forest Change) product. Two machine-learning approaches (Random Forests and Extremely Randomised Trees) are evaluated. Both Random Forests and Extremely Randomised Trees classification accuracies were high (98.1–98.5%), with differences between the two classifiers being minimal (<0.5%). Increasing levels of post classification filtering led to a decrease in estimated forest area and an increase in overall accuracy of SAR-derived forest cover maps. All forest cover products were evaluated using an independent validation dataset. For the Longford region, the highest overall accuracy was recorded with the Forestry2010 dataset (97.42%) whereas in Sligo, highest overall accuracy was obtained for the Prime2 dataset (97.43%), although accuracies of SAR-derived forest maps were comparable. Our findings indicate that spaceborne radar could aid inventories in regions with low levels of forest cover in fragmented landscapes. The reduced accuracies observed for the global and pan-continental forest cover maps in comparison to national and SAR-derived forest maps indicate that caution should be exercised when applying these datasets for national reporting

    Overview of the 2005 cross-language image retrieval track (ImageCLEF)

    Get PDF
    The purpose of this paper is to outline efforts from the 2005 CLEF crosslanguage image retrieval campaign (ImageCLEF). The aim of this CLEF track is to explore the use of both text and content-based retrieval methods for cross-language image retrieval. Four tasks were offered in the ImageCLEF track: a ad-hoc retrieval from an historic photographic collection, ad-hoc retrieval from a medical collection, an automatic image annotation task, and a user-centered (interactive) evaluation task that is explained in the iCLEF summary. 24 research groups from a variety of backgrounds and nationalities (14 countries) participated in ImageCLEF. In this paper we describe the ImageCLEF tasks, submissions from participating groups and summarise the main fndings

    Three-stage ensemble of image net pre-trained networks for pneumonia detection

    Get PDF
    Focusing on detection of pneumenia disease in the Chest X-Ray images, this paper proposes a three-stage ensemble methodology utilizing multiple pre-trained Convolutional Neural Networks (CNNs). In the first-stage ensemble, k subsets of training data are firstly randomly generated, each of which is then used to retrain a pre-trained CNN to produce k CNN models for the ensemble in the first stage. In the second-stage ensemble, multiple ensemble CNN models based on multiple pre-trained CNNs are integrated to reduce variance and improve the performance of the prediction. The third-stage ensemble is based on image augmentation, i.e., the original set of images are augmented to generate a few sets of additional images, after which each set of images are input to the ensemble models from the first two stages, and the outputs based multiple sets of images are then integrated. In integrating outputs in each stage, four ensemble techniques are introduced including averaging, feed forward neural network-based, decision tree-based, and majority voting. Thorough experiments were conducted on Chest X-Ray images from a Kaggle challenge, and the results showed the effectiveness of the proposed three-stage ensemble method in detecting pneumonia disease in the images

    Assessment of multi-temporal, multi-sensor radar and ancillary spatial data for grasslands monitoring in Ireland using machine learning approaches

    Get PDF
    Accurate inventories of grasslands are important for studies of carbon dynamics, biodiversity conservation and agricultural management. For regions with persistent cloud cover the use of multi-temporal synthetic aperture radar (SAR) data provides an attractive solution for generating up-to-date inventories of grasslands. This is even more appealing considering the data that will be available from upcoming missions such as Sentinel-1 and ALOS-2. In this study, the performance of three machine learning algorithms; Random Forests (RF), Support Vector Machines (SVM) and the relatively underused Extremely Randomised Trees (ERT) is evaluated for discriminating between grassland types over two large heterogeneous areas of Ireland using multi-temporal, multi-sensor radar and ancillary spatial datasets. A detailed accuracy assessment shows the efficacy of the three algorithms to classify different types of grasslands. Overall accuracies ≥ 88.7% (with kappa coefficient of 0.87) were achieved for the single frequency classifications and maximum accuracies of 97.9% (kappa coefficient of 0.98) for the combined frequency classifications. For most datasets, the ERT classifier outperforms SVM and RF

    A bag-of-words approach for Drosophila gene expression pattern annotation

    Get PDF
    abstract: Background Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput study were annotated with a variable number of anatomical terms manually using a controlled vocabulary. Considering that the number of available images is rapidly increasing, it is imperative to design computational methods to automate this task. Results We present a computational method to annotate gene expression pattern images automatically. The proposed method uses the bag-of-words scheme to utilize the existing information on pattern annotation and annotates images using a model that exploits correlations among terms. The proposed method can annotate images individually or in groups (e.g., according to the developmental stage). In addition, the proposed method can integrate information from different two-dimensional views of embryos. Results on embryonic patterns from BDGP data demonstrate that our method significantly outperforms other methods. Conclusion The proposed bag-of-words scheme is effective in representing a set of annotations assigned to a group of images, and the model employed to annotate images successfully captures the correlations among different controlled vocabulary terms. The integration of existing annotation information from multiple embryonic views improves annotation performance.The electronic version of this article is the complete one and can be found online at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-11
    • …
    corecore