125 research outputs found

    Semantic Segmentation of Pathological Lung Tissue with Dilated Fully Convolutional Networks

    Full text link
    Early and accurate diagnosis of interstitial lung diseases (ILDs) is crucial for making treatment decisions, but can be challenging even for experienced radiologists. The diagnostic procedure is based on the detection and recognition of the different ILD pathologies in thoracic CT scans, yet their manifestation often appears similar. In this study, we propose the use of a deep purely convolutional neural network for the semantic segmentation of ILD patterns, as the basic component of a computer aided diagnosis (CAD) system for ILDs. The proposed CNN, which consists of convolutional layers with dilated filters, takes as input a lung CT image of arbitrary size and outputs the corresponding label map. We trained and tested the network on a dataset of 172 sparsely annotated CT scans, within a cross-validation scheme. The training was performed in an end-to-end and semi-supervised fashion, utilizing both labeled and non-labeled image regions. The experimental results show significant performance improvement with respect to the state of the art

    Expected exponential loss for gaze-based video and volume ground truth annotation

    Full text link
    Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record where they have looked at with a \$200 eye gaze tracker. Our method then estimates pixel-wise probabilities for the presence of the object throughout the sequence from which we train a classifier in semi-supervised setting using a novel Expected Exponential loss function. We show that our framework provides superior performances on a wide range of medical image settings compared to existing strategies and that our method can be combined with current crowd-sourcing paradigms as well.Comment: 9 pages, 5 figues, MICCAI 2017 - LABELS Worksho

    Semantic Segmentation of Histopathological Slides for the Classification of Cutaneous Lymphoma and Eczema

    Full text link
    Mycosis fungoides (MF) is a rare, potentially life threatening skin disease, which in early stages clinically and histologically strongly resembles Eczema, a very common and benign skin condition. In order to increase the survival rate, one needs to provide the appropriate treatment early on. To this end, one crucial step for specialists is the evaluation of histopathological slides (glass slides), or Whole Slide Images (WSI), of the patients' skin tissue. We introduce a deep learning aided diagnostics tool that brings a two-fold value to the decision process of pathologists. First, our algorithm accurately segments WSI into regions that are relevant for an accurate diagnosis, achieving a Mean-IoU of 69% and a Matthews Correlation score of 83% on a novel dataset. Additionally, we also show that our model is competitive with the state of the art on a reference dataset. Second, using the segmentation map and the original image, we are able to predict if a patient has MF or Eczema. We created two models that can be applied in different stages of the diagnostic pipeline, potentially eliminating life-threatening mistakes. The classification outcome is considerably more interpretable than using only the WSI as the input, since it is also based on the segmentation map. Our segmentation model, which we call EU-Net, extends a classical U-Net with an EfficientNet-B7 encoder which was pre-trained on the Imagenet dataset.Comment: Submitted to https://link.springer.com/chapter/10.1007/978-3-030-52791-4_

    Food category recognition using SURF and MSER local feature representation

    Get PDF
    Food object recognition has gained popularity in recent years. This can perhaps be attributed to its potential applications in fields such as nutrition and fitness. Recognizing food images however is a challenging task since various foods come in many shapes and sizes. Besides having unexpected deformities and texture, food images are also captured in differing lighting conditions and camera viewpoints. From a computer vision perspective, using global image features to train a supervised classifier might be unsuitable due to the complex nature of the food images. Local features on the other hand seem the better alternative since they are able to capture minute intricacies such as interest points and other intricate information. In this paper, two local features namely SURF (Speeded- Up Robust Feature) and MSER (Maximally Stable Extremal Regions) are investigated for food object recognition. Both features are computationally inexpensive and have shown to be effective local descriptors for complex images. Specifically, each feature is firstly evaluated separately. This is followed by feature fusion to observe whether a combined representation could better represent food images. Experimental evaluations using a Support Vector Machine classifier shows that feature fusion generates better recognition accuracy at 86.6%

    Figure Text Extraction in Biomedical Literature

    Get PDF
    Background: Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engin

    Text detection in images and videos

    No full text
    The goal of a multimedia text extraction and recognition system is filling the gap between the already existing and mature technology of Optical Character Recognition and the new needs for textual information retrieval created by the spread of digital multimedia. A text extraction system from multimedia usually consists of the following four stages: spatial text detection, temporal text detection – tracking (for videos), image binarization – segmentation, character recognition. In the framework of this PhD thesis we dealt with all the stages of a multimedia text extraction system, focusing though on the designing and development of techniques for the spatial detection of text in images and videos as well as methods for evaluating the corresponding result. The contribution of the thesis to the research area of multimedia text extraction lies to the proposition of generic methods for spatial detection of unconstrained text in images and videos regardless of their content, quality and resolution. In addition, two methods for the evaluation of the text detection result were proposed that deal successfully with the problems of the related literature. Each of them uses different criteria for the evaluation of the result while both of them are based on intuitively correct observations. Finally, a very efficient method was developed for the temporal detection of text which actually conduces to a better spatial detection while concurrently enhances the quality of the text image.Ο σκοπός ενός συστήματος εξαγωγής και αναγνώρισης χαρακτήρων σε πολυμεσικά έγγραφα έγκειται στη γεφύρωση του χάσματος μεταξύ της ήδη υπάρχουσας και ώριμης τεχνολογίας Οπτικής Αναγνώρισης Χαρακτήρων και των νέων αναγκών ανάκτησης πληροφορίας κειμένου που δημιουργούνται από την ραγδαία εξάπλωση των ψηφιακών πολυμέσων. Ένα σύστημα εξαγωγής πολυμεσικού κειμένου συνήθως αποτελείται από τα εξής τέσσερα στάδια: χωρικός εντοπισμός κειμένου, χρονικός εντοπισμός - παρακολούθηση κειμένου (σε βίντεο), δυαδική μετατροπή εικόνας κειμένου, κατάτμηση και αναγνώριση χαρακτήρων. Στα πλαίσια της συγκεκριμένης διδακτορικής διατριβής μελετήθηκαν ξεχωριστά όλα τα στάδια της εξαγωγής χαρακτήρων από πολυμέσα, δίνοντας μεγαλύτερη έμφαση στην ανάπτυξη τεχνικών για το χωρικό εντοπισμό κειμένου σε βίντεο και φωτογραφίες καθώς και σε τεχνικές αποτίμησης του αποτελέσματος. Η συνεισφορά της διατριβής έγκειται κυρίως στην πρόταση γενικευμένων μεθόδων για χωρικό εντοπισμό κειμένου κάθε είδους σε φωτογραφίες και βίντεο οποιασδήποτε ποιότητας, ανάλυσης και περιεχομένου. Στα πλαίσια της διατριβής προτάθηκαν επίσης δύο μέθοδοι αποτίμησης του χωρικού εντοπισμού κειμένου που αντιμετωπίζουν επιτυχώς μια σειρά προβλημάτων της σχετικής βιβλιογραφίας. Κάθε μια από αυτές χρησιμοποιεί διαφορετικό κριτήριο για την αξιολόγηση του αποτελέσματος ενώ και οι δύο βασίζονται σε διαισθητικά ορθές παρατηρήσεις. Τέλος, αναπτύχθηκε μια πολύ αποδοτική μέθοδος χρονικού εντοπισμού ακίνητου κειμένου σε βίντεο η οποία παράλληλα συμβάλλει στον καλύτερο χωρικό εντοπισμό αλλά και βελτιώνει την ποιότητα της εικόνας κειμένο

    A Hybrid System for Text Detection in Video Frames

    No full text
    This paper proposes a hybrid system for text detection in video frames. The system consists of two main stages. In the first stage text regions are detected based on the edge map of the image leading in a high recall rate with minimum computation requirements. In the sequel, a refinement stage uses an SVM classifier trained on features obtained by a new Local Binary Pattern based operator which results in diminishing false alarms. Experimental results show the overall performance of the system that proves the discriminating ability of the proposed feature set
    corecore