47 research outputs found

    Efficient High-Resolution Template Matching with Vector Quantized Nearest Neighbour Fields

    Full text link
    Template matching is a fundamental problem in computer vision and has applications in various fields, such as object detection, image registration, and object tracking. The current state-of-the-art methods rely on nearest-neighbour (NN) matching in which the query feature space is converted to NN space by representing each query pixel with its NN in the template pixels. The NN-based methods have been shown to perform better in occlusions, changes in appearance, illumination variations, and non-rigid transformations. However, NN matching scales poorly with high-resolution data and high feature dimensions. In this work, we present an NN-based template-matching method which efficiently reduces the NN computations and introduces filtering in the NN fields to consider deformations. A vector quantization step first represents the template with kk features, then filtering compares the template and query distributions over the kk features. We show that state-of-the-art performance was achieved in low-resolution data, and our method outperforms previous methods at higher resolution showing the robustness and scalability of the approach

    Towards Better Guided Attention and Human Knowledge Insertion in Deep Convolutional Neural Networks

    Full text link
    Attention Branch Networks (ABNs) have been shown to simultaneously provide visual explanation and improve the performance of deep convolutional neural networks (CNNs). In this work, we introduce Multi-Scale Attention Branch Networks (MSABN), which enhance the resolution of the generated attention maps, and improve the performance. We evaluate MSABN on benchmark image recognition and fine-grained recognition datasets where we observe MSABN outperforms ABN and baseline models. We also introduce a new data augmentation strategy utilizing the attention maps to incorporate human knowledge in the form of bounding box annotations of the objects of interest. We show that even with a limited number of edited samples, a significant performance gain can be achieved with this strategy

    Interpreting deep learning output for out-of-distribution detection

    Full text link
    Commonly used AI networks are very self-confident in their predictions, even when the evidence for a certain decision is dubious. The investigation of a deep learning model output is pivotal for understanding its decision processes and assessing its capabilities and limitations. By analyzing the distributions of raw network output vectors, it can be observed that each class has its own decision boundary and, thus, the same raw output value has different support for different classes. Inspired by this fact, we have developed a new method for out-of-distribution detection. The method offers an explanatory step beyond simple thresholding of the softmax output towards understanding and interpretation of the model learning process and its output. Instead of assigning the class label of the highest logit to each new sample presented to the network, it takes the distributions over all classes into consideration. A probability score interpreter (PSI) is created based on the joint logit values in relation to their respective correct vs wrong class distributions. The PSI suggests whether the sample is likely to belong to a specific class, whether the network is unsure, or whether the sample is likely an outlier or unknown type for the network. The simple PSI has the benefit of being applicable on already trained networks. The distributions for correct vs wrong class for each output node are established by simply running the training examples through the trained network. We demonstrate our OOD detection method on a challenging transmission electron microscopy virus image dataset. We simulate a real-world application in which images of virus types unknown to a trained virus classifier, yet acquired with the same procedures and instruments, constitute the OOD samples

    Mechanochemical Polarization of Contiguous Cell Walls Shapes Plant Pavement Cells.

    Get PDF
    The epidermis of aerial plant organs is thought to be limiting for growth, because it acts as a continuous load-bearing layer, resisting tension. Leaf epidermis contains jigsaw puzzle piece-shaped pavement cells whose shape has been proposed to be a result of subcellular variations in expansion rate that induce local buckling events. Paradoxically, such local compressive buckling should not occur given the tensile stresses across the epidermis. Using computational modeling, we show that the simplest scenario to explain pavement cell shapes within an epidermis under tension must involve mechanical wall heterogeneities across and along the anticlinal pavement cell walls between adjacent cells. Combining genetics, atomic force microscopy, and immunolabeling, we demonstrate that contiguous cell walls indeed exhibit hybrid mechanochemical properties. Such biochemical wall heterogeneities precede wall bending. Altogether, this provides a possible mechanism for the generation of complex plant cell shapes

    Segmentation methods and shape descriptions in digital images

    Get PDF
    Digital image analysis enables creating objective, fast, and reproducible analysis methods of objects or situations that can be imaged. This thesis contains theoretical work regarding distance transforms for images digitized in elongated grids. Such images are the result of many, mainly 3D, imaging devices. Local weights appropriate for different elongation factors in 2D, as well as in 3D, are presented. Methods adapted to elongated grids save time and computer memory compared to increasing the image size by interpolating to a cubic grid. A number of segmentation methods for images in specific applications are also included in the thesis. Distance information is used to segment individual pores in paper volume images. This opens the possibility to investigate how the pore network affects the paper quality. Stable and reliable segmentation methods for cell nuclei are necessary to enable studies of tumor morphology, as well as amounts of fluorescence marked substances in individual nuclei. Intensity, gradient magnitude, and shape information is combined in a method to segment cell nuclei in 2D fluorescence and 3D confocal microscopy images of tissue sections. Two match based segmentation methods are also presented. Three types of viral capsids are identified and described based on their radial intensity distribution in transmission electron micrographs of infected cells. This can be used to measure how a potential drug affects the relative amounts of the three capsids, and possibly, the viral maturation pathway. Proteins of a specific kind in transmission electron volume images of a protein solution are identified using a shape based match method. This method reduces the amount of visual inspection needed to identify proteins of interest in the images. Two representation schemes, developed in order to simplify the analysis of individual proteins in volume images of proteins in solution, are presented. One divides a protein into subparts based on the internal intensity distribution and shape. The other represents the protein by the maximum intensity curve connecting the centers of the subparts of the protein. These representations can serve as tools for collecting information about how flexible a protein in solution is and how it interacts with other proteins or substances. This information is valuable for the pharmaceutical industry, when developing new drugs

    Evaluation of noise robustness for local binary pattern descriptors in texture classification

    Get PDF
    Local binary pattern (LBP) operators have become commonly used texture descriptors in recent years. Several new LBP-based descriptors have been proposed, of which some aim at improving robustness to noise. To do this, the thresholding and encoding schemes used in the descriptors are modified. In this article, the robustness to noise for the eight following LBP-based descriptors are evaluated; improved LBP, median binary patterns (MBP), local ternary patterns (LTP), improved LTP (ILTP), local quinary patterns, robust LBP, and fuzzy LBP (FLBP). To put their performance into perspective they are compared to three well-known reference descriptors; the classic LBP, Gabor filter banks (GF), and standard descriptors derived from gray-level co-occurrence matrices. In addition, a roughly five times faster implementation of the FLBP descriptor is presented, and a new descriptor which we call shift LBP is introduced as an even faster approximation to the FLBP. The texture descriptors are compared and evaluated on six texture datasets; Brodatz, KTH-TIPS2b, Kylberg, Mondial Marmi, UIUC, and a Virus texture dataset. After optimizing all parameters for each dataset the descriptors are evaluated under increasing levels of additive Gaussian white noise. The discriminating power of the texture descriptors is assessed using tenfolded cross-validation of a nearest neighbor classifier. The results show that several of the descriptors perform well at low levels of noise while they all suffer, to different degrees, from higher levels of introduced noise. In our tests, ILTP and FLBP show an overall good performance on several datasets. The GF are often very noise robust compared to the LBP-family under moderate to high levels of noise but not necessarily the best descriptor under low levels of added noise. In our tests, MBP is neither a good texture descriptor nor stable to noise
    corecore