273 research outputs found

    Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches

    Get PDF
    The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems

    Evaluation Kidney Layer Segmentation on Whole Slide Imaging using Convolutional Neural Networks and Transformers

    Full text link
    The segmentation of kidney layer structures, including cortex, outer stripe, inner stripe, and inner medulla within human kidney whole slide images (WSI) plays an essential role in automated image analysis in renal pathology. However, the current manual segmentation process proves labor-intensive and infeasible for handling the extensive digital pathology images encountered at a large scale. In response, the realm of digital renal pathology has seen the emergence of deep learning-based methodologies. However, very few, if any, deep learning based approaches have been applied to kidney layer structure segmentation. Addressing this gap, this paper assesses the feasibility of performing deep learning based approaches on kidney layer structure segmetnation. This study employs the representative convolutional neural network (CNN) and Transformer segmentation approaches, including Swin-Unet, Medical-Transformer, TransUNet, U-Net, PSPNet, and DeepLabv3+. We quantitatively evaluated six prevalent deep learning models on renal cortex layer segmentation using mice kidney WSIs. The empirical results stemming from our approach exhibit compelling advancements, as evidenced by a decent Mean Intersection over Union (mIoU) index. The results demonstrate that Transformer models generally outperform CNN-based models. By enabling a quantitative evaluation of renal cortical structures, deep learning approaches are promising to empower these medical professionals to make more informed kidney layer segmentation

    Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features

    Full text link
    The task of open-vocabulary object-centric image retrieval involves the retrieval of images containing a specified object of interest, delineated by an open-set text query. As working on large image datasets becomes standard, solving this task efficiently has gained significant practical importance. Applications include targeted performance analysis of retrieved images using ad-hoc queries and hard example mining during training. Recent advancements in contrastive-based open vocabulary systems have yielded remarkable breakthroughs, facilitating large-scale open vocabulary image retrieval. However, these approaches use a single global embedding per image, thereby constraining the system's ability to retrieve images containing relatively small object instances. Alternatively, incorporating local embeddings from detection pipelines faces scalability challenges, making it unsuitable for retrieval from large databases. In this work, we present a simple yet effective approach to object-centric open-vocabulary image retrieval. Our approach aggregates dense embeddings extracted from CLIP into a compact representation, essentially combining the scalability of image retrieval pipelines with the object identification capabilities of dense detection methods. We show the effectiveness of our scheme to the task by achieving significantly better results than global feature approaches on three datasets, increasing accuracy by up to 15 mAP points. We further integrate our scheme into a large scale retrieval framework and demonstrate our method's advantages in terms of scalability and interpretability.Comment: BMVC 202

    Automatic segmentation of the left ventricle cavity and myocardium in MRI data

    Get PDF
    A novel approach for the automatic segmentation has been developed to extract the epi-cardium and endo-cardium boundaries of the left ventricle (lv) of the heart. The developed segmentation scheme takes multi-slice and multi-phase magnetic resonance (MR) images of the heart, transversing the short-axis length from the base to the apex. Each image is taken at one instance in the heart's phase. The images are segmented using a diffusion-based filter followed by an unsupervised clustering technique and the resulting labels are checked to locate the (lv) cavity. From cardiac anatomy, the closest pool of blood to the lv cavity is the right ventricle cavity. The wall between these two blood-pools (interventricular septum) is measured to give an approximate thickness for the myocardium. This value is used when a radial search is performed on a gradient image to find appropriate robust segments of the epi-cardium boundary. The robust edge segments are then joined using a normal spline curve. Experimental results are presented with very encouraging qualitative and quantitative results and a comparison is made against the state-of-the art level-sets method

    Image Segmentation by Edge Partitioning over a Nonsubmodular Markov Random Field

    Get PDF
    Edge weight-based segmentation methods, such as normalized cut or minimum cut, require a partition number specification for their energy formulation. The number of partitions plays an important role in the segmentation overall quality. However, finding a suitable partition number is a nontrivial problem, and the numbers are ordinarily manually assigned. This is an aspect of the general partition problem, where finding the partition number is an important and difficult issue. In this paper, the edge weights instead of the pixels are partitioned to segment the images. By partitioning the edge weights into two disjoints sets, that is, cut and connect, an image can be partitioned into all possible disjointed segments. The proposed energy function is independent of the number of segments. The energy is minimized by iterating the QPBO-α-expansion algorithm over the pairwise Markov random field and the mean estimation of the cut and connected edges. Experiments using the Berkeley database show that the proposed segmentation method can obtain equivalently accurate segmentation results without designating the segmentation numbers
    corecore