609 research outputs found

    Visual analysis for drum sequence transcription

    Get PDF
    A system is presented for analysing drum performance video sequences. A novel ellipse detection algorithm is introduced that automatically locates drum tops. This algorithm fits ellipses to edge clusters, and ranks them according to various fitness criteria. A background/foreground segmentation method is then used to extract the silhouette of the drummer and drum sticks. Coupled with a motion intensity feature, this allows for the detection of ā€˜hitsā€™ in each of the extracted regions. In order to obtain a transcription of the performance, each of these regions is automatically labeled with the corresponding instrument class. A partial audio transcription and color cues are used to measure the compatibility between a region and its label, the Kuhn-Munkres algorithm is then employed to find the optimal labeling. Experimental results demonstrate the ability of visual analysis to enhance the performance of an audio drum transcription system

    TFDet: Target-aware Fusion for RGB-T Pedestrian Detection

    Full text link
    Pedestrian detection plays a critical role in computer vision as it contributes to ensuring traffic safety. Existing methods that rely solely on RGB images suffer from performance degradation under low-light conditions due to the lack of useful information. To address this issue, recent multispectral detection approaches have combined thermal images to provide complementary information and have obtained enhanced performances. Nevertheless, few approaches focus on the negative effects of false positives caused by noisy fused feature maps. Different from them, we comprehensively analyze the impacts of false positives on the detection performance and find that enhancing feature contrast can significantly reduce these false positives. In this paper, we propose a novel target-aware fusion strategy for multispectral pedestrian detection, named TFDet. Our fusion strategy highlights the pedestrian-related features while suppressing unrelated ones, resulting in more discriminative fused features. TFDet achieves state-of-the-art performance on both KAIST and LLVIP benchmarks, with an efficiency comparable to the previous state-of-the-art counterpart. Importantly, TFDet performs remarkably well even under low-light conditions, which is a significant advancement for ensuring road safety. The code will be made publicly available at \url{https://github.com/XueZ-phd/TFDet.git}

    A brief survey of visual saliency detection

    Get PDF

    Recognition of Human Actions in Video

    Get PDF
    Recognition and analysis of Human actions is an important task in the area of computer vision. There are many applications of this research which include surveillance systems, patient monitoring systems, human performance analysis, con tent - based image/video retrieval/storage, virtual reality and a variety of syst ems that involve interactions between persons or interactions between person and devices, etc. The need for such system is increasing day - by - day, with the increase in number of surveillance cameras deployed in public spaces. Automated systems are required that can detect, categorize and recognize human activities and request the human attention only when necessary. In this paper, important steps of such a system are described that can robustly tracks human in various environments and recognizes their actions through image sequences acquired from a single fixed camera. The overall system consists of major th ree steps: blob extraction, feature extraction, and human action recognition. Given the sequence of images, a statistical method is demonstrated to extract the blobs and to remove the shadows and highlights in order to obtain a more accurate object silhouet te. Shape context is used to extract features in next step and at - last human action is recognized using neural networ

    WATUNet: A Deep Neural Network for Segmentation of Volumetric Sweep Imaging Ultrasound

    Full text link
    Objective. Limited access to breast cancer diagnosis globally leads to delayed treatment. Ultrasound, an effective yet underutilized method, requires specialized training for sonographers, which hinders its widespread use. Approach. Volume sweep imaging (VSI) is an innovative approach that enables untrained operators to capture high-quality ultrasound images. Combined with deep learning, like convolutional neural networks (CNNs), it can potentially transform breast cancer diagnosis, enhancing accuracy, saving time and costs, and improving patient outcomes. The widely used UNet architecture, known for medical image segmentation, has limitations, such as vanishing gradients and a lack of multi-scale feature extraction and selective region attention. In this study, we present a novel segmentation model known as Wavelet_Attention_UNet (WATUNet). In this model, we incorporate wavelet gates (WGs) and attention gates (AGs) between the encoder and decoder instead of a simple connection to overcome the limitations mentioned, thereby improving model performance. Main results. Two datasets are utilized for the analysis. The public "Breast Ultrasound Images" (BUSI) dataset of 780 images and a VSI dataset of 3818 images. Both datasets contained segmented lesions categorized into three types: no mass, benign mass, and malignant mass. Our segmentation results show superior performance compared to other deep networks. The proposed algorithm attained a Dice coefficient of 0.94 and an F1 score of 0.94 on the VSI dataset and scored 0.93 and 0.94 on the public dataset, respectively.Comment: N/

    Salient Object Detection via Integrity Learning

    Full text link
    Albeit current salient object detection (SOD) works have achieved fantastic progress, they are cast into the shade when it comes to the integrity of the predicted salient regions. We define the concept of integrity at both the micro and macro level. Specifically, at the micro level, the model should highlight all parts that belong to a certain salient object, while at the macro level, the model needs to discover all salient objects from the given image scene. To facilitate integrity learning for salient object detection, we design a novel Integrity Cognition Network (ICON), which explores three important components to learn strong integrity features. 1) Unlike the existing models that focus more on feature discriminability, we introduce a diverse feature aggregation (DFA) component to aggregate features with various receptive fields (i.e.,, kernel shape and context) and increase the feature diversity. Such diversity is the foundation for mining the integral salient objects. 2) Based on the DFA features, we introduce the integrity channel enhancement (ICE) component with the goal of enhancing feature channels that highlight the integral salient objects at the macro level, while suppressing the other distracting ones. 3) After extracting the enhanced features, the part-whole verification (PWV) method is employed to determine whether the part and whole object features have strong agreement. Such part-whole agreements can further improve the micro-level integrity for each salient object. To demonstrate the effectiveness of ICON, comprehensive experiments are conducted on seven challenging benchmarks, where promising results are achieved

    Salient Object Detection Techniques in Computer Vision-A Survey.

    Full text link
    Detection and localization of regions of images that attract immediate human visual attention is currently an intensive area of research in computer vision. The capability of automatic identification and segmentation of such salient image regions has immediate consequences for applications in the field of computer vision, computer graphics, and multimedia. A large number of salient object detection (SOD) methods have been devised to effectively mimic the capability of the human visual system to detect the salient regions in images. These methods can be broadly categorized into two categories based on their feature engineering mechanism: conventional or deep learning-based. In this survey, most of the influential advances in image-based SOD from both conventional as well as deep learning-based categories have been reviewed in detail. Relevant saliency modeling trends with key issues, core techniques, and the scope for future research work have been discussed in the context of difficulties often faced in salient object detection. Results are presented for various challenging cases for some large-scale public datasets. Different metrics considered for assessment of the performance of state-of-the-art salient object detection models are also covered. Some future directions for SOD are presented towards end

    Direction Selective Contour Detection for Salient Objects

    Get PDF
    The active contour model is a widely used technique for automatic object contour extraction. Existing methods based on this model can perform with high accuracy even in case of complex contours, but challenging issues remain, like the need for precise contour initialization for high curvature boundary segments or the handling of cluttered backgrounds. To deal with such issues, this paper presents a salient object extraction method, the first step of which is the introduction of an improved edge map that incorporates edge direction as a feature. The direction information in the small neighborhoods of image feature points are extracted, and the imagesā€™ prominent orientations are defined for direction-selective edge extraction. Using such improved edge information, we provide a highly accurate shape contour representation, which we also combine with texture features. The principle of the paper is to interpret an object as the fusion of its components: its extracted contour and its inner texture. Our goal in fusing textural and structural information is twofold: it is applied for automatic contour initialization, and it is also used to establish an improved external force field. This fusion then produces highly accurate salient object extractions. We performed extensive evaluations which confirm that the presented object extraction method outperforms parametric active contour models and achieves higher efficiency than the majority of the evaluated automatic saliency methods
    • ā€¦
    corecore