126 research outputs found

    U-ASD Net: supervised crowd counting based on semantic segmentation and adaptive scenario discovery

    Get PDF
    Crowd counting considers one of the most significant and challenging issues in computer vision and deep learning communities, whose applications are being utilized for various tasks. While this issue is well studied, it remains an open challenge to manage perspective distortions and scale variations. How well these problems are resolved has a huge impact on predicting a high-quality crowd density map. In this study, a hybrid and modified deep neural network (U-ASD Net), based on U-Net and adaptive scenario discovery (ASD), is proposed to get precise and effective crowd counting. The U part is produced by replacing the nearest upsampling in the encoder of U-Net with max-unpooling. This modification provides a better crowd counting performance by capturing more spatial information. The max-unpooling layers upsample the feature maps based on the max locations held from the downsampling process. The ASD part is constructed with three light pathways, two of which have been learned to reflect various densities of the crowd and define the appropriate geometric configuration employing various sizes of the receptive field. The third pathway is an adaptation path, which implicitly discovers and models complex scenarios to recalibrate pathway-wise responses adaptively. ASD has no additional branches to avoid increasing the complexity. The designed model is end-to-end trainable. This integration provides an effective model to count crowds in both dense and sparse datasets. It also predicts an elevated quality density map with a high structural similarity index and a high peak signal-to-noise ratio. Several comprehensive experiments on four popular datasets for crowd counting have been carried out to demonstrate the proposed method's promising performance compared to other state-of-the-art approaches. Furthermore, a new dataset with its manual annotations, called Haramain with three different scenes and different densities, is introduced and used for evaluating the U-ASD Net

    Advances in Object and Activity Detection in Remote Sensing Imagery

    Get PDF
    The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    A survey on generative adversarial networks for imbalance problems in computer vision tasks

    Get PDF
    Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms

    Digital Interaction and Machine Intelligence

    Get PDF
    This book is open access, which means that you have free and unlimited access. This book presents the Proceedings of the 9th Machine Intelligence and Digital Interaction Conference. Significant progress in the development of artificial intelligence (AI) and its wider use in many interactive products are quickly transforming further areas of our life, which results in the emergence of various new social phenomena. Many countries have been making efforts to understand these phenomena and find answers on how to put the development of artificial intelligence on the right track to support the common good of people and societies. These attempts require interdisciplinary actions, covering not only science disciplines involved in the development of artificial intelligence and human-computer interaction but also close cooperation between researchers and practitioners. For this reason, the main goal of the MIDI conference held on 9-10.12.2021 as a virtual event is to integrate two, until recently, independent fields of research in computer science: broadly understood artificial intelligence and human-technology interaction

    Edge Computing for Internet of Things

    Get PDF
    The Internet-of-Things is becoming an established technology, with devices being deployed in homes, workplaces, and public areas at an increasingly rapid rate. IoT devices are the core technology of smart-homes, smart-cities, intelligent transport systems, and promise to optimise travel, reduce energy usage and improve quality of life. With the IoT prevalence, the problem of how to manage the vast volumes of data, wide variety and type of data generated, and erratic generation patterns is becoming increasingly clear and challenging. This Special Issue focuses on solving this problem through the use of edge computing. Edge computing offers a solution to managing IoT data through the processing of IoT data close to the location where the data is being generated. Edge computing allows computation to be performed locally, thus reducing the volume of data that needs to be transmitted to remote data centres and Cloud storage. It also allows decisions to be made locally without having to wait for Cloud servers to respond

    Advanced Biometrics with Deep Learning

    Get PDF
    Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Affective Computing for Emotion Detection using Vision and Wearable Sensors

    Get PDF
    The research explores the opportunities, challenges, limitations, and presents advancements in computing that relates to, arises from, or deliberately influences emotions (Picard, 1997). The field is referred to as Affective Computing (AC) and is expected to play a major role in the engineering and development of computationally and cognitively intelligent systems, processors and applications in the future. Today the field of AC is bolstered by the emergence of multiple sources of affective data and is fuelled on by developments under various Internet of Things (IoTs) projects and the fusion potential of multiple sensory affective data streams. The core focus of this thesis involves investigation into whether the sensitivity and specificity (predictive performance) of AC, based on the fusion of multi-sensor data streams, is fit for purpose? Can such AC powered technologies and techniques truly deliver increasingly accurate emotion predictions of subjects in the real world? The thesis begins by presenting a number of research justifications and AC research questions that are used to formulate the original thesis hypothesis and thesis objectives. As part of the research conducted, a detailed state of the art investigations explored many aspects of AC from both a scientific and technological perspective. The complexity of AC as a multi-sensor, multi-modality, data fusion problem unfolded during the state of the art research and this ultimately led to novel thinking and origination in the form of the creation of an AC conceptualised architecture that will act as a practical and theoretical foundation for the engineering of future AC platforms and solutions. The AC conceptual architecture developed as a result of this research, was applied to the engineering of a series of software artifacts that were combined to create a prototypical AC multi-sensor platform known as the Emotion Fusion Server (EFS) to be used in the thesis hypothesis AC experimentation phases of the research. The thesis research used the EFS platform to conduct a detailed series of AC experiments to investigate if the fusion of multiple sensory sources of affective data from sensory devices can significantly increase the accuracy of emotion prediction by computationally intelligent means. The research involved conducting numerous controlled experiments along with the statistical analysis of the performance of sensors for the purposes of AC, the findings of which serve to assess the feasibility of AC in various domains and points to future directions for the AC field. The AC experiments data investigations conducted in relation to the thesis hypothesis used applied statistical methods and techniques, and the results, analytics and evaluations are presented throughout the two thesis research volumes. The thesis concludes by providing a detailed set of formal findings, conclusions and decisions in relation to the overarching research hypothesis on the sensitivity and specificity of the fusion of vision and wearables sensor modalities and offers foresights and guidance into the many problems, challenges and projections for the AC field into the future
    corecore