11 research outputs found

    Improving Performance of Object Detection using the Mechanisms of Visual Recognition in Humans

    Full text link
    Object recognition systems are usually trained and evaluated on high resolution images. However, in real world applications, it is common that the images have low resolutions or have small sizes. In this study, we first track the performance of the state-of-the-art deep object recognition network, Faster- RCNN, as a function of image resolution. The results reveals negative effects of low resolution images on recognition performance. They also show that different spatial frequencies convey different information about the objects in recognition process. It means multi-resolution recognition system can provides better insight into optimal selection of features that results in better recognition of objects. This is similar to the mechanisms of the human visual systems that are able to implement multi-scale representation of a visual scene simultaneously. Then, we propose a multi-resolution object recognition framework rather than a single-resolution network. The proposed framework is evaluated on the PASCAL VOC2007 database. The experimental results show the performance of our adapted multi-resolution Faster-RCNN framework outperforms the single-resolution Faster-RCNN on input images with various resolutions with an increase in the mean Average Precision (mAP) of 9.14% across all resolutions and 1.2% on the full-spectrum images. Furthermore, the proposed model yields robustness of the performance over a wide range of spatial frequencies

    Technology assessment of advanced automation for space missions

    Get PDF
    Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology

    Meaning maps and saliency models based on deep convolutional neural networks are insensitive to image meaning when predicting human fixations

    Get PDF
    Eye movements are vital for human vision, and it is therefore important to understand how observers decide where to look. Meaning maps (MMs), a technique to capture the distribution of semantic information across an image, have recently been proposed to support the hypothesis that meaning rather than image features guides human gaze. MMs have the potential to be an important tool far beyond eye-movements research. Here, we examine central assumptions underlying MMs. First, we compared the performance of MMs in predicting fixations to saliency models, showing that DeepGaze II – a deep neural network trained to predict fixations based on high-level features rather than meaning – outperforms MMs. Second, we show that whereas human observers respond to changes in meaning induced by manipulating object-context relationships, MMs and DeepGaze II do not. Together, these findings challenge central assumptions underlying the use of MMs to measure the distribution of meaning in images

    Fractal methods in image analysis and coding

    Get PDF
    In this thesis we present an overview of image processing techniques which use fractal methods in some way. We show how these fields relate to each other, and examine various aspects of fractal methods in each area. The three principal fields of image processing and analysis th a t we examine are texture classification, image segmentation and image coding. In the area of texture classification, we examine fractal dimension estimators, comparing these methods to other methods in use, and to each other. We attempt to explain why differences arise between various estimators of the same quantity. We also examine texture generation methods which use fractal dimension to generate textures of varying complexity. We examine how fractal dimension can contribute to image segmentation methods. We also present an in-depth analysis of a novel segmentation scheme based on fractal coding. Finally, we present an overview of fractal and wavelet image coding, and the links between the two. We examine a possible scheme involving both fractal and wavelet methods

    Advanced Automation for Space Missions

    Get PDF
    The feasibility of using machine intelligence, including automation and robotics, in future space missions was studied

    Investigating the role of image meaning and prior knowledge in human eye movements control

    Get PDF
    Humans sample visual information by making eye movements towards different parts of their surroundings. Understanding what guides this sampling process is an important goal of vision science, and the present thesis is a contribution to this endeavour. Chapter One provides an overview of factors influencing human eye movements, which are typically divided into bottom-up (stimulus-dependent) and top-down (observer-dependent) processes. One of the challenges in studying these factors stem from the fact that they are often difficult to operationalize in a precise, unambiguous way. This is particularly problematic for semantic information contained in visual scenes (‘image meaning’), a top-down factor which is the backbone of the recently proposed framework for understanding human eye movements: the meaning maps approach. Chapter Two evaluates this approach and demonstrates that meaning maps – a crowd-sourced method designed to quantify the distribution of meaning in natural scenes – might be sensitive to complex visual features, rather than meaning. Chapter Three builds on that finding and shows that contextualized meaning maps, the most recent variant of the original meaning maps, share the limitations of their predecessors. Chapter Four adopts a novel perspective on eye-movement control and focuses on the interactions between image features (a bottom-up factor) and prior object-knowledge possessed by an observer (a top-down factor). Specifically, it shows that the same stimuli – black and white, Mooney-style two-tone images – are looked at differently depending on whether the observer possesses object-knowledge that enables them to bind images into coherent percepts of objects. The final chapter summarizes the thesis and maps the future directions for studies on eye movements. Taken together, findings reported here indicate that while top-down factors such as prior object-knowledge play a crucial role in guiding human gaze, the tools to study them offered by the meaning maps approach still need to be improve
    corecore