37,210 research outputs found

    Negative Results in Computer Vision: A Perspective

    Full text link
    A negative result is when the outcome of an experiment or a model is not what is expected or when a hypothesis does not hold. Despite being often overlooked in the scientific community, negative results are results and they carry value. While this topic has been extensively discussed in other fields such as social sciences and biosciences, less attention has been paid to it in the computer vision community. The unique characteristics of computer vision, particularly its experimental aspect, call for a special treatment of this matter. In this paper, I will address what makes negative results important, how they should be disseminated and incentivized, and what lessons can be learned from cognitive vision research in this regard. Further, I will discuss issues such as computer vision and human vision interaction, experimental design and statistical hypothesis testing, explanatory versus predictive modeling, performance evaluation, model comparison, as well as computer vision research culture

    An Empirical Study Comparing Unobtrusive Physiological Sensors for Stress Detection in Computer Work.

    Get PDF
    Several unobtrusive sensors have been tested in studies to capture physiological reactions to stress in workplace settings. Lab studies tend to focus on assessing sensors during a specific computer task, while in situ studies tend to offer a generalized view of sensors' efficacy for workplace stress monitoring, without discriminating different tasks. Given the variation in workplace computer activities, this study investigates the efficacy of unobtrusive sensors for stress measurement across a variety of tasks. We present a comparison of five physiological measurements obtained in a lab experiment, where participants completed six different computer tasks, while we measured their stress levels using a chest-band (ECG, respiration), a wristband (PPG and EDA), and an emerging thermal imaging method (perinasal perspiration). We found that thermal imaging can detect increased stress for most participants across all tasks, while wrist and chest sensors were less generalizable across tasks and participants. We summarize the costs and benefits of each sensor stream, and show how some computer use scenarios present usability and reliability challenges for stress monitoring with certain physiological sensors. We provide recommendations for researchers and system builders for measuring stress with physiological sensors during workplace computer use

    Does comorbid anxiety counteract emotion recognition deficits in conduct disorder?

    Get PDF
    Background: Previous research has reported altered emotion recognition in both conduct disorder (CD) and anxiety disorders (ADs) - but these effects appear to be of different kinds. Adolescents with CD often show a generalised pattern of deficits, while those with ADs show hypersensitivity to specific negative emotions. Although these conditions often cooccur, little is known regarding emotion recognition performance in comorbid CD+ADs. Here, we test the hypothesis that in the comorbid case, anxiety-related emotion hypersensitivity counteracts the emotion recognition deficits typically observed in CD. Method: We compared facial emotion recognition across four groups of adolescents aged 12-18 years: those with CD alone (n = 28), ADs alone (n = 23), cooccurring CD+ADs (n = 20) and typically developing controls (n = 28). The emotion recognition task we used systematically manipulated the emotional intensity of facial expressions as well as fixation location (eye, nose or mouth region). Results: Conduct disorder was associated with a generalised impairment in emotion recognition; however, this may have been modulated by group differences in IQ. AD was associated with increased sensitivity to low-intensity happiness, disgust and sadness. In general, the comorbid CD+ADs group performed similarly to typically developing controls. Conclusions: Although CD alone was associated with emotion recognition impairments, ADs and comorbid CD+ADs were associated with normal or enhanced emotion recognition performance. The presence of comorbid ADs appeared to counteract the effects of CD, suggesting a potentially protective role, although future research should examine the contribution of IQ and gender to these effects

    What is Holding Back Convnets for Detection?

    Full text link
    Convolutional neural networks have recently shown excellent results in general object detection and many other tasks. Albeit very effective, they involve many user-defined design choices. In this paper we want to better understand these choices by inspecting two key aspects "what did the network learn?", and "what can the network learn?". We exploit new annotations (Pascal3D+), to enable a new empirical analysis of the R-CNN detector. Despite common belief, our results indicate that existing state-of-the-art convnet architectures are not invariant to various appearance factors. In fact, all considered networks have similar weak points which cannot be mitigated by simply increasing the training data (architectural changes are needed). We show that overall performance can improve when using image renderings for data augmentation. We report the best known results on the Pascal3D+ detection and view-point estimation tasks

    Infrared face recognition: a comprehensive review of methodologies and databases

    Full text link
    Automatic face recognition is an area with immense practical potential which includes a wide range of commercial and law enforcement applications. Hence it is unsurprising that it continues to be one of the most active research areas of computer vision. Even after over three decades of intense research, the state-of-the-art in face recognition continues to improve, benefitting from advances in a range of different research fields such as image processing, pattern recognition, computer graphics, and physiology. Systems based on visible spectrum images, the most researched face recognition modality, have reached a significant level of maturity with some practical success. However, they continue to face challenges in the presence of illumination, pose and expression changes, as well as facial disguises, all of which can significantly decrease recognition accuracy. Amongst various approaches which have been proposed in an attempt to overcome these limitations, the use of infrared (IR) imaging has emerged as a particularly promising research direction. This paper presents a comprehensive and timely review of the literature on this subject. Our key contributions are: (i) a summary of the inherent properties of infrared imaging which makes this modality promising in the context of face recognition, (ii) a systematic review of the most influential approaches, with a focus on emerging common trends as well as key differences between alternative methodologies, (iii) a description of the main databases of infrared facial images available to the researcher, and lastly (iv) a discussion of the most promising avenues for future research.Comment: Pattern Recognition, 2014. arXiv admin note: substantial text overlap with arXiv:1306.160

    Scene Graph Generation via Conditional Random Fields

    Full text link
    Despite the great success object detection and segmentation models have achieved in recognizing individual objects in images, performance on cognitive tasks such as image caption, semantic image retrieval, and visual QA is far from satisfactory. To achieve better performance on these cognitive tasks, merely recognizing individual object instances is insufficient. Instead, the interactions between object instances need to be captured in order to facilitate reasoning and understanding of the visual scenes in an image. Scene graph, a graph representation of images that captures object instances and their relationships, offers a comprehensive understanding of an image. However, existing techniques on scene graph generation fail to distinguish subjects and objects in the visual scenes of images and thus do not perform well with real-world datasets where exist ambiguous object instances. In this work, we propose a novel scene graph generation model for predicting object instances and its corresponding relationships in an image. Our model, SG-CRF, learns the sequential order of subject and object in a relationship triplet, and the semantic compatibility of object instance nodes and relationship nodes in a scene graph efficiently. Experiments empirically show that SG-CRF outperforms the state-of-the-art methods, on three different datasets, i.e., CLEVR, VRD, and Visual Genome, raising the Recall@100 from 24.99% to 49.95%, from 41.92% to 50.47%, and from 54.69% to 54.77%, respectively

    Fixation prediction with a combined model of bottom-up saliency and vanishing point

    Full text link
    By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing. Several models have been proposed for this purpose, yet there is a gap between best existing saliency models and human performance. While many researchers have developed purely computational models for fixation prediction, less attempts have been made to discover cognitive factors that guide gaze. Here, we study the effect of a particular type of scene structural information, known as the vanishing point, and show that human gaze is attracted to the vanishing point regions. We record eye movements of 10 observers over 532 images, out of which 319 have vanishing points. We then construct a combined model of traditional saliency and a vanishing point channel and show that our model outperforms state of the art saliency models using three scores on our dataset.Comment: arXiv admin note: text overlap with arXiv:1512.0172
    • …
    corecore