1,602 research outputs found

    Diagnosing Rarity in Human-Object Interaction Detection

    Full text link
    Human-object interaction (HOI) detection is a core task in computer vision. The goal is to localize all human-object pairs and recognize their interactions. An interaction defined by a tuple leads to a long-tailed visual recognition challenge since many combinations are rarely represented. The performance of the proposed models is limited especially for the tail categories, but little has been done to understand the reason. To that end, in this paper, we propose to diagnose rarity in HOI detection. We propose a three-step strategy, namely Detection, Identification and Recognition where we carefully analyse the limiting factors by studying state-of-the-art models. Our findings indicate that detection and identification steps are altered by the interaction signals like occlusion and relative location, as a result limiting the recognition accuracy.Comment: Accepted at CVPR'20 Workshop on Learning from Limited Label

    Diagnosing Human-object Interaction Detectors

    Full text link
    Although we have witnessed significant progress in human-object interaction (HOI) detection with increasingly high mAP (mean Average Precision), a single mAP score is too concise to obtain an informative summary of a model's performance and to understand why one approach is better than another. In this paper, we introduce a diagnosis toolbox for analyzing the error sources of the existing HOI detection models. We first conduct holistic investigations in the pipeline of HOI detection, consisting of human-object pair detection and then interaction classification. We define a set of errors and the oracles to fix each of them. By measuring the mAP improvement obtained from fixing an error using its oracle, we can have a detailed analysis of the significance of different errors. We then delve into the human-object detection and interaction classification, respectively, and check the model's behavior. For the first detection task, we investigate both recall and precision, measuring the coverage of ground-truth human-object pairs as well as the noisiness level in the detections. For the second classification task, we compute mAP for interaction classification only, without considering the detection scores. We also measure the performance of the models in differentiating human-object pairs with and without actual interactions using the AP (Average Precision) score. Our toolbox is applicable for different methods across different datasets and available at https://github.com/neu-vi/Diag-HOI

    Human-Object Interaction Detection:A Quick Survey and Examination of Methods

    Full text link
    Human-object interaction detection is a relatively new task in the world of computer vision and visual semantic information extraction. With the goal of machines identifying interactions that humans perform on objects, there are many real-world use cases for the research in this field. To our knowledge, this is the first general survey of the state-of-the-art and milestone works in this field. We provide a basic survey of the developments in the field of human-object interaction detection. Many works in this field use multi-stream convolutional neural network architectures, which combine features from multiple sources in the input image. Most commonly these are the humans and objects in question, as well as the spatial quality of the two. As far as we are aware, there have not been in-depth studies performed that look into the performance of each component individually. In order to provide insight to future researchers, we perform an individualized study that examines the performance of each component of a multi-stream convolutional neural network architecture for human-object interaction detection. Specifically, we examine the HORCNN architecture as it is a foundational work in the field. In addition, we provide an in-depth look at the HICO-DET dataset, a popular benchmark in the field of human-object interaction detection. Code and papers can be found at https://github.com/SHI-Labs/Human-Object-Interaction-Detection.Comment: Published at The 1st International Workshop On Human-Centric Multimedia Analysis, at ACM Multimedia Conference 202

    A mobile augmented reality application for supporting real-time skin lesion analysis based on deep learning

    Get PDF
    AbstractMelanoma is considered the deadliest skin cancer and when it is in an advanced state it is difficult to treat. Diagnoses are visually performed by dermatologists, by naked-eye observation. This paper proposes an augmented reality smartphone application for supporting the dermatologist in the real-time analysis of a skin lesion. The app augments the camera view with information related to the lesion features generally measured by the dermatologist for formulating the diagnosis. The lesion is also classified by a deep learning approach for identifying melanoma. The real-time process adopted for generating the augmented content is described. The real-time performances are also evaluated and a user study is also conducted. Results revealed that the real-time process may be entirely executed on the Smartphone and that the support provided is well judged by the target users

    Spatial Dichotomies: Research into the Development of 3D Representation

    Get PDF
    Abstract Matthew McGuire, for the Masters of Science degree in Professional Media & Media Management, presented on Friday, November 14th 2014, at Southern Illinois University Carbondale. Spatial Dichotomies: Research into the Development of 3D Representation Chair of Committee: Robert Spahr This research paper covers how 3D technology influences different industries over time. Key points within the research find that over time humans consistently gravitate toward increasing their ability to recreate depth, texture and image. Industry leaders in education, film, medical, television, and the Web will find literature and discussion of how to adjust their modes of production and manufacturing in the future. Each of these industries will be reviewed throughout the report. The data in the report identifies 3D technology in two sections: object and screen-based technology. Each chapter contains social and financial elements that assist and react to new technology in the market. Additionally, to help show the effects on the different industries there are trends and time lines to show how individual industries develop. In each industry, there are multiple ways that 3D technology can be effective and ineffective. The research goes over transitions in early media to compare to transitions in modern mediums of art and science. After covering a short history on the transformation that society has had with the image, the paper covers public demand for new virtual settings in education and entertainment. Image-based research indicates that users will on average continuously desire more depth and interactivity in content

    Preserved Consciousness in the Absence of a Cerebral Cortex, the Legal and Ethical Implications of Redefining Consciousness and Its Neural Correlates: A Case for a Subcortical System Generating Affective Consciousness

    Get PDF
    Historically, the scientific and medical communities have taken a corticocentric view on consciousness, emphasizing the need for a cortex in producing the conscious experience. The preserved consciousness observed in hydranencephalic children and decorticated rats suggests that some form of consciousness may be produced by a subcortical network. The brainstem, a phylogenetically ancient and conserved brain structure, could serve as the major integrative machinery to produce this form of consciousness, which is called affective consciousness—the evolutionary antecedent to the reflective consciousness that allows humans to reflect on their experiences. The functional convergence of the brainstem with the amygdala, motor system, and other subcortical structures provides the necessary architecture to support an affective state of consciousness by which instinctual-emotional goal-directed behavior is produced. This subcortical system operates by what Merker (2007) calls the selection triangle—an interface between bodily actions (action selection), the world (target selection), and personal motivation—to produce action through integration. By this model, it is possible that consciousness may persist in the absence of a cortex, such as in the persistent vegetative state. Because of this, it is necessary to establish that multiple forms of consciousness exist and to distinguish between affective and reflective consciousness, because such a distinction would have tremendous ethical implications in the conventional medical treatment of those with disorders of consciousness

    Symptom profiles and rates of diagnosis in autistic and other atypically developing infants and toddlers

    Get PDF
    Although there are many assessment scales that aid in the diagnosis of Autism Spectrum Disorders (ASD), very few instruments are designed specifically to identify the condition in the population of infants and toddlers. The primary purpose of this study is to systematically examine the differences between scores on the Baby and Infant Screen for Children with aUtIsm Traits (BISCUIT)-Part 1 in a sample of at risk atypically developing children. Participants are children enrolled in Louisiana’s EarlySteps Program, which provides support services (e.g., speech therapy, occupational therapy, physical therapy, behavior psychology) to infants/toddlers and their families from birth to 36 months of age. All children enrolled in EarlySteps have a medical condition that is likely to result in a developmental delay/atypical development (e.g., premature birth, seizure disorders, Down Syndrome), or are currently diagnosed with developmental delays. Using one-way analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA), three groups (ASD alone, seizure disorder and ASD, and premature and ASD) were compared on BISCUIT-Part 1 total and subscale scores. Implications of these results and directions for future research are discussed
    • …
    corecore