Search CORE

1,602 research outputs found

Diagnosing Rarity in Human-Object Interaction Detection

Author: Kilickaya Mert
Smeulders Arnold
Publication venue
Publication date: 01/01/2020
Field of study

Human-object interaction (HOI) detection is a core task in computer vision. The goal is to localize all human-object pairs and recognize their interactions. An interaction defined by a tuple leads to a long-tailed visual recognition challenge since many combinations are rarely represented. The performance of the proposed models is limited especially for the tail categories, but little has been done to understand the reason. To that end, in this paper, we propose to diagnose rarity in HOI detection. We propose a three-step strategy, namely Detection, Identification and Recognition where we carefully analyse the limiting factors by studying state-of-the-art models. Our findings indicate that detection and identification steps are altered by the interaction signals like occlusion and relative location, as a result limiting the recognition accuracy.Comment: Accepted at CVPR'20 Workshop on Learning from Limited Label

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Diagnosing Human-object Interaction Detectors

Author: Jiang Huaizu
Xie Weidi
Xie Yiming
Zhu Fangrui
Publication venue
Publication date: 16/08/2023
Field of study

Although we have witnessed significant progress in human-object interaction (HOI) detection with increasingly high mAP (mean Average Precision), a single mAP score is too concise to obtain an informative summary of a model's performance and to understand why one approach is better than another. In this paper, we introduce a diagnosis toolbox for analyzing the error sources of the existing HOI detection models. We first conduct holistic investigations in the pipeline of HOI detection, consisting of human-object pair detection and then interaction classification. We define a set of errors and the oracles to fix each of them. By measuring the mAP improvement obtained from fixing an error using its oracle, we can have a detailed analysis of the significance of different errors. We then delve into the human-object detection and interaction classification, respectively, and check the model's behavior. For the first detection task, we investigate both recall and precision, measuring the coverage of ground-truth human-object pairs as well as the noisiness level in the detections. For the second classification task, we compute mAP for interaction classification only, without considering the detection scores. We also measure the performance of the models in differentiating human-object pairs with and without actual interactions using the AP (Average Precision) score. Our toolbox is applicable for different methods across different datasets and available at https://github.com/neu-vi/Diag-HOI

arXiv.org e-Print Archive

Human-Object Interaction Detection:A Quick Survey and Examination of Methods

Author: Wei Yunchao
Yu Yu
Zheng Sipeng
Zhuang Bohan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/09/2020
Field of study

Human-object interaction detection is a relatively new task in the world of computer vision and visual semantic information extraction. With the goal of machines identifying interactions that humans perform on objects, there are many real-world use cases for the research in this field. To our knowledge, this is the first general survey of the state-of-the-art and milestone works in this field. We provide a basic survey of the developments in the field of human-object interaction detection. Many works in this field use multi-stream convolutional neural network architectures, which combine features from multiple sources in the input image. Most commonly these are the humans and objects in question, as well as the spatial quality of the two. As far as we are aware, there have not been in-depth studies performed that look into the performance of each component individually. In order to provide insight to future researchers, we perform an individualized study that examines the performance of each component of a multi-stream convolutional neural network architecture for human-object interaction detection. Specifically, we examine the HORCNN architecture as it is a foundational work in the field. In addition, we provide an in-depth look at the HICO-DET dataset, a popular benchmark in the field of human-object interaction detection. Code and papers can be found at https://github.com/SHI-Labs/Human-Object-Interaction-Detection.Comment: Published at The 1st International Workshop On Human-Centric Multimedia Analysis, at ACM Multimedia Conference 202

arXiv.org e-Print Archive

Crossref

Cancer is an adaptation that selects in animals against energy dissipation

Author: Muller A.W.J.
Publication venue: 'Elsevier BV'
Publication date: 01/07/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Recommended from our members

Olfaction-enhanced multimedia: Perspectives and challenges

Author: Ademoye OA
Ghinea G
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/08/2010
Field of study

This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2011 Springer VerlagOlfaction—or smell—is one of the last challenges which multimedia and multimodal applications have to conquer. Enhancing such applications with olfactory stimuli has the potential to create a more complex—and richer—user multimedia experience, by heightening the sense of reality and diversifying user interaction modalities. Nonetheless, olfaction-enhanced multimedia still remains a challenging research area. More recently, however, there have been initial signs of olfactory-enhanced applications in multimedia, with olfaction being used towards a variety of goals, including notification alerts, enhancing the sense of reality in immersive applications, and branding, to name but a few. However, as the goal of a multimedia application is to inform and/or entertain users, achieving quality olfaction-enhanced multimedia applications from the users’ perspective is vital to the success and continuity of these applications. Accordingly, in this paper we have focused on investigating the user perceived experience of olfaction-enhanced multimedia applications, with the aim of discovering the quality evaluation factors that are important from a user’s perspective of these applications, and consequently ensure the continued advancement and success of olfaction-enhanced multimedia applications

Brunel University Research Archive

A mobile augmented reality application for supporting real-time skin lesion analysis based on deep learning

Author: Genoveffa Tortora
Maria Frasca
Michele Risi
Rita Francese
Publication venue
Publication date: 03/05/2021
Field of study

AbstractMelanoma is considered the deadliest skin cancer and when it is in an advanced state it is difficult to treat. Diagnoses are visually performed by dermatologists, by naked-eye observation. This paper proposes an augmented reality smartphone application for supporting the dermatologist in the real-time analysis of a skin lesion. The app augments the camera view with information related to the lesion features generally measured by the dermatologist for formulating the diagnosis. The lesion is also classified by a deep learning approach for identifying melanoma. The real-time process adopted for generating the augmented content is described. The real-time performances are also evaluated and a user study is also conducted. Results revealed that the real-time process may be entirely executed on the Smartphone and that the support provided is well judged by the target users

Open Access Repository

Spatial Dichotomies: Research into the Development of 3D Representation

Author: McGuire Matthew G
Publication venue: OpenSIUC
Publication date: 30/01/2015
Field of study

Abstract Matthew McGuire, for the Masters of Science degree in Professional Media & Media Management, presented on Friday, November 14th 2014, at Southern Illinois University Carbondale. Spatial Dichotomies: Research into the Development of 3D Representation Chair of Committee: Robert Spahr This research paper covers how 3D technology influences different industries over time. Key points within the research find that over time humans consistently gravitate toward increasing their ability to recreate depth, texture and image. Industry leaders in education, film, medical, television, and the Web will find literature and discussion of how to adjust their modes of production and manufacturing in the future. Each of these industries will be reviewed throughout the report. The data in the report identifies 3D technology in two sections: object and screen-based technology. Each chapter contains social and financial elements that assist and react to new technology in the market. Additionally, to help show the effects on the different industries there are trends and time lines to show how individual industries develop. In each industry, there are multiple ways that 3D technology can be effective and ineffective. The research goes over transitions in early media to compare to transitions in modern mediums of art and science. After covering a short history on the transformation that society has had with the image, the paper covers public demand for new virtual settings in education and entertainment. Image-based research indicates that users will on average continuously desire more depth and interactivity in content

OpenSIUC

Preserved Consciousness in the Absence of a Cerebral Cortex, the Legal and Ethical Implications of Redefining Consciousness and Its Neural Correlates: A Case for a Subcortical System Generating Affective Consciousness

Author: Kawkabani Shadia
Publication venue: IdeaExchange@UAkron
Publication date: 01/01/2018
Field of study

Historically, the scientific and medical communities have taken a corticocentric view on consciousness, emphasizing the need for a cortex in producing the conscious experience. The preserved consciousness observed in hydranencephalic children and decorticated rats suggests that some form of consciousness may be produced by a subcortical network. The brainstem, a phylogenetically ancient and conserved brain structure, could serve as the major integrative machinery to produce this form of consciousness, which is called affective consciousness—the evolutionary antecedent to the reflective consciousness that allows humans to reflect on their experiences. The functional convergence of the brainstem with the amygdala, motor system, and other subcortical structures provides the necessary architecture to support an affective state of consciousness by which instinctual-emotional goal-directed behavior is produced. This subcortical system operates by what Merker (2007) calls the selection triangle—an interface between bodily actions (action selection), the world (target selection), and personal motivation—to produce action through integration. By this model, it is possible that consciousness may persist in the absence of a cortex, such as in the persistent vegetative state. Because of this, it is necessary to establish that multiple forms of consciousness exist and to distinguish between affective and reflective consciousness, because such a distinction would have tremendous ethical implications in the conventional medical treatment of those with disorders of consciousness

The University of Akron

Symptom profiles and rates of diagnosis in autistic and other atypically developing infants and toddlers

Author: Dempsey Timothy Michael
Publication venue: LSU Digital Commons
Publication date: 01/01/2011
Field of study

Although there are many assessment scales that aid in the diagnosis of Autism Spectrum Disorders (ASD), very few instruments are designed specifically to identify the condition in the population of infants and toddlers. The primary purpose of this study is to systematically examine the differences between scores on the Baby and Infant Screen for Children with aUtIsm Traits (BISCUIT)-Part 1 in a sample of at risk atypically developing children. Participants are children enrolled in Louisiana’s EarlySteps Program, which provides support services (e.g., speech therapy, occupational therapy, physical therapy, behavior psychology) to infants/toddlers and their families from birth to 36 months of age. All children enrolled in EarlySteps have a medical condition that is likely to result in a developmental delay/atypical development (e.g., premature birth, seizure disorders, Down Syndrome), or are currently diagnosed with developmental delays. Using one-way analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA), three groups (ASD alone, seizure disorder and ASD, and premature and ASD) were compared on BISCUIT-Part 1 total and subscale scores. Implications of these results and directions for future research are discussed

Louisiana State University