2,311 research outputs found

    HUMAN GENDER CLASSIFICATION USING KINECT SENSOR: A REVIEW

    Get PDF
    Human Gender Classification using Kinect sensor aims to classifying people’s gender based on their outward appearance. Application areas of Kinect sensor technology includes security, marketing, healthcare, and gaming. However, because of the changes in pose, attire, and illumination, gender determination with the Kinect sensor is not a trivial task. It is based on a variety of characteristics, including biological, social network, face, and body aspects. In recent years, gender classification that utilizes the Kinect sensor became a popular and essential way for accurate gender classification. A variety of methods and approaches, like machine learning, convolutional neural networks, sport vector machine (SVM), etc., have been used for gender classification using a Kinect sensor. This paper presents the state of the art for gender classification, with a focus on the features, databases, procedures, and algorithms used in it. A review of recent studies on this subject using the Kinect sensor and other technologies is provided, together with information on the variables that affect the classification\u27s accuracy. In addition, several publicly accessible databases or datasets are used by researchers to classify people by gender are covered. Finlay, this overview offers insightful information about the potential future avenues for research on Kinect-based human gender classification

    AudioViewer: Learning to Visualize Sounds

    Full text link
    A long-standing goal in the field of sensory substitution is to enable sound perception for deaf and hard of hearing (DHH) people by visualizing audio content. Different from existing models that translate to hand sign language, between speech and text, or text and images, we target immediate and low-level audio to video translation that applies to generic environment sounds as well as human speech. Since such a substitution is artificial, without labels for supervised learning, our core contribution is to build a mapping from audio to video that learns from unpaired examples via high-level constraints. For speech, we additionally disentangle content from style, such as gender and dialect. Qualitative and quantitative results, including a human study, demonstrate that our unpaired translation approach maintains important audio features in the generated video and that videos of faces and numbers are well suited for visualizing high-dimensional audio features that can be parsed by humans to match and distinguish between sounds and words. Code and models are available at https://chunjinsong.github.io/audioviewe
    • …
    corecore