8 research outputs found

    Facial landmark localization in depth images using supervised ridge descent

    Get PDF
    Berk Gökberk (MEF Author)Supervised Descent Method (SDM) has proven successful in many computer vision applications such as face alignment, tracking and camera calibration. Recent studies which used SDM, achieved state of the-art performance on facial landmark localization in depth images [4]. In this study, we propose to use ridge regression instead of least squares regression for learning the SDM, and to change feature sizes in each iteration, effectively turning the landmark search into a coarse to fine process. We apply the proposed method to facial landmark localization on the Bosphorus 3D Face Database; using frontal depth images with no occlusion. Experimental results confirm that both ridge regression and using adaptive feature sizes improve the localization accuracy considerably.WOS:000380434700048Scopus - Affiliation ID: 60105072Conference Proceedings Citation Index- ScienceProceedings PaperAralık2015YÖK - 2015-1

    Point Cloud Segmentation Using Transfer Learning with RandLA-Net: A Case Study on Urban Areas

    Full text link
    Urban environments are characterized by complex structures and diverse features, making accurate segmentation of point cloud data a challenging task. This paper presents a comprehensive study on the application of RandLA-Net, a state-of-the-art neural network architecture, for the 3D segmentation of large-scale point cloud data in urban areas. The study focuses on three major Chinese cities, namely Chengdu, Jiaoda, and Shenzhen, leveraging their unique characteristics to enhance segmentation performance. To address the limited availability of labeled data for these specific urban areas, we employed transfer learning techniques. We transferred the learned weights from the Sensat Urban and Toronto 3D datasets to initialize our RandLA-Net model. Additionally, we performed class remapping to adapt the model to the target urban areas, ensuring accurate segmentation results. The experimental results demonstrate the effectiveness of the proposed approach achieving over 80\% F1 score for each areas in 3D point cloud segmentation. The transfer learning strategy proves to be crucial in overcoming data scarcity issues, providing a robust solution for urban point cloud analysis. The findings contribute to the advancement of point cloud segmentation methods, especially in the context of rapidly evolving Chinese urban areas

    Semi-Supervised Domain Adaptation for Semantic Segmentation of Roads from Satellite Images

    Full text link
    This paper presents the preliminary findings of a semi-supervised segmentation method for extracting roads from sattelite images. Artificial Neural Networks and image segmentation methods are among the most successful methods for extracting road data from satellite images. However, these models require large amounts of training data from different regions to achieve high accuracy rates. In cases where this data needs to be of more quantity or quality, it is a standard method to train deep neural networks by transferring knowledge from annotated data obtained from different sources. This study proposes a method that performs path segmentation with semi-supervised learning methods. A semi-supervised field adaptation method based on pseudo-labeling and Minimum Class Confusion method has been proposed, and it has been observed to increase performance in targeted datasets.Comment: in Turkish languag

    Building Segmentation on Satellite Images and Performance of Post-Processing Methods

    Full text link
    Researchers are doing intensive work on satellite images due to the information it contains with the development of computer vision algorithms and the ease of accessibility to satellite images. Building segmentation of satellite images can be used for many potential applications such as city, agricultural, and communication network planning. However, since no dataset exists for every region, the model trained in a region must gain generality. In this study, we trained several models in China and post-processing work was done on the best model selected among them. These models are evaluated in the Chicago region of the INRIA dataset. As can be seen from the results, although state-of-art results in this area have not been achieved, the results are promising. We aim to present our initial experimental results of a building segmentation from satellite images in this study.Comment: in Turkish languag

    Multi-domain and multi-task prediction of extraversion and leadership from meeting videos

    No full text
    Abstract Automatic prediction of personalities from meeting videos is a classical machine learning problem. Psychologists define personality traits as uncorrelated long-term characteristics of human beings. However, human annotations of personality traits introduce cultural and cognitive bias. In this study, we present methods to automatically predict emergent leadership and personality traits in the group meeting videos of the Emergent LEAdership corpus. Prediction of extraversion has attracted the attention of psychologists as it is able to explain a wide range of behaviors, predict performance, and assess risk. Prediction of emergent leadership, on the other hand, is of great importance for the business community. Therefore, we focus on the prediction of extraversion and leadership since these traits are also strongly manifested in a meeting scenario through the extracted features. We use feature analysis and multi-task learning methods in conjunction with the non-verbal features and crowd-sourced annotations from the Video bLOG (VLOG) corpus to perform a multi-domain and multi-task prediction of personality traits. Our results indicate that multi-task learning methods using 10 personality annotations as tasks and with a transfer from two different datasets from different domains improve the overall recognition performance. Preventing negative transfer by using a forward task selection scheme yields the best recognition results with 74.5% accuracy in leadership and 81.3% accuracy in extraversion traits. These results demonstrate the presence of annotation bias as well as the benefit of transferring information from weakly similar domains

    Comparison of mobile device navigation information display alternatives from the cognitive load perspective

    No full text
    In-vehicle information systems (IVIS) should minimize the cognitive load on the drivers to reduce any risk of accidents. For that purpose we built an experiment in which two alternatives for information display are compared. One alternative is the traditional information display method of showing a map with the target route highlighted in red. This is compared against a proposed alternative for information display in which prior to a junction a ground-level photo is displayed with a large red arrow pointing at the correct route the driver must take. The photo-enhanced information display method required 39% more time spent while gazing at the screen but provided a 10% reduction in the total number of headturns. Based on the participant comments, 80% of whom opted for the non-photo enhanced method, we concluded that the cognitive load brought on by the photo-enhancement is not worth the return

    Systém pro automatický překlad prstové znakové abecedy do mluvené řeči

    No full text
    Tento článek popisuje systém, který umožňuje vzájemnou komunikaci sluchově a zrakově postižených osob, s využitím konverze mluvené řeči do animace prstové znakové abecedy a naopak. V tomto článku je uvažováno využití několika jazyků: angličtina, ruština, turečtina a čeština.The aim of this paper is to help the communication of two people, one hearing impaired and one visually impaired by converting speech to fingerspelling and fingerspelling to speech. Fingerspelling is a subset of sign language, and uses finger signs to spell letters of the spoken or written language. We aim to convert finger spelled words to speech and vice versa. Different spoken languages and sign languages such as English, Russian, Turkish and Czech are considered
    corecore