3 research outputs found

    Neural Networks and Learning Systems for Human Machine Interfacing

    Get PDF
    With developments of the sensor and computing technologies, human-machine interfaces (HMIs) are designed to meet the increasing user demands of machines and systems. This is because human effects are becoming the key issues to allow some advanced mechanical devices, such as robots and biometric systems, to perform complicate tasks intelligently in an unknown environment. An effective HMI with learning ability can process, interpret, recognize, and simulate the intention and behaviors of human beings, and then utilize intelligent algorithms to drive the machine devices. The HMIs also enable us to bring humanistic intelligence and actions in robotic devices, biometric systems and other machines through two-ways interactions, such as using deep neural networks. In recent years, a growing number of researchers and studies focusing on this area have clearly demonstrated the importance of learning systems for HMIs

    EnTri: Ensemble Learning with Tri-level Representations for Explainable Scene Recognition

    Full text link
    Scene recognition based on deep-learning has made significant progress, but there are still limitations in its performance due to challenges posed by inter-class similarities and intra-class dissimilarities. Furthermore, prior research has primarily focused on improving classification accuracy, yet it has given less attention to achieving interpretable, precise scene classification. Therefore, we are motivated to propose EnTri, an ensemble scene recognition framework that employs ensemble learning using a hierarchy of visual features. EnTri represents features at three distinct levels of detail: pixel-level, semantic segmentation-level, and object class and frequency level. By incorporating distinct feature encoding schemes of differing complexity and leveraging ensemble strategies, our approach aims to improve classification accuracy while enhancing transparency and interpretability via visual and textual explanations. To achieve interpretability, we devised an extension algorithm that generates both visual and textual explanations highlighting various properties of a given scene that contribute to the final prediction of its category. This includes information about objects, statistics, spatial layout, and textural details. Through experiments on benchmark scene classification datasets, EnTri has demonstrated superiority in terms of recognition accuracy, achieving competitive performance compared to state-of-the-art approaches, with an accuracy of 87.69%, 75.56%, and 99.17% on the MIT67, SUN397, and UIUC8 datasets, respectively.Comment: Submitted to Pattern Recognition journa

    Text-based indoor place recognition with deep neural network

    No full text
    Indoor place recognition is a challenging problem because of the hard representation to complicated intra-class variations and inter-class similarities.This paper presents a new indoor place recognition scheme using deep neural network. Traditional representations of indoor place almost utilize image feature to retain the spatial structure without considering the object's semantic characteristics. However, we argue that the attributes, state and relationships of objects are much more helpful in indoor place recognition. In particular, we improve the recognition framework by utilizing Place Descriptors (PDs) in text from to connect different types of place information with their categories. Meanwhile, we analyse the ability of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) for classification in natural language, for which we use them to process the indoor place descriptions. In addition, we improve the robustness of the designed deep neural network by combining a number of effective strategies, i.e. L2-regularization, data normalization, and proper calibration of key parameters. Compared with existing state of the art, the proposed approach achieves well performance of 70.73%, 70.08% and 70.16% of accuracy, precision and recall on Visual Genome database respectively. Meanwhile, the accuracy becomes 98.6% after adding voting mechanics
    corecore