193 research outputs found

    Eye Motion Matters for 3D Face Reconstruction

    Full text link
    Recent advances in single-image 3D face reconstruction have shown remarkable progress in various applications. Nevertheless, prevailing techniques tend to prioritize the global facial contour and expression, often neglecting the nuanced dynamics of the eye region. In response, we introduce an Eye Landmark Adjustment Module, complemented by a Local Dynamic Loss, designed to capture the dynamic features of the eyes area. Our module allows for flexible adjustment of landmarks, resulting in accurate recreation of various eye states. In this paper, we present a comprehensive evaluation of our approach, conducting extensive experiments on two datasets. The results underscore the superior performance of our approach, highlighting its significant contributions in addressing this particular challenge.Comment: 6 pages, 5 figure

    Robust face recognition via accurate face alignment and sparse representation

    Get PDF
    Due to its potential applications, face recognition has been receiving more and more research attention recently. In this paper, we present a robust real-time facial recognition system. The system comprises three functional components, which are face detection, eye alignment and face recognition, respectively. Within the context of computer vision, there are lots of candidate algorithms to accomplish the above tasks. Having compared the performance of a few state-of-the-art candidates, robust and efficient algorithms are implemented. As for face detection, we have proposed a new approach termed Boosted Greedy Sparse Linear Discriminant Analysis (BGSLDA) that produces better performances than most reported face detectors. Since face misalignment significantly deteriorates the recognition accuracy, we advocate a new cascade framework including two different methods for eye detection and face alignment. We have adopted a recent algorithm termed Sparse Representation-based Classification (SRC) for the face recognition component. Experiments demonstrate that the whole system is highly qualified for efficiency as well as accuracy.Hanxi Li, Peng Wang and Chunhua Shenhttp://dicta2010.conference.nicta.com.au

    Intelligent Reflecting Surfaces and Next Generation Wireless Systems

    Full text link
    Intelligent reflecting surface (IRS) is a potential candidate for massive multiple-input multiple-output (MIMO) 2.0 technology due to its low cost, ease of deployment, energy efficiency and extended coverage. This chapter investigates the slot-by-slot IRS reflection pattern design and two-timescale reflection pattern design schemes, respectively. For the slot-by-slot reflection optimization, we propose exploiting an IRS to improve the propagation channel rank in mmWave massive MIMO systems without need to increase the transmit power budget. Then, we analyze the impact of the distributed IRS on the channel rank. To further reduce the heavy overhead of channel training, channel state information (CSI) estimation, and feedback in time-varying MIMO channels, we present a two-timescale reflection optimization scheme, where the IRS is configured relatively infrequently based on statistical CSI (S-CSI) and the active beamformers and power allocation are updated based on quickly outdated instantaneous CSI (I-CSI) per slot. The achievable average sum-rate (AASR) of the system is maximized without excessive overhead of cascaded channel estimation. A recursive sampling particle swarm optimization (PSO) algorithm is developed to optimize the large-timescale IRS reflection pattern efficiently with reduced samplings of channel samples.Comment: To appear as a chapter of the book "Massive MIMO for Future Wireless Communication Systems: Technology and Applications", to be published by Wiley-IEEE Press. arXiv admin note: text overlap with arXiv:2206.0727

    Toward enhancement of deep learning techniques using fuzzy logic: a survey

    Get PDF
    Deep learning has emerged recently as a type of artificial intelligence (AI) and machine learning (ML), it usually imitates the human way in gaining a particular knowledge type. Deep learning is considered an essential data science element, which comprises predictive modeling and statistics. Deep learning makes the processes of collecting, interpreting, and analyzing big data easier and faster. Deep neural networks are kind of ML models, where the non-linear processing units are layered for the purpose of extracting particular features from the inputs. Actually, the training process of similar networks is very expensive and it also depends on the used optimization method, hence optimal results may not be provided. The techniques of deep learning are also vulnerable to data noise. For these reasons, fuzzy systems are used to improve the performance of deep learning algorithms, especially in combination with neural networks. Fuzzy systems are used to improve the representation accuracy of deep learning models. This survey paper reviews some of the deep learning based fuzzy logic models and techniques that were presented and proposed in the previous studies, where fuzzy logic is used to improve deep learning performance. The approaches are divided into two categories based on how both of the samples are combined. Furthermore, the models' practicality in the actual world is revealed

    Applications of Intelligent Vision in Low-Cost Mobile Robots

    Get PDF
    With the development of intelligent information technology, we have entered an era of 5G and AI. Mobile robots embody both of these technologies, and as such play an important role in future developments. However, the development of perception vision in consumer-grade low-cost mobile robots is still in its infancies. With the popularity of edge computing technology in the future, high-performance vision perception algorithms are expected to be deployed on low-power edge computing chips. Within the context of low-cost mobile robotic solutions, a robot intelligent vision system is studied and developed in this thesis. The thesis proposes and designs the overall framework of the higher-level intelligent vision system. The core system includes automatic robot navigation and obstacle object detection. The core algorithm deployments are implemented through a low-power embedded platform. The thesis analyzes and investigates deep learning neural network algorithms for obstacle object detection in intelligent vision systems. By comparing a variety of open source object detection neural networks on high performance hardware platforms, combining the constraints of hardware platform, a suitable neural network algorithm is selected. The thesis combines the characteristics and constraints of the low-power hardware platform to further optimize the selected neural network. It introduces the minimize mean square error (MMSE) and the moving average minmax algorithms in the quantization process to reduce the accuracy loss of the quantized model. The results show that the optimized neural network achieves a 20-fold improvement in inference performance on the RK3399PRO hardware platform compared to the original network. The thesis concludes with the application of the above modules and systems to a higher-level intelligent vision system for a low-cost disinfection robot, and further optimization is done for the hardware platform. The test results show that while achieving the basic service functions, the robot can accurately identify the obstacles ahead and locate and navigate in real time, which greatly enhances the perception function of the low-cost mobile robot

    CNN Confidence Estimation for Rejection-Based Hand Gesture Classification in Myoelectric Control

    Get PDF
    Convolutional neural networks (CNNs) have been widely utilized to identify hand gestures from surface electromyography (sEMG) signals. However, due to the nonstationary characteristics of sEMG, the classification accuracy usually degrades significantly in the daily living environment involving complex hand movements. To further improve the reliability of a classifier, unconfident classifications are expected to be identified and rejected. In this study, we propose a novel approach to estimate the probability of correctness for each classification. Specifically, a confidence estimation model is established to generate confidence scores (ConfScore) based on posterior probabilities of CNN, and an objective function is designed to train the parameters of this model. In addition, a comprehensive metric that combines the true acceptance rate (TAR) and the true rejection rate (TRR) is proposed to evaluate the rejection performance of ConfScore, so that the tradeoff between system security and control lag could be fully considered. The effectiveness of ConfScore is verified using data from public databases and our online platform. The experimental results illustrate that ConfScore can better reflect the correctness of CNN classifications than traditional confidence features, i.e., maximum posterior probability and entropy of the probability vector. Moreover, the rejection performance is observed to be less sensitive to variations in rejection thresholds

    Enhancing target detection accuracy through cross-modal spatial perception and dual-modality fusion

    Get PDF
    The disparity between human and machine perception of spatial information presents a challenge for machines to accurately sense their surroundings and improve target detection performance. Cross-modal data fusion emerges as a potential solution to enhance the perceptual capabilities of systems. This article introduces a novel spatial perception method that integrates dual-modality feature fusion and coupled attention mechanisms to validate the improvement in detection performance through cross-modal information fusion. The proposed approach incorporates cross-modal feature extraction through a multi-scale feature extraction structure employing a dual-flow architecture. Additionally, a transformer is integrated for feature fusion, while the information perception of the detection system is optimized through the utilization of a linear combination of loss functions. Experimental results demonstrate the superiority of our algorithm over single-modality target detection using visible images, exhibiting an average accuracy improvement of 30.4%. Furthermore, our algorithm outperforms single-modality infrared image detection by 3.0% and comparative multimodal target detection algorithms by 3.5%. These results validate the effectiveness of our proposed algorithm in fusing dual-band features, significantly enhancing target detection accuracy. The adaptability and robustness of our approach are showcased through these results
    • …
    corecore