27 research outputs found

    移動ロボットのための顔インタフェースを利用した動作予告に関する研究

    Get PDF
    筑波大学修士(情報学)学位論文・平成31年3月25日授与(41295号

    FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation

    Full text link
    Facial expression analysis based on machine learning requires large number of well-annotated data to reflect different changes in facial motion. Publicly available datasets truly help to accelerate research in this area by providing a benchmark resource, but all of these datasets, to the best of our knowledge, are limited to rough annotations for action units, including only their absence, presence, or a five-level intensity according to the Facial Action Coding System. To meet the need for videos labeled in great detail, we present a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D Facial Animation. One hundred and twenty-two participants, including children, young adults and elderly people, were recorded in real-world conditions. In addition, 99,356 frames were manually labeled using Expression Quantitative Tool developed by us to quantify 9 symmetrical FACS action units, 10 asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action descriptors and 2 asymmetrical FACS action descriptors, and each action unit or action descriptor is well-annotated with a floating point number between 0 and 1. To provide a baseline for use in future research, a benchmark for the regression of action unit values based on Convolutional Neural Networks are presented. We also demonstrate the potential of our FEAFA dataset for 3D facial animation. Almost all state-of-the-art algorithms for facial animation are achieved based on 3D face reconstruction. We hence propose a novel method that drives virtual characters only based on action unit value regression of the 2D video frames of source actors.Comment: 9 pages, 7 figure

    K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment

    Full text link
    The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location, together with a tumor encoder for representing features, such as texture details and brightness intensities. To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities. As a consequence, the features learned from lesion regions, k-space, and anatomical structures are all captured, which serve as our quality evaluators. We evaluate the performance by constructing a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments. Extensive experiments demonstrate that the proposed method outperforms other metrics, especially in comparison with the radiologists on NIRPS

    IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

    Full text link
    Image anomaly detection (IAD) is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but their performance deviates greatly. We realize that the lack of actual IM settings most probably hinders the development and usage of these methods in real-world applications. As far as we know, IAD methods are not evaluated systematically. As a result, this makes it difficult for researchers to analyze them because they are designed for different or special cases. To solve this problem, we first propose a uniform IM setting to assess how well these algorithms perform, which includes several aspects, i.e., various levels of supervision (unsupervised vs. semi-supervised), few-shot learning, continual learning, noisy labels, memory usage, and inference speed. Moreover, we skillfully build a comprehensive image anomaly detection benchmark (IM-IAD) that includes 16 algorithms on 7 mainstream datasets with uniform settings. Our extensive experiments (17,017 in total) provide in-depth insights for IAD algorithm redesign or selection under the IM setting. Next, the proposed benchmark IM-IAD gives challenges as well as directions for the future. To foster reproducibility and accessibility, the source code of IM-IAD is uploaded on the website, https://github.com/M-3LAB/IM-IAD

    Multimodal ultrasound imaging: a method to improve the accuracy of sentinel lymph node diagnosis in breast cancer

    Get PDF
    AimThis study assessed the utility of multimodal ultrasound in enhancing the accuracy of breast cancer sentinel lymph node (SLN) assessment and compared it with single-modality ultrasound.MethodsPreoperative examinations, including two-dimensional ultrasound (2D US), intradermal contrast-enhanced ultrasound (CEUS), intravenous CEUS, shear-wave elastography (SWE), and surface localization, were conducted on 86 SLNs from breast cancer patients. The diagnostic performance of single and multimodal approaches for detecting metastatic SLNs was compared to postoperative pathological results.ResultsAmong the 86 SLNs, 29 were pathologically diagnosed as metastatic, and 57 as non-metastatic. Single-modality ultrasounds had AUC values of 0.826 (intradermal CEUS), 0.705 (intravenous CEUS), 0.678 (2D US), and 0.677 (SWE), respectively. Intradermal CEUS significantly outperformed the other methods (p<0.05), while the remaining three methods had no statistically significant differences (p>0.05). Multimodal ultrasound, combining intradermal CEUS, intravenous CEUS, 2D US, and SWE, achieved an AUC of 0.893, with 86.21% sensitivity and 84.21% specificity. The DeLong test confirmed that multimodal ultrasound was significantly better than the four single-modal ultrasound methods (p<0.05). Decision curve analysis and clinical impact curves demonstrated the superior performance of multimodal ultrasound in identifying high-risk SLN patients.ConclusionMultimodal ultrasound improves breast cancer SLN identification and diagnostic accuracy

    Applications of artificial intelligence in children and elderly care and short video industries - Cases from Cubo Ai and Tiktok

    Get PDF
    Ever since the concept of Artificial Intelligence (AI) was first coined in 1955, the quest for sophistication and improvement of existing technologies paved the way for the continuous development of AI technologies. Nowadays, AI technologies are redefining and disrupting the way people work and live in many different domains. This paper mainly focuses on AI applications in two fields closely related to people's life - children & elderly care and short video industries. It first introduces several prevailing AI technologies applied in children & elderly care and short video industries, and then uses two case studies from Cubo Ai and Tiktok to elaborate the applications in the corresponding fields

    A Coarse-to-Fine Facial Landmark Detection Method Based on Self-attention Mechanism

    Get PDF
    © 1999-2012 IEEE. Facial landmark detection in the wild remains a challenging problem in computer vision. Deep learning-based methods currently play a leading role in solving this. However, these approaches generally focus on local feature learning and ignore global relationships. Therefore, in this study, a self-attention mechanism is introduced into facial landmark detection. Specifically, a coarse-to-fine facial landmark detection method is proposed that uses two stacked hourglasses as the backbone, with a new landmark-guided self-attention (LGSA) block inserted between them. The LGSA block learns the global relationships between different positions on the feature map and allows feature learning to focus on the locations of landmarks with the help of a landmark-specific attention map, which is generated in the first-stage hourglass model. A novel attentional consistency loss is also proposed to ensure the generation of an accurate landmark-specific attention map. A new channel transformation block is used as the building block of the hourglass model to improve the model\u27s capacity. The coarse-to-fine strategy is adopted during and between phases to reduce complexity. Extensive experimental results on public datasets demonstrate the superiority of our proposed method against state-of-the-art models

    The Knowledge Map of Marine Energy

    No full text

    TPE: Lightweight Transformer Photo Enhancement Based on Curve Adjustment

    No full text
    In recent years, learning-based methods have made great progress in the field of photo enhancement. However, the enhancement methods rely on complex network structures and consume excessive amounts of computing resources, which greatly increases the difficulty of their deployment on lightweight devices. Additionally, the methods have poor real-time performance when processing very large resolution images. In contrast to previous works on designing structurally diverse CNN networks, photo enhancement can be achieved through a lightweight self-attentive mechanism for global-local tuning. In this paper, we design a lightweight photo enhancement tool based on Transformer; we dub the tool, TPE. TPE captures long-range dependencies among image patches and can efficiently extract the structural relationships within an image. A multistage curve adjustment strategy overcomes the problem of the limited adjustment capabilities of the global adjustment function, allowing the method to combine both global modifications and local fine-tuning. Experiments on various benchmarks demonstrate the qualitative and quantitative advantages of TPE over state-of-the-art methods in photo retouching and low-light image enhancement tasks

    Tiny adversarial multi-objective one-shot neural architecture search

    No full text
    Xie G, Wang J, Yu G, Lyu J, Zheng F, Jin Y. Tiny adversarial multi-objective one-shot neural architecture search. Complex & Intelligent Systems. 2023.The widely employed tiny neural networks (TNNs) in mobile devices are vulnerable to adversarial attacks. However, more advanced research on the robustness of TNNs is highly in demand. This work focuses on improving the robustness of TNNs without sacrificing the model’s accuracy. To find the optimal trade-off networks in terms of the adversarial accuracy, clean accuracy, and model size, we present TAM-NAS, a tiny adversarial multi-objective one-shot network architecture search method. First, we build a novel search space comprised of new tiny blocks and channels to establish a balance between the model size and adversarial performance. Then, we demonstrate how the supernet facilitates the acquisition of the optimal subnet under white-box adversarial attacks, provided that the supernet significantly impacts the subnet’s performance. Concretely, we investigate a new adversarial training paradigm by evaluating the adversarial transferability, the width of the supernet, and the distinction between training subnets from scratch and fine-tuning. Finally, we undertake statistical analysis for the layer-wise combination of specific blocks and channels on the first non-dominated front, which can be utilized as a design guideline for the design of TNNs
    corecore