203 research outputs found

    Inventory management of slow moving spare parts in National Electricity Power Plant of China

    Get PDF

    Exploring Visual Attention Mechanism for Scene Understanding in Image Captioning

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Scene understanding is a high-level computer vision research task that requires multiple fundamental vision tasks on different visual elements. Image captioning is a typical scene understanding task that understands salient visual contents in a real-world scene of an image and then automatically describes its understandings via a natural language sentence. This thesis concentrates on exploring visual attention mechanism for the scene understanding in image captioning, aiming to achieve a comprehensive multimodal and transparent scene understanding ability. Specifically, four problems are studied to enhance the visual attention mechanism to fine-grained spatial level, to a comprehensive semantic level, and to a high-level ability of attending to visual relationships for interaction words. Firstly, this thesis proposes a fine-grained and semantic-guided visual region attention model based on a novel Fully Convolutional Network (FCN)-Long Short Term Memory (LSTM) framework. It can attend to both object and “stuff” regions at a fine-grained grid-wise resolution and only focuses on the principal object information in each grid cell. In addition, grid-wise semantic labels are introduced to provide semantic guidance to ensure that related visual regions in different grid cells are correlated to each other. Moreover, an additional semantic context can be summarized from these textual semantic labels. Secondly, this thesis proposes a novel high-resolution FCN encoder is used with a residual attention structure to achieve a high-resolution attention supporting four high fine-grained resolutions (i.e., 27 x 27, 40 x 40, 60 x 60, 80 x 80). A size-invariant “attention correctness” metric is further proposed for evaluating the attention accuracy under different resolutions. Based on the COCO-Stuff dataset, pixel-level evaluations are conducted on both object and “stuff” regions to analyse the performances of high-resolution fine-grained visual region attention. Thirdly, this thesis concentrates on visual relationship attention exploring visual relationships between object regions under spatial constraints based on the parallel attention mechanism. The goal is to fully explore the relationships/interactions between visual regions for achieving an accurate description of interaction words in the caption. Moreover, it is trained implicitly through an unsupervised approach without using any explicit visual relationship annotations. Last but not the least, this thesis takes a further step to achieve an adaptive attention module that can perform the role of both visual region attention or visual relationship attention adaptively according to the needs of language decoder. The dynamic linguistic context of the language decoder is fully leveraged for exploring and attending to related visual relationships for interactive words

    ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases

    Full text link
    Large Language Models (LLMs) have shown the potential to revolutionize natural language processing tasks in various domains, sparking great interest in vertical-specific large models. However, unlike proprietary models such as BloombergGPT and FinGPT, which have leveraged their unique data accumulations to make strides in the finance domain, there hasn't not many similar large language models in the Chinese legal domain to facilitate its digital transformation. In this paper, we propose an open-source legal large language model named ChatLaw. Due to the importance of data quality, we carefully designed a legal domain fine-tuning dataset. Additionally, to overcome the problem of model hallucinations in legal data screening during reference data retrieval, we introduce a method that combines vector database retrieval with keyword retrieval to effectively reduce the inaccuracy of relying solely on vector database retrieval. Furthermore, we propose a self-attention method to enhance the ability of large models to overcome errors present in reference data, further optimizing the issue of model hallucinations at the model level and improving the problem-solving capabilities of large models. We also open-sourced our model and part of the data at https://github.com/PKU-YuanGroup/ChatLaw

    LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis

    Full text link
    Facial expression analysis is an important tool for human-computer interaction. In this paper, we introduce LibreFace, an open-source toolkit for facial expression analysis. This open-source toolbox offers real-time and offline analysis of facial behavior through deep learning models, including facial action unit (AU) detection, AU intensity estimation, and facial expression recognition. To accomplish this, we employ several techniques, including the utilization of a large-scale pre-trained network, feature-wise knowledge distillation, and task-specific fine-tuning. These approaches are designed to effectively and accurately analyze facial expressions by leveraging visual information, thereby facilitating the implementation of real-time interactive applications. In terms of Action Unit (AU) intensity estimation, we achieve a Pearson Correlation Coefficient (PCC) of 0.63 on DISFA, which is 7% higher than the performance of OpenFace 2.0 while maintaining highly-efficient inference that runs two times faster than OpenFace 2.0. Despite being compact, our model also demonstrates competitive performance to state-of-the-art facial expression analysis methods on AffecNet, FFHQ, and RAFDB. Our code will be released at https://github.com/ihp-lab/LibreFaceComment: 10 pages, 5 figures. Accepted by WACV 2024 Round 1. (Application Track

    NUMERICAL ANALYSIS OF VALVE STRUCTURE OF HIGH POWER MARINE ENGINE

    Get PDF
    Valve as an important part of the gas distribution mechanism, is an crucial part of the engine. When the engine works, the valve is subjected to high temperature, high impact, frictional wear and corrosion and other harsh working conditions, and the reliable and durable valve has an important impact on the safety and reliability of the engine. In this paper, a model of four-stroke marine diesel engine valve is used as the research object, and the intake valve set and exhaust valve set models are established respectively. Heat transfer simulation and failure analysis of inlet and exhaust valves of different structures and materials under different operating conditions were carried out using finite element analysis. The results show that the different valve structures and manufacturing materials have different effects on the reliability of the valves; Changing the valve structures and choosing different valve manufacturing materials have a greater impact on the heat transfer and deformation, thus affecting the overall reliability of the valves
    • …
    corecore