120 research outputs found

    Show, Recall, and Tell: Image Captioning with Recall Mechanism

    Full text link
    Generating natural and accurate descriptions in image cap-tioning has always been a challenge. In this paper, we pro-pose a novel recall mechanism to imitate the way human con-duct captioning. There are three parts in our recall mecha-nism : recall unit, semantic guide (SG) and recalled-wordslot (RWS). Recall unit is a text-retrieval module designedto retrieve recalled words for images. SG and RWS are de-signed for the best use of recalled words. SG branch cangenerate a recalled context, which can guide the process ofgenerating caption. RWS branch is responsible for copyingrecalled words to the caption. Inspired by pointing mecha-nism in text summarization, we adopt a soft switch to balancethe generated-word probabilities between SG and RWS. Inthe CIDEr optimization step, we also introduce an individualrecalled-word reward (WR) to boost training. Our proposedmethods (SG+RWS+WR) achieve BLEU-4 / CIDEr / SPICEscores of 36.6 / 116.9 / 21.3 with cross-entropy loss and 38.7 /129.1 / 22.4 with CIDEr optimization on MSCOCO Karpathytest split, which surpass the results of other state-of-the-artmethods.Comment: Published in AAAI 202

    Deep-learning electronic-structure calculation of magnetic superstructures

    Full text link
    Ab initio study of magnetic superstructures (e.g., magnetic skyrmion) is indispensable to the research of novel materials but bottlenecked by its formidable computational cost. For solving the bottleneck problem, we develop a deep equivariant neural network method (named xDeepH) to represent density functional theory Hamiltonian HDFTH_\text{DFT} as a function of atomic and magnetic structures and apply neural networks for efficient electronic structure calculation. Intelligence of neural networks is optimized by incorporating a priori knowledge about the important locality and symmetry properties into the method. Particularly, we design a neural-network architecture fully preserving all equivalent requirements on HDFTH_\text{DFT} by the Euclidean and time-reversal symmetries (E(3)Ă—{I,T}E(3) \times \{I, T\}), which is essential to improve method performance. High accuracy (sub-meV error) and good transferability of xDeepH are shown by systematic experiments on nanotube, spin-spiral, and Moir\'{e} magnets, and the capability of studying magnetic skyrmion is also demonstrated. The method could find promising applications in magnetic materials research and inspire development of deep-learning ab initio methods

    OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity based on Residue-Atom Contacting Shells

    Full text link
    One key task in virtual screening is to accurately predict the binding affinity (â–ł\triangleGG) of protein-ligand complexes. Recently, deep learning (DL) has significantly increased the predicting accuracy of scoring functions due to the extraordinary ability of DL to extract useful features from raw data. Nevertheless, more efforts still need to be paid in many aspects, for the aim of increasing prediction accuracy and decreasing computational cost. In this study, we proposed a simple scoring function (called OnionNet-2) based on convolutional neural network to predict â–ł\triangleGG. The protein-ligand interactions are characterized by the number of contacts between protein residues and ligand atoms in multiple distance shells. Compared to published models, the efficacy of OnionNet-2 is demonstrated to be the best for two widely used datasets CASF-2016 and CASF-2013 benchmarks. The OnionNet-2 model was further verified by non-experimental decoy structures from docking program and the CSAR NRC-HiQ data set (a high-quality data set provided by CSAR), which showed great success. Thus, our study provides a simple but efficient scoring function for predicting protein-ligand binding free energy.Comment: 7 pages, 4 figures, 1 tabl

    Equivariant Neural Network Force Fields for Magnetic Materials

    Full text link
    Neural network force fields have significantly advanced ab initio atomistic simulations across diverse fields. However, their application in the realm of magnetic materials is still in its early stage due to challenges posed by the subtle magnetic energy landscape and the difficulty of obtaining training data. Here we introduce a data-efficient neural network architecture to represent density functional theory total energy, atomic forces, and magnetic forces as functions of atomic and magnetic structures. Our approach incorporates the principle of equivariance under the three-dimensional Euclidean group into the neural network model. Through systematic experiments on various systems, including monolayer magnets, curved nanotube magnets, and moir\'e-twisted bilayer magnets of CrI3\text{CrI}_{3}, we showcase the method's high efficiency and accuracy, as well as exceptional generalization ability. The work creates opportunities for exploring magnetic phenomena in large-scale materials systems.Comment: 10 pages, 4 figure

    Torque Improvement for Modified Double Stator Switched Reluctance Machines

    Get PDF
    This study advances the design of double stator switched reluctance machines (DSSRMs) by focusing on mitigating torque ripple to improve efficiency and promote broader application. The research undertakes a comprehensive literature review, establishes a baseline design, and employs iterative enhancements alongside advanced 2D and 3D model simulations using SolidWorks and ANSYS Maxwell software. Significant findings include a torque ripple reduction of up to 9%, an increase in peak torque, and optimised magnetic flux distribution, achieved through adjustments in rotor segment geometry and electromagnetic force balancing methods. The outcomes highlight the critical role of magnetic force analysis, 3D modelling, and dynamic testing in enhancing DSSRM performance, establishing a foundation for future optimisations in design and materials for environmental and operational sustainability

    Efficient hybrid density functional calculation by deep learning

    Full text link
    Hybrid density functional calculation is indispensable to accurate description of electronic structure, whereas the formidable computational cost restricts its broad application. Here we develop a deep equivariant neural network method (named DeepH-hybrid) to learn the hybrid-functional Hamiltonian from self-consistent field calculations of small structures, and apply the trained neural networks for efficient electronic-structure calculation by passing the self-consistent iterations. The method is systematically checked to show high efficiency and accuracy, making the study of large-scale materials with hybrid-functional accuracy feasible. As an important application, the DeepH-hybrid method is applied to study large-supercell Moir\'{e} twisted materials, offering the first case study on how the inclusion of exact exchange affects flat bands in the magic-angle twisted bilayer graphene

    ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

    Full text link
    Graphical User Interface (GUI) automation holds significant promise for assisting users with complex tasks, thereby boosting human productivity. Existing works leveraging Large Language Model (LLM) or LLM-based AI agents have shown capabilities in automating tasks on Android and Web platforms. However, these tasks are primarily aimed at simple device usage and entertainment operations. This paper presents a novel benchmark, AssistGUI, to evaluate whether models are capable of manipulating the mouse and keyboard on the Windows platform in response to user-requested tasks. We carefully collected a set of 100 tasks from nine widely-used software applications, such as, After Effects and MS Word, each accompanied by the necessary project files for better evaluation. Moreover, we propose an advanced Actor-Critic Embodied Agent framework, which incorporates a sophisticated GUI parser driven by an LLM-agent and an enhanced reasoning mechanism adept at handling lengthy procedural tasks. Our experimental results reveal that our GUI Parser and Reasoning mechanism outshine existing methods in performance. Nevertheless, the potential remains substantial, with the best model attaining only a 46% success rate on our benchmark. We conclude with a thorough analysis of the current methods' limitations, setting the stage for future breakthroughs in this domain.Comment: Project Page: https://showlab.github.io/assistgui
    • …
    corecore