87 research outputs found

    Enhanced bias stress stability of a-InGaZnO thin film transistors by inserting an ultra-thin interfacial InGaZnO:N layer

    Get PDF
    Amorphous indium-gallium-zinc oxide (a-IGZO) thin film transistors (TFTs) having an ultra-thin nitrogenated a-IGZO (a-IGZO:N) layer sandwiched at the channel/gate dielectric interface are fabricated. It is found that the device shows enhanced bias stress stability with significantly reduced threshold voltage drift under positive gate bias stress. Based on x-ray photoelectron spectroscopy measurement, the concentration of oxygen vacancies within the a-IGZO:N layer is suppressed due to the formation of N-Ga bonds. Meanwhile, low frequency noise analysis indicates that the average trap density near the channel/dielectric interface continuously drops as the nitrogen content within the a-IGZO:N layer increases. The improved interface quality upon nitrogen doping agrees with the enhanced bias stress stability of the a-IGZO TFTs.This work was supported in part by the State Key Program for Basic Research of China under Grant Nos. 2010CB327504, 2011CB922100, and 2011CB301900; in part by the National Natural Science Foundation of China under Grant Nos. 60936004 and 11104130; in part by the Natural Science Foundation of Jiangsu Province under Grant Nos. BK2011556 and BK2011050; and in part by the Priority Academic Program Development of Jiangsu Higher Education Institutions

    Learning to Program with Natural Language

    Full text link
    Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks, which raises hope for achieving Artificial General Intelligence. For completing the complex task, we still need a program for the task first and then ask LLMs to follow the program to generate the specific solution. We propose using natural language as a new programming language to describe task procedures, making them easily understandable to both humans and LLMs. ~The LLM is capable of directly generating natural language programs, but these programs may still contain factual errors or incomplete steps. Therefore, we further propose the Learning to Program (\text{LP}) method to ask LLMs themselves to learn the natural language program based on the training dataset of the complex task first and then use the learned program to guide the inference. Our experiments on the reasoning tasks of five different reasoning types (8 datasets) demonstrate the effectiveness of our approach. Further, our analysis experiment shows that the learned program can be directly used to guide another LLM to improve its performance, which reveals a new transfer learning paradigm.Comment: Work in progres

    GameEval: Evaluating LLMs on Conversational Games

    Full text link
    The rapid advancements in large language models (LLMs) have presented challenges in evaluating those models. Existing evaluation methods are either reference-based or preference based, which inevitably need human intervention or introduce test bias caused by evaluator models. In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods. GameEval treats LLMs as game players and assigns them distinct roles with specific goals achieved by launching conversations of various forms, including discussion, question answering, and voting. We design three unique games with cooperative or adversarial objectives, accompanied by corresponding evaluation metrics, to show how this new paradigm comprehensively evaluates model performance.Through extensive experiments, we show that GameEval can effectively differentiate the capabilities of various LLMs, providing a comprehensive assessment of their integrated abilities to solve complex problems. Our public anonymous code is available at https://github.com/GameEval/GameEval

    DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

    Full text link
    Controllable video generation has gained significant attention in recent years. However, two main limitations persist: Firstly, most existing works focus on either text, image, or trajectory-based control, leading to an inability to achieve fine-grained control in videos. Secondly, trajectory control research is still in its early stages, with most experiments being conducted on simple datasets like Human3.6M. This constraint limits the models' capability to process open-domain images and effectively handle complex curved trajectories. In this paper, we propose DragNUWA, an open-domain diffusion-based video generation model. To tackle the issue of insufficient control granularity in existing works, we simultaneously introduce text, image, and trajectory information to provide fine-grained control over video content from semantic, spatial, and temporal perspectives. To resolve the problem of limited open-domain trajectory control in current research, We propose trajectory modeling with three aspects: a Trajectory Sampler (TS) to enable open-domain control of arbitrary trajectories, a Multiscale Fusion (MF) to control trajectories in different granularities, and an Adaptive Training (AT) strategy to generate consistent videos following trajectories. Our experiments validate the effectiveness of DragNUWA, demonstrating its superior performance in fine-grained control in video generation. The homepage link is \url{https://www.microsoft.com/en-us/research/project/dragnuwa/

    Electrical instability of amorphous indium-gallium-zinc oxide thin film transistors under monochromatic light illumination

    No full text
    The electrical instability behaviors of a positive-gate-bias-stressed amorphous indium-gallium-zinc oxide (a-IGZO) thin film transistor(TFT) are studied under monochromatic light illumination. It is found that as the wavelength of incident light reduces from 750 nm to 450 nm, the threshold voltage of the illuminated TFT shows a continuous negative shift, which is caused by photo-excitation of trapped electrons at the channel/dielectric interface. Meanwhile, an increase of the sub-threshold swing (SS) is observed when the illumination wavelength is below 625 nm (∼2.0 eV). The SS degradation is accompanied by a simultaneous increase of the field effect mobility (μFE) of the TFT, which then decreases at even shorter wavelength beyond 540 nm (∼2.3 eV). The variation of SS and μFE is explained by a physical model based on generation of singly ionized oxygen vacancies (Vo⁺) and double ionized oxygen vacancies (Vo²⁺) within the a-IGZO active layer by high energy photons, which would form trap states near the mid-gap and the conduction band edge, respectively.This work was supported by the State Key Program for Basic Research of China under Grant Nos. 2010CB327504, 2011CB922100, 2011CB301900; the National Natural Science Foundation of China under Grant Nos. 60825401, 60936004, 11104130, BK2011556, and BK2011050

    ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

    Full text link
    Two-Tower Vision-Language (VL) models have shown promising improvements on various downstream VL tasks. Although the most advanced work improves performance by building bridges between encoders, it suffers from ineffective layer-by-layer utilization of uni-modal representations and cannot flexibly exploit different levels of uni-modal semantic knowledge. In this work, we propose ManagerTower, a novel VL model architecture that gathers and combines the insights of pre-trained uni-modal experts at different levels. The managers introduced in each cross-modal layer can adaptively aggregate uni-modal semantic knowledge to facilitate more comprehensive cross-modal alignment and fusion. ManagerTower outperforms previous strong baselines both with and without Vision-Language Pre-training (VLP). With only 4M VLP data, ManagerTower achieves superior performances on various downstream VL tasks, especially 79.15% accuracy on VQAv2 Test-Std, 86.56% IR@1 and 95.64% TR@1 on Flickr30K. Code and checkpoints are available at https://github.com/LooperXX/ManagerTower.Comment: Accepted by ACL 2023 Main Conference, Ora

    TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

    Full text link
    Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain-specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub-tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, TaskMatrix.AI focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next