537 research outputs found

    Ultrafast quantum state tomography with feed-forward neural networks

    Full text link
    Reconstructing the state of many-body quantum systems is of fundamental importance in quantum information tasks, but extremely challenging due to the curse of dimensionality. In this work, we present a quantum tomography approach based on neural networks to achieve the ultrafast reconstruction of multi-qubit states. Particularly, we propose a simple 3-layer feed-forward network to process the experimental data generated from measuring each qubit with a positive operator-valued measure, which is able to reduce the storage cost and computational complexity. Moreover, the techniques of state decomposition and PP-order absolute projection are jointly introduced to ensure the positivity of state matrices learned in the maximum likelihood function and to improve the convergence speed and robustness of the above network. Finally, it is tested on a large number of states with a wide range of purity to show that we can faithfully tomography 11-qubit states on a laptop within 2 minutes under noise. Our numerical results also demonstrate that more state samples are required to achieve the given tomography fidelity for the low-purity states, and the increased depolarizing noise induces a linear decrease in the tomography fidelity

    Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

    Full text link
    Temporal action localization (TAL) requires long-form reasoning to predict actions of various lengths and complex content. Given limited GPU memory, training TAL end-to-end on such long-form videos (i.e., from videos to predictions) is a significant challenge. Most methods can only train on pre-extracted features without optimizing them for the localization problem, consequently limiting localization performance. In this work, to extend the potential in TAL networks, we propose a novel end-to-end method Re2TAL, which rewires pretrained video backbones for reversible TAL. Re2TAL builds a backbone with reversible modules, where the input can be recovered from the output such that the bulky intermediate activations can be cleared from memory during training. Instead of designing one single type of reversible module, we propose a network rewiring mechanism, to transform any module with a residual connection to a reversible module without changing any parameters. This provides two benefits: (1) a large variety of reversible networks are easily obtained from existing and even future model designs, and (2) the reversible models require much less training effort as they reuse the pre-trained parameters of their original non-reversible versions. Re2TAL reaches 37.01% average mAP, a new state-of-the-art record on ActivityNet-v1.3, and mAP 64.9% at tIoU=0.5 on THUMOS-14 without using optimal flow

    Integrating Spatial Data Linkage and Analysis Services in a Geoportal for China Urban Research

    Full text link
    Many geoportals are now evolving into online analytical environments, where large amounts of data and various analysis methods are integrated. These spatiotemporal data are often distributed in different databases and exist in heterogeneous forms, even when they refer to the same geospatial entities. Besides, existing open standards lack sufficient expression of the attribute semantics. Client applications or other services thus have to deal with unrelated preprocessing tasks, such as data transformation and attribute annotation, leading to potential inconsistencies. Furthermore, to build informative interfaces that guide users to quickly understand the analysis methods, an analysis service needs to explicitly model the method parameters, which are often interrelated and have rich auxiliary information. This work presents the design of the spatial data linkage and analysis services in a geoportal for China urban research. The spatial data linkage service aggregates multisource heterogeneous data into linked layers with flexible attribute mapping, providing client applications and services with a unified access as if querying a big table. The spatial analysis service incorporates parameter hierarchy and grouping by extending the standard WPS service, and data‐dependent validation in computation components. This platform can help researchers efficiently explore and analyze spatiotemporal data online.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110740/1/tgis12084.pd

    On the Pareto Front of Multilingual Neural Machine Translation

    Full text link
    In this work, we study how the performance of a given direction changes with its sampling ratio in Multilingual Neural Machine Translation (MNMT). By training over 200 multilingual models with various model sizes, data sizes, and language directions, we find it interesting that the performance of certain translation direction does not always improve with the increase of its weight in the multi-task optimization objective. Accordingly, scalarization method leads to a multitask trade-off front that deviates from the traditional Pareto front when there exists data imbalance in the training corpus, which poses a great challenge to improve the overall performance of all directions. Based on our observations, we propose the Double Power Law to predict the unique performance trade-off front in MNMT, which is robust across various languages, data adequacy, and the number of tasks. Finally, we formulate the sample ratio selection problem in MNMT as an optimization problem based on the Double Power Law. In our experiments, it achieves better performance than temperature searching and gradient manipulation methods with only 1/5 to 1/2 of the total training budget. We release the code at https://github.com/pkunlp-icler/ParetoMNMT for reproduction.Comment: NeurIPS 202

    TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

    Full text link
    High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collection methods are limited by unrealistic manual labeling costs or by the hallucination of relying solely on LLM generation. To address the problems, this paper presents a scalable method to automatically collect high-quality instructional adaptation data by training language models to automatically design tasks based on human-written texts. Intuitively, human-written text helps to help the model attenuate illusions during the generation of tasks. Unlike instruction back-translation-based methods that directly take the given text as a response, we require the model to generate the \textit{instruction}, \textit{input}, and \textit{output} simultaneously to filter the noise. The results of the automated and manual evaluation experiments demonstrate the quality of our dataset.Comment: Work in progres

    Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study

    Full text link
    Evaluating the quality of generated text is a challenging task in natural language processing. This difficulty arises from the inherent complexity and diversity of text. Recently, OpenAI's ChatGPT, a powerful large language model (LLM), has garnered significant attention due to its impressive performance in various tasks. Therefore, we present this report to investigate the effectiveness of LLMs, especially ChatGPT, and explore ways to optimize their use in assessing text quality. We compared three kinds of reference-free evaluation methods based on ChatGPT or similar LLMs. The experimental results prove that ChatGPT is capable to evaluate text quality effectively from various perspectives without reference and demonstrates superior performance than most existing automatic metrics. In particular, the Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches. However, directly comparing the quality of two texts using ChatGPT may lead to suboptimal results. We hope this report will provide valuable insights into selecting appropriate methods for evaluating text quality with LLMs such as ChatGPT.Comment: Technical Report, 13 page

    ETAD: Training Action Detection End to End on a Laptop

    Full text link
    Temporal action detection (TAD) with end-to-end training often suffers from the pain of huge demand for computing resources due to long video duration. In this work, we propose an efficient temporal action detector (ETAD) that can train directly from video frames with extremely low GPU memory consumption. Our main idea is to minimize and balance the heavy computation among features and gradients in each training iteration. We propose to sequentially forward the snippet frame through the video encoder, and backward only a small necessary portion of gradients to update the encoder. To further alleviate the computational redundancy in training, we propose to dynamically sample only a small subset of proposals during training. Moreover, various sampling strategies and ratios are studied for both the encoder and detector. ETAD achieves state-of-the-art performance on TAD benchmarks with remarkable efficiency. On ActivityNet-1.3, training ETAD in 18 hours can reach 38.25% average mAP with only 1.3 GB memory consumption per video under end-to-end training. Our code will be publicly released
    corecore