Search CORE

530 research outputs found

Ultrafast quantum state tomography with feed-forward neural networks

Author: Chen Jie
Cheng Shuming
Li Li
Wang Yong
Publication venue
Publication date: 12/07/2022
Field of study

Reconstructing the state of many-body quantum systems is of fundamental importance in quantum information tasks, but extremely challenging due to the curse of dimensionality. In this work, we present a quantum tomography approach based on neural networks to achieve the ultrafast reconstruction of multi-qubit states. Particularly, we propose a simple 3-layer feed-forward network to process the experimental data generated from measuring each qubit with a positive operator-valued measure, which is able to reduce the storage cost and computational complexity. Moreover, the techniques of state decomposition and

P

-order absolute projection are jointly introduced to ensure the positivity of state matrices learned in the maximum likelihood function and to improve the convergence speed and robustness of the above network. Finally, it is tested on a large number of states with a wide range of purity to show that we can faithfully tomography 11-qubit states on a laptop within 2 minutes under noise. Our numerical results also demonstrate that more state samples are required to achieve the given tomography fidelity for the low-purity states, and the increased depolarizing noise induces a linear decrease in the tomography fidelity

arXiv.org e-Print Archive

Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

Author: Ghanem Bernard
Liu Shuming
Mangalam Karttikeya
Zhao Chen
Publication venue
Publication date: 25/11/2022
Field of study

Temporal action localization (TAL) requires long-form reasoning to predict actions of various lengths and complex content. Given limited GPU memory, training TAL end-to-end on such long-form videos (i.e., from videos to predictions) is a significant challenge. Most methods can only train on pre-extracted features without optimizing them for the localization problem, consequently limiting localization performance. In this work, to extend the potential in TAL networks, we propose a novel end-to-end method Re2TAL, which rewires pretrained video backbones for reversible TAL. Re2TAL builds a backbone with reversible modules, where the input can be recovered from the output such that the bulky intermediate activations can be cleared from memory during training. Instead of designing one single type of reversible module, we propose a network rewiring mechanism, to transform any module with a residual connection to a reversible module without changing any parameters. This provides two benefits: (1) a large variety of reversible networks are easily obtained from existing and even future model designs, and (2) the reversible models require much less training effort as they reuse the pre-trained parameters of their original non-reversible versions. Re2TAL reaches 37.01% average mAP, a new state-of-the-art record on ActivityNet-v1.3, and mAP 64.9% at tIoU=0.5 on THUMOS-14 without using optimal flow

arXiv.org e-Print Archive

Integrating Spatial Data Linkage and Analysis Services in a Geoportal for China Urban Research

Author: Bao Shuming
Chen Di
Guo Wei
She Bing
Zhu Xinyan
Publication venue: 'Wiley'
Publication date: 01/04/2014
Field of study

Many geoportals are now evolving into online analytical environments, where large amounts of data and various analysis methods are integrated. These spatiotemporal data are often distributed in different databases and exist in heterogeneous forms, even when they refer to the same geospatial entities. Besides, existing open standards lack sufficient expression of the attribute semantics. Client applications or other services thus have to deal with unrelated preprocessing tasks, such as data transformation and attribute annotation, leading to potential inconsistencies. Furthermore, to build informative interfaces that guide users to quickly understand the analysis methods, an analysis service needs to explicitly model the method parameters, which are often interrelated and have rich auxiliary information. This work presents the design of the spatial data linkage and analysis services in a geoportal for China urban research. The spatial data linkage service aggregates multisource heterogeneous data into linked layers with flexible attribute mapping, providing client applications and services with a unified access as if querying a big table. The spatial analysis service incorporates parameter hierarchy and grouping by extending the standard WPS service, and data‐dependent validation in computation components. This platform can help researchers efficiently explore and analyze spatiotemporal data online.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110740/1/tgis12084.pd

Crossref

Deep Blue Documents at the University of Michigan

On the Pareto Front of Multilingual Neural Machine Translation

Author: Chang Baobao
Chen Liang
Ma Shuming
Wei Furu
Zhang Dongdong
Publication venue
Publication date: 31/10/2023
Field of study

In this work, we study how the performance of a given direction changes with its sampling ratio in Multilingual Neural Machine Translation (MNMT). By training over 200 multilingual models with various model sizes, data sizes, and language directions, we find it interesting that the performance of certain translation direction does not always improve with the increase of its weight in the multi-task optimization objective. Accordingly, scalarization method leads to a multitask trade-off front that deviates from the traditional Pareto front when there exists data imbalance in the training corpus, which poses a great challenge to improve the overall performance of all directions. Based on our observations, we propose the Double Power Law to predict the unique performance trade-off front in MNMT, which is robust across various languages, data adequacy, and the number of tasks. Finally, we formulate the sample ratio selection problem in MNMT as an optimization problem based on the Double Power Law. In our experiments, it achieves better performance than temperature searching and gradient manipulation methods with only 1/5 to 1/2 of the total training budget. We release the code at https://github.com/pkunlp-icler/ParetoMNMT for reproduction.Comment: NeurIPS 202

arXiv.org e-Print Archive

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

Author: Chen Yongrui
Huang Xinting
Jiang Haiyun
Qi Guilin
Shi Shuming
Publication venue
Publication date: 11/09/2023
Field of study

High-quality instruction-tuning data is critical to improving LLM capabilities. Existing data collection methods are limited by unrealistic manual labeling costs or by the hallucination of relying solely on LLM generation. To address the problems, this paper presents a scalable method to automatically collect high-quality instructional adaptation data by training language models to automatically design tasks based on human-written texts. Intuitively, human-written text helps to help the model attenuate illusions during the generation of tasks. Unlike instruction back-translation-based methods that directly take the given text as a response, we require the model to generate the \textit{instruction}, \textit{input}, and \textit{output} simultaneously to filter the noise. The results of the automated and manual evaluation experiments demonstrate the quality of our dataset.Comment: Work in progres

arXiv.org e-Print Archive

ETAD: Training Action Detection End to End on a Laptop

Author: Ghanem Bernard
Liu Shuming
Xu Mengmeng
Zhao Chen
Zhao Xu
Publication venue
Publication date: 28/11/2022
Field of study

Temporal action detection (TAD) with end-to-end training often suffers from the pain of huge demand for computing resources due to long video duration. In this work, we propose an efficient temporal action detector (ETAD) that can train directly from video frames with extremely low GPU memory consumption. Our main idea is to minimize and balance the heavy computation among features and gradients in each training iteration. We propose to sequentially forward the snippet frame through the video encoder, and backward only a small necessary portion of gradients to update the encoder. To further alleviate the computational redundancy in training, we propose to dynamically sample only a small subset of proposals during training. Moreover, various sampling strategies and ratios are studied for both the encoder and detector. ETAD achieves state-of-the-art performance on TAD benchmarks with remarkable efficiency. On ActivityNet-1.3, training ETAD in 18 hours can reach 38.25% average mAP with only 1.3 GB memory consumption per video under end-to-end training. Our code will be publicly released

arXiv.org e-Print Archive

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study

Author: Chen Yi
Jiang Haiyun
Shi Shuming
Wang Rui
Xu Ruifeng
Publication venue
Publication date: 10/04/2023
Field of study

Evaluating the quality of generated text is a challenging task in natural language processing. This difficulty arises from the inherent complexity and diversity of text. Recently, OpenAI's ChatGPT, a powerful large language model (LLM), has garnered significant attention due to its impressive performance in various tasks. Therefore, we present this report to investigate the effectiveness of LLMs, especially ChatGPT, and explore ways to optimize their use in assessing text quality. We compared three kinds of reference-free evaluation methods based on ChatGPT or similar LLMs. The experimental results prove that ChatGPT is capable to evaluate text quality effectively from various perspectives without reference and demonstrates superior performance than most existing automatic metrics. In particular, the Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches. However, directly comparing the quality of two texts using ChatGPT may lead to suboptimal results. We hope this report will provide valuable insights into selecting appropriate methods for evaluating text quality with LLMs such as ChatGPT.Comment: Technical Report, 13 page

arXiv.org e-Print Archive