537 research outputs found
Ultrafast quantum state tomography with feed-forward neural networks
Reconstructing the state of many-body quantum systems is of fundamental
importance in quantum information tasks, but extremely challenging due to the
curse of dimensionality. In this work, we present a quantum tomography approach
based on neural networks to achieve the ultrafast reconstruction of multi-qubit
states. Particularly, we propose a simple 3-layer feed-forward network to
process the experimental data generated from measuring each qubit with a
positive operator-valued measure, which is able to reduce the storage cost and
computational complexity. Moreover, the techniques of state decomposition and
-order absolute projection are jointly introduced to ensure the positivity
of state matrices learned in the maximum likelihood function and to improve the
convergence speed and robustness of the above network. Finally, it is tested on
a large number of states with a wide range of purity to show that we can
faithfully tomography 11-qubit states on a laptop within 2 minutes under noise.
Our numerical results also demonstrate that more state samples are required to
achieve the given tomography fidelity for the low-purity states, and the
increased depolarizing noise induces a linear decrease in the tomography
fidelity
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Temporal action localization (TAL) requires long-form reasoning to predict
actions of various lengths and complex content. Given limited GPU memory,
training TAL end-to-end on such long-form videos (i.e., from videos to
predictions) is a significant challenge. Most methods can only train on
pre-extracted features without optimizing them for the localization problem,
consequently limiting localization performance. In this work, to extend the
potential in TAL networks, we propose a novel end-to-end method Re2TAL, which
rewires pretrained video backbones for reversible TAL. Re2TAL builds a backbone
with reversible modules, where the input can be recovered from the output such
that the bulky intermediate activations can be cleared from memory during
training. Instead of designing one single type of reversible module, we propose
a network rewiring mechanism, to transform any module with a residual
connection to a reversible module without changing any parameters. This
provides two benefits: (1) a large variety of reversible networks are easily
obtained from existing and even future model designs, and (2) the reversible
models require much less training effort as they reuse the pre-trained
parameters of their original non-reversible versions. Re2TAL reaches 37.01%
average mAP, a new state-of-the-art record on ActivityNet-v1.3, and mAP 64.9%
at tIoU=0.5 on THUMOS-14 without using optimal flow
Integrating Spatial Data Linkage and Analysis Services in a Geoportal for China Urban Research
Many geoportals are now evolving into online analytical environments, where large amounts of data and various analysis methods are integrated. These spatiotemporal data are often distributed in different databases and exist in heterogeneous forms, even when they refer to the same geospatial entities. Besides, existing open standards lack sufficient expression of the attribute semantics. Client applications or other services thus have to deal with unrelated preprocessing tasks, such as data transformation and attribute annotation, leading to potential inconsistencies. Furthermore, to build informative interfaces that guide users to quickly understand the analysis methods, an analysis service needs to explicitly model the method parameters, which are often interrelated and have rich auxiliary information. This work presents the design of the spatial data linkage and analysis services in a geoportal for China urban research. The spatial data linkage service aggregates multisource heterogeneous data into linked layers with flexible attribute mapping, providing client applications and services with a unified access as if querying a big table. The spatial analysis service incorporates parameter hierarchy and grouping by extending the standard WPS service, and data‐dependent validation in computation components. This platform can help researchers efficiently explore and analyze spatiotemporal data online.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110740/1/tgis12084.pd
On the Pareto Front of Multilingual Neural Machine Translation
In this work, we study how the performance of a given direction changes with
its sampling ratio in Multilingual Neural Machine Translation (MNMT). By
training over 200 multilingual models with various model sizes, data sizes, and
language directions, we find it interesting that the performance of certain
translation direction does not always improve with the increase of its weight
in the multi-task optimization objective. Accordingly, scalarization method
leads to a multitask trade-off front that deviates from the traditional Pareto
front when there exists data imbalance in the training corpus, which poses a
great challenge to improve the overall performance of all directions. Based on
our observations, we propose the Double Power Law to predict the unique
performance trade-off front in MNMT, which is robust across various languages,
data adequacy, and the number of tasks. Finally, we formulate the sample ratio
selection problem in MNMT as an optimization problem based on the Double Power
Law. In our experiments, it achieves better performance than temperature
searching and gradient manipulation methods with only 1/5 to 1/2 of the total
training budget. We release the code at
https://github.com/pkunlp-icler/ParetoMNMT for reproduction.Comment: NeurIPS 202
TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design
High-quality instruction-tuning data is critical to improving LLM
capabilities. Existing data collection methods are limited by unrealistic
manual labeling costs or by the hallucination of relying solely on LLM
generation. To address the problems, this paper presents a scalable method to
automatically collect high-quality instructional adaptation data by training
language models to automatically design tasks based on human-written texts.
Intuitively, human-written text helps to help the model attenuate illusions
during the generation of tasks. Unlike instruction back-translation-based
methods that directly take the given text as a response, we require the model
to generate the \textit{instruction}, \textit{input}, and \textit{output}
simultaneously to filter the noise. The results of the automated and manual
evaluation experiments demonstrate the quality of our dataset.Comment: Work in progres
Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study
Evaluating the quality of generated text is a challenging task in natural
language processing. This difficulty arises from the inherent complexity and
diversity of text. Recently, OpenAI's ChatGPT, a powerful large language model
(LLM), has garnered significant attention due to its impressive performance in
various tasks. Therefore, we present this report to investigate the
effectiveness of LLMs, especially ChatGPT, and explore ways to optimize their
use in assessing text quality. We compared three kinds of reference-free
evaluation methods based on ChatGPT or similar LLMs. The experimental results
prove that ChatGPT is capable to evaluate text quality effectively from various
perspectives without reference and demonstrates superior performance than most
existing automatic metrics. In particular, the Explicit Score, which utilizes
ChatGPT to generate a numeric score measuring text quality, is the most
effective and reliable method among the three exploited approaches. However,
directly comparing the quality of two texts using ChatGPT may lead to
suboptimal results. We hope this report will provide valuable insights into
selecting appropriate methods for evaluating text quality with LLMs such as
ChatGPT.Comment: Technical Report, 13 page
ETAD: Training Action Detection End to End on a Laptop
Temporal action detection (TAD) with end-to-end training often suffers from
the pain of huge demand for computing resources due to long video duration. In
this work, we propose an efficient temporal action detector (ETAD) that can
train directly from video frames with extremely low GPU memory consumption. Our
main idea is to minimize and balance the heavy computation among features and
gradients in each training iteration. We propose to sequentially forward the
snippet frame through the video encoder, and backward only a small necessary
portion of gradients to update the encoder. To further alleviate the
computational redundancy in training, we propose to dynamically sample only a
small subset of proposals during training. Moreover, various sampling
strategies and ratios are studied for both the encoder and detector. ETAD
achieves state-of-the-art performance on TAD benchmarks with remarkable
efficiency. On ActivityNet-1.3, training ETAD in 18 hours can reach 38.25%
average mAP with only 1.3 GB memory consumption per video under end-to-end
training. Our code will be publicly released
- …