4,063 research outputs found
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
There are a number of studies about extraction of bottleneck (BN) features
from deep neural networks (DNNs)trained to discriminate speakers, pass-phrases
and triphone states for improving the performance of text-dependent speaker
verification (TD-SV). However, a moderate success has been achieved. A recent
study [1] presented a time contrastive learning (TCL) concept to explore the
non-stationarity of brain signals for classification of brain states. Speech
signals have similar non-stationarity property, and TCL further has the
advantage of having no need for labeled data. We therefore present a TCL based
BN feature extraction method. The method uniformly partitions each speech
utterance in a training dataset into a predefined number of multi-frame
segments. Each segment in an utterance corresponds to one class, and class
labels are shared across utterances. DNNs are then trained to discriminate all
speech frames among the classes to exploit the temporal structure of speech. In
addition, we propose a segment-based unsupervised clustering algorithm to
re-assign class labels to the segments. TD-SV experiments were conducted on the
RedDots challenge database. The TCL-DNNs were trained using speech data of
fixed pass-phrases that were excluded from the TD-SV evaluation set, so the
learned features can be considered phrase-independent. We compare the
performance of the proposed TCL bottleneck (BN) feature with those of
short-time cepstral features and BN features extracted from DNNs discriminating
speakers, pass-phrases, speaker+pass-phrase, as well as monophones whose labels
and boundaries are generated by three different automatic speech recognition
(ASR) systems. Experimental results show that the proposed TCL-BN outperforms
cepstral features and speaker+pass-phrase discriminant BN features, and its
performance is on par with those of ASR derived BN features. Moreover,....Comment: Copyright (c) 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Differential measurement of atmospheric refraction with a telescope with double fields of view
For the sake of complete theoretical research of atmospheric refraction, the
atmospheric refraction under the condition of lower angles of elevation is
still worthy to be analyzed and explored. In some engineering applications, the
objects with larger zenith distance must be observed sometimes. Carrying out
observational research of the atmospheric refraction at lower angles of
elevation has an important significance. It has been considered difficult to
measure the atmospheric refraction at lower angles of elevation. A new idea for
determining atmospheric refraction by utilizing differential measurement with
double fields of view is proposed. Taking the observational principle of
HIPPARCOS satellite as a reference, a schematic prototype with double fields of
view was developed. In August of 2013, experimental observations were carried
out and the atmospheric refractions at lower angles of elevation can be
obtained by the schematic prototype. The measured value of the atmospheric
refraction at the zenith distance of 78.8 degree is , and the
feasibility of differential measurement of atmospheric refraction with double
fields of view was justified. The limitations of the schematic prototype such
as inadequate ability of gathering light, lack of accurate meteorological data
recording and lower automatic level of observation and data processing were
also pointed out, which need to be improved in subsequent work.Comment: 10 pages, 6 figure
Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Lying on the heart of intelligent decision-making systems, how policy is
represented and optimized is a fundamental problem. The root challenge in this
problem is the large scale and the high complexity of policy space, which
exacerbates the difficulty of policy learning especially in real-world
scenarios. Towards a desirable surrogate policy space, recently policy
representation in a low-dimensional latent space has shown its potential in
improving both the evaluation and optimization of policy. The key question
involved in these studies is by what criterion we should abstract the policy
space for desired compression and generalization. However, both the theory on
policy abstraction and the methodology on policy representation learning are
less studied in the literature. In this work, we make very first efforts to
fill up the vacancy. First, we propose a unified policy abstraction theory,
containing three types of policy abstraction associated to policy features at
different levels. Then, we generalize them to three policy metrics that
quantify the distance (i.e., similarity) of policies, for more convenient use
in learning policy representation. Further, we propose a policy representation
learning approach based on deep metric learning. For the empirical study, we
investigate the efficacy of the proposed policy metrics and representations, in
characterizing policy difference and conveying policy generalization
respectively. Our experiments are conducted in both policy optimization and
evaluation problems, containing trust-region policy optimization (TRPO),
diversity-guided evolution strategy (DGES) and off-policy evaluation (OPE).
Somewhat naturally, the experimental results indicate that there is no a
universally optimal abstraction for all downstream learning problems; while the
influence-irrelevance policy abstraction can be a generally preferred choice.Comment: Preprint versio
The Lifecycle and Cascade of WeChat Social Messaging Groups
Social instant messaging services are emerging as a transformative form with
which people connect, communicate with friends in their daily life - they
catalyze the formation of social groups, and they bring people stronger sense
of community and connection. However, research community still knows little
about the formation and evolution of groups in the context of social messaging
- their lifecycles, the change in their underlying structures over time, and
the diffusion processes by which they develop new members. In this paper, we
analyze the daily usage logs from WeChat group messaging platform - the largest
standalone messaging communication service in China - with the goal of
understanding the processes by which social messaging groups come together,
grow new members, and evolve over time. Specifically, we discover a strong
dichotomy among groups in terms of their lifecycle, and develop a separability
model by taking into account a broad range of group-level features, showing
that long-term and short-term groups are inherently distinct. We also found
that the lifecycle of messaging groups is largely dependent on their social
roles and functions in users' daily social experiences and specific purposes.
Given the strong separability between the long-term and short-term groups, we
further address the problem concerning the early prediction of successful
communities. In addition to modeling the growth and evolution from group-level
perspective, we investigate the individual-level attributes of group members
and study the diffusion process by which groups gain new members. By
considering members' historical engagement behavior as well as the local social
network structure that they embedded in, we develop a membership cascade model
and demonstrate the effectiveness by achieving AUC of 95.31% in predicting
inviter, and an AUC of 98.66% in predicting invitee.Comment: 10 pages, 8 figures, to appear in proceedings of the 25th
International World Wide Web Conference (WWW 2016
RESA: Recurrent Feature-Shift Aggregator for Lane Detection
Lane detection is one of the most important tasks in self-driving. Due to
various complex scenarios (e.g., severe occlusion, ambiguous lanes, etc.) and
the sparse supervisory signals inherent in lane annotations, lane detection
task is still challenging. Thus, it is difficult for the ordinary convolutional
neural network (CNN) to train in general scenes to catch subtle lane feature
from the raw image. In this paper, we present a novel module named REcurrent
Feature-Shift Aggregator (RESA) to enrich lane feature after preliminary
feature extraction with an ordinary CNN. RESA takes advantage of strong shape
priors of lanes and captures spatial relationships of pixels across rows and
columns. It shifts sliced feature map recurrently in vertical and horizontal
directions and enables each pixel to gather global information. RESA can
conjecture lanes accurately in challenging scenarios with weak appearance clues
by aggregating sliced feature map. Moreover, we propose a Bilateral Up-Sampling
Decoder that combines coarse-grained and fine-detailed features in the
up-sampling stage. It can recover the low-resolution feature map into
pixel-wise prediction meticulously. Our method achieves state-of-the-art
results on two popular lane detection benchmarks (CULane and Tusimple). Code
has been made available at: https://github.com/ZJULearning/resa
Probabilistic activity driven model of temporal simplicial networks and its application on higher-order dynamics
Network modeling characterizes the underlying principles of structural
properties and is of vital significance for simulating dynamical processes in
real world. However, bridging structure and dynamics is always challenging due
to the multiple complexities in real systems. Here, through introducing the
individual's activity rate and the possibility of group interaction, we propose
a probabilistic activity driven (PAD) model that could generate temporal
higher-order networks with both power-law and high-clustering characteristics,
which successfully links the two most critical structural features and a basic
dynamical pattern in extensive complex systems. Surprisingly, the power-law
exponents and the clustering coefficients of the aggregated PAD network could
be tuned in a wide range by altering a set of model parameters. We further
provide an approximation algorithm to select the proper parameters that can
generate networks with given structural properties, the effectiveness of which
is verified by fitting various real-world networks. Lastly, we explore the
co-evolution of PAD model and higher-order contagion dynamics, and analytically
derive the critical conditions for phase transition and bistable phenomenon.
Our model provides a basic tool to reproduce complex structural properties and
to study the widespread higher-order dynamics, which has great potential for
applications across fields
- …