Search CORE

4,063 research outputs found

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification

Author: Glass James
Sarkar Achintya kr.
Shon Suwon
Tan Zheng-Hua
Tang Hao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/05/2019
Field of study

There are a number of studies about extraction of bottleneck (BN) features from deep neural networks (DNNs)trained to discriminate speakers, pass-phrases and triphone states for improving the performance of text-dependent speaker verification (TD-SV). However, a moderate success has been achieved. A recent study [1] presented a time contrastive learning (TCL) concept to explore the non-stationarity of brain signals for classification of brain states. Speech signals have similar non-stationarity property, and TCL further has the advantage of having no need for labeled data. We therefore present a TCL based BN feature extraction method. The method uniformly partitions each speech utterance in a training dataset into a predefined number of multi-frame segments. Each segment in an utterance corresponds to one class, and class labels are shared across utterances. DNNs are then trained to discriminate all speech frames among the classes to exploit the temporal structure of speech. In addition, we propose a segment-based unsupervised clustering algorithm to re-assign class labels to the segments. TD-SV experiments were conducted on the RedDots challenge database. The TCL-DNNs were trained using speech data of fixed pass-phrases that were excluded from the TD-SV evaluation set, so the learned features can be considered phrase-independent. We compare the performance of the proposed TCL bottleneck (BN) feature with those of short-time cepstral features and BN features extracted from DNNs discriminating speakers, pass-phrases, speaker+pass-phrase, as well as monophones whose labels and boundaries are generated by three different automatic speech recognition (ASR) systems. Experimental results show that the proposed TCL-BN outperforms cepstral features and speaker+pass-phrase discriminant BN features, and its performance is on par with those of ASR derived BN features. Moreover,....Comment: Copyright (c) 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

arXiv.org e-Print Archive

VBN

Differential measurement of atmospheric refraction with a telescope with double fields of view

Author: Cao Jian-Jun
Luo Hao
Tang Zheng-Hong
Yu Yong
Zhao Ming
Publication venue: 'IOP Publishing'
Publication date: 16/04/2015
Field of study

For the sake of complete theoretical research of atmospheric refraction, the atmospheric refraction under the condition of lower angles of elevation is still worthy to be analyzed and explored. In some engineering applications, the objects with larger zenith distance must be observed sometimes. Carrying out observational research of the atmospheric refraction at lower angles of elevation has an important significance. It has been considered difficult to measure the atmospheric refraction at lower angles of elevation. A new idea for determining atmospheric refraction by utilizing differential measurement with double fields of view is proposed. Taking the observational principle of HIPPARCOS satellite as a reference, a schematic prototype with double fields of view was developed. In August of 2013, experimental observations were carried out and the atmospheric refractions at lower angles of elevation can be obtained by the schematic prototype. The measured value of the atmospheric refraction at the zenith distance of 78.8 degree is

240.23"\pm0.27"

, and the feasibility of differential measurement of atmospheric refraction with double fields of view was justified. The limitations of the schematic prototype such as inadequate ability of gathering light, lack of accurate meteorological data recording and lower automatic level of observation and data processing were also pointed out, which need to be improved in subsequent work.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive

Shanghai Astronomical Observatory,Chinese Academy of Sciences

Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes

Author: Hao Jianye
Tang Hongyao
Zhang Min
Zheng Yan
Publication venue
Publication date: 15/09/2022
Field of study

Lying on the heart of intelligent decision-making systems, how policy is represented and optimized is a fundamental problem. The root challenge in this problem is the large scale and the high complexity of policy space, which exacerbates the difficulty of policy learning especially in real-world scenarios. Towards a desirable surrogate policy space, recently policy representation in a low-dimensional latent space has shown its potential in improving both the evaluation and optimization of policy. The key question involved in these studies is by what criterion we should abstract the policy space for desired compression and generalization. However, both the theory on policy abstraction and the methodology on policy representation learning are less studied in the literature. In this work, we make very first efforts to fill up the vacancy. First, we propose a unified policy abstraction theory, containing three types of policy abstraction associated to policy features at different levels. Then, we generalize them to three policy metrics that quantify the distance (i.e., similarity) of policies, for more convenient use in learning policy representation. Further, we propose a policy representation learning approach based on deep metric learning. For the empirical study, we investigate the efficacy of the proposed policy metrics and representations, in characterizing policy difference and conveying policy generalization respectively. Our experiments are conducted in both policy optimization and evaluation problems, containing trust-region policy optimization (TRPO), diversity-guided evolution strategy (DGES) and off-policy evaluation (OPE). Somewhat naturally, the experimental results indicate that there is no a universally optimal abstraction for all downstream learning problems; while the influence-irrelevance policy abstraction can be a generally preferred choice.Comment: Preprint versio

arXiv.org e-Print Archive

The Lifecycle and Cascade of WeChat Social Messaging Groups

Author: Chen Bo
Hopcroft John
Li Yixuan
Lu Zheng
Qiu Jiezhong
Tang Jie
Yang Qiang
Ye Hao
Publication venue
Publication date: 20/02/2016
Field of study

Social instant messaging services are emerging as a transformative form with which people connect, communicate with friends in their daily life - they catalyze the formation of social groups, and they bring people stronger sense of community and connection. However, research community still knows little about the formation and evolution of groups in the context of social messaging - their lifecycles, the change in their underlying structures over time, and the diffusion processes by which they develop new members. In this paper, we analyze the daily usage logs from WeChat group messaging platform - the largest standalone messaging communication service in China - with the goal of understanding the processes by which social messaging groups come together, grow new members, and evolve over time. Specifically, we discover a strong dichotomy among groups in terms of their lifecycle, and develop a separability model by taking into account a broad range of group-level features, showing that long-term and short-term groups are inherently distinct. We also found that the lifecycle of messaging groups is largely dependent on their social roles and functions in users' daily social experiences and specific purposes. Given the strong separability between the long-term and short-term groups, we further address the problem concerning the early prediction of successful communities. In addition to modeling the growth and evolution from group-level perspective, we investigate the individual-level attributes of group members and study the diffusion process by which groups gain new members. By considering members' historical engagement behavior as well as the local social network structure that they embedded in, we develop a membership cascade model and demonstrate the effectiveness by achieving AUC of 95.31% in predicting inviter, and an AUC of 98.66% in predicting invitee.Comment: 10 pages, 8 figures, to appear in proceedings of the 25th International World Wide Web Conference (WWW 2016

arXiv.org e-Print Archive

Crossref

RESA: Recurrent Feature-Shift Aggregator for Lane Detection

Author: Cai Deng
Fang Hao
Liu Haifeng
Tang Wenjian
Yang Zheng
Zhang Yi
Zheng Tu
Publication venue
Publication date: 25/03/2021
Field of study

Lane detection is one of the most important tasks in self-driving. Due to various complex scenarios (e.g., severe occlusion, ambiguous lanes, etc.) and the sparse supervisory signals inherent in lane annotations, lane detection task is still challenging. Thus, it is difficult for the ordinary convolutional neural network (CNN) to train in general scenes to catch subtle lane feature from the raw image. In this paper, we present a novel module named REcurrent Feature-Shift Aggregator (RESA) to enrich lane feature after preliminary feature extraction with an ordinary CNN. RESA takes advantage of strong shape priors of lanes and captures spatial relationships of pixels across rows and columns. It shifts sliced feature map recurrently in vertical and horizontal directions and enables each pixel to gather global information. RESA can conjecture lanes accurately in challenging scenarios with weak appearance clues by aggregating sliced feature map. Moreover, we propose a Bilateral Up-Sampling Decoder that combines coarse-grained and fine-detailed features in the up-sampling stage. It can recover the low-resolution feature map into pixel-wise prediction meticulously. Our method achieves state-of-the-art results on two popular lane detection benchmarks (CULane and Tusimple). Code has been made available at: https://github.com/ZJULearning/resa

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Probabilistic activity driven model of temporal simplicial networks and its application on higher-order dynamics

Author: Han Zhihao
Hao Yajing
Liu Longzhao
Tang Shaoting
Wang Xin
Zheng Hongwei
Zheng Zhiming
Publication venue
Publication date: 26/12/2022
Field of study

Network modeling characterizes the underlying principles of structural properties and is of vital significance for simulating dynamical processes in real world. However, bridging structure and dynamics is always challenging due to the multiple complexities in real systems. Here, through introducing the individual's activity rate and the possibility of group interaction, we propose a probabilistic activity driven (PAD) model that could generate temporal higher-order networks with both power-law and high-clustering characteristics, which successfully links the two most critical structural features and a basic dynamical pattern in extensive complex systems. Surprisingly, the power-law exponents and the clustering coefficients of the aggregated PAD network could be tuned in a wide range by altering a set of model parameters. We further provide an approximation algorithm to select the proper parameters that can generate networks with given structural properties, the effectiveness of which is verified by fitting various real-world networks. Lastly, we explore the co-evolution of PAD model and higher-order contagion dynamics, and analytically derive the critical conditions for phase transition and bistable phenomenon. Our model provides a basic tool to reproduce complex structural properties and to study the widespread higher-order dynamics, which has great potential for applications across fields

arXiv.org e-Print Archive