572 research outputs found
Minimizing Negative Transfer of Knowledge in Multivariate Gaussian Processes: A Scalable and Regularized Approach
Recently there has been an increasing interest in the multivariate Gaussian
process (MGP) which extends the Gaussian process (GP) to deal with multiple
outputs. One approach to construct the MGP and account for non-trivial
commonalities amongst outputs employs a convolution process (CP). The CP is
based on the idea of sharing latent functions across several convolutions.
Despite the elegance of the CP construction, it provides new challenges that
need yet to be tackled. First, even with a moderate number of outputs, model
building is extremely prohibitive due to the huge increase in computational
demands and number of parameters to be estimated. Second, the negative transfer
of knowledge may occur when some outputs do not share commonalities. In this
paper we address these issues. We propose a regularized pairwise modeling
approach for the MGP established using CP. The key feature of our approach is
to distribute the estimation of the full multivariate model into a group of
bivariate GPs which are individually built. Interestingly pairwise modeling
turns out to possess unique characteristics, which allows us to tackle the
challenge of negative transfer through penalizing the latent function that
facilitates information sharing in each bivariate model. Predictions are then
made through combining predictions from the bivariate models within a Bayesian
framework. The proposed method has excellent scalability when the number of
outputs is large and minimizes the negative transfer of knowledge between
uncorrelated outputs. Statistical guarantees for the proposed method are
studied and its advantageous features are demonstrated through numerical
studies
On Indexed Data Broadcast
We consider the problem of efficient information retrieval in asymmetric communication environments where multiple clients with limited resources retrieve information from a powerful server who periodically broadcasts its information repository over a communication medium. The cost of a retrieving client consists of two components: (a) access time, defined as the total amount of time spent by a client in retrieving the information of interest; and (b) tuning time, defined as the time spent by the client in actively listening to the communication medium, measuring a certain efficiency in resource usage. A probability distribution is associated with the data items in the broadcast representing the likelihood of a data item\u27s being requested at any point of time. The problem of indexed data broadcast is to schedule the data items interleaved with certain indexing information in the broadcast so as to minimize simultaneously the mean access time and the mean tuning time.
Prior work on this problem thus far has focused only on some special cases. In this paper we study the indexed data broadcast problem in its full generality and design a broadcast scheme that achieves a mean access time oef at most (1.5 + ε) times the optimal and a mean tuning time bounded by O(log n)
On Indexed Data Broadcast
We consider the problem of efficient information retrieval in asymmetric communication environments where multiple clients with limited resources retrieve information from a powerful server who periodically broadcasts its information repository over a communication medium. The cost of a retrieving client consists of two components: (a) access time, defined as the total amount of time spent by a client in retrieving the information of interest; and (b) tuning time, defined as the time spent by the client in actively listening to the communication medium, measuring a certain efficiency in resource usage. A probability distribution is associated with the data items in the broadcast representing the likelihood of a data item\u27s being requested at any point of time. The problem of indexed data broadcast is to schedule the data items interleaved with certain indexing information in the broadcast so as to minimize simultaneously the mean access time and the mean tuning time.
Prior work on this problem thus far has focused only on some special cases. In this paper we study the indexed data broadcast problem in its full generality and design a broadcast scheme that achieves a mean access time oef at most (1.5 + ε) times the optimal and a mean tuning time bounded by O(log n)
DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration
We present DeepICP - a novel end-to-end learning-based 3D point cloud
registration framework that achieves comparable registration accuracy to prior
state-of-the-art geometric methods. Different from other keypoint based methods
where a RANSAC procedure is usually needed, we implement the use of various
deep neural network structures to establish an end-to-end trainable network.
Our keypoint detector is trained through this end-to-end structure and enables
the system to avoid the inference of dynamic objects, leverages the help of
sufficiently salient features on stationary objects, and as a result, achieves
high robustness. Rather than searching the corresponding points among existing
points, the key contribution is that we innovatively generate them based on
learned matching probabilities among a group of candidates, which can boost the
registration accuracy. Our loss function incorporates both the local similarity
and the global geometric constraints to ensure all above network designs can
converge towards the right direction. We comprehensively validate the
effectiveness of our approach using both the KITTI dataset and the
Apollo-SouthBay dataset. Results demonstrate that our method achieves
comparable or better performance than the state-of-the-art geometry-based
methods. Detailed ablation and visualization analysis are included to further
illustrate the behavior and insights of our network. The low registration error
and high robustness of our method makes it attractive for substantial
applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results
updated, accepted by ICCV 201
Long-Running Speech Recognizer:An End-to-End Multi-Task Learning Framework for Online ASR and VAD
When we use End-to-end automatic speech recognition (E2E-ASR) system for
real-world applications, a voice activity detection (VAD) system is usually
needed to improve the performance and to reduce the computational cost by
discarding non-speech parts in the audio. This paper presents a novel
end-to-end (E2E), multi-task learning (MTL) framework that integrates ASR and
VAD into one model. The proposed system, which we refer to as Long-Running
Speech Recognizer (LR-SR), learns ASR and VAD jointly from two seperate
task-specific datasets in the training stage. With the assistance of VAD, the
ASR performance improves as its connectionist temporal classification (CTC)
loss function can leverage the VAD alignment information. In the inference
stage, the LR-SR system removes non-speech parts at low computational cost and
recognizes speech parts with high robustness. Experimental results on segmented
speech data show that the proposed MTL framework outperforms the baseline
single-task learning (STL) framework in ASR task. On unsegmented speech data,
we find that the LR-SR system outperforms the baseline ASR systems that build
an extra GMM-based or DNN-based voice activity detector.Comment: 5 pages, 2 figure
- …