926 research outputs found
Schwarz-Pick Estimates for Holomorphic Mappings with Values in Homogeneous Ball
Let BX be the unit ball in a complex Banach space X. Assume BX is homogeneous. The generalization of the Schwarz-Pick estimates of partial derivatives of arbitrary order is established for holomorphic mappings from the unit ball Bn to BX associated with the Carathéodory metric, which extend the corresponding Chen and Liu, Dai et al. results
Kervolutional Neural Networks
Convolutional neural networks (CNNs) have enabled the state-of-the-art
performance in many computer vision tasks. However, little effort has been
devoted to establishing convolution in non-linear space. Existing works mainly
leverage on the activation layers, which can only provide point-wise
non-linearity. To solve this problem, a new operation, kervolution (kernel
convolution), is introduced to approximate complex behaviors of human
perception systems leveraging on the kernel trick. It generalizes convolution,
enhances the model capacity, and captures higher order interactions of
features, via patch-wise kernel functions, but without introducing additional
parameters. Extensive experiments show that kervolutional neural networks (KNN)
achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201
Understanding and Comparing Scalable Gaussian Process Regression for Big Data
As a non-parametric Bayesian model which produces informative predictive
distribution, Gaussian process (GP) has been widely used in various fields,
like regression, classification and optimization. The cubic complexity of
standard GP however leads to poor scalability, which poses challenges in the
era of big data. Hence, various scalable GPs have been developed in the
literature in order to improve the scalability while retaining desirable
prediction accuracy. This paper devotes to investigating the methodological
characteristics and performance of representative global and local scalable GPs
including sparse approximations and local aggregations from four main
perspectives: scalability, capability, controllability and robustness. The
numerical experiments on two toy examples and five real-world datasets with up
to 250K points offer the following findings. In terms of scalability, most of
the scalable GPs own a time complexity that is linear to the training size. In
terms of capability, the sparse approximations capture the long-term spatial
correlations, the local aggregations capture the local patterns but suffer from
over-fitting in some scenarios. In terms of controllability, we could improve
the performance of sparse approximations by simply increasing the inducing
size. But this is not the case for local aggregations. In terms of robustness,
local aggregations are robust to various initializations of hyperparameters due
to the local attention mechanism. Finally, we highlight that the proper hybrid
of global and local scalable GPs may be a promising way to improve both the
model capability and scalability for big data.Comment: 25 pages, 15 figures, preprint submitted to KB
Facial Motion Prior Networks for Facial Expression Recognition
Deep learning based facial expression recognition (FER) has received a lot of
attention in the past few years. Most of the existing deep learning based FER
methods do not consider domain knowledge well, which thereby fail to extract
representative features. In this work, we propose a novel FER framework, named
Facial Motion Prior Networks (FMPN). Particularly, we introduce an addition
branch to generate a facial mask so as to focus on facial muscle moving
regions. To guide the facial mask learning, we propose to incorporate prior
domain knowledge by using the average differences between neutral faces and the
corresponding expressive faces as the training guidance. Extensive experiments
on three facial expression benchmark datasets demonstrate the effectiveness of
the proposed method, compared with the state-of-the-art approaches.Comment: VCIP 2019, Oral. Code is available at
https://github.com/donydchen/FMPN-FE
Unpaired Image Captioning via Scene Graph Alignments
Most of current image captioning models heavily rely on paired image-caption
datasets. However, getting large scale image-caption paired data is
labor-intensive and time-consuming. In this paper, we present a scene
graph-based approach for unpaired image captioning. Our framework comprises an
image scene graph generator, a sentence scene graph generator, a scene graph
encoder, and a sentence decoder. Specifically, we first train the scene graph
encoder and the sentence decoder on the text modality. To align the scene
graphs between images and sentences, we propose an unsupervised feature
alignment method that maps the scene graph features from the image to the
sentence modality. Experimental results show that our proposed model can
generate quite promising results without using any image-caption training
pairs, outperforming existing methods by a wide margin.Comment: Accepted in ICCV 201
- …