2,631 research outputs found
Adaptive Normalized Risk-Averting Training For Deep Neural Networks
This paper proposes a set of new error criteria and learning approaches,
Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex
optimization problem in training deep neural networks (DNNs). Theoretically, we
demonstrate its effectiveness on global and local convexity lower-bounded by
the standard -norm error. By analyzing the gradient on the convexity index
, we explain the reason why to learn adaptively using
gradient descent works. In practice, we show how this method improves training
of deep neural networks to solve visual recognition tasks on the MNIST and
CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results
comparable or superior to those reported in recent literature on the same tasks
using standard ConvNets + MSE/cross entropy. Performance on deep/shallow
multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can
be combined with other quasi-Newton training methods, innovative network
variants, regularization techniques and other specific tricks in DNNs. Other
than unsupervised pretraining, it provides a new perspective to address the
non-convex optimization problem in DNNs.Comment: AAAI 2016, 0.39%~0.4% ER on MNIST with single 32-32-256-10 ConvNets,
code available at https://github.com/cauchyturing/ANRA
ResumeNet: A Learning-based Framework for Automatic Resume Quality Assessment
Recruitment of appropriate people for certain positions is critical for any
companies or organizations. Manually screening to select appropriate candidates
from large amounts of resumes can be exhausted and time-consuming. However,
there is no public tool that can be directly used for automatic resume quality
assessment (RQA). This motivates us to develop a method for automatic RQA.
Since there is also no public dataset for model training and evaluation, we
build a dataset for RQA by collecting around 10K resumes, which are provided by
a private resume management company. By investigating the dataset, we identify
some factors or features that could be useful to discriminate good resumes from
bad ones, e.g., the consistency between different parts of a resume. Then a
neural-network model is designed to predict the quality of each resume, where
some text processing techniques are incorporated. To deal with the label
deficiency issue in the dataset, we propose several variants of the model by
either utilizing the pair/triplet-based loss, or introducing some
semi-supervised learning technique to make use of the abundant unlabeled data.
Both the presented baseline model and its variants are general and easy to
implement. Various popular criteria including the receiver operating
characteristic (ROC) curve, F-measure and ranking-based average precision (AP)
are adopted for model evaluation. We compare the different variants with our
baseline model. Since there is no public algorithm for RQA, we further compare
our results with those obtained from a website that can score a resume.
Experimental results in terms of different criteria demonstrate the
effectiveness of the proposed method. We foresee that our approach would
transform the way of future human resources management.Comment: ICD
Higher-order Projected Power Iterations for Scalable Multi-Matching
The matching of multiple objects (e.g. shapes or images) is a fundamental
problem in vision and graphics. In order to robustly handle ambiguities, noise
and repetitive patterns in challenging real-world settings, it is essential to
take geometric consistency between points into account. Computationally, the
multi-matching problem is difficult. It can be phrased as simultaneously
solving multiple (NP-hard) quadratic assignment problems (QAPs) that are
coupled via cycle-consistency constraints. The main limitations of existing
multi-matching methods are that they either ignore geometric consistency and
thus have limited robustness, or they are restricted to small-scale problems
due to their (relatively) high computational cost. We address these
shortcomings by introducing a Higher-order Projected Power Iteration method,
which is (i) efficient and scales to tens of thousands of points, (ii)
straightforward to implement, (iii) able to incorporate geometric consistency,
(iv) guarantees cycle-consistent multi-matchings, and (iv) comes with
theoretical convergence guarantees. Experimentally we show that our approach is
superior to existing methods
- …