18,530 research outputs found
FReLU: Flexible Rectified Linear Units for Improving Convolutional Neural Networks
Rectified linear unit (ReLU) is a widely used activation function for deep
convolutional neural networks. However, because of the zero-hard rectification,
ReLU networks miss the benefits from negative values. In this paper, we propose
a novel activation function called \emph{flexible rectified linear unit
(FReLU)} to further explore the effects of negative values. By redesigning the
rectified point of ReLU as a learnable parameter, FReLU expands the states of
the activation output. When the network is successfully trained, FReLU tends to
converge to a negative value, which improves the expressiveness and thus the
performance. Furthermore, FReLU is designed to be simple and effective without
exponential functions to maintain low cost computation. For being able to
easily used in various network architectures, FReLU does not rely on strict
assumptions by self-adaption. We evaluate FReLU on three standard image
classification datasets, including CIFAR-10, CIFAR-100, and ImageNet.
Experimental results show that the proposed method achieves fast convergence
and higher performances on both plain and residual networks
Colorization as a Proxy Task for Visual Understanding
We investigate and improve self-supervision as a drop-in replacement for
ImageNet pretraining, focusing on automatic colorization as the proxy task.
Self-supervised training has been shown to be more promising for utilizing
unlabeled data than other, traditional unsupervised learning methods. We build
on this success and evaluate the ability of our self-supervised network in
several contexts. On VOC segmentation and classification tasks, we present
results that are state-of-the-art among methods not using ImageNet labels for
pretraining representations.
Moreover, we present the first in-depth analysis of self-supervision via
colorization, concluding that formulation of the loss, training details and
network architecture play important roles in its effectiveness. This
investigation is further expanded by revisiting the ImageNet pretraining
paradigm, asking questions such as: How much training data is needed? How many
labels are needed? How much do features change when fine-tuned? We relate these
questions back to self-supervision by showing that colorization provides a
similarly powerful supervisory signal as various flavors of ImageNet
pretraining.Comment: CVPR 2017 (Project page:
http://people.cs.uchicago.edu/~larsson/color-proxy/
Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm
Over the past five decades, k-means has become the clustering algorithm of
choice in many application domains primarily due to its simplicity, time/space
efficiency, and invariance to the ordering of the data points. Unfortunately,
the algorithm's sensitivity to the initial selection of the cluster centers
remains to be its most serious drawback. Numerous initialization methods have
been proposed to address this drawback. Many of these methods, however, have
time complexity superlinear in the number of data points, which makes them
impractical for large data sets. On the other hand, linear methods are often
random and/or sensitive to the ordering of the data points. These methods are
generally unreliable in that the quality of their results is unpredictable.
Therefore, it is common practice to perform multiple runs of such methods and
take the output of the run that produces the best results. Such a practice,
however, greatly increases the computational requirements of the otherwise
highly efficient k-means algorithm. In this chapter, we investigate the
empirical performance of six linear, deterministic (non-random), and
order-invariant k-means initialization methods on a large and diverse
collection of data sets from the UCI Machine Learning Repository. The results
demonstrate that two relatively unknown hierarchical initialization methods due
to Su and Dy outperform the remaining four methods with respect to two
objective effectiveness criteria. In addition, a recent method due to Erisoglu
et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms
(Springer, 2014). arXiv admin note: substantial text overlap with
arXiv:1304.7465, arXiv:1209.196
Training Process Reduction Based On Potential Weights Linear Analysis To Accelarate Back Propagation Network
Learning is the important property of Back Propagation Network (BPN) and
finding the suitable weights and thresholds during training in order to improve
training time as well as achieve high accuracy. Currently, data pre-processing
such as dimension reduction input values and pre-training are the contributing
factors in developing efficient techniques for reducing training time with high
accuracy and initialization of the weights is the important issue which is
random and creates paradox, and leads to low accuracy with high training time.
One good data preprocessing technique for accelerating BPN classification is
dimension reduction technique but it has problem of missing data. In this
paper, we study current pre-training techniques and new preprocessing technique
called Potential Weight Linear Analysis (PWLA) which combines normalization,
dimension reduction input values and pre-training. In PWLA, the first data
preprocessing is performed for generating normalized input values and then
applying them by pre-training technique in order to obtain the potential
weights. After these phases, dimension of input values matrix will be reduced
by using real potential weights. For experiment results XOR problem and three
datasets, which are SPECT Heart, SPECTF Heart and Liver disorders (BUPA) will
be evaluated. Our results, however, will show that the new technique of PWLA
will change BPN to new Supervised Multi Layer Feed Forward Neural Network
(SMFFNN) model with high accuracy in one epoch without training cycle. Also
PWLA will be able to have power of non linear supervised and unsupervised
dimension reduction property for applying by other supervised multi layer feed
forward neural network model in future work.Comment: 11 pages IEEE format, International Journal of Computer Science and
Information Security, IJCSIS 2009, ISSN 1947 5500, Impact factor 0.42
- …