131 research outputs found

    On the Importance of Normalisation Layers in Deep Learning with Piecewise Linear Activation Units

    Full text link
    Deep feedforward neural networks with piecewise linear activations are currently producing the state-of-the-art results in several public datasets. The combination of deep learning models and piecewise linear activation functions allows for the estimation of exponentially complex functions with the use of a large number of subnetworks specialized in the classification of similar input examples. During the training process, these subnetworks avoid overfitting with an implicit regularization scheme based on the fact that they must share their parameters with other subnetworks. Using this framework, we have made an empirical observation that can improve even more the performance of such models. We notice that these models assume a balanced initial distribution of data points with respect to the domain of the piecewise linear activation function. If that assumption is violated, then the piecewise linear activation units can degenerate into purely linear activation units, which can result in a significant reduction of their capacity to learn complex functions. Furthermore, as the number of model layers increases, this unbalanced initial distribution makes the model ill-conditioned. Therefore, we propose the introduction of batch normalisation units into deep feedforward neural networks with piecewise linear activations, which drives a more balanced use of these activation units, where each region of the activation function is trained with a relatively large proportion of training samples. Also, this batch normalisation promotes the pre-conditioning of very deep learning models. We show that by introducing maxout and batch normalisation units to the network in network model results in a model that produces classification results that are better than or comparable to the current state of the art in CIFAR-10, CIFAR-100, MNIST, and SVHN datasets

    Methods for Understanding and Improving Deep Learning Classification Models

    Get PDF
    Recently proposed deep learning systems can achieve superior performance with respect to methods based on hand-crafted features on a broad range of tasks, not limited to the object recognition/detection tasks, but also on medical image analysis and game control applications. These advances can be credited in part to the rapid development of computation hardware, and the availability of large-scale public datasets. The training process of deep learning models is a challenging task because of the large number of parameters involved, which requires large annotated training sets. A number of recent works have tried to explain the behaviour of deep learning models during training and testing, but the whole field still has limited understanding of the functionality of deep learning models. In this thesis, we aim to develop methods that allow for a better understanding of the behaviour of deep learning models. With such methods, we attempt to improve the performance of deep learning models in several applications and reveal promising directions to explore with empirical evidence. Our first method is a novel nonlinear hierarchical classifier that uses off-the-shelf convolutional neural network (CNN) features. This nonlinear classifier is a tree-structured classifier that uses linear classifier as tree nodes. Experiments suggest that our proposed nonlinear hierarchical classifier achieves better results than the linear classifiers. In our second method, we use Maxout activation function to replace the common rectified linear unit (ReLU) function to increase the model capacity of deep learning models. We found that it can lead to an ill-conditioned training problem, given that the input data is generally not properly normalised. We show how to mitigate this problem by incorporating Batch Normalisation. This method allows us to build a deep learning model that surpassed the performance of several state-of-the-art methods. In the third method, we explore the possibility of introducing multiple-size features into deep learning models. Our design includes up to four different filter sizes to provide different spatial pattern candidates, and a max pooling function that selects the maximum response to represent the unit’s output. As an outcome of this work, we combine the multiple-size filters and the Batch-normalised Maxout activation unit from the second work to achieve the automatic spatial pattern selection within the activation unit. The result of this research shows significant improvements over the state-of-the-art on five publicly available computer vision datasets, including the ImageNet 2012 dataset. Finally, we propose two novel measurements derived from the eigenvalues of the approximate empirical Fisher matrix which can be efficiently calculated within the stochastic gradient descent (SGD) iteration. These measurements can be obtained efficiently even for the recent state-of-the-art deep residual networks. We show how to use these measurements to help select training hyper-parameters such as mini-batch size, model structure, learning rate and stochastic depth rate. By using these tools, we discover a new way to schedule the dynamic sampling and dynamic stochastic depth, which leads to performance improvements of deep learning models. We show the proposed training approach reaches competitive classification results in CIFAR-10 and CIFAR- 100 datasets with models that have significantly lower capacity compare to the current state-of-the-art in the field.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 201

    CCD photometric study of the W UMa-type binary II CMa in the field of Berkeley 33

    Full text link
    The CCD photometric data of the EW-type binary, II CMa, which is a contact star in the field of the middle-aged open cluster Berkeley 33, are presented. The complete R light curve was obtained. In the present paper, using the five CCD epochs of light minimum (three of them are calculated from Mazur et al. (1993)'s data and two from our new data), the orbital period P was revised to 0.22919704 days. The complete R light curve was analyzed by using the 2003 version of W-D (Wilson-Devinney) program. It is found that this is a contact system with a mass ratio q=0.9q=0.9 and a contact factor f=4.1f=4.1%. The high mass ratio (q=0.9q=0.9) and the low contact factor (f=4.1f=4.1%) indicate that the system just evolved into the marginal contact stage

    Quaternary ammonium cationic polymer as a superior bifunctional binder for lithium – sulfur batteries and effects of counter anion

    Get PDF
    Bifunctional polymer binders featured with both strong binding and superior polysulfide trapping properties are highly desired for the fabrication of sulfur cathodes with suppressed polysulfide shuttling in Li–S batteries. In this paper, we have explored the potential of a quaternary ammonium cationic polymer, polydiallyldimethylammonium (PDADMA-X; X = T, B, P, and Cl) with different counter anions (TFSI–, BF4−, PF6−, and Cl−, respectively) as the bifunctional binder. We have also revealed the dramatic effects of the counter anion on the performance of the cationic polymer binder. PDADMA-X's containing the former three weakly associating anions have been demonstrated to show polysulfide adsorption capability. In particular, PDADMA-T having the largest, least interacting TFSI– anion shows the optimum performance, with strong binding strength and the best polysulfide adsorption capability. Relative to commercial PVDF and PDADMA-X's of other counter anions, it offers sulfur cathodes with lowered polarization, higher discharge capacity, significantly better capacity retention, and improved cycling stability. With its convenient synthesis from commercially available PDADMA-Cl, cationic PDADMA-T having the TFSI– anion is a promising bifunctioal binder for sulfur cathodes in practical Li-sulfur batteries

    Heat Pump-Based Novel Energy System for High-Power LED Lamp Cooling and Waste Heat Recovery

    Get PDF
    Unlike incandescent light bulb, which radiates heat into the surroundings by infrared rays, light emitting diode (LED) traps heat inside the lamp. This fact increases the difficulty of cooling LED lamps, while it facilitates the recovery of the generated heat. We propose a novel energy system that merges high-power LED lamp cooling with the heat pump use; the heat pump can cool the LED lamp and at the same time recover the waste heat. In this way, a high percentage of the energy consumed by the LED lamp can be utilized. In this work, we developed a prototype of this energy system and conducted a series of experimental studies to determine the effect of several parameters, such as cooling water flow rate and LED power, on the LED leadframe temperature, compressor power consumption, and system performance. The experimental results clearly indicate that the energy system can lead to substantial energy savings
    • …
    corecore