82 research outputs found

    ネパール γƒŽ パルパグン γƒŽ γ‚³γƒŸγƒ₯ニティ ο½₯ γƒ•γ‚©γƒ¬γ‚Ήγƒˆγƒͺο½° チむキ シャカむ ニγ‚ͺケル γ‚³γƒŸγƒ₯ニティ ο½₯ γƒ•γ‚©γƒ¬γ‚Ήγƒˆγƒͺο½° γƒˆ γ‚½γƒŽ エむキョウ

    No full text
    PDF/A formatsAccess: via World Wide Web東京倖国θͺžε€§ε­¦ε€§ε­¦ι™’η·εˆε›½ιš›ε­¦η ”η©Άη§‘εšε£« (ε­¦θ‘“) θ«–ζ–‡ (2017εΉ΄5月)Author's thesis (Ph.D)--Tokyo University of Foreign Studies, 2017εšη”²η¬¬228号"A dissertation submitted to the Graduate School of Global Studies in partial fulfillment of the Requirments for the Degree of Doctor of philosophy in Area and International Studies. Supervised by prof. Okada Akito"Bibliography: p. 140-161Summary in English and Japanese東京倖国θͺžε€§ε­¦ (Tokyo University of Foreign Studies)博士 (ε­¦θ‘“

    Dimension Mixer: A Generalized Method for Structured Sparsity in Deep Neural Networks

    Full text link
    The recent success of multiple neural architectures like CNNs, Transformers, and MLP-Mixers motivated us to look for similarities and differences between them. We found that these architectures can be interpreted through the lens of a general concept of dimension mixing. Research on coupling flows and the butterfly transform shows that partial and hierarchical signal mixing schemes are sufficient for efficient and expressive function approximation. In this work, we study group-wise sparse, non-linear, multi-layered and learnable mixing schemes of inputs and find that they are complementary to many standard neural architectures. Following our observations and drawing inspiration from the Fast Fourier Transform, we generalize Butterfly Structure to use non-linear mixer function allowing for MLP as mixing function called Butterfly MLP. We were also able to mix along sequence dimension for Transformer-based architectures called Butterfly Attention. Experiments on CIFAR and LRA datasets demonstrate that the proposed Non-Linear Butterfly Mixers are efficient and scale well when the host architectures are used as mixing function. Additionally, we propose Patch-Only MLP-Mixer for processing spatial 2D signals demonstrating a different dimension mixing strategy.Comment: 11 pages, 4 figures, 7 table

    Importance Estimation with Random Gradient for Neural Network Pruning

    Full text link
    Global Neuron Importance Estimation is used to prune neural networks for efficiency reasons. To determine the global importance of each neuron or convolutional kernel, most of the existing methods either use activation or gradient information or both, which demands abundant labelled examples. In this work, we use heuristics to derive importance estimation similar to Taylor First Order (TaylorFO) approximation based methods. We name our methods TaylorFO-abs and TaylorFO-sq. We propose two additional methods to improve these importance estimation methods. Firstly, we propagate random gradients from the last layer of a network, thus avoiding the need for labelled examples. Secondly, we normalize the gradient magnitude of the last layer output before propagating, which allows all examples to contribute similarly to the importance score. Our methods with additional techniques perform better than previous methods when tested on ResNet and VGG architectures on CIFAR-100 and STL-10 datasets. Furthermore, our method also complements the existing methods and improves their performances when combined with them.Comment: 7 pages, 2 figures, ICLR 2023 Workshop on Sparsity in Neural Networks. arXiv admin note: text overlap with arXiv:2306.1320

    Input Invex Neural Network

    Full text link
    In this paper, we present a novel method to constrain invexity on Neural Networks (NN). Invex functions ensure every stationary point is global minima. Hence, gradient descent commenced from any point will lead to the global minima. Another advantage of invexity on NN is to divide data space locally into two connected sets with a highly non-linear decision boundary by simply thresholding the output. To this end, we formulate a universal invex function approximator and employ it to enforce invexity in NN. We call it Input Invex Neural Networks (II-NN). We first fit data with a known invex function, followed by modification with a NN, compare the direction of the gradient and penalize the direction of gradient on NN if it contradicts with the direction of reference invex function. In order to penalize the direction of the gradient we perform Gradient Clipped Gradient Penalty (GC-GP). We applied our method to the existing NNs for both image classification and regression tasks. From the extensive empirical and qualitative experiments, we observe that our method gives the performance similar to ordinary NN yet having invexity. Our method outperforms linear NN and Input Convex Neural Network (ICNN) with a large margin. We publish our code and implementation details at github.Comment: 20 page

    Challenges to the peace process in Nepal

    Get PDF

    NepBERTa : Nepali Language Model Trained in a Large Corpus

    Get PDF
    We would like to thank Google’s TPU Research Cloud program for providing us with free and unlimited usage of TPU v3-128 for 90 days. It would not have been possible without the continuous support and response of the TRC team.Publisher PD

    The Barriers to Community Forest Management: A Case Study ofCommunity Forest User Groups in Palpa

    Get PDF
    • …
    corecore