15 research outputs found

    Universality of underlying mechanism for successful deep learning

    Full text link
    An underlying mechanism for successful deep learning (DL) with a limited deep architecture and dataset, namely VGG-16 on CIFAR-10, was recently presented based on a quantitative method to measure the quality of a single filter in each layer. In this method, each filter identifies small clusters of possible output labels, with additional noise selected as labels out of the clusters. This feature is progressively sharpened with the layers, resulting in an enhanced signal-to-noise ratio (SNR) and higher accuracy. In this study, the suggested universal mechanism is verified for VGG-16 and EfficientNet-B0 trained on the CIFAR-100 and ImageNet datasets with the following main results. First, the accuracy progressively increases with the layers, whereas the noise per filter typically progressively decreases. Second, for a given deep architecture, the maximal error rate increases approximately linearly with the number of output labels. Third, the average filter cluster size and the number of clusters per filter at the last convolutional layer adjacent to the output layer are almost independent of the number of dataset labels in the range [3, 1,000], while a high SNR is preserved. The presented DL mechanism suggests several techniques, such as applying filter's cluster connections (AFCC), to improve the computational complexity and accuracy of deep architectures and furthermore pinpoints the simplification of pre-existing structures while maintaining their accuracies.Comment: 27 pages,5 figures, 6 tables. arXiv admin note: text overlap with arXiv:2305.1807

    Efficient shallow learning as an alternative to deep learning

    Full text link
    The realization of complex classification tasks requires training of deep learning (DL) architectures consisting of tens or even hundreds of convolutional and fully connected hidden layers, which is far from the reality of the human brain. According to the DL rationale, the first convolutional layer reveals localized patterns in the input and large-scale patterns in the following layers, until it reliably characterizes a class of inputs. Here, we demonstrate that with a fixed ratio between the depths of the first and second convolutional layers, the error rates of the generalized shallow LeNet architecture, consisting of only five layers, decay as a power law with the number of filters in the first convolutional layer. The extrapolation of this power law indicates that the generalized LeNet can achieve small error rates that were previously obtained for the CIFAR-10 database using DL architectures. A power law with a similar exponent also characterizes the generalized VGG-16 architecture. However, this results in a significantly increased number of operations required to achieve a given error rate with respect to LeNet. This power law phenomenon governs various generalized LeNet and VGG-16 architectures, hinting at its universal behavior and suggesting a quantitative hierarchical time-space complexity among machine learning architectures. Additionally, the conservation law along the convolutional layers, which is the square-root of their size times their depth, is found to asymptotically minimize error rates. The efficient shallow learning that is demonstrated in this study calls for further quantitative examination using various databases and architectures and its accelerated implementation using future dedicated hardware developments.Comment: 26 pages, 4 figures (improved figures resolution

    Enhancing the success rates by performing pooling decisions adjacent to the output layer

    Full text link
    Learning classification tasks of (2^nx2^n) inputs typically consist of \le n (2x2) max-pooling (MP) operators along the entire feedforward deep architecture. Here we show, using the CIFAR-10 database, that pooling decisions adjacent to the last convolutional layer significantly enhance accuracy success rates (SRs). In particular, average SRs of the advanced VGG with m layers (A-VGGm) architectures are 0.936, 0.940, 0.954, 0.955, and 0.955 for m=6, 8, 14, 13, and 16, respectively. The results indicate A-VGG8s' SR is superior to VGG16s', and that the SRs of A-VGG13 and A-VGG16 are equal, and comparable to that of Wide-ResNet16. In addition, replacing the three fully connected (FC) layers with one FC layer, A-VGG6 and A-VGG14, or with several linear activation FC layers, yielded similar SRs. These significantly enhanced SRs stem from training the most influential input-output routes, in comparison to the inferior routes selected following multiple MP decisions along the deep architecture. In addition, SRs are sensitive to the order of the non-commutative MP and average pooling operators adjacent to the output layer, varying the number and location of training routes. The results call for the reexamination of previously proposed deep architectures and their SRs by utilizing the proposed pooling strategy adjacent to the output layer.Comment: 27 pages, 3 figures, 1 table and Supplementary Informatio

    The mechanism underlying successful deep learning

    Full text link
    Deep architectures consist of tens or hundreds of convolutional layers (CLs) that terminate with a few fully connected (FC) layers and an output layer representing the possible labels of a complex classification task. According to the existing deep learning (DL) rationale, the first CL reveals localized features from the raw data, whereas the subsequent layers progressively extract higher-level features required for refined classification. This article presents an efficient three-phase procedure for quantifying the mechanism underlying successful DL. First, a deep architecture is trained to maximize the success rate (SR). Next, the weights of the first several CLs are fixed and only the concatenated new FC layer connected to the output is trained, resulting in SRs that progress with the layers. Finally, the trained FC weights are silenced, except for those emerging from a single filter, enabling the quantification of the functionality of this filter using a correlation matrix between input labels and averaged output fields, hence a well-defined set of quantifiable features is obtained. Each filter essentially selects a single output label independent of the input label, which seems to prevent high SRs; however, it counterintuitively identifies a small subset of possible output labels. This feature is an essential part of the underlying DL mechanism and is progressively sharpened with layers, resulting in enhanced signal-to-noise ratios and SRs. Quantitatively, this mechanism is exemplified by the VGG-16, VGG-6, and AVGG-16. The proposed mechanism underlying DL provides an accurate tool for identifying each filter's quality and is expected to direct additional procedures to improve the SR, computational complexity, and latency of DL.Comment: 33 pages, 8 figure

    Optical Cryptanalysis: Recovering Cryptographic Keys from Power LED Light Fluctuations

    Get PDF
    Although power LEDs have been integrated in various devices that perform cryptographic operations for decades, the cryptanalysis risk they pose has not yet been investigated. In this paper, we present optical cryptanalysis, a new form of cryptanalytic side-channel attack, in which secret keys are extracted by using a photodiode to measure the light emitted by a device’s power LED and analyzing subtle fluctuations in the light intensity during cryptographic operations. We analyze the optical leakage of power LEDs of various consumer devices and the factors that affect the optical SNR. We then demonstrate end-to-end optical cryptanalytic attacks against a range of consumer devices (smartphone, smartcard, and Raspberry Pi, along with their USB peripherals) and recover secret keys (RSA, ECDSA, SIKE) from prior and recent versions of popular cryptographic libraries (GnuPG, Libgcrypt, PQCrypto-SIDH) from a maximum distance of 25 meter

    The Missing Linkage: Building Effective Governance for Joint and Network Evaluation

    No full text
    Research on governance network evaluation has made great strides in determining what effectiveness is and effectiveness for whom , often introducing a set of indicators or objectives as a basis for network evaluation. On this basis, governance theorists developed frameworks and models to increase network effectiveness in relation to specific determinants. The discussion of indicators and determinants of network effectiveness has been followed by innovative methods to analyse and evaluate them. Yet, the linkage between indicators and determinants of effectiveness and methodologies of network evaluation is still unclear as different evaluation methods can be managed and applied in various ways depending on the nature of the network and the programmes to be evaluated. Concrete applications of joint and network evaluations mirror this theoretical gap, where inappropriate evaluation approaches and management practices of such evaluations cause delays, increases in transaction costs, cumbersome governance structures, dissatisfaction of actors with current evaluation practices and unresolved conflicts that hinder engagement between actors for future collaborative performance assessments. In this dissertation we frame this problem as a problem of governance, and aim to generate knowledge concerning ways to improve governance and increase governability, without giving up on the ability to evaluate processes and final results of highly complex programmes and networks. Differentiating between aspects of demand and supply in evaluation systems and conceiving varied levels of dynamics, complexity and diversity as given conditions to which governance should adapt, the research question in this PhD thesis is: How can the governing system in joint and network evaluations be best adapted to the programmes to be evaluated?By integrating the parameters of evaluation, complexity and governance, this dissertation introduces a configurative model based on levels of dynamics, complexity and diversity that provides procedures for building effective governance for joint and network evaluations as the linkage between network effectiveness and the methods used to assess it. Following a retroductive research design, the model is exemplified in five case studies in order to enrich it and examine its validity and practical application. The field of development aid was chosen as an appropriate arena in which to develop and test the model because of the rich practice and experience generated in this field. However, based on governance networks, complexity and evaluation theories, the model is designed to be applied to various fields, all types of networks and diverse evaluation types. As the model conceives of network evaluation as a network within network , it goes even beyond the evaluation process and points to a new direction for exploring network effectiveness as a whole.In addition to its direct practical contribution, this dissertation adds to the theoretical discussion on network governance and effectiveness. It rejects the current assumption that networks characterized by coordination and cooperation should necessarily be assessed differently from those characterized by collaboration, since each network type can be classified as having either high or low levels of complexity. Furthermore, it rejects the traditional dichotomy between participatory, multi-perspective evaluation and conventional performance indicators and the claim that goal attainment methods as ex-ante formulated objectives have no credibility in collaborative performance assessments. Instead, this dissertation shows that performance measurement and management practices can complement participatory process evaluation, and that ex-ante assumptions may be important tools also in complex networks evaluation, depending on the levels of system dynamics. Other main theoretical contributions of this dissertation are a different but practical and innovative method to analyse complexity, a differentiation between the governing systems and the systems to be governed in networks and the revision of various central concepts in organization and network theories. Finally, perhaps the most important contribution of this dissertation is the transdisciplinary approach to governance and evaluation, demonstrating that the two disciplines have much to contribute to and learn from each other.status: publishe

    Nuclear spin effects in biological processes

    No full text
    Traditionally, nuclear spin is not considered to affect biological processes. Recently, this has changed as isotopic fractionation that deviates from classical mass dependence was reported both in vitro and in vivo. In these cases, the isotopic effect correlates with the nuclear magnetic spin. Here, we show nuclear spin effects using stable oxygen isotopes (16O, 17O, and 18O) in two separate setups: an artificial dioxygen production system and biological aquaporin channels in cells. We observe that oxygen dynamics in chiral environments (in particular its transport) depend on nuclear spin, suggesting future applications for controlled isotope separation to be used, for instance, in NMR. To demonstrate the mechanism behind our findings, we formulate theoretical models based on a nuclear-spin-enhanced switch between electronic spin states. Accounting for the role of nuclear spin in biology can provide insights into the role of quantum effects in living systems and help inspire the development of future biotechnology solutions

    IHH enhancer variant within neighboring NHEJ1 intron causes microphthalmia anophthalmia and coloboma

    No full text
    Abstract Genomic sequences residing within introns of few genes have been shown to act as enhancers affecting expression of neighboring genes. We studied an autosomal recessive phenotypic continuum of microphthalmia, anophthalmia and ocular coloboma, with no apparent coding-region disease-causing mutation. Homozygosity mapping of several affected Jewish Iranian families, combined with whole genome sequence analysis, identified a 0.5 Mb disease-associated chromosome 2q35 locus (maximal LOD score 6.8) harboring an intronic founder variant in NHEJ1, not predicted to affect NHEJ1. The human NHEJ1 intronic variant lies within a known specifically limb-development enhancer of a neighboring gene, Indian hedgehog (Ihh), known to be involved in eye development in mice and chickens. Through mouse and chicken molecular development studies, we demonstrated that this variant is within an Ihh enhancer that drives gene expression in the developing eye and that the identified variant affects this eye-specific enhancer activity. We thus delineate an Ihh enhancer active in mammalian eye development whose variant causes human microphthalmia, anophthalmia and ocular coloboma. The findings highlight disease causation by an intronic variant affecting the expression of a neighboring gene, delineating molecular pathways of eye development
    corecore