8 research outputs found

    Appearence of Random Matrix Theory in Deep Learning

    Get PDF
    We investigate the local spectral statistics of the loss surface Hessians of artificial neural networks, where we discover excellent agreement with Gaussian Orthogonal Ensemble statistics across several network architectures and datasets. These results shed new light on the applicability of Random Matrix Theory to modelling neural networks and suggest a previously unrecognised role for it in the study of loss surfaces in deep learning. Inspired by these observations, we propose a novel model for the true loss surfaces of neural networks, consistent with our observations, which allows for Hessian spectral densities with rank degeneracy and outliers, extensively observed in practice, and predicts a growing independence of loss gradients as a function of distance in weight-space. We further investigate the importance of the true loss surface in neural networks and find, in contrast to previous work, that the exponential hardness of locating the global minimum has practical consequences for achieving state of the art performance.Comment: 33 pages, 14 figure

    Iterative Averaging in the Quest for Best Test Error

    Full text link
    We analyse and explain the increased generalisation performance of iterate averaging using a Gaussian process perturbation model between the true and batch risk surface on the high dimensional quadratic. We derive three phenomena \latestEdits{from our theoretical results:} (1) The importance of combining iterate averaging (IA) with large learning rates and regularisation for improved regularisation. (2) Justification for less frequent averaging. (3) That we expect adaptive gradient methods to work equally well, or better, with iterate averaging than their non-adaptive counterparts. Inspired by these results\latestEdits{, together with} empirical investigations of the importance of appropriate regularisation for the solution diversity of the iterates, we propose two adaptive algorithms with iterate averaging. These give significantly better results compared to stochastic gradient descent (SGD), require less tuning and do not require early stopping or validation set monitoring. We showcase the efficacy of our approach on the CIFAR-10/100, ImageNet and Penn Treebank datasets on a variety of modern and classical network architectures

    Microprocessor mediates transcriptional termination of long noncoding RNA transcripts hosting microRNAs

    Get PDF
    MicroRNA (miRNA) play a major role in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with co-transcriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. While most miRNA are located within introns of protein coding genes, a substantial minority of miRNA originate from long non coding (lnc) RNA where transcript processing is largely uncharacterized. Here, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis, we show that most lnc-pri-miRNA do not use the canonical cleavage and polyadenylation (CPA) pathway but instead use Microprocessor cleavage to terminate transcription. Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a novel RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells

    Primary microRNA transcripts are processed co-transcriptionally.

    No full text
    microRNAs (miRNAs) are generated from long primary (pri-) RNA polymerase II (Pol II)-derived transcripts by two RNase III processing reactions: Drosha cleavage of nuclear pri-miRNAs and Dicer cleavage of cytoplasmic pre-miRNAs. Here we show that Drosha cleavage occurs during transcription acting on both independently transcribed and intron-encoded miRNAs. We also show that both 5'-3' and 3'-5' exonucleases associate with the sites where co-transcriptional Drosha cleavage occurs, promoting intron degradation before splicing. We finally demonstrate that miRNAs can also derive from 3' flanking transcripts of Pol II genes. Our results demonstrate that multiple miRNA-containing transcripts are co-transcriptionally cleaved during their synthesis and suggest that exonucleolytic degradation from Drosha cleavage sites in pre-mRNAs may influence the splicing and maturation of numerous mRNAs
    corecore