8 research outputs found
Appearence of Random Matrix Theory in Deep Learning
We investigate the local spectral statistics of the loss surface Hessians of
artificial neural networks, where we discover excellent agreement with Gaussian
Orthogonal Ensemble statistics across several network architectures and
datasets. These results shed new light on the applicability of Random Matrix
Theory to modelling neural networks and suggest a previously unrecognised role
for it in the study of loss surfaces in deep learning. Inspired by these
observations, we propose a novel model for the true loss surfaces of neural
networks, consistent with our observations, which allows for Hessian spectral
densities with rank degeneracy and outliers, extensively observed in practice,
and predicts a growing independence of loss gradients as a function of distance
in weight-space. We further investigate the importance of the true loss surface
in neural networks and find, in contrast to previous work, that the exponential
hardness of locating the global minimum has practical consequences for
achieving state of the art performance.Comment: 33 pages, 14 figure
Iterative Averaging in the Quest for Best Test Error
We analyse and explain the increased generalisation performance of iterate
averaging using a Gaussian process perturbation model between the true and
batch risk surface on the high dimensional quadratic. We derive three phenomena
\latestEdits{from our theoretical results:} (1) The importance of combining
iterate averaging (IA) with large learning rates and regularisation for
improved regularisation. (2) Justification for less frequent averaging. (3)
That we expect adaptive gradient methods to work equally well, or better, with
iterate averaging than their non-adaptive counterparts. Inspired by these
results\latestEdits{, together with} empirical investigations of the importance
of appropriate regularisation for the solution diversity of the iterates, we
propose two adaptive algorithms with iterate averaging. These give
significantly better results compared to stochastic gradient descent (SGD),
require less tuning and do not require early stopping or validation set
monitoring. We showcase the efficacy of our approach on the CIFAR-10/100,
ImageNet and Penn Treebank datasets on a variety of modern and classical
network architectures
Microprocessor mediates transcriptional termination of long noncoding RNA transcripts hosting microRNAs
MicroRNA (miRNA) play a major role in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with co-transcriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. While most miRNA are located within introns of protein coding genes, a substantial minority of miRNA originate from long non coding (lnc) RNA where transcript processing is largely uncharacterized. Here, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis, we show that most lnc-pri-miRNA do not use the canonical cleavage and polyadenylation (CPA) pathway but instead use Microprocessor cleavage to terminate transcription. Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a novel RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells
Primary microRNA transcripts are processed co-transcriptionally.
microRNAs (miRNAs) are generated from long primary (pri-) RNA polymerase II (Pol II)-derived transcripts by two RNase III processing reactions: Drosha cleavage of nuclear pri-miRNAs and Dicer cleavage of cytoplasmic pre-miRNAs. Here we show that Drosha cleavage occurs during transcription acting on both independently transcribed and intron-encoded miRNAs. We also show that both 5'-3' and 3'-5' exonucleases associate with the sites where co-transcriptional Drosha cleavage occurs, promoting intron degradation before splicing. We finally demonstrate that miRNAs can also derive from 3' flanking transcripts of Pol II genes. Our results demonstrate that multiple miRNA-containing transcripts are co-transcriptionally cleaved during their synthesis and suggest that exonucleolytic degradation from Drosha cleavage sites in pre-mRNAs may influence the splicing and maturation of numerous mRNAs