25 research outputs found
Batch Normalization Orthogonalizes Representations in Deep Random Networks
This paper underlines a subtle property of batch-normalization (BN):
Successive batch normalizations with random linear transformations make hidden
representations increasingly orthogonal across layers of a deep neural network.
We establish a non-asymptotic characterization of the interplay between depth,
width, and the orthogonality of deep representations. More precisely, under a
mild assumption, we prove that the deviation of the representations from
orthogonality rapidly decays with depth up to a term inversely proportional to
the network width. This result has two main implications: 1) Theoretically, as
the depth grows, the distribution of the representation -- after the linear
layers -- contracts to a Wasserstein-2 ball around an isotropic Gaussian
distribution. Furthermore, the radius of this Wasserstein ball shrinks with the
width of the network. 2) In practice, the orthogonality of the representations
directly influences the performance of stochastic gradient descent (SGD). When
representations are initially aligned, we observe SGD wastes many iterations to
orthogonalize representations before the classification. Nevertheless, we
experimentally show that starting optimization from orthogonal representations
is sufficient to accelerate SGD, with no need for BN
On the impact of activation and normalization in obtaining isometric embeddings at initialization
In this paper, we explore the structure of the penultimate Gram matrix in
deep neural networks, which contains the pairwise inner products of outputs
corresponding to a batch of inputs. In several architectures it has been
observed that this Gram matrix becomes degenerate with depth at initialization,
which dramatically slows training. Normalization layers, such as batch or layer
normalization, play a pivotal role in preventing the rank collapse issue.
Despite promising advances, the existing theoretical results (i) do not extend
to layer normalization, which is widely used in transformers, (ii) can not
characterize the bias of normalization quantitatively at finite depth.
To bridge this gap, we provide a proof that layer normalization, in
conjunction with activation layers, biases the Gram matrix of a multilayer
perceptron towards isometry at an exponential rate with depth at
initialization. We quantify this rate using the Hermite expansion of the
activation function, highlighting the importance of higher order ()
Hermite coefficients in the bias towards isometry
Learning Genomic Sequence Representations using Graph Neural Networks over De Bruijn Graphs
The rapid expansion of genomic sequence data calls for new methods to achieve
robust sequence representations. Existing techniques often neglect intricate
structural details, emphasizing mainly contextual information. To address this,
we developed k-mer embeddings that merge contextual and structural string
information by enhancing De Bruijn graphs with structural similarity
connections. Subsequently, we crafted a self-supervised method based on
Contrastive Learning that employs a heterogeneous Graph Convolutional Network
encoder and constructs positive pairs based on node similarities. Our
embeddings consistently outperform prior techniques for Edit Distance
Approximation and Closest String Retrieval tasks.Comment: Poster at "NeurIPS 2023 New Frontiers in Graph Learning Workshop
(NeurIPS GLFrontiers 2023)
Identification of medicinal plants effective on sinusitis native to Shiraz province in Iran
Sinusitis is one of the most infectious diseases that affect holes around the nose such as frontal ethmoid sinuses, maxillary and sphenoid. Symptoms usually include nasal congestion and obstruction, feeling of pressure or fullness in the face, anterior or posterior nasal causing discharge, headaches, fever, swelling and erythema in forehead or cheek and cough. The symptoms might be edema and mucosal congestion, nasal drainage, posterior nasal discharge, nasal septum deviation and polyps. The medicinal plants identified for instance are Amygdalus scoparia Spach, Echinophora platyloba DC., Haplophyllum perforatum L, Lavandula stoechas L, Borago officinalis, Matricaria recutita, Descurainia Sophia (L.) Schr and Haplophyllum perforatum L to treat sinusitis in Shiraz. Many of these plants have antioxidant activity and contain bioactive compounds such as flavonoids, flavonoids, polyphenols, anthocyanins, tannins and many other pharmaceutical bioactive ingredients that have effects on sinusitis. This paper aims to review the recently published papers in this topic
EEG-Based Functional Brain Networks: Does the Network Size Matter?
Functional connectivity in human brain can be represented as a network using electroencephalography (EEG) signals. These networks – whose nodes can vary from tens to hundreds – are characterized by neurobiologically meaningful graph theory metrics. This study investigates the degree to which various graph metrics depend upon the network size. To this end, EEGs from 32 normal subjects were recorded and functional networks of three different sizes were extracted. A state-space based method was used to calculate cross-correlation matrices between different brain regions. These correlation matrices were used to construct binary adjacency connectomes, which were assessed with regards to a number of graph metrics such as clustering coefficient, modularity, efficiency, economic efficiency, and assortativity. We showed that the estimates of these metrics significantly differ depending on the network size. Larger networks had higher efficiency, higher assortativity and lower modularity compared to those with smaller size and the same density. These findings indicate that the network size should be considered in any comparison of networks across studies
On bridging the gap between mean field and finite width deep random multilayer perceptron with batch normalization
ISSN:2640-349