188 research outputs found
A Convergence Rate for Manifold Neural Networks
High-dimensional data arises in numerous applications, and the rapidly
developing field of geometric deep learning seeks to develop neural network
architectures to analyze such data in non-Euclidean domains, such as graphs and
manifolds. Recent work by Z. Wang, L. Ruiz, and A. Ribeiro has introduced a
method for constructing manifold neural networks using the spectral
decomposition of the Laplace Beltrami operator. Moreover, in this work, the
authors provide a numerical scheme for implementing such neural networks when
the manifold is unknown and one only has access to finitely many sample points.
The authors show that this scheme, which relies upon building a data-driven
graph, converges to the continuum limit as the number of sample points tends to
infinity. Here, we build upon this result by establishing a rate of convergence
that depends on the intrinsic dimension of the manifold but is independent of
the ambient dimension. We also discuss how the rate of convergence depends on
the depth of the network and the number of filters used in each layer
Curvature corrected tangent space-based approximation of manifold-valued data
When generalizing schemes for real-valued data approximation or decomposition
to data living in Riemannian manifolds, tangent space-based schemes are very
attractive for the simple reason that these spaces are linear. An open
challenge is to do this in such a way that the generalized scheme is applicable
to general Riemannian manifolds, is global-geometry aware and is
computationally feasible. Existing schemes have been unable to account for all
three of these key factors at the same time.
In this work, we take a systematic approach to developing a framework that is
able to account for all three factors. First, we will restrict ourselves to the
-- still general -- class of symmetric Riemannian manifolds and show how
curvature affects general manifold-valued tensor approximation schemes. Next,
we show how the latter observations can be used in a general strategy for
developing approximation schemes that are also global-geometry aware. Finally,
having general applicability and global-geometry awareness taken into account
we restrict ourselves once more in a case study on low-rank approximation. Here
we show how computational feasibility can be achieved and propose the
curvature-corrected truncated higher-order singular value decomposition
(CC-tHOSVD), whose performance is subsequently tested in numerical experiments
with both synthetic and real data living in symmetric Riemannian manifolds with
both positive and negative curvature
Manifold Filter-Combine Networks
We introduce a large class of manifold neural networks (MNNs) which we call
Manifold Filter-Combine Networks. This class includes as special cases, the
MNNs considered in previous work by Wang, Ruiz, and Ribeiro, the manifold
scattering transform (a wavelet-based model of neural networks), and other
interesting examples not previously considered in the literature such as the
manifold equivalent of Kipf and Welling's graph convolutional network. We then
consider a method, based on building a data-driven graph, for implementing such
networks when one does not have global knowledge of the manifold, but merely
has access to finitely many sample points. We provide sufficient conditions for
the network to provably converge to its continuum limit as the number of sample
points tends to infinity. Unlike previous work (which focused on specific MNN
architectures and graph constructions), our rate of convergence does not
explicitly depend on the number of filters used. Moreover, it exhibits linear
dependence on the depth of the network rather than the exponential dependence
obtained previously
birgHPC: creating instant computing clusters for bioinformatics and molecular dynamics
Summary: birgHPC, a bootable Linux Live CD has been developed to create high-performance clusters for bioinformatics and molecular dynamics studies using any Local Area Network (LAN)-networked computers. birgHPC features automated hardware and slots detection as well as provides a simple job submission interface. The latest versions of GROMACS, NAMD, mpiBLAST and ClustalW-MPI can be run in parallel by simply booting the birgHPC CD or flash drive from the head node, which immediately positions the rest of the PCs on the network as computing nodes. Thus, a temporary, affordable, scalable and high-performance computing environment can be built by non-computing-based researchers using low-cost commodity hardware
Directed Scattering for Knowledge Graph-based Cellular Signaling Analysis
Directed graphs are a natural model for many phenomena, in particular
scientific knowledge graphs such as molecular interaction or chemical reaction
networks that define cellular signaling relationships. In these situations,
source nodes typically have distinct biophysical properties from sinks. Due to
their ordered and unidirectional relationships, many such networks also have
hierarchical and multiscale structure. However, the majority of methods
performing node- and edge-level tasks in machine learning do not take these
properties into account, and thus have not been leveraged effectively for
scientific tasks such as cellular signaling network inference. We propose a new
framework called Directed Scattering Autoencoder (DSAE) which uses a directed
version of a geometric scattering transform, combined with the non-linear
dimensionality reduction properties of an autoencoder and the geometric
properties of the hyperbolic space to learn latent hierarchies. We show this
method outperforms numerous others on tasks such as embedding directed graphs
and learning cellular signaling networks.Comment: 5 pages, 3 figure
Multi-scale Hybridized Topic Modeling: A Pipeline for Analyzing Unstructured Text Datasets via Topic Modeling
We propose a multi-scale hybridized topic modeling method to find hidden topics from transcribed interviews more accurately and efficiently than traditional topic modeling methods. Our multi-scale hybridized topic modeling method (MSHTM) approaches data at different scales and performs topic modeling in a hierarchical way utilizing first a classical method, Nonnegative Matrix Factorization, and then a transformer-based method, BERTopic. It harnesses the strengths of both NMF and BERTopic. Our method can help researchers and the public better extract and interpret the interview information. Additionally, it provides insights for new indexing systems based on the topic level. We then deploy our method on real-world interview transcripts and find promising results
The Manifold Scattering Transform for High-Dimensional Point Cloud Data
The manifold scattering transform is a deep feature extractor for data
defined on a Riemannian manifold. It is one of the first examples of extending
convolutional neural network-like operators to general manifolds. The initial
work on this model focused primarily on its theoretical stability and
invariance properties but did not provide methods for its numerical
implementation except in the case of two-dimensional surfaces with predefined
meshes. In this work, we present practical schemes, based on the theory of
diffusion maps, for implementing the manifold scattering transform to datasets
arising in naturalistic systems, such as single cell genetics, where the data
is a high-dimensional point cloud modeled as lying on a low-dimensional
manifold. We show that our methods are effective for signal classification and
manifold classification tasks.Comment: Accepted for publication in the TAG in DS Workshop at ICML. For
subsequent theoretical guarantees, please see Section 6 of arXiv:2208.0856
Multi-scale Hybridized Topic Modeling: A Pipeline for Analyzing Unstructured Text Datasets via Topic Modeling
We propose a multi-scale hybridized topic modeling method to find hidden
topics from transcribed interviews more accurately and efficiently than
traditional topic modeling methods. Our multi-scale hybridized topic modeling
method (MSHTM) approaches data at different scales and performs topic modeling
in a hierarchical way utilizing first a classical method, Nonnegative Matrix
Factorization, and then a transformer-based method, BERTopic. It harnesses the
strengths of both NMF and BERTopic. Our method can help researchers and the
public better extract and interpret the interview information. Additionally, it
provides insights for new indexing systems based on the topic level. We then
deploy our method on real-world interview transcripts and find promising
results
Double Pomeron Jet Cross Sections
We treat hadron-hadron collisions where the final state is kinematically of
the kind associated with double-pomeron-exchange (DPE) and has large transverse
momentum jets. We show that in addition to the conventional factorized (FDPE)
contribution, there is a non-factorized (NDPE) contribution which has no
pomeron beam jet. Within a simple model we compute DPE-two-jet total and
differential cross sections at Tevatron energy scales, and show that the NDPE
contribution is dominant.Comment: 21 pages, 7 figures, figure 1 has been slightly change
Chemokine-driven lymphocyte infiltration: an early intratumoural event determining long-term survival in resectable hepatocellular carcinoma
Objective Hepatocellular carcinoma (HCC) is a heterogeneous disease with poor prognosis and limited methods for predicting patient survival. The nature of the immune cells that infiltrate tumours is known to impact clinical outcome. However, the molecular events that regulate this infiltration require further understanding. Here the ability of immune genes expressed in the tumour microenvironment to predict disease progression was investigated.MethodsUsing quantitative PCR, the expression of 14 immune genes in resected tumour tissues from 57 Singaporean patients was analysed. The nearest-template prediction method was used to derive and test a prognostic signature from this training cohort. The signature was then validated in an independent cohort of 98 patients from Hong Kong and Zurich. Intratumoural components expressing these critical immune genes were identified by in situ labelling. Regulation of these genes was analysed in vitro using the HCC cell line SNU-182.ResultsThe identified 14 immune-gene signature predicts patient survival in both the training cohort (p=0.0004 and HR=5.2) and the validation cohort (p=0.0051 and HR=2.5) irrespective of patient ethnicity and disease aetiology. Importantly, it predicts the survival of patients with early disease (stages I and II), for whom classical clinical parameters provide limited information. The lack of predictive power in late disease stages III and IV emphasises that a protective immune microenvironment has to be established early in order to impact disease progression significantly. This signature includes the chemokine genes CXCL10, CCL5 and CCL2, whose expression correlates with markers of T helper 1 (Th1), CD8(+) T and natural killer (NK) cells. Inflammatory cytokines (tumour necrosis factor α, interferon γ) and Toll-like receptor 3 ligands stimulate intratumoural production of these chemokines which drive tumour infiltration by T and NK cells, leading to enhanced cancer cell death.ConclusionA 14 immune-gene signature, which identifies molecular cues driving tumour infiltration by lymphocytes, accurately predicts survival of patients with HCC especially in early disease
- …