1,091 research outputs found
Recommended from our members
Quantifying the Impact and Extent of Undocumented Biomedical Synonymy
Synonymous relationships among biomedical terms are extensively annotated within specialized terminologies, implying that synonymy is important for practical computational applications within this field. It remains unclear, however, whether text mining actually benefits from documented synonymy and whether existing biomedical thesauri provide adequate coverage of these linguistic relationships. In this study, we examine the impact and extent of undocumented synonymy within a very large compendium of biomedical thesauri. First, we demonstrate that missing synonymy has a significant negative impact on named entity normalization, an important problem within the field of biomedical text mining. To estimate the amount synonymy currently missing from thesauri, we develop a probabilistic model for the construction of synonym terminologies that is capable of handling a wide range of potential biases, and we evaluate its performance using the broader domain of near-synonymy among general English words. Our model predicts that over 90% of these relationships are currently undocumented, a result that we support experimentally through “crowd-sourcing.” Finally, we apply our model to biomedical terminologies and predict that they are missing the vast majority (>90%) of the synonymous relationships they intend to document. Overall, our results expose the dramatic incompleteness of current biomedical thesauri and suggest the need for “next-generation,” high-coverage lexical terminologies.</p
2023-2024 Catalog
The 2023-2024 Governors State University Undergraduate and Graduate Catalog is a comprehensive listing of current information regarding:Degree RequirementsCourse OfferingsUndergraduate and Graduate Rules and Regulation
Hypercubes and Hamilton cycles of display sets of rooted phylogenetic networks
In the context of reconstructing phylogenetic networks from a collection of
phylogenetic trees, several characterisations and subsequently algorithms have
been established to reconstruct a phylogenetic network that collectively embeds
all trees in the input in some minimum way. For many instances however, the
resulting network also embeds additional phylogenetic trees that are not part
of the input. However, little is known about these inferred trees. In this
paper, we explore the relationships among all phylogenetic trees that are
embedded in a given phylogenetic network. First, we investigate some
combinatorial properties of the collection P of all rooted binary phylogenetic
trees that are embedded in a rooted binary phylogenetic network N. To this end,
we associated a particular graph G, which we call rSPR graph, with the elements
in P and show that, if |P|=2^k, where k is the number of vertices with
in-degree two in N, then G has a Hamiltonian cycle. Second, by exploiting rSPR
graphs and properties of hypercubes, we turn to the well-studied class of
rooted binary level-1 networks and give necessary and sufficient conditions for
when a set of rooted binary phylogenetic trees can be embedded in a level-1
network without inferring any additional trees. Lastly, we show how these
conditions translate into a polynomial-time algorithm to reconstruct such a
network if it exists.Comment: final version of accepted manuscrip
Geometric Data Analysis: Advancements of the Statistical Methodology and Applications
Data analysis has become fundamental to our society and comes in multiple facets and approaches. Nevertheless, in research and applications, the focus was primarily on data from Euclidean vector spaces. Consequently, the majority of methods that are applied today are not suited for more general data types. Driven by needs from fields like image processing, (medical) shape analysis, and network analysis, more and more attention has recently been given to data from non-Euclidean spaces–particularly (curved) manifolds. It has led to the field of geometric data analysis whose methods explicitly take the structure (for example, the topology and geometry) of the underlying space into account.
This thesis contributes to the methodology of geometric data analysis by generalizing several fundamental notions from multivariate statistics to manifolds. We thereby focus on two different viewpoints.
First, we use Riemannian structures to derive a novel regression scheme for general manifolds that relies on splines of generalized BĂ©zier curves. It can accurately model non-geodesic relationships, for example, time-dependent trends with saturation effects or cyclic trends. Since BĂ©zier curves can be evaluated with the constructive de Casteljau algorithm, working with data from manifolds of high dimensions (for example, a hundred thousand or more) is feasible. Relying on the regression, we further develop
a hierarchical statistical model for an adequate analysis of longitudinal data in manifolds, and a method to control for confounding variables.
We secondly focus on data that is not only manifold- but even Lie group-valued, which is frequently the case in applications. We can only achieve this by endowing the group with an affine connection structure that is generally not Riemannian. Utilizing it, we derive generalizations of several well-known dissimilarity measures between data distributions that can be used for various tasks, including hypothesis testing. Invariance under data translations is proven, and a connection to continuous distributions is given for one measure.
A further central contribution of this thesis is that it shows use cases for all notions in real-world applications, particularly in problems from shape analysis in medical imaging and archaeology. We can replicate or further quantify several known findings for shape changes of the femur and the right hippocampus under osteoarthritis and Alzheimer's, respectively. Furthermore, in an archaeological application, we obtain new insights into the construction principles of ancient sundials. Last but not least, we use the geometric structure underlying human brain connectomes to predict cognitive scores. Utilizing a sample selection procedure, we obtain state-of-the-art results
Northeastern Illinois University, Academic Catalog 2023-2024
https://neiudc.neiu.edu/catalogs/1064/thumbnail.jp
A Structural Approach to the Design of Domain Specific Neural Network Architectures
This is a master's thesis concerning the theoretical ideas of geometric deep
learning. Geometric deep learning aims to provide a structured characterization
of neural network architectures, specifically focused on the ideas of
invariance and equivariance of data with respect to given transformations.
This thesis aims to provide a theoretical evaluation of geometric deep
learning, compiling theoretical results that characterize the properties of
invariant neural networks with respect to learning performance.Comment: 94 pages and 16 Figures Upload of my Master's thesis. Not peer
reviewed and potentially contains error
Robustness, scalability and interpretability of equivariant neural networks across different low-dimensional geometries
In this thesis we develop neural networks that exploit the symmetries of four different low-dimensional geometries, namely 1D grids, 2D grids, 3D continuous spaces and graphs, through the consideration of translational, rotational, cylindrical and permutation symmetries. We apply these models to applications across a range of scientific disciplines demonstrating the predictive ability, robustness, scalability, and interpretability.
We develop a neural network that exploits the translational symmetries on 1D grids to predict age and species of mosquitoes from high-dimensional mid-infrared spectra. We show that the model can learn to predict mosquito age and species with a higher accuracy than models that do not utilise any inductive bias. We also demonstrate that the model is sensitive to regions within the input spectra that are in agreement with regions identified by a domain expert. We present a transfer learning approach to overcome the challenge of working with small, real-world, wild collected data sets and demonstrate the benefit of the approach on a real-world application.
We demonstrate the benefit of rotation equivariant neural networks on the task of segmenting deforestation regions from satellite images through exploiting the rotational symmetry present on 2D grids. We develop a novel physics-informed architecture, exploiting the cylindrical symmetries of the group SO+ (2, 1), which can invert the transmission effects of multi-mode optical fibres (MMFs). We develop a new connection between a physics understanding of MMFs and group equivariant neural networks. We show that this novel architecture requires fewer training samples to learn, better generalises to out-of-distribution data sets, scales to higher-resolution images, is more interpretable, and reduces the parameter count of the model. We demonstrate the capability of the model on real-world data and provide an adaption to the model to handle real-world deviations from theory. We also show that the model can scale to higher resolution images than was previously possible.
We develop a novel architecture which provides a symmetry-preserving mapping between two different low-dimensional geometries and demonstrate its practical benefit for the application of 3D hand mesh generation from 2D images. This models exploits both the 2D rotational symmetries present in a 2D image and in a 3D hand mesh, and provides a mapping between the two data domains. We demonstrate that the model performs competitively on a range of benchmark data sets and justify the choice of inductive bias in the model.
We develop an architecture which is equivariant to a novel choice of automorphism group through the use of a sub-graph selection policy. We demonstrate the benefit of the architecture, theoretically through proving the improved expressivity and improved scalability, and experimentally on a range of widely studied benchmark graph classification tasks. We present a method of comparison between models that had not been previously considered in this area of research, demonstrating recent SOTA methods are statistically indistinguishable
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Data analysis with merge trees
Today’s data are increasingly complex and classical statistical techniques need growingly more refined mathematical tools to be able to model and investigate them. Paradigmatic situations are represented by data which need to be considered up to some kind of trans- formation and all those circumstances in which the analyst finds himself in the need of defining a general concept of shape. Topological Data Analysis (TDA) is a field which is fundamentally contributing to such challenges by extracting topological information from data with a plethora of interpretable and computationally accessible pipelines. We con- tribute to this field by developing a series of novel tools, techniques and applications to work with a particular topological summary called merge tree. To analyze sets of merge trees we introduce a novel metric structure along with an algorithm to compute it, define a framework to compare different functions defined on merge trees and investigate the metric space obtained with the aforementioned metric. Different geometric and topolog- ical properties of the space of merge trees are established, with the aim of obtaining a deeper understanding of such trees. To showcase the effectiveness of the proposed metric, we develop an application in the field of Functional Data Analysis, working with functions up to homeomorphic reparametrization, and in the field of radiomics, where each patient is represented via a clustering dendrogram
Quantum-Inspired Machine Learning: a Survey
Quantum-inspired Machine Learning (QiML) is a burgeoning field, receiving
global attention from researchers for its potential to leverage principles of
quantum mechanics within classical computational frameworks. However, current
review literature often presents a superficial exploration of QiML, focusing
instead on the broader Quantum Machine Learning (QML) field. In response to
this gap, this survey provides an integrated and comprehensive examination of
QiML, exploring QiML's diverse research domains including tensor network
simulations, dequantized algorithms, and others, showcasing recent
advancements, practical applications, and illuminating potential future
research avenues. Further, a concrete definition of QiML is established by
analyzing various prior interpretations of the term and their inherent
ambiguities. As QiML continues to evolve, we anticipate a wealth of future
developments drawing from quantum mechanics, quantum computing, and classical
machine learning, enriching the field further. This survey serves as a guide
for researchers and practitioners alike, providing a holistic understanding of
QiML's current landscape and future directions.Comment: 56 pages, 13 figures, 8 table
- …