4,129 research outputs found
Comparison of ontology alignment systems across single matching task via the McNemar's test
Ontology alignment is widely-used to find the correspondences between
different ontologies in diverse fields.After discovering the alignments,several
performance scores are available to evaluate them.The scores typically require
the identified alignment and a reference containing the underlying actual
correspondences of the given ontologies.The current trend in the alignment
evaluation is to put forward a new score(e.g., precision, weighted precision,
etc.)and to compare various alignments by juxtaposing the obtained scores.
However,it is substantially provocative to select one measure among others for
comparison.On top of that, claiming if one system has a better performance than
one another cannot be substantiated solely by comparing two scalars.In this
paper,we propose the statistical procedures which enable us to theoretically
favor one system over one another.The McNemar's test is the statistical means
by which the comparison of two ontology alignment systems over one matching
task is drawn.The test applies to a 2x2 contingency table which can be
constructed in two different ways based on the alignments,each of which has
their own merits/pitfalls.The ways of the contingency table construction and
various apposite statistics from the McNemar's test are elaborated in minute
detail.In the case of having more than two alignment systems for comparison,
the family-wise error rate is expected to happen. Thus, the ways of preventing
such an error are also discussed.A directed graph visualizes the outcome of the
McNemar's test in the presence of multiple alignment systems.From this graph,
it is readily understood if one system is better than one another or if their
differences are imperceptible.The proposed statistical methodologies are
applied to the systems participated in the OAEI 2016 anatomy track, and also
compares several well-known similarity metrics for the same matching problem
Differential analysis of biological networks
In cancer research, the comparison of gene expression or DNA methylation
networks inferred from healthy controls and patients can lead to the discovery
of biological pathways associated to the disease. As a cancer progresses, its
signalling and control networks are subject to some degree of localised
re-wiring. Being able to detect disrupted interaction patterns induced by the
presence or progression of the disease can lead to the discovery of novel
molecular diagnostic and prognostic signatures. Currently there is a lack of
scalable statistical procedures for two-network comparisons aimed at detecting
localised topological differences. We propose the dGHD algorithm, a methodology
for detecting differential interaction patterns in two-network comparisons. The
algorithm relies on a statistic, the Generalised Hamming Distance (GHD), for
assessing the degree of topological difference between networks and evaluating
its statistical significance. dGHD builds on a non-parametric permutation
testing framework but achieves computationally efficiency through an asymptotic
normal approximation. We show that the GHD is able to detect more subtle
topological differences compared to a standard Hamming distance between
networks. This results in the dGHD algorithm achieving high performance in
simulation studies as measured by sensitivity and specificity. An application
to the problem of detecting differential DNA co-methylation subnetworks
associated to ovarian cancer demonstrates the potential benefits of the
proposed methodology for discovering network-derived biomarkers associated with
a trait of interest
An inferential framework for biological network hypothesis tests
Background
Networks are ubiquitous in modern cell biology and physiology. A large literature exists for inferring/proposing biological pathways/networks using statistical or machine learning algorithms. Despite these advances a formal testing procedure for analyzing network-level observations is in need of further development. Comparing the behaviour of a pharmacologically altered pathway to its canonical form is an example of a salient one-sample comparison. Locating which pathways differentiate disease from no-disease phenotype may be recast as a two-sample network inference problem. Results
We outline an inferential method for performing one- and two-sample hypothesis tests where the sampling unit is a network and the hypotheses are stated via network model(s). We propose a dissimilarity measure that incorporates nearby neighbour information to contrast one or more networks in a statistical test. We demonstrate and explore the utility of our approach with both simulated and microarray data; random graphs and weighted (partial) correlation networks are used to form network models. Using both a well-known diabetes dataset and an ovarian cancer dataset, the methods outlined here could better elucidate co-regulation changes for one or more pathways between two clinically relevant phenotypes. Conclusions
Formal hypothesis tests for gene- or protein-based networks are a logical progression from existing gene-based and gene-set tests for differential expression. Commensurate with the growing appreciation and development of systems biology, the dissimilarity-based testing methods presented here may allow us to improve our understanding of pathways and other complex regulatory systems. The benefit of our method was illustrated under select scenarios
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
- …