203 research outputs found
A unifying view for performance measures in multi-class prediction
In the last few years, many different performance measures have been
introduced to overcome the weakness of the most natural metric, the Accuracy.
Among them, Matthews Correlation Coefficient has recently gained popularity
among researchers not only in machine learning but also in several application
fields such as bioinformatics. Nonetheless, further novel functions are being
proposed in literature. We show that Confusion Entropy, a recently introduced
classifier performance measure for multi-class problems, has a strong
(monotone) relation with the multi-class generalization of a classical metric,
the Matthews Correlation Coefficient. Computational evidence in support of the
claim is provided, together with an outline of the theoretical explanation
Stability Indicators in Network Reconstruction
The number of algorithms available to reconstruct a biological network from a
dataset of high-throughput measurements is nowadays overwhelming, but
evaluating their performance when the gold standard is unknown is a difficult
task. Here we propose to use a few reconstruction stability tools as a
quantitative solution to this problem. We introduce four indicators to
quantitatively assess the stability of a reconstructed network in terms of
variability with respect to data subsampling. In particular, we give a measure
of the mutual distances among the set of networks generated by a collection of
data subsets (and from the network generated on the whole dataset) and we rank
nodes and edges according to their decreasing variability within the same set
of networks. As a key ingredient, we employ a global/local network distance
combined with a bootstrap procedure. We demonstrate the use of the indicators
in a controlled situation on a toy dataset, and we show their application on a
miRNA microarray dataset with paired tumoral and non-tumoral tissues extracted
from a cohort of 241 hepatocellular carcinoma patients
“Candido’s List”: the workers of Collotta Cis & Figli at Molina di Ledro in Trento Province, Italy. A tale of magnesia, asbestos and work
The study entitled “Candido’s List” (La Lista di Candido) is not the work of the three authors alone. A good part of the community is entitled to feel itself coauthor, each for his/her own part, of a research project that has succeeded in blending a variety of different ingredients: history, entrepreneurship, the industrialization of the Trento Province with all its high and low points, personal life stories, medicine, genius, work, women’s emancipation, the past but also the present and future. The research comprises an eloquent collection of memories and a variety of iconographic materials; it has now become a book and a travelling exhibition containing the accounts of the people who worked at the Collotta-Cis factory in Molina di Ledro. It starts with the brilliance of Pier Antonio Cassoni, who in 1816 deposited the first patent in the world for the extraction of magnesium carbonate, and closes with the decontamination of the factory site in the late 1980s. A needful section has been set aside for the painful facts relating to the processing of asbestos fibre; a final space, midway between an artistic reading and an interpretation for the future, has seen the involvement of the Circolo Fotoamatori di Ledro, with a photographic itinerary enabling the reader to “virtually’ enter the remaining worksites and listen to these spaces “tell” their stories after years of silence. A story in black and white, where the two tones are also messages for reading a complex story, one that it is important to remember
A possible juvenile hypochondroplasia case from the mass grave of Lazzaretto Nuovo Island (Venice)
Among the remains of individuals buried in the cemetery of the New Lazaretto (Venice) during the plague epidemic of 1576, a juvenile skeleton with a discrepancy between the biological age at death obtained by the diaphyseal length was recovered. Other skeletal indicators from the humerus and the shoulder girdle show a craniocaudal reduction of bone length. Associated with other morphological changes and signs, the individual is diagnosed with hypochondroplasia, a specific form of dwarfism
The HIM glocal metric and kernel for network comparison and classification
Due to the ever rising importance of the network paradigm across several
areas of science, comparing and classifying graphs represent essential steps in
the networks analysis of complex systems. Both tasks have been recently tackled
via quite different strategies, even tailored ad-hoc for the investigated
problem. Here we deal with both operations by introducing the
Hamming-Ipsen-Mikhailov (HIM) distance, a novel metric to quantitatively
measure the difference between two graphs sharing the same vertices. The new
measure combines the local Hamming distance and the global spectral
Ipsen-Mikhailov distance so to overcome the drawbacks affecting the two
components separately. Building then the HIM kernel function derived from the
HIM distance it is possible to move from network comparison to network
classification via the Support Vector Machine (SVM) algorithm. Applications of
HIM distance and HIM kernel in computational biology and social networks
science demonstrate the effectiveness of the proposed functions as a general
purpose solution.Comment: Frontiers of Network Analysis: Methods, Models, and Applications -
NIPS 2013 Worksho
Minerva and minepy: a C engine for the MINE suite and its R, Python and MATLAB wrappers
We introduce a novel implementation in ANSI C of the MINE family of
algorithms for computing maximal information-based measures of dependence
between two variables in large datasets, with the aim of a low memory footprint
and ease of integration within bioinformatics pipelines. We provide the
libraries minerva (with the R interface) and minepy for Python, MATLAB, Octave
and C++. The C solution reduces the large memory requirement of the original
Java implementation, has good upscaling properties, and offers a native
parallelization for the R interface. Low memory requirements are demonstrated
on the MINE benchmarks as well as on large (n=1340) microarray and Illumina
GAII RNA-seq transcriptomics datasets.
Availability and Implementation: Source code and binaries are freely
available for download under GPL3 licence at http://minepy.sourceforge.net for
minepy and through the CRAN repository http://cran.r-project.org for the R
package minerva. All software is multiplatform (MS Windows, Linux and OSX).Comment: Bioinformatics 2012, in pres
- …