434 research outputs found
The role of classifiers and data complexity in learned Bloom filters: insights and recommendations
Bloom filters, since their introduction over 50 years ago, have become a pillar to handle membership queries in small space, with relevant application in Big Data Mining and Stream Processing. Further improvements have been recently proposed with the use of Machine Learning techniques: learned Bloom filters. Those latter make considerably more complicated the proper parameter setting of this multi-criteria data structure, in particular in regard to the choice of one of its key components (the classifier) and accounting for the classification complexity of the input dataset. Given this State of the Art, our contributions are as follows. (1) A novel methodology, supported by software, for designing, analyzing and implementing learned Bloom filters that account for their own multi-criteria nature, in particular concerning classifier type choice and data classification complexity. Extensive experiments show the validity of the proposed methodology and, being our software public, we offer a valid tool to the practitioners interested in using learned Bloom filters. (2) Further contributions to the advancement of the State of the Art that are of great practical relevance are the following: (a) the classifier inference time should not be taken as a proxy for the filter reject time; (b) of the many classifiers we have considered, only two offer good performance; this result is in agreement with and further strengthens early findings in the literature; (c) Sandwiched Bloom filter, which is already known as being one of the references of this area, is further shown here to have the remarkable property of robustness to data complexity and classifier performance variability
New improved Moser-Trudinger inequalities and singular Liouville equations on compact surfaces
We consider a singular Liouville equation on a compact surface, arising from
the study of Chern-Simons vortices in a self dual regime. Using new improved
versions of the Moser-Trudinger inequalities (whose main feature is to be
scaling invariant) and a variational scheme, we prove new existence results.Comment: to appear in GAF
An improved geometric inequality via vanishing moments, with applications to singular Liouville equations
We consider a class of singular Liouville equations on compact surfaces
motivated by the study of Electroweak and Self-Dual Chern-Simons theories, the
Gaussian curvature prescription with conical singularities and Onsager's
description of turbulence. We analyse the problem of existence variationally,
and show how the angular distribution of the conformal volume near the
singularities may lead to improvements in the Moser-Trudinger inequality, and
in turn to lower bounds on the Euler-Lagrange functional. We then discuss
existence and non-existence results.Comment: some references adde
Resource-Limited Automated Ki67 Index Estimation in Breast Cancer
The prediction of tumor progression and chemotherapy response has been
recently tackled exploiting Tumor Infiltrating Lymphocytes (TILs) and the
nuclear protein Ki67 as prognostic factors. Recently, deep neural networks
(DNNs) have been shown to achieve top results in estimating Ki67 expression and
simultaneous determination of intratumoral TILs score in breast cancer cells.
However, in the last ten years the extraordinary progress induced by deep
models proliferated at least as much as their resource demand. The exorbitant
computational costs required to query (and in some cases also to store) a deep
model represent a strong limitation in resource-limited contexts, like that of
IoT-based applications to support healthcare personnel. To this end, we propose
a resource consumption-aware DNN for the effective estimate of the percentage
of Ki67-positive cells in breast cancer screenings. Our approach reduced up to
75% and 89% the usage of memory and disk space respectively, up to 1.5x the
energy consumption, and preserved or improved the overall accuracy of a
benchmark state-of-the-art solution. Encouraged by such positive results, we
developed and structured the adopted framework so as to allow its general
purpose usage, along with a public software repository to support its usage
Existence of solutions to a higher dimensional mean-field equation on manifolds
For we prove an existence result for the equation on a closed Riemannian
manifold of dimension for certain values of .Comment: 15 Page
Simultaneous Learning of Fuzzy Sets
We extend a procedure based on support vector clustering and devoted to inferring the membership function of a fuzzy set to the case of a universe of discourse over which several fuzzy sets are defined. The extended approach learns simultaneously these sets without requiring as previous knowledge either their number or labels approximating membership values. This data-driven approach is completed via expert knowledge incorporation in the form of predefined shapes for the membership functions. The procedure is successfully tested on a benchmark
Evaluating the impact of topological protein features on the negative examples selection
Supervised machine learning methods when applied to the problem of automated protein-function prediction (AFP) require the availability of both positive examples (i.e., proteins which are known to possess a given protein function) and negative examples (corresponding to proteins not associated with that function). Unfortunately, publicly available proteome and genome data sources such as the Gene Ontology rarely store the functions not possessed by a protein. Thus the negative selection, consisting in identifying informative negative examples, is currently a central and challenging problem in AFP. Several heuristics have been proposed through the years to solve this problem; nevertheless, despite their effectiveness, to the best of our knowledge no previous existing work studied which protein features are more relevant to this task, that is, which protein features help more in discriminating reliable and unreliable negatives
Fostering Computational Thinking in Primary School through a LEGO®-based Music Notation
This paper presents a teaching methodology mixing elements from the domains of music and informatics as a key enabling to expose primary school pupils to basic aspects of computational thinking. This methodology is organized in two phases exploiting LEGO\uae bricks respectively as a physical tool and as a metaphor in order to let participants discover a simple notation encoding several basic concepts of the classical musical notation. The related activities, grounded on active learning theory, challenge groups of students to solve musical encoding problems of increasing difficulty
- …