434 research outputs found

    The role of classifiers and data complexity in learned Bloom filters: insights and recommendations

    Get PDF
    Bloom filters, since their introduction over 50 years ago, have become a pillar to handle membership queries in small space, with relevant application in Big Data Mining and Stream Processing. Further improvements have been recently proposed with the use of Machine Learning techniques: learned Bloom filters. Those latter make considerably more complicated the proper parameter setting of this multi-criteria data structure, in particular in regard to the choice of one of its key components (the classifier) and accounting for the classification complexity of the input dataset. Given this State of the Art, our contributions are as follows. (1) A novel methodology, supported by software, for designing, analyzing and implementing learned Bloom filters that account for their own multi-criteria nature, in particular concerning classifier type choice and data classification complexity. Extensive experiments show the validity of the proposed methodology and, being our software public, we offer a valid tool to the practitioners interested in using learned Bloom filters. (2) Further contributions to the advancement of the State of the Art that are of great practical relevance are the following: (a) the classifier inference time should not be taken as a proxy for the filter reject time; (b) of the many classifiers we have considered, only two offer good performance; this result is in agreement with and further strengthens early findings in the literature; (c) Sandwiched Bloom filter, which is already known as being one of the references of this area, is further shown here to have the remarkable property of robustness to data complexity and classifier performance variability

    An improved geometric inequality via vanishing moments, with applications to singular Liouville equations

    Full text link
    We consider a class of singular Liouville equations on compact surfaces motivated by the study of Electroweak and Self-Dual Chern-Simons theories, the Gaussian curvature prescription with conical singularities and Onsager's description of turbulence. We analyse the problem of existence variationally, and show how the angular distribution of the conformal volume near the singularities may lead to improvements in the Moser-Trudinger inequality, and in turn to lower bounds on the Euler-Lagrange functional. We then discuss existence and non-existence results.Comment: some references adde

    Resource-Limited Automated Ki67 Index Estimation in Breast Cancer

    Full text link
    The prediction of tumor progression and chemotherapy response has been recently tackled exploiting Tumor Infiltrating Lymphocytes (TILs) and the nuclear protein Ki67 as prognostic factors. Recently, deep neural networks (DNNs) have been shown to achieve top results in estimating Ki67 expression and simultaneous determination of intratumoral TILs score in breast cancer cells. However, in the last ten years the extraordinary progress induced by deep models proliferated at least as much as their resource demand. The exorbitant computational costs required to query (and in some cases also to store) a deep model represent a strong limitation in resource-limited contexts, like that of IoT-based applications to support healthcare personnel. To this end, we propose a resource consumption-aware DNN for the effective estimate of the percentage of Ki67-positive cells in breast cancer screenings. Our approach reduced up to 75% and 89% the usage of memory and disk space respectively, up to 1.5x the energy consumption, and preserved or improved the overall accuracy of a benchmark state-of-the-art solution. Encouraged by such positive results, we developed and structured the adopted framework so as to allow its general purpose usage, along with a public software repository to support its usage

    Existence of solutions to a higher dimensional mean-field equation on manifolds

    Full text link
    For m1m\geq 1 we prove an existence result for the equation (Δg)mu+λ=λe2muMe2mudμg(-\Delta_g)^m u+\lambda=\lambda\frac{e^{2mu}}{\int_M e^{2mu}d\mu_g} on a closed Riemannian manifold (M,g)(M,g) of dimension 2m2m for certain values of λ\lambda.Comment: 15 Page

    Simultaneous Learning of Fuzzy Sets

    Get PDF
    We extend a procedure based on support vector clustering and devoted to inferring the membership function of a fuzzy set to the case of a universe of discourse over which several fuzzy sets are defined. The extended approach learns simultaneously these sets without requiring as previous knowledge either their number or labels approximating membership values. This data-driven approach is completed via expert knowledge incorporation in the form of predefined shapes for the membership functions. The procedure is successfully tested on a benchmark

    Evaluating the impact of topological protein features on the negative examples selection

    Get PDF
    Supervised machine learning methods when applied to the problem of automated protein-function prediction (AFP) require the availability of both positive examples (i.e., proteins which are known to possess a given protein function) and negative examples (corresponding to proteins not associated with that function). Unfortunately, publicly available proteome and genome data sources such as the Gene Ontology rarely store the functions not possessed by a protein. Thus the negative selection, consisting in identifying informative negative examples, is currently a central and challenging problem in AFP. Several heuristics have been proposed through the years to solve this problem; nevertheless, despite their effectiveness, to the best of our knowledge no previous existing work studied which protein features are more relevant to this task, that is, which protein features help more in discriminating reliable and unreliable negatives

    Fostering Computational Thinking in Primary School through a LEGO®-based Music Notation

    Get PDF
    This paper presents a teaching methodology mixing elements from the domains of music and informatics as a key enabling to expose primary school pupils to basic aspects of computational thinking. This methodology is organized in two phases exploiting LEGO\uae bricks respectively as a physical tool and as a metaphor in order to let participants discover a simple notation encoding several basic concepts of the classical musical notation. The related activities, grounded on active learning theory, challenge groups of students to solve musical encoding problems of increasing difficulty
    corecore