28,495 research outputs found

    One-class classifiers based on entropic spanning graphs

    Get PDF
    One-class classifiers offer valuable tools to assess the presence of outliers in data. In this paper, we propose a design methodology for one-class classifiers based on entropic spanning graphs. Our approach takes into account the possibility to process also non-numeric data by means of an embedding procedure. The spanning graph is learned on the embedded input data and the outcoming partition of vertices defines the classifier. The final partition is derived by exploiting a criterion based on mutual information minimization. Here, we compute the mutual information by using a convenient formulation provided in terms of the α\alpha-Jensen difference. Once training is completed, in order to associate a confidence level with the classifier decision, a graph-based fuzzy model is constructed. The fuzzification process is based only on topological information of the vertices of the entropic spanning graph. As such, the proposed one-class classifier is suitable also for data characterized by complex geometric structures. We provide experiments on well-known benchmarks containing both feature vectors and labeled graphs. In addition, we apply the method to the protein solubility recognition problem by considering several representations for the input samples. Experimental results demonstrate the effectiveness and versatility of the proposed method with respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN, Vancouver, Canad

    Learning and Designing Stochastic Processes from Logical Constraints

    Get PDF
    Stochastic processes offer a flexible mathematical formalism to model and reason about systems. Most analysis tools, however, start from the premises that models are fully specified, so that any parameters controlling the system's dynamics must be known exactly. As this is seldom the case, many methods have been devised over the last decade to infer (learn) such parameters from observations of the state of the system. In this paper, we depart from this approach by assuming that our observations are {\it qualitative} properties encoded as satisfaction of linear temporal logic formulae, as opposed to quantitative observations of the state of the system. An important feature of this approach is that it unifies naturally the system identification and the system design problems, where the properties, instead of observations, represent requirements to be satisfied. We develop a principled statistical estimation procedure based on maximising the likelihood of the system's parameters, using recent ideas from statistical machine learning. We demonstrate the efficacy and broad applicability of our method on a range of simple but non-trivial examples, including rumour spreading in social networks and hybrid models of gene regulation

    Iterative design of dynamic experiments in modeling for optimization of innovative bioprocesses

    Get PDF
    Finding optimal operating conditions fast with a scarce budget of experimental runs is a key problem to speed up the development and scaling up of innovative bioprocesses. In this paper, a novel iterative methodology for the model-based design of dynamic experiments in modeling for optimization is developed and successfully applied to the optimization of a fed-batch bioreactor related to the production of r-interleukin-11 (rIL-11) whose DNA sequence has been cloned in an Escherichia coli strain. At each iteration, the proposed methodology resorts to a library of tendency models to increasingly bias bioreactor operating conditions towards an optimum. By selecting the ‘most informative’ tendency model in the sequel, the next dynamic experiment is defined by re-optimizing the input policy and calculating optimal sampling times. Model selection is based on minimizing an error measure which distinguishes between parametric and structural uncertainty to selectively bias data gathering towards improved operating conditions. The parametric uncertainty of tendency models is iteratively reduced using Global Sensitivity Analysis (GSA) to pinpoint which parameters are keys for estimating the objective function. Results obtained after just a few iterations are very promising.Fil: Cristaldi, Mariano Daniel. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Santa Fe. Instituto de Desarrollo y Diseño. Universidad TecnolĂłgica Nacional. Facultad Regional Santa Fe. Instituto de Desarrollo y Diseño; ArgentinaFil: Grau, Ricardo JosĂ© Antonio. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Santa Fe. Instituto de Desarrollo TecnolĂłgico para la Industria QuĂ­mica. Universidad Nacional del Litoral. Instituto de Desarrollo TecnolĂłgico para la Industria QuĂ­mica; ArgentinaFil: MartĂ­nez, Ernesto Carlos. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Santa Fe. Instituto de Desarrollo y Diseño. Universidad TecnolĂłgica Nacional. Facultad Regional Santa Fe. Instituto de Desarrollo y Diseño; Argentin

    A coarse-grained protein model in a water-like solvent

    Get PDF
    Simulations employing an explicit atom description of proteins in solvent can be computationally expensive. On the other hand, coarse-grained protein models in implicit solvent miss essential features of the hydrophobic effect, especially its temperature dependence and have limited ability to capture the kinetics of protein folding. We propose a free space two-letter protein (“H-P”) model in a simple, but qualitatively accurate description for water, the Jagla model, which coarse-grains water into an isotropically interacting sphere. Using Monte Carlo simulations, we design protein-like sequences that can undergo a collapse, exposing the “Jagla-philic” monomers to the solvent, while maintaining a “hydrophobic” core. This protein-like model manifests heat and cold denaturation in a manner that is reminiscent of proteins. While this protein-like model lacks the details that would introduce secondary structure formation, we believe that these ideas represent a first step in developing a useful, but computationally expedient, means of modeling proteins.We thank C. A. Angell, M. Marques, S. Sastry, and Z. Yan for helpful discussions. S. S. and S. K. K. acknowledge the DOE - Basic Engineering Sciences for funding this research. P. G. D. gratefully acknowledges the support of the National Science Foundation (Grant CHE-1213343). P.J.R. gratefully acknowledges the support of the National Science Foundation (Collaborative Research Grants CHE-0908265 and CHE-0910615). Additional support from the R.A. Welch Foundation (F-0019) to P.J.R. is also gratefully acknowledged. HES thanks the NSF Chemistry Division for support through grants CHE 0911389, CHE 0908218 and CHE-1213217. S. V. B. acknowledges the partial support of this research through the Dr Bernard W. Gamson Computational Science Center at Yeshiva College. (DOE - Basic Engineering Sciences; CHE-1213343 - National Science Foundation; CHE-0908265 - National Science Foundation; CHE-0910615 - National Science Foundation; F-0019 - R.A. Welch Foundation; CHE 0911389 - NSF Chemistry Division; CHE 0908218 - NSF Chemistry Division; CHE-1213217 - NSF Chemistry Division; Dr Bernard W. Gamson Computational Science Center at Yeshiva College)Published versio
    • 

    corecore