94 research outputs found

    Estimating Gene Interactions Using Information Theoretic Functionals

    No full text
    With an abundance of data resulting from high-throughput technologies, like DNA microarrays, a race has been on the last few years, to determine the structures and functions of genes and their products, the proteins. Inference of gene interactions, lies in the core of these efforts. In all this activity, three important research issues have emerged. First, in much of the current literature on gene regulatory networks, dependencies among variables in our case genes - are assumed to be linear in nature, when in fact, in real-life scenarios this is seldom the case. This disagreement leads to systematic deviation and biased evaluation. Secondly, although the problem of undersampling, features in every piece of work as one of the major causes for poor results, in practice it is overlooked and rarely addressed explicitly. Finally, inference of network structures, although based on rigid mathematical foundations and computational optimizations, often displays poor fitness values and biologically unrealistic link structures, due - to a large extend - to the discovery of pairwise only interactions. In our search for robust, nonlinear measures of dependency, we advocate that mutual information and related information theoretic functionals (conditional mutual information, total correlation) are possibly the most suitable candidates to capture both linear and nonlinear interactions between variables, and resolve higher order dependencies. To address these issues, we researched and implemented under a common framework, a selection nonparametric estimators of mutual information for continuous variables. The focus of their assessment was, their robustness to the limited sample sizes and their expansibility to higher dimensions - important for the detection of more complex interaction structures. Two different assessment scenaria were performed, one with simulated data and one with bootstrapping the estimators in state-of-the-art network inference algorithms and monitor their predictive power and sensitivity. The tests revealed that, in small sample size regimes, there is a significant difference in the performance of different estimators, and naive methods such as uniform binning, gave consistently poor results compared with more sophisticated methods. Finally, a custom, modular mechanism is proposed, for the inference of gene interactions, targeting the identi cation of some of the most common substructures in genetic networks, that we believe will help improve accuracy and predictability scores

    Weakly supervised learning via statistical sufficiency

    No full text
    The Thesis introduces a novel algorithmic framework for weakly supervised learn- ing, namely, for any any problem in between supervised and unsupervised learning, from the labels standpoint. Weak supervision is the reality in many applications of machine learning where training is performed with partially missing, aggregated- level and/or noisy labels. The approach is grounded on the concept of statistical suf- ficiency and its transposition to loss functions. Our solution is problem-agnostic yet constructive as it boils down to a simple two-steps procedure. First, estimate a suffi- cient statistic for the labels from weak supervision. Second, plug the estimate into a (newly defined) linear-odd loss function and learn the model by any gradient-based solver, with a simple adaptation. We apply the same approach to several challeng- ing learning problems: (i) learning from label proportions, (ii) learning with noisy labels for both linear classifiers and deep neural networks, and (iii) learning from feature-wise distributed datasets where the entity matching function is unknown

    Combining Prior Knowledge and Data: Beyond the Bayesian Framework

    Get PDF
    For many tasks such as text categorization and control of robotic systems, state-of-the art learning systems can produce results comparable in accuracy to those of human subjects. However, the amount of training data needed for such systems can be prohibitively large for many practical problems. A text categorization system, for example, may need to see many text postings manually tagged with their subjects before it learns to predict the subject of the next posting with high accuracy. A reinforcement learning (RL) system learning how to drive a car needs a lot of experimentation with the actual car before acquiring the optimal policy. An optimizing compiler targeting a certain platform has to construct, compile, and execute many versions of the same code with different optimization parameters to determine which optimizations work best. Such extensive sampling can be time-consuming, expensive (in terms of both expense of the human expertise needed to label data and wear and tear on the robotic equipment used for exploration in case of RL), and sometimes dangerous (e.g., an RL agent driving the car off the cliff to see if it survives the crash). The goal of this work is to reduce the amount of training data an agent needs in order to learn how to perform a task successfully. This is done by providing the system with prior knowledge about its domain. The knowledge is used to bias the agent towards useful solutions and limit the amount of training needed. We explore this task in three contexts: classification (determining the subject of a newsgroup posting), control (learning to perform tasks such as driving a car up the mountain in simulation), and optimization (optimizing performance of linear algebra operations on different hardware platforms). For the text categorization problem, we introduce a novel algorithm which efficiently integrates prior knowledge into large margin classification. We show that prior knowledge simplifies the problem by reducing the size of the hypothesis space. We also provide formal convergence guarantees for our algorithm. For reinforcement learning, we introduce a novel framework for defining planning problems in terms of qualitative statements about the world (e.g., ``the faster the car is going, the more likely it is to reach the top of the mountain''). We present an algorithm based on policy iteration for solving such qualitative problems and prove its convergence. We also present an alternative framework which allows the user to specify prior knowledge quantitatively in form of a Markov Decision Process (MDP). This prior is used to focus exploration on those regions of the world in which the optimal policy is most sensitive to perturbations in transition probabilities and rewards. Finally, in the compiler optimization problem, the prior is based on an analytic model which determines good optimization parameters for a given platform. This model defines a Bayesian prior which, combined with empirical samples (obtained by measuring the performance of optimized code segments), determines the maximum-a-posteriori estimate of the optimization parameters

    Modal Transition Systems as the Basis for Interface Theories and Product Lines

    Get PDF

    Fluid-structure interaction by a coupled lattice Boltzmann-finite element approach

    Get PDF
    In this thesis, a strategy to model the behavior of fluids and their interaction with deformable bodies is proposed. The fluid domain is modeled by using the lattice Boltzmann method, thus analyzing the fluid dynamics by a mesoscopic point of view. It has been proved that the solution provided by this method is equivalent to solve the Navier-Stokes equations for an incompressible flow with a second-order accuracy. Slender elastic structures idealized through beam finite elements are used. Large displacements are accounted for by using the corotational formulation. Structural dynamics is computed by using the Time Discontinuous Galerkin method. Therefore, two different solution procedures are used, one for the fluid domain and the other for the structural part, respectively. These two solvers need to communicate and to transfer each other several information, i.e. stresses, velocities, displacements. In order to guarantee a continuous, effective, and mutual exchange of information, a coupling strategy, consisting of three different algorithms, has been developed and numerically tested. In particular, the effectiveness of the three algorithms is shown in terms of interface energy artificially produced by the approximate fulfilling of compatibility and equilibrium conditions at the fluid-structure interface. The proposed coupled approach is used in order to solve different fluid-structure interaction problems, i.e. cantilever beams immersed in a viscous fluid, the impact of the hull of the ship on the marine free-surface, blood flow in a deformable vessels, and even flapping wings simulating the take-off of a butterfly. The good results achieved in each application highlight the effectiveness of the proposed methodology and of the C++ developed software to successfully approach several two-dimensional fluid-structure interaction problems

    29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

    Get PDF

    Structure-Preserving Model Reduction of Physical Network Systems

    Get PDF
    This paper considers physical network systems where the energy storage is naturally associated to the nodes of the graph, while the edges of the graph correspond to static couplings. The first sections deal with the linear case, covering examples such as mass-damper and hydraulic systems, which have a structure that is similar to symmetric consensus dynamics. The last section is concerned with a specific class of nonlinear physical network systems; namely detailed-balanced chemical reaction networks governed by mass action kinetics. In both cases, linear and nonlinear, the structure of the dynamics is similar, and is based on a weighted Laplacian matrix, together with an energy function capturing the energy storage at the nodes. We discuss two methods for structure-preserving model reduction. The first one is clustering; aggregating the nodes of the underlying graph to obtain a reduced graph. The second approach is based on neglecting the energy storage at some of the nodes, and subsequently eliminating those nodes (called Kron reduction).</p

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Fundamental Approaches to Software Engineering, FASE 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 23 full papers, 1 tool paper and 6 testing competition papers presented in this volume were carefully reviewed and selected from 81 submissions. The papers cover topics such as requirements engineering, software architectures, specification, software quality, validation, verification of functional and non-functional properties, model-driven development and model transformation, software processes, security and software evolution

    Integration of analysis techniques in security and fault-tolerance

    Get PDF
    This thesis focuses on the study of integration of formal methodologies in security protocol analysis and fault-tolerance analysis. The research is developed in two different directions: interdisciplinary and intra-disciplinary. In the former, we look for a beneficial interaction between strategies of analysis in security protocols and fault-tolerance; in the latter, we search for connections among different approaches of analysis within the security area. In the following we summarize the main results of the research
    • ā€¦
    corecore