9,147 research outputs found

    Inductive queries for a drug designing robot scientist

    Get PDF
    It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

    Fast computation: a steady-state simulation of railways ballasted track settlement

    Get PDF
    Geometryofballastedrailwaystrackisamajorconcerninrailroadssafetyand efficiency. Settlement of railways ballast has been studied to help railway infrastructure managers to keep infrastructures in shape and to prevent accidents. In this paper, we present an innovative numerical approach to study railways ballast settlement. Commonly used models representing a moving load need huge computation time. On the other hand, assuming static cyclic loading representation leads to discrepancies. Indeed, it does not conceder particularities of moving load. With this new model we want to avoid the drawbacks of previously developed methods. We developed a steady state algorithm to compute plastic strain in geomaterials and to study behaviour of ballasted railways track with an Eulerian approach. This way we improved model efficiency by drastically reducing computation time while considering mobile load specificities

    An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

    Full text link
    We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. It can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational Linguistics 2

    CICLAD: A Fast and Memory-efficient Closed Itemset Miner for Streams

    Full text link
    Mining association rules from data streams is a challenging task due to the (typically) limited resources available vs. the large size of the result. Frequent closed itemsets (FCI) enable an efficient first step, yet current FCI stream miners are not optimal on resource consumption, e.g. they store a large number of extra itemsets at an additional cost. In a search for a better storage-efficiency trade-off, we designed Ciclad,an intersection-based sliding-window FCI miner. Leveraging in-depth insights into FCI evolution, it combines minimal storage with quick access. Experimental results indicate Ciclad's memory imprint is much lower and its performances globally better than competitor methods.Comment: KDD2

    On the genericity properties in networked estimation: Topology design and sensor placement

    Full text link
    In this paper, we consider networked estimation of linear, discrete-time dynamical systems monitored by a network of agents. In order to minimize the power requirement at the (possibly, battery-operated) agents, we require that the agents can exchange information with their neighbors only \emph{once per dynamical system time-step}; in contrast to consensus-based estimation where the agents exchange information until they reach a consensus. It can be verified that with this restriction on information exchange, measurement fusion alone results in an unbounded estimation error at every such agent that does not have an observable set of measurements in its neighborhood. To over come this challenge, state-estimate fusion has been proposed to recover the system observability. However, we show that adding state-estimate fusion may not recover observability when the system matrix is structured-rank (SS-rank) deficient. In this context, we characterize the state-estimate fusion and measurement fusion under both full SS-rank and SS-rank deficient system matrices.Comment: submitted for IEEE journal publicatio

    Recursive SDN for Carrier Networks

    Full text link
    Control planes for global carrier networks should be programmable (so that new functionality can be easily introduced) and scalable (so they can handle the numerical scale and geographic scope of these networks). Neither traditional control planes nor new SDN-based control planes meet both of these goals. In this paper, we propose a framework for recursive routing computations that combines the best of SDN (programmability) and traditional networks (scalability through hierarchy) to achieve these two desired properties. Through simulation on graphs of up to 10,000 nodes, we evaluate our design's ability to support a variety of routing and traffic engineering solutions, while incorporating a fast failure recovery mechanism
    corecore