9,147 research outputs found
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
Fast computation: a steady-state simulation of railways ballasted track settlement
Geometryofballastedrailwaystrackisamajorconcerninrailroadssafetyand efficiency. Settlement of railways ballast has been studied to help railway infrastructure managers to keep infrastructures in shape and to prevent accidents. In this paper, we present an innovative numerical approach to study railways ballast settlement. Commonly used models representing a moving load need huge computation time. On the other hand, assuming static cyclic loading representation leads to discrepancies. Indeed, it does not conceder particularities of moving load. With this new model we want to avoid the drawbacks of previously developed methods. We developed a steady state algorithm to compute plastic strain in geomaterials and to study behaviour of ballasted railways track with an Eulerian approach. This way we improved model efficiency by drastically reducing computation time while considering mobile load specificities
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
CICLAD: A Fast and Memory-efficient Closed Itemset Miner for Streams
Mining association rules from data streams is a challenging task due to the
(typically) limited resources available vs. the large size of the result.
Frequent closed itemsets (FCI) enable an efficient first step, yet current FCI
stream miners are not optimal on resource consumption, e.g. they store a large
number of extra itemsets at an additional cost. In a search for a better
storage-efficiency trade-off, we designed Ciclad,an intersection-based
sliding-window FCI miner. Leveraging in-depth insights into FCI evolution, it
combines minimal storage with quick access. Experimental results indicate
Ciclad's memory imprint is much lower and its performances globally better than
competitor methods.Comment: KDD2
On the genericity properties in networked estimation: Topology design and sensor placement
In this paper, we consider networked estimation of linear, discrete-time
dynamical systems monitored by a network of agents. In order to minimize the
power requirement at the (possibly, battery-operated) agents, we require that
the agents can exchange information with their neighbors only \emph{once per
dynamical system time-step}; in contrast to consensus-based estimation where
the agents exchange information until they reach a consensus. It can be
verified that with this restriction on information exchange, measurement fusion
alone results in an unbounded estimation error at every such agent that does
not have an observable set of measurements in its neighborhood. To over come
this challenge, state-estimate fusion has been proposed to recover the system
observability. However, we show that adding state-estimate fusion may not
recover observability when the system matrix is structured-rank (-rank)
deficient.
In this context, we characterize the state-estimate fusion and measurement
fusion under both full -rank and -rank deficient system matrices.Comment: submitted for IEEE journal publicatio
Recursive SDN for Carrier Networks
Control planes for global carrier networks should be programmable (so that
new functionality can be easily introduced) and scalable (so they can handle
the numerical scale and geographic scope of these networks). Neither
traditional control planes nor new SDN-based control planes meet both of these
goals. In this paper, we propose a framework for recursive routing computations
that combines the best of SDN (programmability) and traditional networks
(scalability through hierarchy) to achieve these two desired properties.
Through simulation on graphs of up to 10,000 nodes, we evaluate our design's
ability to support a variety of routing and traffic engineering solutions,
while incorporating a fast failure recovery mechanism
- …