3,050 research outputs found
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
A Survey of Pipelined Workflow Scheduling: Models and Algorithms
International audienceA large class of applications need to execute the same workflow on different data sets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task-, data-, pipelined-, and/or replicated-parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors or optimization goals. This paper surveys the field by summing up and structuring known results and approaches
RRR: Rank-Regret Representative
Selecting the best items in a dataset is a common task in data exploration.
However, the concept of "best" lies in the eyes of the beholder: different
users may consider different attributes more important, and hence arrive at
different rankings. Nevertheless, one can remove "dominated" items and create a
"representative" subset of the data set, comprising the "best items" in it. A
Pareto-optimal representative is guaranteed to contain the best item of each
possible ranking, but it can be almost as big as the full data. Representative
can be found if we relax the requirement to include the best item for every
possible user, and instead just limit the users' "regret". Existing work
defines regret as the loss in score by limiting consideration to the
representative instead of the full data set, for any chosen ranking function.
However, the score is often not a meaningful number and users may not
understand its absolute value. Sometimes small ranges in score can include
large fractions of the data set. In contrast, users do understand the notion of
rank ordering. Therefore, alternatively, we consider the position of the items
in the ranked list for defining the regret and propose the {\em rank-regret
representative} as the minimal subset of the data containing at least one of
the top- of any possible ranking function. This problem is NP-complete. We
use the geometric interpretation of items to bound their ranks on ranges of
functions and to utilize combinatorial geometry notions for developing
effective and efficient approximation algorithms for the problem. Experiments
on real datasets demonstrate that we can efficiently find small subsets with
small rank-regrets
Output-sensitive complexity of multiobjective combinatorial optimization
We study output-sensitive algorithms and complexity for multiobjective combinatorial optimization problems. In this computational complexity framework, an algorithm for a general enumeration problem is regarded efficient if it is output-sensitive, i.e., its running time is bounded by a polynomial in the input and the output size. We provide both practical examples of MOCO problems for which such an efficient algorithm exists as well as problems for which no efficient algorithm exists under mild complexity theoretic assumptions
Multiobjective synchronization of coupled systems
Copyright @ 2011 American Institute of PhysicsSynchronization of coupled chaotic systems has been a subject of great interest and importance, in theory but also various fields of application, such as secure communication and neuroscience. Recently, based on stability theory, synchronization of coupled chaotic systems by designing appropriate coupling has been widely investigated. However, almost all the available results have been focusing on ensuring the synchronization of coupled chaotic systems with as small coupling strengths as possible. In this contribution, we study multiobjective synchronization of coupled chaotic systems by considering two objectives in parallel, i. e., minimizing optimization of coupling strength and convergence speed. The coupling form and coupling strength are optimized by an improved multiobjective evolutionary approach. The constraints on the coupling form are also investigated by formulating the problem into a multiobjective constraint problem. We find that the proposed evolutionary method can outperform conventional adaptive strategy in several respects. The results presented in this paper can be extended into nonlinear time-series analysis, synchronization of complex networks and have various applications
- âŠ