Search CORE

252 research outputs found

Cis-regulatory module detection using constraint programming

Author: Guns Tias
Marchal Kathleen
Nijssen Siegfried
Sun Hong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

We propose a method for finding CRMs in a set of co-regulated genes. Each CRM consists of a set of binding sites of transcription factors. We wish to find CRMs involving the same transcription factors in multiple sequences. Finding such a combination of transcription factors is inherently a combinatorial problem. We solve this problem by combining the principles of itemset mining and constraint programming. The constraints involve the putative binding sites of transcription factors, the number of sequences in which they co-occur and the proximity of the binding sites. Genomic background sequences are used to assess the significance of the modules. We experimentally validate our approach and compare it with state-of-the-art techniques

Crossref

Ghent University Academic Bibliography

A Revised Publication Model for ECML PKDD

Author: Blockeel Hendrik
Kersting Kristian
Nijssen Siegfried
Zelezny Filip
Publication venue
Publication date: 01/01/2012
Field of study

ECML PKDD is the main European conference on machine learning and data mining. Since its foundation it implemented the publication model common in computer science: there was one conference deadline; conference submissions were reviewed by a program committee; papers were accepted with a low acceptance rate. Proceedings were published in several Springer Lecture Notes in Artificial (LNAI) volumes, while selected papers were invited to special issues of the Machine Learning and Data Mining and Knowledge Discovery journals. In recent years, this model has however come under stress. Problems include: reviews are of highly variable quality; the purpose of bringing the community together is lost; reviewing workloads are high; the information content of conferences and journals decreases; there is confusion among scientists in interdisciplinary contexts. In this paper, we present a new publication model, which will be adopted for the ECML PKDD 2013 conference, and aims to solve some of the problems of the traditional model. The key feature of this model is the creation of a journal track, which is open to submissions all year long and allows for revision cycles.Comment: 13 page

arXiv.org e-Print Archive

CiteSeerX

DIAL UCLouvain

Mining Patterns in Networks using Homomorphism

Author: Dries Anton
Nijssen Siegfried
Publication venue
Publication date: 14/10/2011
Field of study

In recent years many algorithms have been developed for finding patterns in graphs and networks. A disadvantage of these algorithms is that they use subgraph isomorphism to determine the support of a graph pattern; subgraph isomorphism is a well-known NP complete problem. In this paper, we propose an alternative approach which mines tree patterns in networks by using subgraph homomorphism. The advantage of homomorphism is that it can be computed in polynomial time, which allows us to develop an algorithm that mines tree patterns in arbitrary graphs in incremental polynomial time. Homomorphism however entails two problems not found when using isomorphism: (1) two patterns of different size can be equivalent; (2) patterns of unbounded size can be frequent. In this paper we formalize these problems and study solutions that easily fit within our algorithm

arXiv.org e-Print Archive

Lirias

CiteSeerX

Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection

Author: Fierro Ana Carolina
Guns Tias
Marchal Kathleen
Nijssen Siegfried
Sun Hong
Thorrez Lieven
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the binding sites of the assayed transcription factor (TF) should be located, but also allows restricting the valid CRMs to those that contain the assayed TF (here referred to as applying CRM detection in a query-based mode). In this study, we show that exploiting ChIP-information in a query-based way makes in silico CRM detection a much more feasible endeavor. To be able to handle the large datasets, the query-based setting and other specificities proper to CRM detection on ChIP-Seq based data, we developed a novel powerful CRM detection method 'CPModule'. By applying it on a well-studied ChIP-Seq data set involved in self-renewal of mouse embryonic stem cells, we demonstrate how our tool can recover combinatorial regulation of five known TFs that are key in the self-renewal of mouse embryonic stem cells. Additionally, we make a number of new predictions on combinatorial regulation of these five key TFs with other TFs documented in TRANSFAC

Ghent University Academic Bibliography

PubMed Central

Using an interpretable Machine Learning approach to study the drivers of International Migration

Author: Docquier Frédéric
Houndji Vinasetan Ratheil
Kiossou Harold Silvère
Nijssen Siegfried
Schaus Pierre
Schenk Yannik
Publication venue
Publication date: 01/01/2020
Field of study

Globally increasing migration pressures call for new modelling approaches in order to design effective policies. It is important to have not only efficient models to predict migration flows but also to understand how specific parameters influence these flows. In this paper, we propose an artificial neural network (ANN) to model international migration. Moreover, we use a technique for interpreting machine learning models, namely Partial Dependence Plots (PDP), to show that one can well study the effects of drivers behind international migration. We train and evaluate the model on a dataset containing annual international bilateral migration from

1960

2010

from

175

origin countries to

33

mainly OECD destinations, along with the main determinants as identified in the migration literature. The experiments carried out confirm that: 1) the ANN model is more efficient w.r.t. a traditional model, and 2) using PDP we are able to gain additional insights on the specific effects of the migration drivers. This approach provides much more information than only using the feature importance information used in previous works

arXiv.org e-Print Archive

DIAL UCLouvain

Mining local staircase patterns in noisy data

Author: De Raedt Luc
Fierro Ana Carolina
Guns Tias
International workshop on Co-Clustering and Applications
Le Van Thanh
Marchal Kathleen
Nijssen Siegfried
van Leeuwen Matthijs
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Most traditional biclustering algorithms identify biclusters with no or little overlap. In this paper, we introduce the problem of identifying staircases of biclusters. Such staircases may be indicative for causal relationships between columns and can not easily be identified by existing biclustering algorithms. Our formalization relies on a scoring function based on the Minimum Description Length principle. Furthermore, we propose a first algorithm for identifying staircase biclusters, based on a combination of local search and constraint programming. Experiments show that the approach is promising

Crossref

Ghent University Academic Bibliography

DIAL UCLouvain

Guest editor’s introduction: special issue of the ECML PKDD 2013 journal track

Author: Filip Železný
Hendrik Blockeel
Kristian Kersting
Siegfried Nijssen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Guest editor’s introduction: special issue of the ECML PKDD 2013 journal track

Author: Filip Železný
Hendrik Blockeel
Kristian Kersting
Siegfried Nijssen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref