98,931 research outputs found
Recommended from our members
Bioinformatics and constraints
This article introduces the topic of bioinformatics to an audience of computer scientists. We discuss the definition of bioinformatics, give a classification of the problem areas which bioinformatics addresses, and illustrate these in detail with examples. We highlight those areas which we believe to be suitable for the application of constraint solving techniques, or where similar techniques are already used. Finally, we give some advice for computer scientists who are considering getting involved in bioinformatics, and provide a resource list and a reading list
Decompositions of Grammar Constraints
A wide range of constraints can be compactly specified using automata or
formal languages. In a sequence of recent papers, we have shown that an
effective means to reason with such specifications is to decompose them into
primitive constraints. We can then, for instance, use state of the art SAT
solvers and profit from their advanced features like fast unit propagation,
clause learning, and conflict-based search heuristics. This approach holds
promise for solving combinatorial problems in scheduling, rostering, and
configuration, as well as problems in more diverse areas like bioinformatics,
software testing and natural language processing. In addition, decomposition
may be an effective method to propagate other global constraints.Comment: Proceedings of the Twenty-Third AAAI Conference on Artificial
Intelligenc
A polynomial delay algorithm for the enumeration of bubbles with length constraints in directed graphs and its application to the detection of alternative splicing in RNA-seq data
We present a new algorithm for enumerating bubbles with length constraints in
directed graphs. This problem arises in transcriptomics, where the question is
to identify all alternative splicing events present in a sample of mRNAs
sequenced by RNA-seq. This is the first polynomial-delay algorithm for this
problem and we show that in practice, it is faster than previous approaches.
This enables us to deal with larger instances and therefore to discover novel
alternative splicing events, especially long ones, that were previously
overseen using existing methods.Comment: Peer-reviewed and presented as part of the 13th Workshop on
Algorithms in Bioinformatics (WABI2013
Using philosophy to improve the coherence and interoperability of applications ontologies: A field report on the collaboration of IFOMIS and L&C
The collaboration of Language and Computing nv (L&C) and the Institute for Formal Ontology and Medical Information Science (IFOMIS) is guided by the hypothesis that quality constraints on ontologies for software ap-plication purposes closely parallel the constraints salient to the design of sound philosophical theories. The extent of this parallel has been poorly appreciated in the informatics community, and it turns out that importing the benefits of phi-losophical insight and methodology into application domains yields a variety of improvements. L&Câs LinKBaseÂź is one of the worldâs largest medical domain ontologies. Its current primary use pertains to natural language processing ap-plications, but it also supports intelligent navigation through a range of struc-tured medical and bioinformatics information resources, such as SNOMED-CT, Swiss-Prot, and the Gene Ontology (GO). In this report we discuss how and why philosophical methods improve both the internal coherence of LinKBaseÂź, and its capacity to serve as a translation hub, improving the interoperability of the ontologies through which it navigates
Methodology for Constructing Problem Definitions in Bioinformatics
Motivation: A recurrent criticism is that certain bioinformatics tools do not account for crucial biology and therefore fail answering the targeted biological question. We posit that the single most important reason for such shortcomings is an inaccurate formulation of the computational problem. Results: Our paper describes how to define a bioinformatics problem so that it captures both the underlying biology and the computational constraints for a particular problem. The proposed model delineates comprehensively the biological problem and conducts an item-by-item bioinformatics transformation resulting in a germane computational problem. This methodology not only facilitates interdisciplinary information flow but also accommodates emerging knowledge and technologies
Flexible RNA design under structure and sequence constraints using formal languages
The problem of RNA secondary structure design (also called inverse folding)
is the following: given a target secondary structure, one aims to create a
sequence that folds into, or is compatible with, a given structure. In several
practical applications in biology, additional constraints must be taken into
account, such as the presence/absence of regulatory motifs, either at a
specific location or anywhere in the sequence. In this study, we investigate
the design of RNA sequences from their targeted secondary structure, given
these additional sequence constraints. To this purpose, we develop a general
framework based on concepts of language theory, namely context-free grammars
and finite automata. We efficiently combine a comprehensive set of constraints
into a unifying context-free grammar of moderate size. From there, we use
generic generic algorithms to perform a (weighted) random generation, or an
exhaustive enumeration, of candidate sequences. The resulting method, whose
complexity scales linearly with the length of the RNA, was implemented as a
standalone program. The resulting software was embedded into a publicly
available dedicated web server. The applicability demonstrated of the method on
a concrete case study dedicated to Exon Splicing Enhancers, in which our
approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational
Biology and Biomedical Informatics (2013
Gene Similarity-based Approaches for Determining Core-Genes of Chloroplasts
In computational biology and bioinformatics, the manner to understand
evolution processes within various related organisms paid a lot of attention
these last decades. However, accurate methodologies are still needed to
discover genes content evolution. In a previous work, two novel approaches
based on sequence similarities and genes features have been proposed. More
precisely, we proposed to use genes names, sequence similarities, or both,
insured either from NCBI or from DOGMA annotation tools. Dogma has the
advantage to be an up-to-date accurate automatic tool specifically designed for
chloroplasts, whereas NCBI possesses high quality human curated genes (together
with wrongly annotated ones). The key idea of the former proposal was to take
the best from these two tools. However, the first proposal was limited by name
variations and spelling errors on the NCBI side, leading to core trees of low
quality. In this paper, these flaws are fixed by improving the comparison of
NCBI and DOGMA results, and by relaxing constraints on gene names while adding
a stage of post-validation on gene sequences. The two stages of similarity
measures, on names and sequences, are thus proposed for sequence clustering.
This improves results that can be obtained using either NCBI or DOGMA alone.
Results obtained with this quality control test are further investigated and
compared with previously released ones, on both computational and biological
aspects, considering a set of 99 chloroplastic genomes.Comment: 4 pages, IEEE International Conference on Bioinformatics and
Biomedicine (BIBM 2014
- âŠ