818 research outputs found

    GPU acceleration for statistical gene classification

    Get PDF
    The use of Bioinformatic tools in routine clinical diagnostics is still facing a number of issues. The more complex and advanced bioinformatic tools become, the more performance is required by the computing platforms. Unfortunately, the cost of parallel computing platforms is usually prohibitive for both public and small private medical practices. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up bioinformatic tasks such as statistical classification of gene expression profiles. The results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatic tools and real medical practic

    Gene Expression vs. Network Attractors

    Get PDF
    Microarrays, RNA-Seq, and Gene Regulatory Networks (GRNs) are common tools used to study the regulatory mechanisms mediating the expression of the genes involved in the biological processes of a cell. Whereas microarrays and RNA-Seq provide a snapshot of the average expression of a set of genes of a population of cells, GRNs are used to model the dynamics of the regulatory dependencies among a subset of genes believed to be the main actors in a biological process. In this paper we discuss the possibility of correlating a GRN dynamics with a gene expression profile extracted from one or more wet-lab expression experiments. This is more a position paper to promote discussion than a research paper with final results

    GPU cards as a low cost solution for efficient and fast classification of high dimensional gene expression datasets

    Get PDF
    The days when bioinformatics tools will be so reliable to become a standard aid in routine clinical diagnostics are getting very close. However, it is important to remember that the more complex and advanced bioinformatics tools become, the more performances are required by the computing platforms. Unfortunately, the cost of High Performance Computing (HPC) platforms is still prohibitive for both public and private medical practices. Therefore, to promote and facilitate the use of bioinformatics tools it is important to identify low-cost parallel computing solutions. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up classification of gene expression profiles. Results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatics tools and real medical practic

    An agent-based simulation framework for complex systems

    Get PDF
    In this abstract we present a new approach to the simulation of complex systems as biological interaction networks, chemical reactions, ecosystems, etc. It aims at overcoming previously proposed analytical approaches that, because of several computational challenges, could not handle systems of realistic com- plexity. The proposed model is based on a set of agents interacting through a shared environment. Each agent functions independently from the others, and its be- havior is driven only by its current status and the "content" of the surrounding environment. The environment is the only "data repository" and does not store the value of variables, but only their presence and concentration. Each agent performs 3 main functions: 1. it samples the environment at random locations 2. based on the distribution of the sampled data and a proper Transfer Func- tion, it computes the rate at which the output values are generated 3. it writes the output "products" at random locations. The environment is modeled as a Really Random Access Memory (R2AM). Data is written and sampled at random memory locations. Each memory location represent an atomic sample (a molecule, a chemical compound, a protein, an ion, . . . ). Presence and concentration of these samples are what constitutes the environment data set. The environment can be sensitive to external stimuli (e.g., pH, Temperature, ...) and can include topological information to allow its partitioning (e.g. between nucleus and cytoplasm in a cell) and the modeling of sample "movements" within the environment. The proposed approach is easily scalable in both complexity and computa- tional costs. Each module could implement a very simple object as a single chemical reaction or a very complex process as a gene translation into a pro- tein. At the same time, from the hardware point of view, the complexity of the objects implementing a single agent can range from a single software process to a dedicated computer or hardware platfor

    Using Boolean Networks to Model Post-transcriptional Regulation in Gene Regulatory Networks

    Get PDF
    In Gene Regulatory Networks research there is a considerable lack of tech- niques and tools to understand these networks from a System Biology point of view. The typical biological approach is to reconstruct a particular network from expression patterns, then try to validate its dynamics by simulation, use simulation to analyze its reactions to some perturbations, and finally go back "in vitro" to validate the simulation results. Nevertheless, when the goal is to understand the high-level, general mechanisms that allow these networks to work or to be stable under mild perturbations, this type of approach has shown very strong limitations. In this work we want to better understand the role of miRNA as a stabilizing mechanism in gene regulatory networks. Boolean networks have been recently used to better understand the struc- tural and dynamical properties of regulatory networks. Attractors and ergodic sets have been easily correlated with many of the typical biological cell behav- iors (cancer, differentiation, pluripotential, ...). The most widely used model are nevertheless very simple, and work under too strict constraints. We are defining an enhanced model based on Boolean Networks but also able to take into account post-transcriptional regulation and possibly be extended to other regulatory mechanisms (e.g. ceRNA) that have been already proven crucial in vivo. The final goal is to try to understand if the wide number of miRNA targets constitutes a structural network-stability mechanism used to make the network immune to „regulatory" noise. To achieve this result we evolve the modified Boolean networks for high or low sensitivity to perturbations, and then analyze the resulting networks to understand if specific structural patterns containing miRNA-like post-transcriptional regulatory elements can be correlated with the network stabilit

    Building Gene Expression Profile Classifiers with a Simple and Efficient Rejection Option in R

    Get PDF
    Background: The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers. Results: This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention. Conclusions: This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional classifiers might not be availabl

    Statistical Reliability Estimation of Microprocessor-Based Systems

    Get PDF
    What is the probability that the execution state of a given microprocessor running a given application is correct, in a certain working environment with a given soft-error rate? Trying to answer this question using fault injection can be very expensive and time consuming. This paper proposes the baseline for a new methodology, based on microprocessor error probability profiling, that aims at estimating fault injection results without the need of a typical fault injection setup. The proposed methodology is based on two main ideas: a one-time fault-injection analysis of the microprocessor architecture to characterize the probability of successful execution of each of its instructions in presence of a soft-error, and a static and very fast analysis of the control and data flow of the target software application to compute its probability of success. The presented work goes beyond the dependability evaluation problem; it also has the potential to become the backbone for new tools able to help engineers to choose the best hardware and software architecture to structurally maximize the probability of a correct execution of the target softwar

    Common integration sites of published datasets identified using a graph-based framework

    Get PDF
    With next-generation sequencing, the genomic data available for the characterization of integration sites (IS) has dramatically increased. At present, in a single experiment, several thousand viral integration genome targets can be investigated to define genomic hot spots. In a previous article, we renovated a formal CIS analysis based on a rigid fixed window demarcation into a more stretchy definition grounded on graphs. Here, we present a selection of supporting data related to the graph-based framework (GBF) from our previous article, in which a collection of common integration sites (CIS) were identified on six published datasets. In this work, we will focus on two datasets, ISRTCGD and ISHIV, which have been previously discussed. Moreover, we show in more detail the workflow design that originates the datasets

    A New miRNA Motif Protects Pathways' Expression in Gene Regulatory Networks

    Get PDF
    The continuing discovery of new functions and classes of small non-coding RNAs is suggesting the presence of regulatory mechanisms far more complex than the ones identified so far. In our computational analysis of a large set of public available databases, we found statistical evidence of an inter-pathway regulatory motif, not previously described, that reveals a new protective role miRNAs may play in the successful activation of a pathway. This paper reports the main outcomes of this analysis
    corecore