50 research outputs found

    FiCoS: A fine-grained and coarse-grained GPU-powered deterministic simulator for biochemical networks.

    Get PDF
    Mathematical models of biochemical networks can largely facilitate the comprehension of the mechanisms at the basis of cellular processes, as well as the formulation of hypotheses that can be tested by means of targeted laboratory experiments. However, two issues might hamper the achievement of fruitful outcomes. On the one hand, detailed mechanistic models can involve hundreds or thousands of molecular species and their intermediate complexes, as well as hundreds or thousands of chemical reactions, a situation generally occurring in rule-based modeling. On the other hand, the computational analysis of a model typically requires the execution of a large number of simulations for its calibration, or to test the effect of perturbations. As a consequence, the computational capabilities of modern Central Processing Units can be easily overtaken, possibly making the modeling of biochemical networks a worthless or ineffective effort. To the aim of overcoming the limitations of the current state-of-the-art simulation approaches, we present in this paper FiCoS, a novel "black-box" deterministic simulator that effectively realizes both a fine-grained and a coarse-grained parallelization on Graphics Processing Units. In particular, FiCoS exploits two different integration methods, namely, the Dormand-Prince and the Radau IIA, to efficiently solve both non-stiff and stiff systems of coupled Ordinary Differential Equations. We tested the performance of FiCoS against different deterministic simulators, by considering models of increasing size and by running analyses with increasing computational demands. FiCoS was able to dramatically speedup the computations up to 855×, showing to be a promising solution for the simulation and analysis of large-scale models of complex biological processes

    ACDC: Automated Cell Detection and Counting for Time-Lapse Fluorescence Microscopy.

    Get PDF
    Advances in microscopy imaging technologies have enabled the visualization of live-cell dynamic processes using time-lapse microscopy imaging. However, modern methods exhibit several limitations related to the training phases and to time constraints, hindering their application in the laboratory practice. In this work, we present a novel method, named Automated Cell Detection and Counting (ACDC), designed for activity detection of fluorescent labeled cell nuclei in time-lapse microscopy. ACDC overcomes the limitations of the literature methods, by first applying bilateral filtering on the original image to smooth the input cell images while preserving edge sharpness, and then by exploiting the watershed transform and morphological filtering. Moreover, ACDC represents a feasible solution for the laboratory practice, as it can leverage multi-core architectures in computer clusters to efficiently handle large-scale imaging datasets. Indeed, our Parent-Workers implementation of ACDC allows to obtain up to a 3.7× speed-up compared to the sequential counterpart. ACDC was tested on two distinct cell imaging datasets to assess its accuracy and effectiveness on images with different characteristics. We achieved an accurate cell-count and nuclei segmentation without relying on large-scale annotated datasets, a result confirmed by the average Dice Similarity Coefficients of 76.84 and 88.64 and the Pearson coefficients of 0.99 and 0.96, calculated against the manual cell counting, on the two tested datasets

    GenHap: a novel computational method based on genetic algorithms for haplotype assembly.

    Get PDF
    BACKGROUND: In order to fully characterize the genome of an individual, the reconstruction of the two distinct copies of each chromosome, called haplotypes, is essential. The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. Indeed, the knowledge of complete haplotypes is generally more informative than analyzing single SNPs and plays a fundamental role in many medical applications. RESULTS: To reconstruct the two haplotypes, we addressed the weighted Minimum Error Correction (wMEC) problem, which is a successful approach for haplotype assembly. This NP-hard problem consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets, with the least number of corrections to the SNP values. To this aim, we propose here GenHap, a novel computational method for haplotype assembly based on Genetic Algorithms, yielding optimal solutions by means of a global search process. In order to evaluate the effectiveness of our approach, we run GenHap on two synthetic (yet realistic) datasets, based on the Roche/454 and PacBio RS II sequencing technologies. We compared the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype phasing. Our results show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 4× faster than HapCol in the case of Roche/454 instances and up to 20× faster when compared on the PacBio RS II dataset. Finally, we assessed the performance of GenHap on two different real datasets. CONCLUSIONS: Future-generation sequencing technologies, producing longer reads with higher coverage, can highly benefit from GenHap, thanks to its capability of efficiently solving large instances of the haplotype assembly problem. Moreover, the optimization approach proposed in GenHap can be extended to the study of allele-specific genomic features, such as expression, methylation and chromatin conformation, by exploiting multi-objective optimization techniques. The source code and the full documentation are available at the following GitHub repository: https://github.com/andrea-tango/GenHap

    A swarm intelligence approach to avoid local optima in fuzzy c-means clustering

    No full text
    Clustering analysis is an important computational task that has applications in many domains. One of the most popular algorithms to solve the clustering problem is fuzzy c-means, which exploits notions from fuzzy logic to provide a smooth partitioning of the data into classes, allowing the possibility of multiple membership for each data sample. The fuzzy c-means algorithm is based on the optimization of a partitioning function, which minimizes inter-cluster similarity. This optimization problem is known to be NP-hard and it is generally tackled using a hill climbing method, a local optimizer that provides acceptable but sub-optimal solutions, since it is sensitive to initialization and tends to get stuck in local optima. In this work we propose an alternative approach based on the swarm intelligence global optimization method Fuzzy Self-Tuning Particle Swarm Optimization (FST-PSO). We solve the fuzzy clustering task by optimizing fuzzy c-means' partitioning function using FST-PSO. We show that this population-based metaheuristics is more effective than hill climbing, providing high quality solutions with the cost of an additional computational complexity. It is noteworthy that, since this particle swarm optimization algorithm is self-tuning, the user does not have to specify additional hyperparameters for the optimization process

    A novel multi-objective approach to fuzzy clustering

    No full text
    A clustering algorithm is an unsupervised method, which aims to divide data points into two groups or more. These algorithms generally rely on the optimization of a single criterion to find optimal cluster structures. This choice might lead to cluster structures of poor quality, and does not reflect how humans generally rely on multiple (possibly conflicting) criteria when grouping similar elements together. In this paper, we apply an different approach based on multi-objective optimization to solve the problem of fuzzy clustering. Specifically, we combine the objective function of the popular fuzzy c-means algorithm with a second objective function, which aims at maximizing the number of data points having a high degree of membership to one of the clusters. The rationale is that data points close to a cluster center have a high membership value, while data points in between cluster centers share their membership between the different clusters: by optimizing the second criterion we expect an improvement of the quality of the resulting clustering structure. We perform the multi-objective optimization by means of the Non-dominated Sorting Genetic Algorithm (NSGA-II), a multi-objective, evolutionary global optimization algorithm. Our results show that a multi-objective approach to fuzzy clustering can generate solutions of higher quality than classic fuzzy C-means, on both synthetic and real world data sets

    Surfing on fitness landscapes: A boost on optimization by fourier surrogate modeling

    Get PDF
    Surfing in rough waters is not always as fun as wave riding the "big one". Similarly, in optimization problems, fitness landscapes with a huge number of local optima make the search for the global optimum a hard and generally annoying game. Computational Intelligence optimization metaheuristics use a set of individuals that "surf" across the fitness landscape, sharing and exploiting pieces of information about local fitness values in a joint effort to find out the global optimum. In this context, we designed surF, a novel surrogate modeling technique that leverages the discrete Fourier transform to generate a smoother, and possibly easier to explore, fitness landscape. The rationale behind this idea is that filtering out the high frequencies of the fitness function and keeping only its partial information (i.e., the low frequencies) can actually be beneficial in the optimization process. We prove our theory by combining surF with a settings free variant of Particle Swarm Optimization (PSO) based on Fuzzy Logic, called Fuzzy Self-Tuning PSO. Specifically, we introduce a new algorithm, named F3ST-PSO, which performs a preliminary exploration on the surrogate model followed by a second optimization using the actual fitness function. We show that F3ST-PSO can lead to improved performances, notably using the same budget of fitness evaluations

    The Impact of Variable Selection and Transformation on the Interpretability and Accuracy of Fuzzy Models

    No full text
    Data transformation is an important step in Machine Learning pipelines which can strongly improve their performance. For instance, min-max normalization is often used to make all variables lie in the same range, while log-transformation is used to map data that is scattered across several orders of magnitude to a logarithmic space. Such transformations can be beneficial when the machine learning approach measures distance in a metric space, such as cluster-based approaches. These two transformation approaches can be combined to reveal hidden patterns in the data in the case of log-normally distributed data points, which commonly occur in biological and medical data. In this work we introduce a novel evolutionary approach designed to automatically determine the optimal log-transformation and selection of variables. Our approach is built around an interpretable AI system (created by pyFUME), so that all transformations are followed by inverse transformations to map back the values into the original universe of discourse, and preserve the interpretability of the results. We test our approach on two synthetic datasets, designed to reproduce a condition in which some variables are normally distributed, some variables are log-normally distributed, and some variables are just noise in the dataset. Our results show that our approach yields better performing models compared to conventional methods, and that the resulting model is also characterised by a better interpretability, making such approach particularly useful to study biomedical datasets

    pyFUME: a Python Package for Fuzzy Model Estimation

    No full text
    Living in the era of "data deluge" demands for an increase in the application and development of machine learning methods, both in basic and applied research. Among these methods, in the last decades fuzzy inference systems carved out their own niche as (light) grey box models, which are considered more interpretable and transparent than other commonly employed methods, such as artificial neural networks. Although commercially distributed alternatives are available, software able to assist practitioners and researchers in each step of the estimation of a fuzzy model from data are still limited in scope and applicability. This is especially true when looking at software developed in Python, a programming language that quickly gained popularity among data scientists and it is often considered their language of choice. To fill this gap, we introduce pyFUME, a Python library for automatically estimating fuzzy models from data. pyFUME contains a set of classes and methods to estimate the antecedent sets and the consequent parameters of a Takagi-Sugeno fuzzy model from data, and then create an executable fuzzy model exploiting the Simpful library. pyFUME can be beneficial to practitioners, thanks to its pre-implemented and user-friendly pipelines, but also to researchers that want to fine-tune each step of the estimation process

    Reboot strategies in particle swarm optimization and their impact on parameter estimation of biochemical systems

    No full text
    \u3cp\u3eComputational methods adopted in the field of Systems Biology require the complete knowledge of reaction kinetic constants to perform simulations of the dynamics and understand the emergent behavior of biochemical systems. However, kinetic parameters of biochemical reactions are often difficult or impossible to measure, thus they are generally inferred from experimental data, in a process known as Parameter Estimation (PE). We consider here a PE methodology that exploits Particle Swarm Optimization (PSO) to estimate an appropriate kinetic parameterization, by comparing experimental time-series target data with in silica dynamics, simulated by using the parameterization encoded by each particle. In this work we present three different reboot strategies for PSO, whose aim is to reinitialize particle positions to avoid particles to get trapped in local optima, and we compare the performance of PSO coupled with the reboot strategies with respect to standard PSO in the case of the PE of two biochemical systems. Since the PE requires a huge number of simulations at each iteration, in this work we exploit a GPU-powered deterministic simulator, cupSODA, which performs in a parallel fashion all simulations and fitness evaluations. Finally, we show that the performances of our implementation scale sublinearly with respect to the swarm size, even on outdated GPUs.\u3c/p\u3

    A swarm intelligence approach to avoid local optima in fuzzy c-means clustering

    No full text
    \u3cp\u3eClustering analysis is an important computational task that has applications in many domains. One of the most popular algorithms to solve the clustering problem is fuzzy c-means, which exploits notions from fuzzy logic to provide a smooth partitioning of the data into classes, allowing the possibility of multiple membership for each data sample. The fuzzy c-means algorithm is based on the optimization of a partitioning function, which minimizes inter-cluster similarity. This optimization problem is known to be NP-hard and it is generally tackled using a hill climbing method, a local optimizer that provides acceptable but sub-optimal solutions, since it is sensitive to initialization and tends to get stuck in local optima. In this work we propose an alternative approach based on the swarm intelligence global optimization method Fuzzy Self-Tuning Particle Swarm Optimization (FST-PSO). We solve the fuzzy clustering task by optimizing fuzzy c-means' partitioning function using FST-PSO. We show that this population-based metaheuristics is more effective than hill climbing, providing high quality solutions with the cost of an additional computational complexity. It is noteworthy that, since this particle swarm optimization algorithm is self-tuning, the user does not have to specify additional hyperparameters for the optimization process.\u3c/p\u3
    corecore