39 research outputs found

    A three-term recurrence relation for accurate evaluation of transition probabilities of the simple birth-and-death process

    Get PDF
    The simple (linear) birth-and-death process is a widely used stochastic model for describing the dynamics of a population. When the process is observed discretely over time, despite the large amount of literature on the subject, little is known about formal estimator properties. Here we will show that its application to observed data is further complicated by the fact that numerical evaluation of the well-known transition probability is an ill-conditioned problem. To overcome this difficulty we will rewrite the transition probability in terms of a Gaussian hypergeometric function and subsequently obtain a three-term recurrence relation for its accurate evaluation. We will also study the properties of the hypergeometric function as a solution to the three-term recurrence relation. We will then provide formulas for the gradient and Hessian of the log-likelihood function and conclude the article by applying our methods for numerically computing maximum likelihood estimates in both simulated and real dataset.Peer reviewe

    Kpax3 : Bayesian bi-clustering of large sequence datasets

    Get PDF
    Motivation: Estimation of the hidden population structure is an important step in many genetic studies. Often the aim is also to identify which sequence locations are the most discriminative between groups of samples for a given data partition. Automated discovery of interesting patterns that are present in the data can help to generate new biological hypotheses. Results: We introduce Kpax3, a Bayesian method for bi-clustering multiple sequence alignments. Influence of individual sites will be determined in a supervised manner by using informative prior distributions for the model parameters. Our inference method uses an implementation of both split-merge and Gibbs sampler type MCMC algorithms to traverse the joint posterior of partitions of samples and variables. We use a large Rotavirus sequence dataset to demonstrate the ability of Kpax3 to generate biologically important hypotheses about differential selective pressures across a virus protein.Peer reviewe

    Bayesian cluster analysis with applications to pathogen population genomics

    Get PDF
    Identifying similarity patterns in heterogeneous observations is a very common problem in many branches of science. When the similarities and dissimilarities are encoded by a group structure, the task of dividing the observed sample into an unknown number of homogeneous groups is known as cluster analysis. Among the many types of statistical data analyses, it is one of the most widely applied. In evolutionary biology, for example, the population structure plays an important role. Groups naturally arise as the result of evolutionary processes and depending on the resolution of the study, clusters might represent similar molecules, organisms, or even species. With the huge amount of genetic data now freely available in on-line databases, cluster analysis is a valuable technique to better understand the evolution of organisms. In this dissertation we focus our attention on Bayesian approaches to model-based clustering. We review the mathematical formalization of the two most common methods, finite mixture models and product partition models, together with algorithms needed to draw inferences. We then introduce a novel Bayesian model which has been specifically designed to partition categorical data matrices. Finally, we show how cluster analysis is a very effective method for understanding the evolution of pathogens, and how this information is relevant to public health

    Validation and Automation of a High-Throughput Multitargeted Method for Semiquantification of Endogenous Metabolites from Different Biological Matrices Using Tandem Mass Spectrometry

    Get PDF
    The use of metabolomics profiling to understand the metabolism under different physiological states has increased in recent years, which created the need for robust analytical platforms. Here, we present a validated method for targeted and semiquantitative analysis of 102 polar metabolites that cover major metabolic pathways from 24 classes in a single 17.5-min assay. The method has been optimized for a wide range of biological matrices from various organisms, and involves automated sample preparation and data processing using an inhouse developed R-package. To ensure reliability, the method was validated for accuracy, precision, selectivity, specificity, linearity, recovery, and stability according to European Medicines Agency guidelines. We demonstrated an excellent repeatability of retention times (CV 0.980) in their respective wide dynamic concentration ranges (CV <3%), and concentrations (CV <25%) of quality control samples interspersed within 25 batches analyzed over a period of one year. The robustness was demonstrated through a high correlation between metabolite concentrations measured using our method and the NIST reference values (R-2 = 0.967), including cross-platform comparability against the BIOCRATES AbsoluteIDQp180 kit (R-2 = 0.975) and NMR analyses (R-2 = 0.884). We have shown that our method can be successfully applied in many biomedical research fields and clinical trials, including epidemiological studies for biomarker discovery. In summary, a thorough validation demonstrated that our method is reproducible, robust, reliable, and suitable for metabolomics studies.Peer reviewe

    Combined gene essentiality scoring improves the prediction of cancer dependency maps

    Get PDF
    Correction: Volume: 51 Article Number: UNSP 102594 DOI: 10.1016/j.ebiom.2019.12.003Peer reviewe

    SynergyFinder Plus: Toward Better Interpretation and Annotation of Drug Combination Screening Datasets

    Get PDF
    Combinatorial therapies have been recently proposed to improve the efficacy of anticancer treatment. The SynergyFinder R package is a software used to analyze pre-clinical drug combination datasets. Here, we report the major updates to the SynergyFinder R package for improved interpretation and annotation of drug combination screening results. Unlike the existing implementations, the updated SynergyFinder R package includes five main innovations. 1) We extend the mathematical models to higher-order drug combination data analysis and implement dimension reduction techniques for visualizing the synergy landscape. 2) We provide a statistical analysis of drug combination synergy and sensitivity with confidence intervals and P values. 3) We incorporate a synergy barometer to harmonize multiple synergy scoring methods to provide a consensus metric for synergy. 4) We evaluate drug combination synergy and sensitivity to provide an unbiased interpretation of the clinical potential. 5) We enable fast annotation of drugs and cell lines, including their chemical and target information. These annotations will improve the interpretation of the mechanisms of action of drug combinations. To facilitate the use of the R package within the drug discovery community, we also provide a web server at www.synergyfinderplus.org as a user-friendly interface to enable a more flexible and versatile analysis of drug combination data.Peer reviewe

    Convergent amino acid signatures in polyphyletic Campylobacter jejuni subpopulations suggest human niche tropism

    Get PDF
    Human infection with the gastrointestinal pathogen Campylobacter jejuni is dependent upon the opportunity for zoonotic transmission and the ability of strains to colonize the human host. Certain lineages of this diverse organism are more common in human infection but the factors underlying this overrepresentation are not fully understood. We analyzed 601 isolate genomes from agricultural animals and human clinical cases, including isolates from the multihost (ecological generalist) ST-21 and ST-45 clonal complexes (CCs). Combined nucleotide and amino acid sequence analysis identified 12 human-only amino acid KPAX clusters among polyphyletic lineages within the common disease causing CC21 group isolates, with no such clusters among CC45 isolates. Isolate sequence types within human-only CC21 group KPAX clusters have been sampled from other hosts, including poultry, so rather than representing unsampled reservoir hosts, the increase in relative frequency in human infection potentially reflects a genetic bottleneck at the point of human infection. Consistent with this, sequence enrichment analysis identified nucleotide variation in genes with putative functions related to human colonization and pathogenesis, in human-only clusters. Furthermore, the tight clustering and polyphyly of human-only lineage clusters within a single CC suggest the repeated evolution of human association through acquisition of genetic elements within this complex. Taken together, combined nucleotide and amino acid analysis of large isolate collections may provide clues about human niche tropism and the nature of the forces that promote the emergence of clinically important C. jejuni lineages.Peer reviewe

    Drug combination sensitivity scoring facilitates the discovery of synergistic and efficacious drug combinations in cancer

    Get PDF
    High-throughput drug screening has facilitated the discovery of drug combinations in cancer. Many existing studies adopted a full matrix design, aiming for the characterization of drug pair effects for cancer cells. However, the full matrix design may be suboptimal as it requires a drug pair to be combined at multiple concentrations in a full factorial manner. Furthermore, many of the computational tools assess only the synergy but not the sensitivity of drug combinations, which might lead to false positive discoveries. We proposed a novel cross design to enable a more cost-effective and simultaneous testing of drug combination sensitivity and synergy. We developed a drug combination sensitivity score (CSS) to determine the sensitivity of a drug pair, and showed that the CSS is highly reproducible between the replicates and thus supported its usage as a robust metric. We further showed that CSS can be predicted using machine learning approaches which determined the top pharmaco-features to cluster cancer cell lines based on their drug combination sensitivity profiles. To assess the degree of drug interactions using the cross design, we developed an S synergy score based on the difference between the drug combination and the single drug dose-response curves. We showed that the S score is able to detect true synergistic and antagonistic drug combinations at an accuracy level comparable to that using the full matrix design. Taken together, we showed that the cross design coupled with the CSS sensitivity and S synergy scoring methods may provide a robust and accurate characterization of both drug combination sensitivity and synergy levels, with minimal experimental materials required. Our experimental-computational approach could be utilized as an efficient pipeline for improving the discovery rate in high-throughput drug combination screening, particularly for primary patient samples which are difficult to obtain.Peer reviewe
    corecore