39 research outputs found

    Quantifying stability in gene list ranking across microarray derived clinical biomarkers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype.</p> <p>Results</p> <p>Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers.</p> <p>Conclusions</p> <p>The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes.</p

    İşret, kumar, nisvan belası

    Get PDF
    Paul de Kock'un Sabah'ta yayımlanan İşret, Kumar, Nisvan Belası adlı romanının ilk ve son tefrikalar

    Emodepside targets SLO-1 channels of Onchocerca ochengi and induces broad anthelmintic effects in a bovine model of onchocerciasis

    Get PDF
    Onchocerciasis (river blindness), caused by the filarial worm Onchocerca volvulus, is a neglected tropical disease mostly affecting sub-Saharan Africa and is responsible for >1.3 million years lived with disability. Current control relies almost entirely on ivermectin, which suppresses symptoms caused by the first-stage larvae (microfilariae) but does not kill the long-lived adults. Here, we evaluated emodepside, a semi-synthetic cyclooctadepsipeptide registered for deworming applications in companion animals, for activity against adult filariae (i.e., as a macrofilaricide). We demonstrate the equivalence of emodepside activity on SLO-1 potassium channels in Onchocerca volvulus and Onchocerca ochengi, its sister species from cattle. Evaluation of emodepside in cattle as single or 7-day treatments at two doses (0.15 and 0.75 mg/kg) revealed rapid activity against microfilariae, prolonged suppression of female worm fecundity, and macrofilaricidal effects by 18 months post treatment. The drug was well tolerated, causing only transiently increased blood glucose. Female adult worms were mostly paralyzed; however, some retained metabolic activity even in the multiple high-dose group. These data support ongoing clinical development of emodepside to treat river blindness

    Prediction Errors in Learning Drug Response from Gene Expression Data - Influence of Labeling, Sample Size, and Machine Learning Algorithm

    Get PDF
    Model-based prediction is dependent on many choices ranging from the sample collection and prediction endpoint to the choice of algorithm and its parameters. Here we studied the effects of such choices, exemplified by predicting sensitivity (as IC50) of cancer cell lines towards a variety of compounds. For this, we used three independent sample collections and applied several machine learning algorithms for predicting a variety of endpoints for drug response. We compared all possible models for combinations of sample collections, algorithm, drug, and labeling to an identically generated null model. The predictability of treatment effects varies among compounds, i.e. response could be predicted for some but not for all. The choice of sample collection plays a major role towards lowering the prediction error, as does sample size. However, we found that no algorithm was able to consistently outperform the other and there was no significant difference between regression and two- or three class predictors in this experimental setting. These results indicate that response-modeling projects should direct efforts mainly towards sample collection and data quality, rather than method adjustment

    Comparison of drug response value distributions between panels.

    No full text
    <p>Overlaid density plots for all drugs that are in NCI60 and BPH panel.</p

    Clustering Protein Sequences - Structure Prediction by transitive homology

    Get PDF
    It is widely believed that for two proteins A and B a sequence identity above some threshold implies structural similarity. It is not fully understood whether in the case that sequence similarity between A and B is below this threshold the existence of a third protein with a level of sequence similarity with A and with B which is high enough suffices for inferring structural similarity of A and B, that is whether transitivity holds. We examined the protein sequences in the SwissProt database. Their similarity was determined using the Smith-Waterman algorithm. This data was transformed into a directed graph where protein sequences constitute vertices. A directed edge was drawn from vertex A to vertex B if the sequences A and B showed similarity above a fixed threshold. By use of a length dependent scaling of the alignment scores we have a criterion to avoid clustering errors due to multi-domain proteins. To deal with the resulting large graphs we have developed a very efficient library. Methods include both a novel graph-based clustering algorithm capable of handling multi-domain proteins and cluster comparison algorithms. The parameters of above algorithms used were fine-tuned by using SCOP as a test set. We will present our algorithmic advances yielding a 24 percent improvement over pair-wise comparisons, statistics of the clusterings obtained and general methodology relevant for testing our hypothesis

    Clustering protein sequences -- structure prediction by transitive homology

    Get PDF
    Motivation: It is widely believed that for two proteins A and B a sequence identity above some threshold implies structural similarity due to a common evolutionary ancestor. Since this is only a sufficient, but not a necessary condition for structural similarity, the question remains what other criteria can be used to identify remote homologues. Transitivity refers to the concept of deducing a structural similarity between proteins A and C from the existence of a third protein B, such that A and B as well as B and C are homologues, as ascertained if the sequence identity between A and B as well as that between B and C is above the aforementioned threshold. It is not fully understood, if transitivity always holds and whethe
    corecore