647 research outputs found

    Ensembles of jittered association rule classifiers

    Get PDF
    The ensembling of classifiers tends to improve predictive accuracy. To obtain an ensemble with N classifiers, one typically needs to run N learning processes. In this paper we introduce and explore Model Jittering Ensembling, where one single model is perturbed in order to obtain variants that can be used as an ensemble. We use as base classifiers sets of classification association rules. The two methods of jittering ensembling we propose are Iterative Reordering Ensembling (IRE) and Post Bagging (PB). Both methods start by learning one rule set over a single run, and then produce multiple rule sets without relearning. Empirical results on 36 data sets are positive and show that both strategies tend to reduce error with respect to the single model association rule classifier. A bias–variance analysis reveals that while both IRE and PB are able to reduce the variance component of the error, IRE is particularly effective in reducing the bias component. We show that Model Jittering Ensembling can represent a very good speed-up w.r.t. multiple model learning ensembling. We also compare Model Jittering with various state of the art classifiers in terms of predictive accuracy and computational efficiency.This work was partially supported by FCT project Rank! (PTDC/EIA/81178/2006) and by AdI project Palco3.0 financed by QREN and Fundo Europeu de Desenvolvimento Regional (FEDER), and also supported by Fundacao Ciencia e Tecnologia, FEDER e Programa de Financiamento Plurianual de Unidades de I & D. Thanks are due to William Cohen for kindly providing the executable code for the SLIPPER implementation. Our gratitude goes also to our anonymous reviewers who have helped to significantly improve this paper by sharing their knowledge and their informed criticism with the authors

    Shape recognition through multi-level fusion of features and classifiers

    Get PDF
    Shape recognition is a fundamental problem and a special type of image classification, where each shape is considered as a class. Current approaches to shape recognition mainly focus on designing low-level shape descriptors, and classify them using some machine learning approaches. In order to achieve effective learning of shape features, it is essential to ensure that a comprehensive set of high quality features can be extracted from the original shape data. Thus we have been motivated to develop methods of fusion of features and classifiers for advancing the classification performance. In this paper, we propose a multi-level framework for fusion of features and classifiers in the setting of gran-ular computing. The proposed framework involves creation of diversity among classifiers, through adopting feature selection and fusion to create diverse feature sets and to train diverse classifiers using different learn-Xinming Wang algorithms. The experimental results show that the proposed multi-level framework can effectively create diversity among classifiers leading to considerable advances in the classification performance

    MAGE-A cancer/testis antigens inhibit MDM2 ubiquitylation function and promote increased levels of MDM4

    Get PDF
    Melanoma antigen A (MAGE-A) proteins comprise a structurally and biochemically similar sub-family of Cancer/Testis antigens that are expressed in many cancer types and are thought to contribute actively to malignancy. MAGE-A proteins are established regulators of certain cancer-associated transcription factors, including p53, and are activators of several RING finger-dependent ubiquitin E3 ligases. Here, we show that MAGE-A2 associates with MDM2, a ubiquitin E3 ligase that mediates ubiquitylation of more than 20 substrates including mainly p53, MDM2 itself, and MDM4, a potent p53 inhibitor and MDM2 partner that is structurally related to MDM2. We find that MAGE-A2 interacts with MDM2 via the N-terminal p53-binding pocket and the RING finger domain of MDM2 that is required for homo/hetero-dimerization and for E2 ligase interaction. Consistent with these data, we show that MAGE-A2 is a potent inhibitor of the E3 ubiquitin ligase activity of MDM2, yet it does not have any significant effect on p53 turnover mediated by MDM2. Strikingly, however, increased MAGE-A2 expression leads to reduced ubiquitylation and increased levels of MDM4. Similarly, silencing of endogenous MAGE-A expression diminishes MDM4 levels in a manner that can be rescued by the proteasomal inhibitor, bortezomid, and permits increased MDM2/MDM4 association. These data suggest that MAGE-A proteins can: (i) uncouple the ubiquitin ligase and degradation functions of MDM2; (ii) act as potent inhibitors of E3 ligase function; and (iii) regulate the turnover of MDM4. We also find an association between the presence of MAGE-A and increased MDM4 levels in primary breast cancer, suggesting that MAGE-A-dependent control of MDM4 levels has relevance to cancer clinically

    Predicting a small molecule-kinase interaction map: A machine learning approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features.</p> <p>Results</p> <p>A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided.</p> <p>Conclusions</p> <p>In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful.</p

    Extensive Copy-Number Variation of Young Genes across Stickleback Populations

    Get PDF
    MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Genetic determinants of co-accessible chromatin regions in activated T cells across humans.

    Get PDF
    Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression

    A transcriptomic snapshot of early molecular communication between Pasteuria penetrans and Meloidogyne incognita

    Get PDF
    © The Author(s). 2018Background: Southern root-knot nematode Meloidogyne incognita (Kofoid and White, 1919), Chitwood, 1949 is a key pest of agricultural crops. Pasteuria penetrans is a hyperparasitic bacterium capable of suppressing the nematode reproduction, and represents a typical coevolved pathogen-hyperparasite system. Attachment of Pasteuria endospores to the cuticle of second-stage nematode juveniles is the first and pivotal step in the bacterial infection. RNA-Seq was used to understand the early transcriptional response of the root-knot nematode at 8 h post Pasteuria endospore attachment. Results: A total of 52,485 transcripts were assembled from the high quality (HQ) reads, out of which 582 transcripts were found differentially expressed in the Pasteuria endospore encumbered J2 s, of which 229 were up-regulated and 353 were down-regulated. Pasteuria infection caused a suppression of the protein synthesis machinery of the nematode. Several of the differentially expressed transcripts were putatively involved in nematode innate immunity, signaling, stress responses, endospore attachment process and post-attachment behavioral modification of the juveniles. The expression profiles of fifteen selected transcripts were validated to be true by the qRT PCR. RNAi based silencing of transcripts coding for fructose bisphosphate aldolase and glucosyl transferase caused a reduction in endospore attachment as compared to the controls, whereas, silencing of aspartic protease and ubiquitin coding transcripts resulted in higher incidence of endospore attachment on the nematode cuticle. Conclusions: Here we provide evidence of an early transcriptional response by the nematode upon infection by Pasteuria prior to root invasion. We found that adhesion of Pasteuria endospores to the cuticle induced a down-regulated protein response in the nematode. In addition, we show that fructose bisphosphate aldolase, glucosyl transferase, aspartic protease and ubiquitin coding transcripts are involved in modulating the endospore attachment on the nematode cuticle. Our results add new and significant information to the existing knowledge on early molecular interaction between M. incognita and P. penetrans.Peer reviewedFinal Published versio

    A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE

    Get PDF
    We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as “noise” or “error”) within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms

    Information and feedback to improve occupational physicians’ reporting of occupational diseases: a randomised controlled trial

    Get PDF
    To assess the effectiveness of supplying occupational physicians (OPs) with targeted and stage-matched information or with feedback on reporting occupational diseases to the national registry in the Netherlands. In a randomized controlled design, 1076 OPs were divided into three groups based on previous reporting behaviour: precontemplators not considering reporting, contemplators considering reporting and actioners reporting occupational diseases. Precontemplators and contemplators were randomly assigned to receive stage-matched, stage-mismatched or general information. Actioners were randomly assigned to receive personalized or standardized feedback upon notification. Outcome measures were the number of OPs reporting and the number of reported occupational diseases in a 180-day period before and after the intervention. Precontemplators were significantly more male and self-employed compared to contemplators and actioners. There was no significant effect of stage-matched information versus stage-mismatched or general information on the percentage of reporting OPs and on the mean number of notifications in each group. Receiving any information affected reporting more in contemplators than in precontemplators. The mean number of notifications in actioners increased more after personalized feedback than after standardized feedback, but the difference was not significant. This study supports the concept that contemplators are more susceptible to receiving information but could not confirm an effect of stage-matching this information on reporting occupational diseases to the national registr
    corecore