105 research outputs found

    Automated data pre-processing via meta-learning

    Get PDF
    The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version

    Data mining workflow templates for intelligent discovery assistance in RapidMiner

    Full text link
    Knowledge Discovery in Databases (KDD) has evolved during the last years and reached a mature stage offering plenty of operators to solve complex tasks. User support for building workflows, in contrast, has not increased proportionally. The large number of operators available in current KDD systems make it difficult for users to successfully analyze data. Moreover, workflows easily contain a large number of operators and parts of the workflows are applied several times, thus it is hard for users to build them manually. In addition, workflows are not checked for correctness before execution. Hence, it frequently happens that the execution of the workflow stops with an error after several hours runtime. In this paper we address these issues by introducing a knowledge-based representation of KDD workflows as a basis for cooperative-interactive planning. Moreover, we discuss workflow templates that can mix executable operators and tasks to be refined later into sub-workflows. This new representation helps users to structure and handle workflows, as it constrains the number of operators that need to be considered. We show that workflows can be grouped in templates enabling re-use and simplifying KDD worflow construction in RapidMiner

    The quest for companions to post-common envelope binaries: I. Searching a sample of stars from the CSS and SDSS

    Full text link
    As part of an ongoing collaboration between student groups at high schools and professional astronomers, we have searched for the presence of circum-binary planets in a bona-fide unbiased sample of twelve post-common envelope binaries (PCEBs) from the Catalina Sky Survey (CSS) and the Sloan Digital Sky Survey (SDSS). Although the present ephemerides are significantly more accurate than previous ones, we find no clear evidence for orbital period variations between 2005 and 2011 or during the 2011 observing season. The sparse long-term coverage still permits O-C variations with a period of years and an amplitude of tens of seconds, as found in other systems. Our observations provide the basis for future inferences about the frequency with which planet-sized or brown-dwarf companions have either formed in these evolved systems or survived the common envelope (CE) phase.Comment: accepted by A&

    Semantic Web integration of Cheminformatics resources with the SADI framework

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The diversity and the largely independent nature of chemical research efforts over the past half century are, most likely, the major contributors to the current poor state of chemical computational resource and database interoperability. While open software for chemical format interconversion and database entry cross-linking have partially addressed database interoperability, computational resource integration is hindered by the great diversity of software interfaces, languages, access methods, and platforms, among others. This has, in turn, translated into limited reproducibility of computational experiments and the need for application-specific computational workflow construction and semi-automated enactment by human experts, especially where emerging interdisciplinary fields, such as systems chemistry, are pursued. Fortunately, the advent of the Semantic Web, and the very recent introduction of RESTful Semantic Web Services (SWS) may present an opportunity to integrate all of the existing computational and database resources in chemistry into a machine-understandable, unified system that draws on the entirety of the Semantic Web.</p> <p>Results</p> <p>We have created a prototype framework of Semantic Automated Discovery and Integration (SADI) framework SWS that exposes the QSAR descriptor functionality of the Chemistry Development Kit. Since each of these services has formal ontology-defined input and output classes, and each service consumes and produces RDF graphs, clients can automatically reason about the services and available reference information necessary to complete a given overall computational task specified through a simple SPARQL query. We demonstrate this capability by carrying out QSAR analysis backed by a simple formal ontology to determine whether a given molecule is drug-like. Further, we discuss parameter-based control over the execution of SADI SWS. Finally, we demonstrate the value of computational resource envelopment as SADI services through service reuse and ease of integration of computational functionality into formal ontologies.</p> <p>Conclusions</p> <p>The work we present here may trigger a major paradigm shift in the distribution of computational resources in chemistry. We conclude that envelopment of chemical computational resources as SADI SWS facilitates interdisciplinary research by enabling the definition of computational problems in terms of ontologies and formal logical statements instead of cumbersome and application-specific tasks and workflows.</p

    T Cell Receptor-Independent, CD31/IL-17A-Driven Inflammatory Axis Shapes Synovitis in Juvenile Idiopathic Arthritis

    Get PDF
    T cells are considered autoimmune effectors in juvenile idiopathic arthritis (JIA), but the antigenic cause of arthritis remains elusive. Since T cells comprise a significant proportion of joint-infiltrating cells, we examined whether the environment in the joint could be shaped through the inflammatory activation by T cells that is independent of conventional TCR signaling. We focused on the analysis of synovial fluid (SF) collected from children with oligoarticular and rheumatoid factor-negative polyarticular JIA. Cytokine profiling of SF showed dominance of five molecules including IL-17A. Cytometric analysis of the same SF samples showed enrichment of αβT cells that lacked both CD4 and CD8 co-receptors [herein called double negative (DN) T cells] and also lacked the CD28 costimulatory receptor. However, these synovial αβT cells expressed high levels of CD31, an adhesion molecule that is normally employed by granulocytes when they transit to sites of injury. In receptor crosslinking assays, ligation of CD31 alone on synovial CD28nullCD31+ DN αβT cells effectively and sufficiently induced phosphorylation of signaling substrates and increased intracytoplasmic stores of cytokines including IL-17A. CD31 ligation was also sufficient to induce RORγT expression and trans-activation of the IL-17A promoter. In addition to T cells, SF contained fibrocyte-like cells (FLC) expressing IL-17 receptor A (IL-17RA) and CD38, a known ligand for CD31. Stimulation of FLC with IL-17A led to CD38 upregulation, and to production of cytokines and tissue-destructive molecules. Addition of an oxidoreductase analog to the bioassays suppressed the CD31-driven IL-17A production by T cells. It also suppressed the downstream IL-17A-mediated production of effectors by FLC. The levels of suppression of FLC effector activities by the oxidoreductase analog were comparable to those seen with corticosteroid and/or biologic inhibitors to IL-6 and TNFα. Collectively, our data suggest that activation of a CD31-driven, αβTCR-independent, IL-17A-mediated T cell-FLC inflammatory circuit drives and/or perpetuates synovitis. With the notable finding that the oxidoreductase mimic suppresses the effector activities of synovial CD31+CD28null αβT cells and IL-17RA+CD38+ FLC, this small molecule could be used to probe further the intricacies of this inflammatory circuit. Such bioactivities of this small molecule also provide rationale for new translational avenue(s) to potentially modulate JIA synovitis
    corecore