622 research outputs found

    Automated data pre-processing via meta-learning

    Get PDF
    The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version

    PRESISTANT : data pre-processing assistant

    Get PDF
    A concrete classification algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. Typically, in order to improve the results, datasets need to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and non-experienced users become overwhelmed. Trial and error is not feasible in the presence of big amounts of data. We developed a method and tool—PRESISTANT, with the aim of answering the need for user assistance during data pre-processing. Leveraging ideas from meta-learning, PRESISTANT is capable of assisting the user by recommending pre-processing operators that ultimately improve the classification performance. The user selects a classification algorithm, from the ones considered, and then PRESISTANT proposes candidate transformations to improve the result of the analysis. In the demonstration, participants will experience, at first hand, how PRESISTANT easily and effectively ranks the pre-processing operators.Peer ReviewedPostprint (author's final draft

    Coupling streaming AI and HPC ensembles to achieve 100-1000x faster biomolecular simulations

    Full text link
    Machine learning (ML)-based steering can improve the performance of ensemble-based simulations by allowing for online selection of more scientifically meaningful computations. We present DeepDriveMD, a framework for ML-driven steering of scientific simulations that we have used to achieve orders-of-magnitude improvements in molecular dynamics (MD) performance via effective coupling of ML and HPC on large parallel computers. We discuss the design of DeepDriveMD and characterize its performance. We demonstrate that DeepDriveMD can achieve between 100-1000x acceleration for protein folding simulations relative to other methods, as measured by the amount of simulated time performed, while covering the same conformational landscape as quantified by the states sampled during a simulation. Experiments are performed on leadership-class platforms on up to 1020 nodes. The results establish DeepDriveMD as a high-performance framework for ML-driven HPC simulation scenarios, that supports diverse MD simulation and ML back-ends, and which enables new scientific insights by improving the length and time scales accessible with current computing capacity

    cGMP-independent nitric oxide signaling and regulation of the cell cycle

    Get PDF
    BACKGROUND: Regulatory functions of nitric oxide (NO(•)) that bypass the second messenger cGMP are incompletely understood. Here, cGMP-independent effects of NO(• )on gene expression were globally examined in U937 cells, a human monoblastoid line that constitutively lacks soluble guanylate cyclase. Differentiated U937 cells (>80% in G0/G1) were exposed to S-nitrosoglutathione, a NO(• )donor, or glutathione alone (control) for 6 h without or with dibutyryl-cAMP (Bt(2)cAMP), and then harvested to extract total RNA for microarray analysis. Bt(2)cAMP was used to block signaling attributable to NO(•)-induced decreases in cAMP. RESULTS: NO(• )regulated 110 transcripts that annotated disproportionately to the cell cycle and cell proliferation (47/110, 43%) and more frequently than expected contained AU-rich, post-transcriptional regulatory elements (ARE). Bt(2)cAMP regulated 106 genes; cell cycle gene enrichment did not reach significance. Like NO(•), Bt(2)cAMP was associated with ARE-containing transcripts. A comparison of NO(• )and Bt(2)cAMP effects showed that NO(• )regulation of cell cycle genes was independent of its ability to interfere with cAMP signaling. Cell cycle genes induced by NO(• )annotated to G1/S (7/8) and included E2F1 and p21/Waf1/Cip1; 6 of these 7 were E2F target genes involved in G1/S transition. Repressed genes were G2/M associated (24/27); 8 of 27 were known targets of p21. E2F1 mRNA and protein were increased by NO(•), as was E2F1 binding to E2F promoter elements. NO(• )activated p38 MAPK, stabilizing p21 mRNA (an ARE-containing transcript) and increasing p21 protein; this increased protein binding to CDE/CHR promoter sites of p21 target genes, repressing key G2/M phase genes, and increasing the proportion of cells in G2/M. CONCLUSION: NO(• )coordinates a highly integrated program of cell cycle arrest that regulates a large number of genes, but does not require signaling through cGMP. In humans, antiproliferative effects of NO(• )may rely substantially on cGMP-independent mechanisms. Stress kinase signaling and alterations in mRNA stability appear to be major pathways by which NO(• )regulates the transcriptome

    Incidence and Risk Factors of Recurrence after Surgery for Pathology-proven Diverticular Disease

    Get PDF
    Contains fulltext : 69776.pdf (publisher's version ) (Closed access)BACKGROUND: Diverticular disease is a common problem in Western countries. Rationale for elective surgery is to prevent recurrent complicated diverticulitis and to reduce emergency procedures. Recurrent diverticulitis occurs in about 10% after resection. The pathogenesis for recurrence is not completely understood. We studied the incidence and risk factors for recurrence and the overall morbidity and mortality of surgical therapy for diverticular disease. METHODS: Medical records of 183 consecutive patients with pathology-proven diverticulitis were eligible for evaluation. Mean duration of follow-up was 7.2 years. Number of preoperative episodes, emergency or elective surgeries, type of operation, level of anastomosis, postoperative complications, persistent postoperative pain, complications associated with colostomy reversal, and recurrent diverticulitis were noted. The Kaplan-Meier method was used to calculate the cumulative probability of recurrence. Cox regression was used to identify possible risk factors for recurrence. RESULTS: The incidence of recurrence was 8.7%, with an estimated risk of recurrence over a 15-year period of 16%. Risk factors associated with recurrence were (younger) age (p < 0.02) and the persistence of postoperative pain (p < 0.005). Persistent abdominal pain after surgery was present in 22%. Eighty percent of patients who needed emergency surgery for acute diverticulitis had no manifestation of diverticular disease prior to surgery. In addition, recurrent diverticulitis was not associated with a higher percentage of emergency procedures. CONCLUSION: Estimated risk of recurrence is high and abdominal complaints after surgical therapy for diverticulitis are frequent. Younger age and persistence of postoperative symptoms predict recurrent diverticulitis after resection. The clinical implication of these findings needs further investigation. The results of this study support the careful selection of patients for surgery for diverticulitis

    The counselling self-estimate inventory (COSE): Does it work in Chinese counsellors?

    Get PDF
    Counselling self-efficacy is an important construct for research and evaluation in counsellors' competencies and training effectiveness. Larson et al. developed the Counselling Self-Estimate Inventory (COSE) for counsellors in America and examined its factor structure using exploratory factor analysis. They recommended a five-factor model (microskills, counselling process, difficult client behaviour, cultural competence, and awareness of values) and the use of the COSE for future research. However, little research has investigated the validity of the COSE in the context of counselling Chinese students in schools. In the present study, the factor structure of responses to the Chinese version of the Counselling Self-Estimate Inventory in a sample of 578 Hong Kong secondary school guidance teachers was examined using the EQS approach to confirmatory factor analysis. The results showed that while a five-factor model was fairly able to fit the data, the deletion of items related to the awareness of values factor yielded a better fitting model. The discussion of potential uses and limitations of the C-COSE in the context of preparing and supervising school guidance personnel in student counselling is relevant to counselling psychologists and researchers in Hong Kong and other parts of the world.postprin

    Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions

    Get PDF
    Measuring the internal quality of source code is one of the traditional goals of making software development into an engineering discipline. Cyclomatic Complexity (CC) is an often used source code quality metric, next to Source Lines of Code (SLOC). However, the use of the CC metric is challenged by the repeated claim that CC is redundant with respect to SLOC due to strong linear correlation. We conducted an extensive literature study of the CC/SLOC correlation results. Next, we tested correlation on large Java (17.6 M methods) and C (6.3 M functions) corpora. Our results show that linear correlation between SLOC and CC is only moderate as caused by increasingly high variance. We further observe that aggregating CC and SLOC as well as performing a power transform improves the correlation. Our conclusion is that the observed linear correlation between CC and SLOC of Java methods or C functions is not strong enough to conclude that CC is redundant with SLOC. This conclusion contradicts earlier claims from literature, but concurs with the widely accepted practice of measuring of CC next to SLOC

    A systematic review of high-fibre dietary therapy in diverticular disease

    Get PDF
    The exact pathogenesis of diverticular disease of the sigmoid colon is not well established. However, the hypothesis that a low-fibre diet may result in diverticulosis and a high-fibre diet will prevent symptoms or complications of diverticular disease is widely accepted. The aim of this review is to assess whether a high-fibre diet can improve symptoms and/or prevent complications of diverticular disease of the sigmoid colon and/or prevent recurrent diverticulitis after a primary episode. Clinical studies were eligible for inclusion if they assessed the treatment of diverticular disease or the prevention of recurrent diverticulitis with a high-fibre diet. The following exclusion criteria were used for study selection: studies without comparison of the patient group with a control group. No studies concerning prevention of recurrent diverticulitis with a high-fibre diet met our inclusion criteria. Three randomised controlled trials (RCT) and one case-control study were included in this systematic review. One RCT of moderate quality showed no difference in the primary endpoints. A second RCT of moderate quality and the case-control study found a significant difference in favour of a high-fibre diet in the treatment of symptomatic diverticular disease. The third RCT of moderate quality found a significant difference in favour of methylcellulose (fibre supplement). This study also showed a placebo effect. High-quality evidence for a high-fibre diet in the treatment of diverticular disease is lacking, and most recommendations are based on inconsistent level 2 and mostly level 3 evidence. Nevertheless, high-fibre diet is still recommended in several guideline
    corecore