157 research outputs found

    GA-ANN Short-Term Electricity Load Forecasting

    Get PDF
    This paper presents a methodology for short-term load forecasting based on genetic algorithm feature selection and artificial neural network modeling. A feed forward artificial neural network is used to model the 24-h ahead load based on past consumption, weather and stock index data. A genetic algorithm is used in order to find the best subset of variables for modeling. Three data sets of different geographical locations, encompassing areas of different dimensions with distinct load profiles are used in order to evaluate the methodology. The developed approach was found to generate models achieving a minimum mean average percentage error under 2 %. The feature selection algorithm was able to significantly reduce the number of used features and increase the accuracy of the models

    On reliable discovery of molecular signatures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR) in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability.</p> <p>Results</p> <p>We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings.</p> <p>Conclusion</p> <p>Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.</p

    Thermodynamic analysis of inverted bifurcation

    Get PDF
    We present a thermodynamic analysis of inverted bifurcation in binary mixtures heated from below. From this analysis it follows that an inverted bifurcation is caused by the competition between a stabilizing effect with a long relaxation time and a destabilizing effect with a short relaxation time. These conditions are precisely the same as those that give rise to overstability. This might explain why overstability and inverted bifurcation occur in the same systems

    Analysis and Computational Dissection of Molecular Signature Multiplicity

    Get PDF
    Molecular signatures are computational or mathematical models created to diagnose disease and other phenotypes and to predict clinical outcomes and response to treatment. It is widely recognized that molecular signatures constitute one of the most important translational and basic science developments enabled by recent high-throughput molecular assays. A perplexing phenomenon that characterizes high-throughput data analysis is the ubiquitous multiplicity of molecular signatures. Multiplicity is a special form of data analysis instability in which different analysis methods used on the same data, or different samples from the same population lead to different but apparently maximally predictive signatures. This phenomenon has far-reaching implications for biological discovery and development of next generation patient diagnostics and personalized treatments. Currently the causes and interpretation of signature multiplicity are unknown, and several, often contradictory, conjectures have been made to explain it. We present a formal characterization of signature multiplicity and a new efficient algorithm that offers theoretical guarantees for extracting the set of maximally predictive and non-redundant signatures independent of distribution. The new algorithm identifies exactly the set of optimal signatures in controlled experiments and yields signatures with significantly better predictivity and reproducibility than previous algorithms in human microarray gene expression datasets. Our results shed light on the causes of signature multiplicity, provide computational tools for studying it empirically and introduce a framework for in silico bioequivalence of this important new class of diagnostic and personalized medicine modalities

    Cooperation of Mtmr8 with PI3K Regulates Actin Filament Modeling and Muscle Development in Zebrafish

    Get PDF
    It has been shown that mutations in at least four myotubularin family genes (MTM1, MTMR1, 2 and 13) are causative for human neuromuscular disorders. However, the pathway and regulative mechanism remain unknown.Here, we reported a new role for Mtmr8 in neuromuscular development of zebrafish. Firstly, we cloned and characterized zebrafish Mtmr8, and revealed the expression pattern predominantly in the eye field and somites during early somitogenesis. Using morpholino knockdown, then, we observed that loss-of-function of Mtmr8 led to defects in somitogenesis. Subsequently, the possible underlying mechanism and signal pathway were examined. We first checked the Akt phosphorylation, and observed an increase of Akt phosphorylation in the morphant embryos. Furthermore, we studied the PH/G domain function within Mtmr8. Although the PH/G domain deletion by itself did not result in embryonic defect, addition of PI3K inhibitor LY294002 did give a defective phenotype in the PH/G deletion morphants, indicating that the PH/G domain was essential for Mtmr8's function. Moreover, we investigated the cooperation of Mtmr8 with PI3K in actin filament modeling and muscle development, and found that both Mtmr8-MO1 and Mtmr8-MO2+LY294002 led to the disorganization of the actin cytoskeleton. In addition, we revealed a possible participation of Mtmr8 in the Hedgehog pathway, and cell transplantation experiments showed that Mtmr8 worked in a non-cell autonomous manner in actin modeling.The above data indicate that a conserved functional cooperation of Mtmr8 with PI3K regulates actin filament modeling and muscle development in zebrafish, and reveal a possible participation of Mtmr8 in the Hedgehog pathway. Therefore, this work provides a new clue to study the physiological function of MTM family members

    Effect of Size and Heterogeneity of Samples on Biomarker Discovery: Synthetic and Real Data Assessment

    Get PDF
    MOTIVATION: The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for the discovery of biomarkers using microarray data often provide results with limited overlap. These differences are imputable to 1) dataset size (few subjects with respect to the number of features); 2) heterogeneity of the disease; 3) heterogeneity of experimental protocols and computational pipelines employed in the analysis. In this paper, we focus on the first two issues and assess, both on simulated (through an in silico regulation network model) and real clinical datasets, the consistency of candidate biomarkers provided by a number of different methods. METHODS: We extensively simulated the effect of heterogeneity characteristic of complex diseases on different sets of microarray data. Heterogeneity was reproduced by simulating both intrinsic variability of the population and the alteration of regulatory mechanisms. Population variability was simulated by modeling evolution of a pool of subjects; then, a subset of them underwent alterations in regulatory mechanisms so as to mimic the disease state. RESULTS: The simulated data allowed us to outline advantages and drawbacks of different methods across multiple studies and varying number of samples and to evaluate precision of feature selection on a benchmark with known biomarkers. Although comparable classification accuracy was reached by different methods, the use of external cross-validation loops is helpful in finding features with a higher degree of precision and stability. Application to real data confirmed these results

    Do serum biomarkers really measure breast cancer?

    Get PDF
    Background Because screening mammography for breast cancer is less effective for premenopausal women, we investigated the feasibility of a diagnostic blood test using serum proteins. Methods This study used a set of 98 serum proteins and chose diagnostically relevant subsets via various feature-selection techniques. Because of significant noise in the data set, we applied iterated Bayesian model averaging to account for model selection uncertainty and to improve generalization performance. We assessed generalization performance using leave-one-out cross-validation (LOOCV) and receiver operating characteristic (ROC) curve analysis. Results The classifiers were able to distinguish normal tissue from breast cancer with a classification performance of AUC = 0.82 ± 0.04 with the proteins MIF, MMP-9, and MPO. The classifiers distinguished normal tissue from benign lesions similarly at AUC = 0.80 ± 0.05. However, the serum proteins of benign and malignant lesions were indistinguishable (AUC = 0.55 ± 0.06). The classification tasks of normal vs. cancer and normal vs. benign selected the same top feature: MIF, which suggests that the biomarkers indicated inflammatory response rather than cancer. Conclusion Overall, the selected serum proteins showed moderate ability for detecting lesions. However, they are probably more indicative of secondary effects such as inflammation rather than specific for malignancy.United States. Dept. of Defense. Breast Cancer Research Program (Grant No. W81XWH-05-1-0292)National Institutes of Health (U.S.) (R01 CA-112437-01)National Institutes of Health (U.S.) (NIH CA 84955

    Understanding hereditary diseases using the dog and human as companion model systems

    Get PDF
    Animal models are requisite for genetic dissection of, and improved treatment regimens for, human hereditary diseases. While several animals have been used in academic and industrial research, the primary model for dissection of hereditary diseases has been the many strains of the laboratory mouse. However, given its greater (than the mouse) genetic similarity to the human, high number of naturally occurring hereditary diseases, unique population structure, and the availability of the complete genome sequence, the purebred dog has emerged as a powerful model for study of diseases. The major advantage the dog provides is that it is afflicted with approximately 450 hereditary diseases, about half of which have remarkable clinical similarities to corresponding diseases of the human. In addition, humankind has a strong desire to cure diseases of the dog so these two facts make the dog an ideal clinical and genetic model. This review highlights several of these shared hereditary diseases. Specifically, the canine models discussed herein have played important roles in identification of causative genes and/or have been utilized in novel therapeutic approaches of interest to the dog and human
    corecore