215 research outputs found

    Learning and predicting time series by neural networks

    Full text link
    Artificial neural networks which are trained on a time series are supposed to achieve two abilities: firstly to predict the series many time steps ahead and secondly to learn the rule which has produced the series. It is shown that prediction and learning are not necessarily related to each other. Chaotic sequences can be learned but not predicted while quasiperiodic sequences can be well predicted but not learned.Comment: 5 page

    Does the biomarker search paradigm need re-booting?

    Get PDF
    The clinical problem of bladder cancer is its high recurrence and progression, and that the most sensitive and specific means of monitoring is cystoscopy, which is invasive and has poor patient compliance. Biomarkers for recurrence and progression could make a great contribution, but in spite of decades of research, no biomarkers are commercially available with the requisite sensitivity and specificity. In the post-genomic age, the means to search the entire genome for biomarkers has become available, but the conventional approaches to biomarker discovery are entirely inadequate to yield results with the new technology. Finding clinically useful biomarker panels with sensitivity and specificity equal to that of cystoscopy is a problem of systems biology

    Secure and linear cryptosystems using error-correcting codes

    Full text link
    A public-key cryptosystem, digital signature and authentication procedures based on a Gallager-type parity-check error-correcting code are presented. The complexity of the encryption and the decryption processes scale linearly with the size of the plaintext Alice sends to Bob. The public-key is pre-corrupted by Bob, whereas a private-noise added by Alice to a given fraction of the ciphertext of each encrypted plaintext serves to increase the secure channel and is the cornerstone for digital signatures and authentication. Various scenarios are discussed including the possible actions of the opponent Oscar as an eavesdropper or as a disruptor

    The dynamics of proving uncolourability of large random graphs I. Symmetric Colouring Heuristic

    Full text link
    We study the dynamics of a backtracking procedure capable of proving uncolourability of graphs, and calculate its average running time T for sparse random graphs, as a function of the average degree c and the number of vertices N. The analysis is carried out by mapping the history of the search process onto an out-of-equilibrium (multi-dimensional) surface growth problem. The growth exponent of the average running time is quantitatively predicted, in agreement with simulations.Comment: 5 figure

    Predictive gene lists for breast cancer prognosis: A topographic visualisation study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists.</p> <p>Methods</p> <p>We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether <it>a-posteriori </it>two prognosis groups are separable on the evidence of the gene lists.</p> <p>A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset.</p> <p>Results</p> <p>The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results.</p> <p>Conclusion</p> <p>The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers.</p> <p>However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses.</p> <p>We conclude that many of the patients involved in such medical studies are <it>intrinsically unclassifiable </it>on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.</p

    The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures

    Get PDF
    Motivation: Biomarker discovery from high-dimensional data is a crucial problem with enormous applications in biology and medicine. It is also extremely challenging from a statistical viewpoint, but surprisingly few studies have investigated the relative strengths and weaknesses of the plethora of existing feature selection methods. Methods: We compare 32 feature selection methods on 4 public gene expression datasets for breast cancer prognosis, in terms of predictive performance, stability and functional interpretability of the signatures they produce. Results: We observe that the feature selection method has a significant influence on the accuracy, stability and interpretability of signatures. Simple filter methods generally outperform more complex embedded or wrapper methods, and ensemble feature selection has generally no positive effect. Overall a simple Student's t-test seems to provide the best results. Availability: Code and data are publicly available at http://cbio.ensmp.fr/~ahaury/

    Multi-Player and Multi-Choice Quantum Game

    Full text link
    We investigate a multi-player and multi-choice quantum game. We start from two-player and two-choice game and the result is better than its classical version. Then we extend it to N-player and N-choice cases. In the quantum domain, we provide a strategy with which players can always avoid the worst outcome. Also, by changing the value of the parameter of the initial state, the probabilities for players to obtain the best payoff will be much higher that in its classical version.Comment: 4 pages, 1 figur

    On reliable discovery of molecular signatures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR) in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability.</p> <p>Results</p> <p>We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings.</p> <p>Conclusion</p> <p>Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.</p
    • …