81,161 research outputs found

    The Search for the Laws of Automatic Random Testing

    Full text link
    Can one estimate the number of remaining faults in a software system? A credible estimation technique would be immensely useful to project managers as well as customers. It would also be of theoretical interest, as a general law of software engineering. We investigate possible answers in the context of automated random testing, a method that is increasingly accepted as an effective way to discover faults. Our experimental results, derived from best-fit analysis of a variety of mathematical functions, based on a large number of automated tests of library code equipped with automated oracles in the form of contracts, suggest a poly-logarithmic law. Although further confirmation remains necessary on different code bases and testing techniques, we argue that understanding the laws of testing may bring significant benefits for estimating the number of detectable faults and comparing different projects and practices.Comment: 20 page

    Symbolic regression of generative network models

    Full text link
    Networks are a powerful abstraction with applicability to a variety of scientific fields. Models explaining their morphology and growth processes permit a wide range of phenomena to be more systematically analysed and understood. At the same time, creating such models is often challenging and requires insights that may be counter-intuitive. Yet there currently exists no general method to arrive at better models. We have developed an approach to automatically detect realistic decentralised network growth models from empirical data, employing a machine learning technique inspired by natural selection and defining a unified formalism to describe such models as computer programs. As the proposed method is completely general and does not assume any pre-existing models, it can be applied "out of the box" to any given network. To validate our approach empirically, we systematically rediscover pre-defined growth laws underlying several canonical network generation models and credible laws for diverse real-world networks. We were able to find programs that are simple enough to lead to an actual understanding of the mechanisms proposed, namely for a simple brain and a social network

    The Role of the Superior Order GLCM in the Characterization and Recognition of the Liver Tumors from Ultrasound Images

    Get PDF
    The hepatocellular carcinoma (HCC) is the most frequent malignant liver tumor. It often has a similar visual aspect with the cirrhotic parenchyma on which it evolves and with the benign liver tumors. The golden standard for HCC diagnosis is the needle biopsy, but this is an invasive, dangerous method. We aim to develop computerized,noninvasive techniques for the automatic diagnosis of HCC, based on information obtained from ultrasound images. The texture is an important property of the internal organs tissue, able to provide subtle information about the pathology. We previously defined the textural model of HCC, consisting in the exhaustive set of the relevant textural features, appropriate for HCC characterization and in the specific values of these features. In this work, we analyze the role that the superior order Grey Level Cooccurrence Matrices (GLCM) and the associated parameters have in the improvement of HCC characterization and automatic diagnosis. We also determine the best spatial relations between the pixels that lead to the highest performances, for the third, fifth and seventh order GLCM. The following classes will be considered: HCC, cirrhotic liver parenchyma on which it evolves and benign liver tumors

    Testing statistical hypothesis on random trees and applications to the protein classification problem

    Full text link
    Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov--Smirnov-type goodness-of-fit test proposed by Balding et al. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford--Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton--Watson related processes.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS218 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Feedback control of quantum state reduction

    Get PDF
    Feedback control of quantum mechanical systems must take into account the probabilistic nature of quantum measurement. We formulate quantum feedback control as a problem of stochastic nonlinear control by considering separately a quantum filtering problem and a state feedback control problem for the filter. We explore the use of stochastic Lyapunov techniques for the design of feedback controllers for quantum spin systems and demonstrate the possibility of stabilizing one outcome of a quantum measurement with unit probability

    Genetic programming for the automatic design of controllers for a surface ship

    Get PDF
    In this paper, the implementation of genetic programming (GP) to design a contoller structure is assessed. GP is used to evolve control strategies that, given the current and desired state of the propulsion and heading dynamics of a supply ship as inputs, generate the command forces required to maneuver the ship. The controllers created using GP are evaluated through computer simulations and real maneuverability tests in a laboratory water basin facility. The robustness of each controller is analyzed through the simulation of environmental disturbances. In addition, GP runs in the presence of disturbances are carried out so that the different controllers obtained can be compared. The particular vessel used in this paper is a scale model of a supply ship called CyberShip II. The results obtained illustrate the benefits of using GP for the automatic design of propulsion and navigation controllers for surface ships

    A Science of Reasoning

    Get PDF
    This paper addresses the question of how we can understand reasoning in general and mathematical proofs in particular. It argues the need for a high-level understanding of proofs to complement the low-level understanding provided by Logic. It proposes a role for computation in providing this high-level understanding, namely by the association of proof plans with proofs. Proof plans are defined and examples are given for two families of proofs. Criteria are given for assessing the association of a proof plan with a proof. 1 Motivation: the understanding of mathematical proofs The understanding of reasoning has interested researchers since, at least, Aristotle. Logic has been proposed by Aristotle, Boole, Frege and others as a way of formalising arguments and understanding their structure. There have also been psychological studies of how people and animals actually do reason. The work on Logic has been especially influential in the automation of reasoning. For instance, resolution..

    An automatic part-of-speech tagger for Middle Low German

    Get PDF
    Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopic language research. Such corpora have recently been developed for a variety of historical languages, or are still under development. One of those under development is the fully tagged and parsed Corpus of Historical Low German (CHLG), which is aimed at facilitating research into the highly under-researched diachronic syntax of Low German. The present paper reports on a crucial step in creating the corpus, viz. the creation of a part-of-speech tagger for Middle Low German (MLG). Having been transmitted in several non-standardised written varieties, MLG poses a challenge to standard POS taggers, which usually rely on normalized spelling. We outline the major issues faced in the creation of the tagger and present our solutions to them
    corecore