4,884 research outputs found

    JGraphT -- A Java library for graph data structures and algorithms

    Full text link
    Mathematical software and graph-theoretical algorithmic packages to efficiently model, analyze and query graphs are crucial in an era where large-scale spatial, societal and economic network data are abundantly available. One such package is JGraphT, a programming library which contains very efficient and generic graph data-structures along with a large collection of state-of-the-art algorithms. The library is written in Java with stability, interoperability and performance in mind. A distinctive feature of this library is the ability to model vertices and edges as arbitrary objects, thereby permitting natural representations of many common networks including transportation, social and biological networks. Besides classic graph algorithms such as shortest-paths and spanning-tree algorithms, the library contains numerous advanced algorithms: graph and subgraph isomorphism; matching and flow problems; approximation algorithms for NP-hard problems such as independent set and TSP; and several more exotic algorithms such as Berge graph detection. Due to its versatility and generic design, JGraphT is currently used in large-scale commercial, non-commercial and academic research projects. In this work we describe in detail the design and underlying structure of the library, and discuss its most important features and algorithms. A computational study is conducted to evaluate the performance of JGraphT versus a number of similar libraries. Experiments on a large number of graphs over a variety of popular algorithms show that JGraphT is highly competitive with other established libraries such as NetworkX or the BGL.Comment: Major Revisio

    Applying MDL to Learning Best Model Granularity

    Get PDF
    The Minimum Description Length (MDL) principle is solidly based on a provably ideal method of inference using Kolmogorov complexity. We test how the theory behaves in practice on a general problem in model selection: that of learning the best model granularity. The performance of a model depends critically on the granularity, for example the choice of precision of the parameters. Too high precision generally involves modeling of accidental noise and too low precision may lead to confusion of models that should be distinguished. This precision is often determined ad hoc. In MDL the best model is the one that most compresses a two-part code of the data set: this embodies ``Occam's Razor.'' In two quite different experimental settings the theoretical value determined using MDL coincides with the best value found experimentally. In the first experiment the task is to recognize isolated handwritten characters in one subject's handwriting, irrespective of size and orientation. Based on a new modification of elastic matching, using multiple prototypes per character, the optimal prediction rate is predicted for the learned parameter (length of sampling interval) considered most likely by MDL, which is shown to coincide with the best value found experimentally. In the second experiment the task is to model a robot arm with two degrees of freedom using a three layer feed-forward neural network where we need to determine the number of nodes in the hidden layer giving best modeling performance. The optimal model (the one that extrapolizes best on unseen examples) is predicted for the number of nodes in the hidden layer considered most likely by MDL, which again is found to coincide with the best value found experimentally.Comment: LaTeX, 32 pages, 5 figures. Artificial Intelligence journal, To appea

    A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table

    An overview of decision table literature 1982-1995.

    Get PDF
    This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

    Automating iterative tasks with programming by demonstration

    Get PDF
    Programming by demonstration is an end-user programming technique that allows people to create programs by showing the computer examples of what they want to do. Users do not need specialised programming skills. Instead, they instruct the computer by demonstrating examples, much as they might show another person how to do the task. Programming by demonstration empowers users to create programs that perform tedious and time-consuming computer chores. However, it is not in widespread use, and is instead confined to research applications that end users never see. This makes it difficult to evaluate programming by demonstration tools and techniques. This thesis claims that domain-independent programming by demonstration can be made available in existing applications and used to automate iterative tasks by end users. It is supported by Familiar, a domain-independent, AppleScript-based programming-by-demonstration tool embodying standard machine learning algorithms. Familiar is designed for end users, so works in the existing applications that they regularly use. The assertion that programming by demonstration can be made available in existing applications is validated by identifying the relevant platform requirements and a range of platforms that meet them. A detailed scrutiny of AppleScript highlights problems with the architecture and with many implementations, and yields a set of guidelines for designing applications that support programming-by-demonstration. An evaluation shows that end users are capable of using programming by demonstration to automate iterative tasks. However, the subjects tended to prefer other tools, choosing Familiar only when the alternatives were unsuitable or unavailable. Familiar's inferencing is evaluated on an extensive set of examples, highlighting the tasks it can perform and the functionality it requires

    Uses and applications of artificial intelligence in manufacturing

    Get PDF
    The purpose of the THESIS is to provide engineers and personnels with a overview of the concepts that underline Artificial Intelligence and Expert Systems. Artificial Intelligence is concerned with the developments of theories and techniques required to provide a computational engine with the abilities to perceive, think and act, in an intelligent manner in a complex environment. Expert system is branch of Artificial Intelligence where the methods of reasoning emulate those of human experts. Artificial Intelligence derives it\u27s power from its ability to represent complex forms of knowledge, some of it common sense, heuristic and symbolic, and the ability to apply the knowledge in searching for solutions. The Thesis will review : The components of an intelligent system, The basics of knowledge representation, Search based problem solving methods, Expert system technologies, Uses and applications of AI in various manufacturing areas like Design, Process Planning, Production Management, Energy Management, Quality Assurance, Manufacturing Simulation, Robotics, Machine Vision etc. Prime objectives of the Thesis are to understand the basic concepts underlying Artificial Intelligence and be able to identify where the technology may be applied in the field of Manufacturing Engineering

    Integrating protein structural information

    Get PDF
    Dissertação apresentada para obtenção de Grau de Doutor em BioquĂ­mica,BioquĂ­mica Estrutural, pela Universidade Nova de Lisboa, Faculdade de CiĂȘncias e TecnologiaThe central theme of this work is the application of constraint programming and other artificial intelligence techniques to protein structure problems, with the goal of better combining experimental data with structure prediction methods. Part one of the dissertation introduces the main subjects of protein structure and constraint programming, summarises the state of the art in the modelling of protein structures and complexes, sets the context for the techniques described later on, and outlines the main points of the thesis: the integration of experimental data in modelling. The first chapter, Protein Structure, introduces the reader to the basic notions of amino acid structure, protein chains, and protein folding and interaction. These are important concepts to understand the work described in parts two and three. Chapter two, Protein Modelling, gives a brief overview of experimental and theoretical techniques to model protein structures. The information in this chapter provides the context of the investigations described in parts two and three, but is not essential to understanding the methods developed. Chapter three, Constraint Programming, outlines the main concepts of this programming technique. Understanding variable modelling, the notions of consistency and propagation, and search methods should greatly help the reader interested in the details of the algorithms, as described in part two of this book. The fourth chapter, Integrating Structural Information, is a summary of the thesis proposed here. This chapter is an overview of the objectives of this work, and gives an idea of how the algorithms developed here could help in modelling protein structures. The main goal is to provide a flexible and continuously evolving framework for the integration of structural information from a diversity of experimental techniques and theoretical predictions. Part two describes the algorithms developed, which make up the main original contribution of this work. This part is aimed especially at developers interested in the details of the algorithms, in replicating the results, in improving the method or in integrating them in other applications. Biochemical aspects are dealt with briefly and as necessary, and the emphasis is on the algorithms and the code

    Bibliographie

    Get PDF
    • 

    corecore