4,884 research outputs found
JGraphT -- A Java library for graph data structures and algorithms
Mathematical software and graph-theoretical algorithmic packages to
efficiently model, analyze and query graphs are crucial in an era where
large-scale spatial, societal and economic network data are abundantly
available. One such package is JGraphT, a programming library which contains
very efficient and generic graph data-structures along with a large collection
of state-of-the-art algorithms. The library is written in Java with stability,
interoperability and performance in mind. A distinctive feature of this library
is the ability to model vertices and edges as arbitrary objects, thereby
permitting natural representations of many common networks including
transportation, social and biological networks. Besides classic graph
algorithms such as shortest-paths and spanning-tree algorithms, the library
contains numerous advanced algorithms: graph and subgraph isomorphism; matching
and flow problems; approximation algorithms for NP-hard problems such as
independent set and TSP; and several more exotic algorithms such as Berge graph
detection. Due to its versatility and generic design, JGraphT is currently used
in large-scale commercial, non-commercial and academic research projects. In
this work we describe in detail the design and underlying structure of the
library, and discuss its most important features and algorithms. A
computational study is conducted to evaluate the performance of JGraphT versus
a number of similar libraries. Experiments on a large number of graphs over a
variety of popular algorithms show that JGraphT is highly competitive with
other established libraries such as NetworkX or the BGL.Comment: Major Revisio
Applying MDL to Learning Best Model Granularity
The Minimum Description Length (MDL) principle is solidly based on a provably
ideal method of inference using Kolmogorov complexity. We test how the theory
behaves in practice on a general problem in model selection: that of learning
the best model granularity. The performance of a model depends critically on
the granularity, for example the choice of precision of the parameters. Too
high precision generally involves modeling of accidental noise and too low
precision may lead to confusion of models that should be distinguished. This
precision is often determined ad hoc. In MDL the best model is the one that
most compresses a two-part code of the data set: this embodies ``Occam's
Razor.'' In two quite different experimental settings the theoretical value
determined using MDL coincides with the best value found experimentally. In the
first experiment the task is to recognize isolated handwritten characters in
one subject's handwriting, irrespective of size and orientation. Based on a new
modification of elastic matching, using multiple prototypes per character, the
optimal prediction rate is predicted for the learned parameter (length of
sampling interval) considered most likely by MDL, which is shown to coincide
with the best value found experimentally. In the second experiment the task is
to model a robot arm with two degrees of freedom using a three layer
feed-forward neural network where we need to determine the number of nodes in
the hidden layer giving best modeling performance. The optimal model (the one
that extrapolizes best on unseen examples) is predicted for the number of nodes
in the hidden layer considered most likely by MDL, which again is found to
coincide with the best value found experimentally.Comment: LaTeX, 32 pages, 5 figures. Artificial Intelligence journal, To
appea
A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm
K-means is undoubtedly the most widely used partitional clustering algorithm.
Unfortunately, due to its gradient descent nature, this algorithm is highly
sensitive to the initial placement of the cluster centers. Numerous
initialization methods have been proposed to address this problem. In this
paper, we first present an overview of these methods with an emphasis on their
computational efficiency. We then compare eight commonly used linear time
complexity initialization methods on a large and diverse collection of data
sets using various performance criteria. Finally, we analyze the experimental
results using non-parametric statistical tests and provide recommendations for
practitioners. We demonstrate that popular initialization methods often perform
poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table
An overview of decision table literature 1982-1995.
This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.
Automating iterative tasks with programming by demonstration
Programming by demonstration is an end-user programming technique that allows people to create programs by showing the computer examples of what they want to do. Users do not need specialised programming skills. Instead, they instruct the computer by demonstrating examples, much as they might show another person how to do the task. Programming by demonstration empowers users to create programs that perform tedious and time-consuming computer chores. However, it is not in widespread use, and is instead confined to research applications that end users never see. This makes it difficult to evaluate programming by demonstration tools and techniques.
This thesis claims that domain-independent programming by demonstration can be made available in existing applications and used to automate iterative tasks by end users. It is supported by Familiar, a domain-independent, AppleScript-based programming-by-demonstration tool embodying standard machine learning algorithms. Familiar is designed for end users, so works in the existing applications that they regularly use.
The assertion that programming by demonstration can be made available in existing applications is validated by identifying the relevant platform requirements and a range of platforms that meet them. A detailed scrutiny of AppleScript highlights problems with the architecture and with many implementations, and yields a set of guidelines for designing applications that support programming-by-demonstration.
An evaluation shows that end users are capable of using programming by demonstration to automate iterative tasks. However, the subjects tended to prefer other tools, choosing Familiar only when the alternatives were unsuitable or unavailable. Familiar's inferencing is evaluated on an extensive set of examples, highlighting the tasks it can perform and the functionality it requires
Uses and applications of artificial intelligence in manufacturing
The purpose of the THESIS is to provide engineers and personnels with a overview of the concepts that underline Artificial Intelligence and Expert Systems. Artificial Intelligence is concerned with the developments of theories and techniques required to provide a computational engine with the abilities to perceive, think and act, in an intelligent manner in a complex environment.
Expert system is branch of Artificial Intelligence where the methods of reasoning emulate those of human experts. Artificial Intelligence derives it\u27s power from its ability to represent complex forms of knowledge, some of it common sense, heuristic and symbolic, and the ability to apply the knowledge in searching for solutions.
The Thesis will review : The components of an intelligent system, The basics of knowledge representation, Search based problem solving methods, Expert system technologies, Uses and applications of AI in various manufacturing areas like Design, Process Planning, Production Management, Energy Management, Quality Assurance, Manufacturing Simulation, Robotics, Machine Vision etc.
Prime objectives of the Thesis are to understand the basic concepts underlying Artificial Intelligence and be able to identify where the technology may be applied in the field of Manufacturing Engineering
Integrating protein structural information
Dissertação apresentada para obtenção
de Grau de Doutor em BioquĂmica,BioquĂmica Estrutural, pela Universidade Nova de Lisboa, Faculdade de CiĂȘncias e TecnologiaThe central theme of this work is the application of constraint programming and other artificial intelligence techniques to protein structure problems, with the goal of better combining experimental data with structure prediction methods.
Part one of the dissertation introduces the main subjects of protein structure and constraint programming, summarises the state of the art in the modelling of protein structures and complexes, sets the context for the techniques described later on, and outlines the main points of the thesis: the integration of experimental data in modelling.
The first chapter, Protein Structure, introduces the reader to the basic notions of amino acid structure, protein chains, and protein folding and interaction. These are important concepts to understand the work described in parts two and three.
Chapter two, Protein Modelling, gives a brief overview of experimental and theoretical techniques to model protein structures. The information in this chapter provides the context of the investigations described in parts two and three, but is not essential to understanding the methods
developed.
Chapter three, Constraint Programming, outlines the main concepts of this programming technique. Understanding variable modelling, the notions of consistency and propagation, and search methods should greatly help the reader interested in the details of the algorithms, as described in part two of this book.
The fourth chapter, Integrating Structural Information, is a summary of the thesis proposed here.
This chapter is an overview of the objectives of this work, and gives an idea of how the algorithms developed here could help in modelling protein structures. The main goal is to provide a flexible and continuously evolving framework for the integration of structural information from a diversity of experimental techniques and theoretical predictions.
Part two describes the algorithms developed, which make up the main original contribution of this work. This part is aimed especially at developers interested in the details of the algorithms, in replicating the results, in improving the method or in integrating them in other applications.
Biochemical aspects are dealt with briefly and as necessary, and the emphasis is on the algorithms and the code
- âŠ