Search CORE

6 research outputs found

Substructure Discovery Using Minimum Description Length and Background Knowledge

Author: Cook D. J.
Holder L. B.
Publication venue
Publication date: 01/01/1994
Field of study

The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain. Description of Online Appendix: This is a compressed tar file containing the SUBDUE discovery system, written in C. The program accepts as input databases represented in graph form, and will output discovered substructures with their corresponding value.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

arXiv.org e-Print Archive

CiteSeerX

Analysis of Three-Dimensional Protein Images

Author: Baxter K.
Fortier S.
Glasgow J.
Leherte L.
Steeg E.
Publication venue
Publication date: 01/01/1997
Field of study

A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Repository of the University of Namur

Machine discovery of protein motifs

Author: A. Sali
B. Falkenhainer
C. Chothia
C. S. Ring
D. Conklin
D. H. Fisher
D. Haussler
D. T. Jones
Darrell Conklin
F. C. Bernstein
F. E. Cohen
G. Salton
J. H. Gennari
J. Larkin
J. M. Thornton
J. W. Ponder
M. J. E. Sternberg
M. J. Rooman
M. J. Rooman
M. J. Rooman
M. J. Rooman
M. J. Rooman
M. Lebowitz
M. Levitt
N. Colloc'h
P. H. Winston
R. D. King
R. F. Smith
R. H. Lathrop
R. Nussinov
R. Unger
S. J. Prestrelski
S. Markovitch
T. A. Jones
T. L. Blundell
W. Kabsch
W. Kabsch
W. R. Taylor
X. Zhang
Y. Matsuo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1995
Field of study

Crossref

The learnability of description logics with equality constraints

Author: A. Blumer
A. Borgida
A. Frisch
A. Frisch
D. Angluin
D. Bobrow
D. Conklin
D. Haussler
D. Helmbold
F. Pfenning
G. D. Plotkin
Haym Hirsh
J. R. Quinlan
K. Morik
L. G. Valiant
L. Pitt
L. Pitt
M. Gold
M. Kearns
M. R. Quillian
M. Vilain
N. Littlestone
P. Devanbu
P. Idestam-Almquist
R. L. Rivest
S. Muggleton
T. G. Dietterich
W. Buntine
W. W. Cohen
William W. Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1994
Field of study

Crossref

Recommended from our members

Explaining Data Patterns using Knowledge from the Web of Data

Author: Tiddi Ilaria
Publication venue
Publication date: 28/11/2016
Field of study

Knowledge Discovery (KD) is a long-tradition field aiming at developing methodologies to detect hidden patterns and regularities in large datasets, using techniques from a wide range of domains, such as statistics, machine learning, pattern recognition or data visualisation. In most real world contexts, the interpretation and explanation of the discovered patterns is left to human experts, whose work is to use their background knowledge to analyse, refine and make the patterns understandable for the intended purpose. Explaining patterns is therefore an intensive and time-consuming process, where parts of the knowledge can remain unrevealed, especially when the experts lack some of the required background knowledge. In this thesis, we investigate the hypothesis that such interpretation process can be facilitated by introducing background knowledge from the Web of (Linked) Data. In the last decade, many areas started publishing and sharing their domain-specific knowledge in the form of structured data, with the objective of encouraging information sharing, reuse and discovery. With a constantly increasing amount of shared and connected knowledge, we thus assume that the process of explaining patterns can become easier, faster, and more automated. To demonstrate this, we developed Dedalo, a framework that automatically provides explanations to patterns of data using the background knowledge extracted from the Web of Data. We studied the elements required for a piece of information to be considered an explanation, identified the best strategies to automatically find the right piece of information in the Web of Data, and designed a process able to produce explanations to a given pattern using the background knowledge autonomously collected from the Web of Data. The final evaluation of Dedalo involved users within an empirical study based on a real-world scenario. We demonstrated that the explanation process is complex when not being familiar with the domain of usage, but also that this can be considerably simplified when using the Web of Data as a source of background knowledge

Open Research Online (The Open University)

Spatial analogy and subsumption

Author: Aha
Buntine
Eshera
Falkenhainer
Funt
Gennari
Gentner
Glasgow
Hall
Haralick
Hayes
Jones
Lebowitz
Nebel
Quinlan
Thompson
Publication venue: 'Elsevier BV'
Publication date: 01/01/1992
Field of study

Crossref