Search CORE

52 research outputs found

Using Ontologies for Extracting Product Features from Web Pages

Author: C. Patel
D.W. Embley
H. Alani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Abstract. In this paper, we show how to use ontologies to bootstrap a knowledge acquisition process that extracts product information from tabular data on Web pages. Furthermore, we use logical rules to reason about product specific properties and to derive higher-order knowledge about product features. We will also explain the knowledge acquisition process, covering both ontological and procedural aspects. Finally, we will give an qualitative and quantitative evaluation of our results.

CiteSeerX

Crossref

Learning from text-based close call data

Author: Allen J.F.
Bliss J.P.
Church K.W.
Close Andriulo S.
Close Call
Close Call Bird F.E.
Dale R.
Davies J.B.
Dillon R.L.
Easton J.M.
Embley D.W.
Gnoni M.G.
Gnoni M.G.
Heinrich H.W.
Hoinaru O.
Jones S.
Macrae C.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2015
Field of study

A key feature of big data is the variety of data sources that are available; which include not just numerical data but also image or video data or even free text. The GB railways collects a large volume of free text data daily from railway workers describing close call hazard reports: instances where an accident could have – but did not – occur. These close call reports contain valuable safety information which could be useful in managing safety on the railway, but which can be lost in the very large volume of data – much larger than is viable for a human analyst to read. This paper describes the application of rudimentary natural language processing (NLP) techniques to uncover safety information from close calls. The analysis has proven that basic information extraction is possible using the rudimentary techniques, but has also identified some limitations that arise using only basic techniques. Using these findings further research in this area intends to look at how the techniques that have been proven to date can be improved with the use of more advanced NLP techniques coupled with machine-learning

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

A Classification of Stereotypes for Object-Oriented Modeling Languages

Author: B. Selic
D. Champeaux de
D. Firesmith
D.W. Embley
E. Gamma
G. Booch
H.-E. Eriksson
I. Jacobson
J. Rumbaugh
J. Rumbaugh
J.B. Wordsworth
P. Coad
R. Wirfs-Brock
R. Wirfs-Brock
S. Joos
S. Shlaer
Unified Modeling Language Specification
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Principled Pragmatism: A Guide to the Adaptation of Ideas from Philosophical Disciplines to Conceptual Modeling

Author: B. Smith
D.W. Embley
D.W. Embley
D.W. Embley
D.W. Embley
D.W. Embley
L. Xu
M.A. Bunge
S.W. Liddle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Tabular Web Data: Schema Discovery and Integration

Author: D.W. Embley
D.W. Embley
D.W. Embley
J. Wang
M.J. Cafarella
M.J. Cafarella
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Cross-Language Hybrid Keyword and Semantic Search

Author: D.W. Embley
D.W. Embley
D.W. Embley
L. Zhang
P. Buitelaar
P. Castells
R. Bhagdev
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Automatic location and separation of records: A case study in the genealogical domain

Author: A.H.F. Laender
D.W. Embley
D.W. Embley
G. Salton
Publication venue
Publication date: 01/01/2004
Field of study

Abstract. Locating specific chunks (records) of information within documents on the web is an interesting and nontrivial problem. If the problem of locating and separating records can be solved well, the longstanding problem of grouping extracted values into appropriate relationships in a record structure can be more easily resolved. Our solution is a hybrid of two well established techniques: (1) ontology-based extraction [ECJ + 99] and (2) vector space modeling [SM83]. To show that the technique has merit, we apply it to the particularly challenging task of locating and separating records for genealogical web documents, which tend to vary considerably in layout and format. Experiments we have conducted show this technique yields an average of 92 % recall and 93 % precision for locating and separating genealogical records in web documents.

CiteSeerX

Crossref

A Superstructure for Models of Quality

Author: C.T. Meadow
D.W. Embley
D.W. Embley
T. Berners-Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search

Author: B.J. Dorr
D. Lonsdale
D.W. Embley
D.W. Embley
D.W. Embley
J. Klavans
L. Xu
S. Sarawagi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Toward Making Online Biological Data Machine Understandable

Author: D.W. Embley
Y. Ding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref