2 research outputs found
An investigation of computer based nominal data record linkage
The Internet now provides access to vast volumes of nominal data (data associated
with names e. g. birth/death records, parish records, text articles, multimedia) collected
for a range of different purposes. This research focuses on parish registers containing
baptism, marriage, and burial records. Mining these data resources involves linkage
investigating as to how two records are related with regards to attributes like surname,
spatio-temporal location, legal association and inter-relationships. Furthermore, as
well as handling the implicit constraints of nominal data, such a system must also be
able to handle automatically a range of temporal and spatial rules and constraints.
The research examines the linkage rules that apply and how such rules interact. In
this investigation a report is given of the current practices in several disciplines (e. g.
history, demography, genealogy, and epidemiology) and how these are implemented
in current computer and database systems. The practical aspects of this study, and the
workbench approach proposed are centred on the extensive Lancashire & Cheshire
Parish Register archive held on the MIMAS database computer located at Manchester
University. The research also proposes how these findings can have wider
applications.
This thesis describes some initial research into this problem. It describes three
prototypes of nominal data workbench that allow the specification and examination of
several linkage types and discusses the merits of alternative name matching methods,
name grouping techniques and method comparisons. The conclusion is that in the
cases examined so far, effective nominal data linkage is essentially a query
optimisation process. The process is made more efficient if linkage specific indexes
exist, and suggests that query re-organization based on these indexes, though a
complex process, is entirely feasible. To facilitate the use of indexes and to guide the
optimization process, the work suggests the use of formal ontologies
Semantic Indexing for Complex Patient Grouping
this paper we describe indexing techniques based on domain knowledge made available in the form of ontologies. In high level interfaces like those used in many data warehousing applications, it is advantageous to use terminology familiar to the enduser of the system. This terminology is very often different from the one incorporated in the underlying database. Much of the terminology (i.e. concepts) in an end-user ontology will map to complex collections of data warehouse attribute value pairs. We create a mapping between the end-user terminology and attribute-value pairs in the data warehouse. We optimize performance by indexing the data warehouse using an ontology. We optimize performance by indexing the whole database using an ontology that represents the end-user's terminology. INTRODUCTION A joint effort between computer scientists at the University of Maryland and clinicians at Johns Hopkins Hospital focuses on using high performance computing technology in suppor