Search CORE

23 research outputs found

Experience and potential

Author: Hall P.A.V.
Mikolajuk Z.
Publication venue: IDRC, Ottawa, ON, CA
Publication date: 01/01/1999
Field of study

"...authored by the participants of the IDRC and UNU/IIST workshop in Macau

International Development Research Centre: IDRC Digital Library

Fast phonetic similarity search over large repositories

Author: G.A. Miller
M. Paterson
P.A.V. Hall
R. Hamming
V.A. Mann
V.I. Levenshtein
W.H. Gomaa
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2014
Field of study

Analysis of unstructured data may be inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they are not rich enough to encode phonetic information to assist the search. In this paper, we present a novel approach for efficiently perform phonetic similarity search over large data sources, that uses a data structure called PhoneticMap to encode language-specific phonetic information. We validate our approach through an experiment over a data set using a Portuguese variant of a well-known repository, to automatically correct words with spelling errors

Crossref

UCL Discovery

Global change in microcosms:Environmental and societal predictors of land cover change on the Atlantic Ocean Islands

Author: Borges P.A.V.
Cabezas F.J.
Castilla-Beltrán A.
Catarino L.
Ceríaco L.M.P.
de Lima R.F.
de Nascimento L.
Elias R.B.
Fernández-Palacios J.M.
Gabriel R.
Hall M.
Kissling W.D.
Lim J.Y.
Matos M.
Menezes de Sequeira M.
Nogué S.
Norder S.J.
Rijsdijk K.F.
Romeiras M.M.
van Loon E.E.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Software internationalization architectures for decision support systems

Author: Hall P.A.V.
Publication venue: IDRC, Ottawa, ON, CA
Publication date: 01/01/1999
Field of study

In IDL-2492

International Development Research Centre: IDRC Digital Library

Probabilistic data generation for deduplication and data linkage

Author: F. Damerau
J.J. Pollock
K. Kukich
P. Christen
P.A.V. Hall
Publication venue: Springer LNCS
Publication date: 01/01/2005
Field of study

Abstract. In many data mining projects the data to be analysed contains personal information, like names and addresses. Cleaning and preprocessing of such data likely involves deduplication or linkage with other data, which is often challenged by a lack of unique entity identifiers. In recent years there has been an increased research effort in data linkage and deduplication, mainly in the machine learning and database communities. Publicly available test data with known deduplication or linkage status is needed so that new linkage algorithms and techniques can be tested, evaluated and compared. However, publication of data containing personal information is normally impossible due to privacy and confidentiality issues. An alternative is to use artificially created data, which has the advantages that content and error rates can be controlled, and the deduplication or linkage status is known. Controlled experiments can be performed and replicated easily. In this paper we present a freely available data set generator capable of creating data sets containing names, addresses and other personal information.

CiteSeerX

Crossref

Fast Phonetic Similarity Search over Large Repositories

Author: G.A. Miller
M. Paterson
P.A.V. Hall
R. Hamming
V.A. Mann
V.I. Levenshtein
W.H. Gomaa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Phrase-Based Statistical Machine Translation Using Approximate Matching

Author: F. Och
J. Tomás
M. Bisani
P.A.V. Hall
P.F. Brown
R. Zens
V. Levenstein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

What if mass storage were free?

Author: Abrial J.R.
George Copeland
Hall P.A.V.
Kaneko A.
Smith J.M.
Thomas R.H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Optimizing Reaction and Processing Times in Automotive Industry’s Quality Management

Author: B. Smith
G.U. Yule
J. Buddhakulsomsiri
J. Read
J.A. Harding
M. Mittlböck
P.A.V. Hall
R. Agrawal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Similarity Function Recommender Service Using Incremental User Knowledge Acquisition

Author: A.K. Elmagarmid
M. Bilenko
M. Báez
M. Cochinwala
M.A. Hernández
P. Compton
P.A.V. Hall
Q. Li
S.B. Needleman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref