Search CORE

34 research outputs found

Automated cleansing of POI databases

Author: A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
C. Baral
C. Baral
D. Dubois
G. Bordogna
G. Cooman De
G. Nachouki
G. Tré De
H. Foley
I. Bloch
I. Fellegi
J. Dujmović
J. Lin
J. Lin
L.A. Zadeh
L.A. Zadeh
M. Bright
M.A. Rodríguez
P. Carrara
R. Torres
R. Yager
R. Yager
R. Yager
R.W. Sinnott
S. Destercke
S. Konieczny
S. Rahimi
S. Sandri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Ghent University Academic Bibliography

Coreference detection of low quality objects

Author: A. Bronselaer
D. Gusfield
I. Fellegi
P. Lehti
P. Ravikumar
S. Tejada
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The problem of record linkage is a widely studied problem that aims to identify coreferent (i.e. duplicate) data in a structured data source. As indicated by Winkler, a solution to the record linkage problem is only possible if the error rate is sufficiently low. In other words, in order to succesfully deduplicate a database, the objects in the database must be of sufficient quality. However, this assumption is not always feasible. In this paper, it is investigated how merging of low quality objects into one high quality object can improve the process of record linkage. This general idea is illustrated in the context of strings comparison, where strings of low quality (i.e. with a high typographical error rate) are merged into a string of high quality by using an n-dimensional Levenshtein distance matrix and compute the optimal alignment between the dirty strings. Results are presented and possible refinements are proposed

Crossref

Ghent University Academic Bibliography

Representing uncertainty regarding satisfaction degrees using possibility distributions

Author: A Bronselaer
D Dubois
D Dubois
D Dubois
G Tré De
H Prade
J Kacprzyk
KT Atanassov
L Zadeh
LA Zadeh
LA Zadeh
LA Zadeh
P Bosc
P Bosc
T Takagi
V Tahani
V Torra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Evaluating flexible criteria on data leads to degrees of satisfaction. If a datum is uncertain, it can be uncertain to which degree it satisfies the criterion. This uncertainty can be modelled using a possibility distribution over the domain of possible degrees of satisfaction. In this work, we discuss the meaningfulness thereof by looking at the semantics of such a representation of the uncertainty. More specifically, it is shown that defuzzification of such a representation, towards usability in (multi-criteria) decision support systems, corresponds to expressing a clear attitude towards uncertainty (optimistic, pessimistic, cautious, etc.

Crossref

Ghent University Academic Bibliography

Quantification of ocean heat uptake from changes in atmospheric O2 and CO2 composition

Author: A Olsen
A. Oschlies
AP Ballantyne
B Bronselaer
BB Stephens
C MacFarling Meure
C Quéré Le
C Rödenbeck
CD Keeling
CJ Somes
CP Morice
D Wang
DG Desbruyères
DP Keller
DP Keller
GC Johnson
H Oeschger
HE Garcia
J Hansen
J. P. Dunne
JE Kay
JK Moore
JL Sarmiento
JP Abraham
JP Dunne
JP Dunne
KE Taylor
KE Trenberth
KP Helm
L Bopp
L Bopp
L Cheng
L Cheng
L Resplandy
L Resplandy
L. Bopp
L. Resplandy
LD Talley
M Battle
M Ishii
M. C. Long
M. K. Brooks
MC Long
MD Palmer
N Gruber
NG Loeb
O Aumont
PM Forster
R Rietbroek
R Séférian
R Wang
R. F. Keeling
R. Wang
RC Hamme
RC Hamme
RF Keeling
RF Keeling
RF Keeling
RJ Andres
RP Allan
RS Vose
S Levitus
S Schmidtko
SC Riser
SP Ritz
T Boyer
T DeVries
T Ito
T Ito
TD Jickells
W. Koeve
WCRP Global Sea Level Budget Group
Y. Eddebbar
YA Eddebbar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The ocean is the main source of thermal inertia in the climate system. Ocean heat uptake during recent decades has been quantified using ocean temperature measurements. However, these estimates all use the same imperfect ocean dataset and share additional uncertainty due to sparse coverage, especially before 2007. Here, we provide an independent estimate by using measurements of atmospheric oxygen (O2) and carbon dioxide (CO2) – levels of which increase as the ocean warms and releases gases – as a whole ocean thermometer. We show that the ocean gained 1.29 ± 0.79 × 1022 Joules of heat per year between 1991 and 2016, equivalent to a planetary energy imbalance of 0.80 ± 0.49 W watts per square metre of Earth’s surface. We also find that the ocean-warming effect that led to the outgassing of O2 and CO2 can be isolated from the direct effects of anthropogenic emissions and CO2 sinks. Our result – which relies on high-precision O2 atmospheric measurements dating back to 1991 – leverages an integrative Earth system approach and provides much needed independent confirmation of heat uptake estimated from ocean data

OceanRep

Princeton University Open Access Repository

Crossref

ArchiMer - Institutional Archive of Ifremer

HAL-INSU

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique

Data quality improvement by constrained splitting

Author: A. Bronselaer
A. Bronselaer
Brad Adelberg
E.F. Codd
I. Fellegi
S. Soderland
V. Borkar
V. Levenstein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

In the setting of relational databases, the schema of the database provides a context in which the data should be interpreted. As a consequence, the quality of a relational database depends strongly on the assumption that data fits this context description. In this paper, we investigate the case where the information provided by an attribute value exceeds the framework provided by the schema. It is shown that such an information overflow can have two orthogonal causes: (i) data about multiple attributes are jointly stored as one attribute and (ii) data about multiple tuples are jointly stored as one tuple. Needless to say, such erroneous information storage deteriorates the quality of the database. In this paper, it is investigated how data quality can be improved by a split operator. The major difficulty hereby is to take into account the constraints that are present in a relational database. A generic algorithm is provided and tested on the well-know Cora dataset

Crossref

Ghent University Academic Bibliography

Concept-relational text clustering

Author: Antoon
Bronselaer
Bronselaer
Bronselaer A Hallez
David
Ellen
George
Gerald
Gert
Joe
John
Karen
Kathleen
Lotfi
Lotfi
Nicholas Andrews and Edward Fox
Ronald Yager
Scott Deerwester
Stuart Lloyd
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Massive ocean carbon sink spotted burping CO2 on the sly

Author: A. R. Gray
B. Bronselaer
Jeff Tollefson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Quantifying the impact of EER modeling on relational database success : an experimental investigation

Author: A Bronselaer
A Bronselaer
A Olivé
D Batra
DL Moody
I Davies
J Ullman
P Fettke
R Elmasri
R Lukyanenko
S Bagui
TJ Teorey
VM Markowitz
WH DeLone
WH DeLone
Y Timmerman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Despite the widespread idea in literature that the inclusion of EER modeling in the design process of a relational database is beneficial for the success of that database, almost no quantitative cost-benefit analyses of EER modeling exist today to support this statement. In order to fill this need, an empirical study is performed in which the success of a relational database of which the design process contains an EER modeling phase is compared to the success of a relational database in which only the minimally needed design effort was put. Hereby, database success is treated as originally proposed by the DeLone and McLean Information Systems Success Model, by specifically focusing on the information quality and system quality of both databases. To this end, respectively, the total amount of time that is needed by an end user to complete a set of tasks by using the database, and the total execution cost that is needed by the database system before a correct solution to each task is submitted, is analyzed. Moreover, the work accounts for the possible moderation of the technical competence of an end user in the relationship between EER modeling and the success of the eventual relational database. Preliminary results indicate that the inclusion of EER modeling in relational database design significantly highers the perceived information quality and system quality of that database. Moreover, there is statistical evidence that this result is independent of the competence profile of that user

Crossref

Ghent University Academic Bibliography

Operational measurement of data quality

Author: A Bronselaer
A Even
B Heinrich
D Krantz
E Sigal
H Frank
I Fellegi
L Pipino
NE Fenton
R Wang
R Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

Ghent University Academic Bibliography

Simple Global Ocean Biogeochemistry With Light, Iron, Nutrients and Gas Version 2 (BLINGv2): Model Description and Simulation Characteristics in GFDL's CM4.0

Author: Bociu I.
Bronselaer B.
Dunne J. P.
Guo H.
John J. G.
Krasting J. P.
Stock C. A.
Winton M.
Zadeh N.
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 18/08/2020
Field of study

Simulation of coupled carbon-climate requires representation of ocean carbon cycling, but the computational burden of simulating the dozens of prognostic tracers in state-of-the-art biogeochemistry ecosystem models can be prohibitive. We describe a six-tracer biogeochemistry module of steady-state phytoplankton and zooplankton dynamics in Biogeochemistry with Light, Iron, Nutrients and Gas (BLING version 2) with particular emphasis on enhancements relative to the previous version and evaluate its implementation in Geophysical Fluid Dynamics Laboratory's (GFDL) fourth-generation climate model (CM4.0) with 1/4 degrees ocean. Major geographical and vertical patterns in chlorophyll, phosphorus, alkalinity, inorganic and organic carbon, and oxygen are well represented. Major biases in BLINGv2 include overly intensified production in high-productivity regions at the expense of productivity in the oligotrophic oceans, overly zonal structure in tropical phosphorus, and intensified hypoxia in the eastern ocean basins as is typical in climate models. Overall, while BLINGv2 structural limitations prevent sophisticated application to plankton physiology, ecology, or biodiversity, its ability to represent major organic, inorganic, and solubility pumps makes it suitable for many coupled carbon-climate and biogeochemistry studies including eddy interactions in the ocean interior. We further overview the biogeochemistry and circulation mechanisms that shape carbon uptake over the historical period. As an initial analysis of model historical and idealized response, we show that CM4.0 takes up slightly more anthropogenic carbon than previous models in part due to enhanced ventilation in the absence of an eddy parameterization. The CM4.0 biogeochemistry response to CO2 doubling highlights a mix of large declines and moderate increases consistent with previous models.Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

Crossref

Directory of Open Access Journals

The University of Arizona