2,666 research outputs found
A Comparison of Blocking Methods for Record Linkage
Record linkage seeks to merge databases and to remove duplicates when unique
identifiers are not available. Most approaches use blocking techniques to
reduce the computational complexity associated with record linkage. We review
traditional blocking techniques, which typically partition the records
according to a set of field attributes, and consider two variants of a method
known as locality sensitive hashing, sometimes referred to as "private
blocking." We compare these approaches in terms of their recall, reduction
ratio, and computational complexity. We evaluate these methods using different
synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure
Superconducting magnesium diboride films on Silicon with Tc0 about 24K grown via vacuum annealing from stoichiometric precursors
Superconducting magnesium diboride films with Tc0 ~ 24 K and sharp transition
\~ 1 K were successfully prepared on silicon substrates by pulsed laser
deposition from a stoichiometric MgB2 target. Contrary to previous reports,
anneals at 630 degree and a background of 2x10^(-4) torr Ar/4%H2 were performed
without the requirement of Mg vapor or an Mg cap layer. This integration of
superconducting MgB2 films on silicon may thus prove enabling in
superconductor-semiconductor device applications. Images of surface morphology
and cross-section profiles by scanning electron microscopy (SEM) show that the
films have a uniform surface morphology and thickness. Energy dispersive
spectroscopy (EDS) reveals these films were contaminated with oxygen,
originating either from the growth environment or from sample exposure to air.
The oxygen contamination may account for the low Tc for those in-situ annealed
films, while the use of Si as the substrate does not result in a decrease in Tc
as compared to other substrates.Comment: 5 pages, 4 figures, 15 references; due to file size limit, images
were blure
ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution
Entity resolution (ER), an important and common data cleaning problem, is
about detecting data duplicate representations for the same external entities,
and merging them into single representations. Relatively recently, declarative
rules called matching dependencies (MDs) have been proposed for specifying
similarity conditions under which attribute values in database records are
merged. In this work we show the process and the benefits of integrating three
components of ER: (a) Classifiers for duplicate/non-duplicate record pairs
built using machine learning (ML) techniques, (b) MDs for supporting both the
blocking phase of ML and the merge itself; and (c) The use of the declarative
language LogiQL -an extended form of Datalog supported by the LogicBlox
platform- for data processing, and the specification and enforcement of MDs.Comment: To appear in Proc. SUM, 201
Biography of James L. Van Etten
Green algae, in surface layers of almost every lake or stream, are some of the most common aquatic creatures. However, unbeknownst to researchers until recently, viruses that infect algae are almost as widespread. Entire ecosystems of algal hosts and their corresponding viruses lay hidden until the 1980s, when James L. Van Etten, a professor of plant pathology at the University of Nebraska (Lincoln), and his colleague Russ Meints discovered and began to characterize the first member of what is now a rapidly expanding family of algal viruses. Van Etten and his colleagues have continued to study these intriguing viruses, focusing on those that infect Chlorella and other similar green algae. The chlorella viruses have many unusual properties, ranging from their large genome sizes to unique modifications in their DNA
Different methods of evaluation of Monilinia laxa on apricot flowers and branches
- Organic apricot production is currently not profitable.
- The main obstacle to sustainable profitability is brown rot caused by the fungus Monilinia laxa (Aderh. & Ruhl).
- In the current apricot germplasm no source of total resistance has been shown, but some varieties are expressing interesting levels of tolerance.
- A good evaluation of the M. laxa symptoms is essential for a precise diagnosis of the infection and to appreciate differences between tolerant and susceptible varieties and genotypes
Biography of James L. Van Etten
Green algae, in surface layers of almost every lake or stream, are some of the most common aquatic creatures. However, unbeknownst to researchers until recently, viruses that infect algae are almost as widespread. Entire ecosystems of algal hosts and their corresponding viruses lay hidden until the 1980s, when James L. Van Etten, a professor of plant pathology at the University of Nebraska (Lincoln), and his colleague Russ Meints discovered and began to characterize the first member of what is now a rapidly expanding family of algal viruses. Van Etten and his colleagues have continued to study these intriguing viruses, focusing on those that infect Chlorella and other similar green algae. The chlorella viruses have many unusual properties, ranging from their large genome sizes to unique modifications in their DNA
- …