295 research outputs found
Schema Independent Relational Learning
Learning novel concepts and relations from relational databases is an
important problem with many applications in database systems and machine
learning. Relational learning algorithms learn the definition of a new relation
in terms of existing relations in the database. Nevertheless, the same data set
may be represented under different schemas for various reasons, such as
efficiency, data quality, and usability. Unfortunately, the output of current
relational learning algorithms tends to vary quite substantially over the
choice of schema, both in terms of learning accuracy and efficiency. This
variation complicates their off-the-shelf application. In this paper, we
introduce and formalize the property of schema independence of relational
learning algorithms, and study both the theoretical and empirical dependence of
existing algorithms on the common class of (de) composition schema
transformations. We study both sample-based learning algorithms, which learn
from sets of labeled examples, and query-based algorithms, which learn by
asking queries to an oracle. We prove that current relational learning
algorithms are generally not schema independent. For query-based learning
algorithms we show that the (de) composition transformations influence their
query complexity. We propose Castor, a sample-based relational learning
algorithm that achieves schema independence by leveraging data dependencies. We
support the theoretical results with an empirical study that demonstrates the
schema dependence/independence of several algorithms on existing benchmark and
real-world datasets under (de) compositions
A Cost-based Optimizer for Gradient Descent Optimization
As the use of machine learning (ML) permeates into diverse application
domains, there is an urgent need to support a declarative framework for ML.
Ideally, a user will specify an ML task in a high-level and easy-to-use
language and the framework will invoke the appropriate algorithms and system
configurations to execute it. An important observation towards designing such a
framework is that many ML tasks can be expressed as mathematical optimization
problems, which take a specific form. Furthermore, these optimization problems
can be efficiently solved using variations of the gradient descent (GD)
algorithm. Thus, to decouple a user specification of an ML task from its
execution, a key component is a GD optimizer. We propose a cost-based GD
optimizer that selects the best GD plan for a given ML task. To build our
optimizer, we introduce a set of abstract operators for expressing GD
algorithms and propose a novel approach to estimate the number of iterations a
GD algorithm requires to converge. Extensive experiments on real and synthetic
datasets show that our optimizer not only chooses the best GD plan but also
allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201
Daily Eastern News: November 10, 1975
https://thekeep.eiu.edu/den_1975_nov/1005/thumbnail.jp
Growth in brine, at low temperature and different organic acids, of yeasts from table olives
The evolution of the main yeast species related to table olives (Pichia anomala, Pichia membranaefaciens, Pichia minuta,
Saccharomyces cerevisiae, Candida diddensii, Candida famata, and
Debaryomyces hansenii) at low temperature (7ºC) and different physico-chemical brine conditions was studied, using the log of the
relative growth as response. In general, the NaCl concentration had a reduced effect, which was slightly greater at pH 3.5, although it was never significant. The effects of pH and type of acid were
significant: the presence of acetic acid always diminished the yeast population with time; however the population was maintained, or even slightly increased, in the presence of lactic acid. Such effects were higher at pH 3.5 than at pH 4.0. The
behavior of the yeast species was diverse. Sacch. cerevisiae, P. membranaefaciens, C. famata y Deb. hansenii disminished with
time in 8% NaCl. The yeast population markedly decreased at pH 3.5, mainly in the case of Sacch. cerevisiae and C. famata. The presence of acetic acid decreased the yeast population in most species and always
lead to a progressive diminution of it with time. No differences between species due to lactic acid was observed. These results can be of interest for the development of commercial presentations of table olives to be preserved at low temperature and with a reduced level of sodium.Se ha estudiado la evolución de las principales especies de levaduras relacionadas con las aceitunas de mesa (Pichia anomala, Pichia membranaefaciens, Pichia minuta, Saccharomyces cerevisiae, Candida diddensii, Candida famata , y Debaryomyces hansenii) a baja temperara (7ºC) y en diversas condiciones fÃsico-quÃmicas en las salmueras, utilizando el log del crecimiento relativo como respuesta. En general, la concentración de sal tiene un efecto muy limitado, que se aprecia algo más a pH 4, pero sin llegar a ser significativo. Los efectos del tipo de ácido y pH fueron significativos; la presencia de acético disminuye la población con el tiempo, mientras que con el láctico se mantiene e, incluso, se eleva ligeramente. Estos efectos se acentúan a pH 3,5. El comportamiento de cada levadura frente a las diferentes variables ha sido diverso. La población relativa de las especies Sacch. cerevisiae , P. membranaefaciens , C. famata y Deb. hansenii disminuyó con el tiempo en presencia del 8 % de NaCl. A pH 3,5 disminuye muy sensiblemente la población inicial en todos los casos, siendo tal influencia más destacada en Sacch. cerevisiae y C. famata. La presencia de acético disminuye de forma importante la población inicial inoculada en la mayorÃa de los casos y provocó siempre un descenso paulatino en las mismas. No se observó diferencias entre las especies debido al ácido láctico. Estos estudios pueden ser de interés para el desarrollo de presentaciones comerciales de aceitunas de mesa refrigeradas y con reducido nivel de sodio.Los autores desean expresar su gratitud a la
CICYT (AGL2000-1539-CO2-01) y a la Unión Europea (FAIR-97-9526) por la financiación parcial de esta investigación.Peer reviewe
Stability limits of n-nonane calculated from molecular dynamics interface simulations
Based on molecular dynamics simulation of the vapor-liquid interface the classical thermodynamic spinodal for n-nonane is estimated using an earlier developed method. The choice of n-nonane as investigated molecule originates from the question whether a deviation from the spherical symmetry of a molecule affects the prediction of the stability limit data. As a result we find that the estimated stability limit data for n-nonane are consistent within the experimental data available for the homologous series of the n-alkanes. It turns out that the slight alignment of the molecules parallel to the interface reported in the literature does not affect the method of transferring interface properties to the bulk phase stability limit
LINVIEW: Incremental View Maintenance for Complex Analytical Queries
Many analytics tasks and machine learning problems can be naturally expressed
by iterative linear algebra programs. In this paper, we study the incremental
view maintenance problem for such complex analytical queries. We develop a
framework, called LINVIEW, for capturing deltas of linear algebra programs and
understanding their computational cost. Linear algebra operations tend to cause
an avalanche effect where even very local changes to the input matrices spread
out and infect all of the intermediate results and the final view, causing
incremental view maintenance to lose its performance benefit over
re-evaluation. We develop techniques based on matrix factorizations to contain
such epidemics of change. As a consequence, our techniques make incremental
view maintenance of linear algebra practical and usually substantially cheaper
than re-evaluation. We show, both analytically and experimentally, the
usefulness of these techniques when applied to standard analytics tasks. Our
evaluation demonstrates the efficiency of LINVIEW in generating parallel
incremental programs that outperform re-evaluation techniques by more than an
order of magnitude.Comment: 14 pages, SIGMO
Estimation of the Thermodynamic Limit of Overheating for Bulk Water from Interfacial Properties
The limit of overheating or expanding is an important property of liquids, which is relevant for the design and safety assessment of processes involving pressurized liquids. In this work, the thermodynamic stability limit – the so-called spinodal – of water is calculated by molecular dynamics computer simulation, using the molecular potential model of Baranyai and Kiss. The spinodal pressure is obtained from the maximal tangential pressure within a liquid‒vapor interface layer. The results are compared to predictions of various equations of states. Based on these comparisons, a set of equations of state is identified which gives reliable results in the metastable (overheated or expanded) liquid region of water down to −55 MPa
- …