2,785 research outputs found
Infinite Probabilistic Databases
Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009).
We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics.
It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries
Answering UCQs under Updates and in the Presence of Integrity Constraints
We investigate the query evaluation problem for fixed queries over
fully dynamic databases where tuples can be inserted or deleted.
The task is to design a dynamic data structure that can immediately
report the new result of a fixed query after every database update.
We consider unions of conjunctive queries (UCQs) and focus on the query evaluation tasks testing (decide whether an input tuple belongs to the query result), enumeration (enumerate, without repetition,
all tuples in the query result), and counting (output the number of tuples in the query result).
We identify three increasingly restrictive classes of UCQs which we
call t-hierarchical, q-hierarchical, and exhaustively q-hierarchical UCQs.
Our main results provide the following dichotomies:
If the query\u27s homomorphic core is t-hierarchical (q-hierarchical,
exhaustively q-hierarchical), then the testing (enumeration, counting)
problem can be solved with constant update time and constant testing time (delay, counting time). Otherwise, it cannot be solved with sublinear update time and sublinear testing time (delay, counting time), unless the OV-conjecture and/or the OMv-conjecture fails.
We also study the complexity of query evaluation in the dynamic setting in the presence of integrity constraints, and we obtain similar dichotomy results for the special case of small domain constraints (i.e., constraints which state that
all values in a particular column of a relation belong to a fixed domain of constant size)
Kinetic field dissipation and fate of endosulfan after application on theobroma cacao farm in tropical Southwestern Nigeria
Endosulfan, 6,7,8,9,10,10-hexachloro-1,5,5a,6,9,9a-hexahydro-6,9-methano,2,4,3-benzodioxathiepin-3-oxide, is still a pesticide of choice for most cocoa farmers in Southwestern Nigeria, in spite of its persistence, bioaccumulative, toxicological properties, and restriction. A single treatment of 1.4 kg ai/ha (0.5% ai) of technical grade endosulfan (Thiodan, 35EC) was applied to 0.0227 ha of cultivated Theobroma cacao L. (Cocoa) farm at the Cocoa Research Institute of Nigeria (CRIN). Levels of parent endosulfan (α-, ÎČ-endosulfan) and major metabolite (endosulfan sulfate) were determined in vegetation and surrounding matrices at days 0, 7, 14, 21, 28, 42, and 60 using GC-MS. Their kinetic variables were determined. Order of âendosulfan distribution at day 0 was dry foliage > fresh foliage > bark > pods > soil (0â15 cm). No residual endosulfan was found in cocoa seeds and subsurface soil (15â30 cm). Low residual levels in pods on day 0 may be due to endogenous enzymatic breakdown, with α-isomer more susceptible and α/ÎČ-endosulfan ratio being 0.90. Fell dry foliage as mulch was predominantly the receiving matrix for non-target endosulfan sprayed. Volatilization was key in endosulfan dissipation between days 0 and 7 from foliage surfaces (> 60% loss), while dissipation trend was bi-phasic and tri-phasic for vegetation and soil, respectively. âendosulfan loss at terminal day ranged between 40.60% (topsoil) and 99.47% (fresh foliage). Iteratively computed half-lives (DTâČ 50 ) ranged from 6.48 to 30.13 days for âendosulfan in vegetation. Endosulfan was moderately persistent in podsâa potential source for cross contamination of seeds during harvest. Iteratively determined DTâČ 50 and initial-final day DT 50 are highly correlated (R = 0.9525; n = 28) and no significant difference (P = 0.05) for both methods
Mitigating the piston effect in high-speed hyperloop transportation: A study on the use of aerofoils
The Hyperloop is a concept for the high-speed ground transportation of passengers traveling in pods at transonic speeds in a partially evacuated tube. It consists of a low-pressure tube with capsules traveling at both low and high speeds throughout the length of the tube. When a high-speed system travels through a low-pressure tube with a constrained diameter such as in the case of the Hyperloop, it becomes an aerodynamically challenging problem. Airflow tends to get choked at the constrained areas around the pod, creating a high-pressure region at the front of the pod, a phenomenon referred to as the âpiston effect.â Papers exploring potential solutions for the piston effect are scarce. In this study, using the Reynolds-Average NavierâStokes (RANS) technique for three-dimensional computational analysis, the aerodynamic performance of a Hyperloop pod inside a vacuum tube is studied. Further, aerofoil-shaped fins are added to the aeroshell as a potential way to mitigate the piston effect. The results show that the addition of fins helps in reducing the drag and eddy currents while providing a positive lift to the pod. Further, these fins are found to be effective in reducing the pressure build-up at the front of the pod
m-tables: Representing Missing Data
Representation systems have been widely used to capture different forms of incomplete data in various settings. However, existing representation systems are not expressive enough to handle the more complex scenarios of missing data that can occur in practice: these could vary from missing attribute values, missing a known number of tuples, or even missing an unknown number of tuples. In this work, we propose a new representation system called m-tables, that can represent many different types of missing data. We show that m-tables form a closed, complete and strong representation system under both set and bag semantics and are strictly more expressive than conditional tables under both the closed and open world assumptions. We further study the complexity of computing certain and possible answers in m-tables. Finally, we discuss how to "interpret" m-tables through a novel labeling scheme that marks a type of generalized tuples as certain or possible
- âŠ