1,137,266 research outputs found
Believe It or Not: Adding Belief Annotations to Databases
We propose a database model that allows users to annotate data with belief
statements. Our motivation comes from scientific database applications where a
community of users is working together to assemble, revise, and curate a shared
data repository. As the community accumulates knowledge and the database
content evolves over time, it may contain conflicting information and members
can disagree on the information it should store. For example, Alice may believe
that a tuple should be in the database, whereas Bob disagrees. He may also
insert the reason why he thinks Alice believes the tuple should be in the
database, and explain what he thinks the correct tuple should be instead.
We propose a formal model for Belief Databases that interprets users'
annotations as belief statements. These annotations can refer both to the base
data and to other annotations. We give a formal semantics based on a fragment
of multi-agent epistemic logic and define a query language over belief
databases. We then prove a key technical result, stating that every belief
database can be encoded as a canonical Kripke structure. We use this structure
to describe a relational representation of belief databases, and give an
algorithm for translating queries over the belief database into standard
relational queries. Finally, we report early experimental results with our
prototype implementation on synthetic data.Comment: 17 pages, 10 figure
Where do we go from here? Recording and analysing Roman coins from archaeological excavations
The publication of English Heritage's guidelines for the analysis and publication of coins from excavations has not met with acceptance by the relevant specialists. This paper takes the opportunity to look back over what we have been doing, consider what the guidelines suggest, and makes recommendations as to where we could be going. In particular it argues that we should be making more of existing database technologies and the internet, and that the analysis of coins should be integrated with other aspects of the archaeological record. The paper is not a new set of guidelines, but is intended to stimulate debate
Providing a Realist Perspective on the eyeGENE Database System
One of the achievements of the eyeGENE Network is a repository of DNA samples of patients with inherited eye diseases and an associated database that tracks key elements of phenotype and genotype information for each patient. Although its database structure serves its direct research needs, eyeGENE has set a goal of enhancing this structure to become increasingly well integrated with medical information standards over time. This goal should be achieved by ensuring semantic interoperability with other information systems but without adopting the incoherencies and inconsistencies found in available biomedical standards. Therefore, eyeGENE’s current pragmatic perspective with focus on data and information, rather than what the information is about, should shift to a realism-based perspective that includes also the portion of reality described, and the competing opinions that clinicians may hold about it. An analysis of eyeGENE’s database structure and user interfaces suggests that such a transition is possible indeed
Representation Independent Analytics Over Structured Data
Database analytics algorithms leverage quantifiable structural properties of
the data to predict interesting concepts and relationships. The same
information, however, can be represented using many different structures and
the structural properties observed over particular representations do not
necessarily hold for alternative structures. Thus, there is no guarantee that
current database analytics algorithms will still provide the correct insights,
no matter what structures are chosen to organize the database. Because these
algorithms tend to be highly effective over some choices of structure, such as
that of the databases used to validate them, but not so effective with others,
database analytics has largely remained the province of experts who can find
the desired forms for these algorithms. We argue that in order to make database
analytics usable, we should use or develop algorithms that are effective over a
wide range of choices of structural organizations. We introduce the notion of
representation independence, study its fundamental properties for a wide range
of data analytics algorithms, and empirically analyze the amount of
representation independence of some popular database analytics algorithms. Our
results indicate that most algorithms are not generally representation
independent and find the characteristics of more representation independent
heuristics under certain representational shifts
uFLIP: Understanding Flash IO Patterns
Does the advent of flash devices constitute a radical change for secondary
storage? How should database systems adapt to this new form of secondary
storage? Before we can answer these questions, we need to fully understand the
performance characteristics of flash devices. More specifically, we want to
establish what kind of IOs should be favored (or avoided) when designing
algorithms and architectures for flash-based systems. In this paper, we focus
on flash IO patterns, that capture relevant distribution of IOs in time and
space, and our goal is to quantify their performance. We define uFLIP, a
benchmark for measuring the response time of flash IO patterns. We also present
a benchmarking methodology which takes into account the particular
characteristics of flash devices. Finally, we present the results obtained by
measuring eleven flash devices, and derive a set of design hints that should
drive the development of flash-based systems on current devices.Comment: CIDR 200
IEAD: A Novel One-Line Interface to Query Astronomical Science Archives
In this article I present IEAD, a new interface for astronomical science
databases. It is based on a powerful, yet simple, syntax designed to completely
abstract the user from the structure of the underlying database. The
programming language chosen for its implementation, JavaScript, makes it
possible to interact directly with the user and to provide real-time
information on the parsing process, error messages, and the name resolution of
targets; additionally, the same parsing engine is used for context-sensitive
autocompletion. Ultimately, this product should significantly simplify the use
of astronomical archives, inspire more advanced uses of them, and allow the
user to focus on what scientific research to perform, instead of on how to
instruct the computer to do it.Comment: 13 pages, PASP in pres
The EU-Directive on the Legal Protection of Databases and the Incentives to Update: An Economic Analysis
The database directive, initiated by the European Commission in 1992 and due to be finalised in the near future, establishes a two-tiered system of protection, amending copyright with a sui generis rule that grants protection against unfair extraction. The terms of protection are extended if the producter makes "substantial changes" to update the database. This paper analyses the incentive to update created by the database directive. In contrast to the usual findings of the literature on the incentive effects of intellectual property rights, we find that, although in most cases the incentives to update a database are insufficient from society's point of view, the possibility of extending the term of protection by making 'substantial changes' in the database may create an incentive for excessive updating. This leads to conclusions about what should be considered a substantial change -- Die in Datenbank-Direktive, deren endgültige Fassung in Kürze vorliegen wird, garantiert Datanbankproduzenten einen zweistufigen Schutz: Neben dem Urheberrecht existier ein sui generis Recht das vor unlauteren Auszügen schützt und dessen Schutzdauer sich verlängert, wenn der Produzent die Datenbank durch substantielle Änderungen aktualisiert. Dieses Papier befaßt sich mit den Anreizen zur Aktualisierung. Im Gegensatz zu den üblichen Anreizwirkungen von Rechten zum Schutz geistigen Eigentums ergibt sich hier ein Anreiz zu exzessiven Investitionen in die Aktualisierung von Datenbanken. Produzenten nehmen Aktualisierungen auch dann vor, wenn dies gesamtgesellschaftlich nicht wünschenswert ist. Aus dieser Erkenntnis ergeben sich Folgerungen für die Festlegung dessen, was als substantielle Änderung gelten sollte.Copyright,databases,updating
MIRACLE at GeoCLEF Query Parsing 2007: Extraction and Classification of Geographical Information
This paper describes the participation of MIRACLE research consortium at the Query Parsing task of GeoCLEF 2007. Our system is composed of three main modules. First, the Named Geo-entity Identifier, whose objective is to perform the geo-entity identification and tagging, i.e., to extract the “where” component of the geographical query, should there be any. This module is based on a gazetteer built up from the Geonames geographical database and carries out a sequential process in three steps that consist on geo-entity recognition, geo-entity selection and query tagging. Then, the Query Analyzer parses this tagged query to identify the “what” and “geo-relation” components by means of a rule-based grammar. Finally, a two-level multiclassifier first decides whether the query is indeed a geographical query and, should it be positive, then determines the query type according to the type of information that the user is supposed to be looking for: map, yellow page or information. According to a strict evaluation criterion where a match should have all fields correct, our system reaches a precision value of 42.8% and a recall of 56.6% and our submission is ranked 1st out of 6 participants in the task. A detailed evaluation of the confusion matrixes reveal that some extra effort must be invested in “user-oriented” disambiguation techniques to improve the first level binary classifier for detecting geographical queries, as it is a key component to eliminate many false-positives
- …