15,770 research outputs found
Relaxed Functional Dependencies - A Survey of Approaches
Recently, there has been a renovated interest in functional dependencies due to the possibility of employing them in several advanced database operations, such as data cleaning, query relaxation, record matching, and so forth. In particular, the constraints defined for canonical functional dependencies have been relaxed to capture inconsistencies in real data, patterns of semantically related data, or semantic relationships in complex data types. In this paper, we have surveyed 35 of such functional dependencies, providing a classification criteria, motivating examples, and a systematic analysis of them
Characterization of order-like dependencies with formal concept analysis
Functional Dependencies (FDs) play a key role in many fields
of the relational database model, one of the most widely used database
systems. FDs have also been applied in data analysis, data quality, knowl-
edge discovery and the like, but in a very limited scope, because of their
fixed semantics. To overcome this limitation, many generalizations have
been defined to relax the crisp definition of FDs. FDs and a few of their
generalizations have been characterized with Formal Concept Analysis
which reveals itself to be an interesting unified framework for charac-
terizing dependencies, that is, understanding and computing them in a
formal way. In this paper, we extend this work by taking into account
order-like dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not sim-
ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
Cosmological Systematics Beyond Nuisance Parameters : Form Filling Functions
In the absence of any compelling physical model, cosmological systematics are
often misrepresented as statistical effects and the approach of marginalising
over extra nuisance systematic parameters is used to gauge the effect of the
systematic. In this article we argue that such an approach is risky at best
since the key choice of function can have a large effect on the resultant
cosmological errors. As an alternative we present a functional form filling
technique in which an unknown, residual, systematic is treated as such. Since
the underlying function is unknown we evaluate the effect of every functional
form allowed by the information available (either a hard boundary or some
data). Using a simple toy model we introduce the formalism of functional form
filling. We show that parameter errors can be dramatically affected by the
choice of function in the case of marginalising over a systematic, but that in
contrast the functional form filling approach is independent of the choice of
basis set. We then apply the technique to cosmic shear shape measurement
systematics and show that a shear calibration bias of |m(z)|< 0.001(1+z)^0.7 is
required for a future all-sky photometric survey to yield unbiased cosmological
parameter constraints to percent accuracy. A module associated with the work in
this paper is available through the open source iCosmo code available at
http://www.icosmo.org .Comment: 24 pages, 18 figures, accepted to MNRA
Characterizing approximate-matching dependencies in formal concept analysis with pattern structures
Functional dependencies (FDs) provide valuable knowledge on the relations between attributes of a data table. A functional dependency holds when the values of an attribute can be determined by another. It has been shown that FDs can be expressed in terms of partitions of tuples that are in agreement w.r.t. the values taken by some subsets of attributes. To extend the use of FDs, several generalizations have been proposed. In this work, we study approximatematching dependencies that generalize FDs by relaxing the constraints on the attributes, i.e. agreement is based on a similarity relation rather than on equality. Such dependencies are attracting attention in the database field since they allow uncrisping the basic notion of FDs extending its application to many different fields, such as data quality, data mining, behavior analysis, data cleaning or data partition, among others. We show that these dependencies can be formalized in the framework of Formal Concept Analysis (FCA) using a previous formalization introduced for standard FDs. Our new results state that, starting from the conceptual structure of a pattern structure, and generalizing the notion of relation between tuples, approximate-matching dependencies can be characterized as implications in a pattern concept lattice. We finally show how to use basic FCA algorithms to construct a pattern concept lattice that entails these dependencies after a slight and tractable binarization of the original data.Postprint (author's final draft
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
We propose a novel method to fit and segment multi-structural data via convex
relaxation. Unlike greedy methods --which maximise the number of inliers-- this
approach efficiently searches for a soft assignment of points to models by
minimising the energy of the overall classification. Our approach is similar to
state-of-the-art energy minimisation techniques which use a global energy.
However, we deal with the scaling factor (as the number of models increases) of
the original combinatorial problem by relaxing the solution. This relaxation
brings two advantages: first, by operating in the continuous domain we can
parallelize the calculations. Second, it allows for the use of different
metrics which results in a more general formulation.
We demonstrate the versatility of our technique on two different problems of
estimating structure from images: plane extraction from RGB-D data and
homography estimation from pairs of images. In both cases, we report accurate
results on publicly available datasets, in most of the cases outperforming the
state-of-the-art
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
- …