5,281 research outputs found
Estimating Sparse Signals Using Integrated Wideband Dictionaries
In this paper, we introduce a wideband dictionary framework for estimating
sparse signals. By formulating integrated dictionary elements spanning bands of
the considered parameter space, one may efficiently find and discard large
parts of the parameter space not active in the signal. After each iteration,
the zero-valued parts of the dictionary may be discarded to allow a refined
dictionary to be formed around the active elements, resulting in a zoomed
dictionary to be used in the following iterations. Implementing this scheme
allows for more accurate estimates, at a much lower computational cost, as
compared to directly forming a larger dictionary spanning the whole parameter
space or performing a zooming procedure using standard dictionary elements.
Different from traditional dictionaries, the wideband dictionary allows for the
use of dictionaries with fewer elements than the number of available samples
without loss of resolution. The technique may be used on both one- and
multi-dimensional signals, and may be exploited to refine several traditional
sparse estimators, here illustrated with the LASSO and the SPICE estimators.
Numerical examples illustrate the improved performance
Feasibility study of an Integrated Program for Aerospace vehicle Design (IPAD). Volume 1B: Concise review
Reports on the design process, support of the design process, IPAD System design catalog of IPAD technical program elements, IPAD System development and operation, and IPAD benefits and impact are concisely reviewed. The approach used to define the design is described. Major activities performed during the product development cycle are identified. The computer system requirements necessary to support the design process are given as computational requirements of the host system, technical program elements and system features. The IPAD computer system design is presented as concepts, a functional description and an organizational diagram of its major components. The cost and schedules and a three phase plan for IPAD implementation are presented. The benefits and impact of IPAD technology are discussed
RDF-TR: Exploiting structural redundancies to boost RDF compression
The number and volume of semantic data have grown impressively over the last decade, promoting compression as an essential tool for RDF preservation, sharing and management. In contrast to universal compressors, RDF compression techniques are able to detect and exploit specific forms of redundancy in RDF data. Thus, state-of-the-art RDF compressors excel at exploiting syntactic and semantic redundancies, i.e., repetitions in the serialization format and information that can be inferred implicitly. However, little attention has been paid to the existence of structural patterns within the RDF dataset; i.e. structural redundancy. In this paper, we analyze structural regularities in real-world datasets, and show three schema-based sources of redundancies that underpin the schema-relaxed nature of RDF. Then, we propose RDF-Tr (RDF Triples Reorganizer), a preprocessing technique that discovers and removes this kind of redundancy before the RDF dataset is effectively compressed. In particular, RDF-Tr groups subjects that are described by the same predicates, and locally re-codes the objects related to these predicates. Finally, we integrate
RDF-Tr with two RDF compressors, HDT and k2-triples. Our experiments show that using RDF-Tr with these compressors improves by up to 2.3 times their original effectiveness, outperforming the most prominent state-of-the-art techniques
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
Incremental View Maintenance For Collection Programming
In the context of incremental view maintenance (IVM), delta query derivation
is an essential technique for speeding up the processing of large, dynamic
datasets. The goal is to generate delta queries that, given a small change in
the input, can update the materialized view more efficiently than via
recomputation. In this work we propose the first solution for the efficient
incrementalization of positive nested relational calculus (NRC+) on bags (with
integer multiplicities). More precisely, we model the cost of NRC+ operators
and classify queries as efficiently incrementalizable if their delta has a
strictly lower cost than full re-evaluation. Then, we identify IncNRC+; a large
fragment of NRC+ that is efficiently incrementalizable and we provide a
semantics-preserving translation that takes any NRC+ query to a collection of
IncNRC+ queries. Furthermore, we prove that incremental maintenance for NRC+ is
within the complexity class NC0 and we showcase how recursive IVM, a technique
that has provided significant speedups over traditional IVM in the case of flat
queries [25], can also be applied to IncNRC+.Comment: 24 pages (12 pages plus appendix
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of Data
El actual diluvio de datos está inundando la web con grandes volúmenes de datos representados en RDF, dando lugar a la denominada 'Web de Datos'. En esta tesis proponemos, en primer lugar, un estudio profundo de aquellos textos que nos permitan abordar un conocimiento global de la estructura real de los conjuntos de datos RDF, HDT, que afronta la representación eficiente de grandes volúmenes de datos RDF a través de estructuras optimizadas para su almacenamiento y transmisión en red. HDT representa efizcamente un conjunto de datos RDF a través de su división en tres componentes: la cabecera (Header), el diccionario (Dictionary) y la estructura de sentencias RDF (Triples). A continuación, nos centramos en proveer estructuras eficientes de dichos componentes, ocupando un espacio comprimido al tiempo que se permite el acceso directo a cualquier dat
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
Improving Knowledge-Based Systems with statistical techniques, text mining, and neural networks for non-technical loss detection
Currently, power distribution companies have several problems that are related to energy losses. For
example, the energy used might not be billed due to illegal manipulation or a breakdown in the customer’s
measurement equipment. These types of losses are called non-technical losses (NTLs), and these
losses are usually greater than the losses that are due to the distribution infrastructure (technical losses).
Traditionally, a large number of studies have used data mining to detect NTLs, but to the best of our
knowledge, there are no studies that involve the use of a Knowledge-Based System (KBS) that is created
based on the knowledge and expertise of the inspectors. In the present study, a KBS was built that is
based on the knowledge and expertise of the inspectors and that uses text mining, neural networks,
and statistical techniques for the detection of NTLs. Text mining, neural networks, and statistical techniques
were used to extract information from samples, and this information was translated into rules,
which were joined to the rules that were generated by the knowledge of the inspectors. This system
was tested with real samples that were extracted from Endesa databases. Endesa is one of the most
important distribution companies in Spain, and it plays an important role in international markets in
both Europe and South America, having more than 73 million customers
- …