2,098 research outputs found
TopSig: Topology Preserving Document Signatures
Performance comparisons between File Signatures and Inverted Files for text
retrieval have previously shown several significant shortcomings of file
signatures relative to inverted files. The inverted file approach underpins
most state-of-the-art search engine algorithms, such as Language and
Probabilistic models. It has been widely accepted that traditional file
signatures are inferior alternatives to inverted files. This paper describes
TopSig, a new approach to the construction of file signatures. Many advances in
semantic hashing and dimensionality reduction have been made in recent times,
but these were not so far linked to general purpose, signature file based,
search engines. This paper introduces a different signature file approach that
builds upon and extends these recent advances. We are able to demonstrate
significant improvements in the performance of signature file based indexing
and retrieval, performance that is comparable to that of state of the art
inverted file based systems, including Language models and BM25. These findings
suggest that file signatures offer a viable alternative to inverted files in
suitable settings and from the theoretical perspective it positions the file
signatures model in the class of Vector Space retrieval models.Comment: 12 pages, 8 figures, CIKM 201
Geobase Information System Impacts on Space Image Formats
As Geobase Information Systems increase in number, size and complexity, the format compatability of satellite remote sensing data becomes increasingly more important. Because of the vast and continually increasing quantity of data available from remote sensing systems the utility of these data is increasingly dependent on the degree to which their formats facilitate, or hinder, their incorporation into Geobase Information Systems. To merge satellite data into a geobase system requires that they both have a compatible geographic referencing system. Greater acceptance of satellite data by the user community will be facilitated if the data are in a form which most readily corresponds to existing geobase data structures. The conference addressed a number of specific topics and made recommendations
Study and simulation of low rate video coding schemes
The semiannual report is included. Topics covered include communication, information science, data compression, remote sensing, color mapped images, robust coding scheme for packet video, recursively indexed differential pulse code modulation, image compression technique for use on token ring networks, and joint source/channel coder design
Digital encoding of black and white facsimile signals
As the costs of digital signal processing and memory hardware are
decreasing each year compared to those of transmission, it is
increasingly economical to apply sophisticated source encoding
techniques to reduce the transmission time for facsimile documents.
With this intent, information lossy encoding schemes have been
investigated in which the encoder is divided into two stages.
Firstly, preprocessing, which removes redundant information from
the original documents, and secondly, actual encoding of the preprocessed
documents. [Continues.
Metaheuristic design of feedforward neural networks: a review of two decades of research
Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
New Techniques for the Modeling, Processing and Visualization of Surfaces and Volumes
With the advent of powerful 3D acquisition technology, there is a growing demand
for the modeling, processing, and visualization of surfaces and volumes. The
proposed methods must be efficient and robust, and they must be able to extract the essential structure of the data and to easily and quickly convey the most significant information to a human observer. Independent of the specific nature of the data, the following fundamental problems can be identified: shape reconstruction from discrete samples, data analysis, and data compression.
This thesis presents several novel solutions to these problems for surfaces
(Part I) and volumes (Part II). For surfaces, we adopt the well-known triangle
mesh representation and develop new algorithms for discrete curvature estimation,detection of feature lines, and line-art rendering (Chapter 3), for connectivity encoding (Chapter 4), and for topology preserving compression of 2D vector fields (Chapter 5). For volumes, that are often given as discrete samples, we base our approach for reconstruction and visualization on the use of new trivariate spline spaces on a certain tetrahedral partition. We study the properties of the new spline spaces (Chapter 7) and present efficient algorithms for reconstruction and visualization by iso-surface rendering for both, regularly (Chapter 8) and irregularly (Chapter 9) distributed data samples
A Survey on Array Storage, Query Languages, and Systems
Since scientific investigation is one of the most important providers of
massive amounts of ordered data, there is a renewed interest in array data
processing in the context of Big Data. To the best of our knowledge, a unified
resource that summarizes and analyzes array processing research over its long
existence is currently missing. In this survey, we provide a guide for past,
present, and future research in array processing. The survey is organized along
three main topics. Array storage discusses all the aspects related to array
partitioning into chunks. The identification of a reduced set of array
operators to form the foundation for an array query language is analyzed across
multiple such proposals. Lastly, we survey real systems for array processing.
The result is a thorough survey on array data storage and processing that
should be consulted by anyone interested in this research topic, independent of
experience level. The survey is not complete though. We greatly appreciate
pointers towards any work we might have forgotten to mention.Comment: 44 page
- …