13,807 research outputs found
An application of machine learning to the organization of institutional software repositories
Software reuse has become a major goal in the development of space systems, as a recent NASA-wide workshop on the subject made clear. The Data Systems Technology Division of Goddard Space Flight Center has been working on tools and techniques for promoting reuse, in particular in the development of satellite ground support software. One of these tools is the Experiment in Libraries via Incremental Schemata and Cobweb (ElvisC). ElvisC applies machine learning to the problem of organizing a reusable software component library for efficient and reliable retrieval. In this paper we describe the background factors that have motivated this work, present the design of the system, and evaluate the results of its application
An Effective Private Data storage and Retrieval System using Secret sharing scheme based on Secure Multi-party Computation
Privacy of the outsourced data is one of the major challenge.Insecurity of
the network environment and untrustworthiness of the service providers are
obstacles of making the database as a service.Collection and storage of
personally identifiable information is a major privacy concern.On-line public
databases and resources pose a significant risk to user privacy, since a
malicious database owner may monitor user queries and infer useful information
about the customer.The challenge in data privacy is to share data with
third-party and at the same time securing the valuable information from
unauthorized access and use by third party.A Private Information Retrieval(PIR)
scheme allows a user to query database while hiding the identity of the data
retrieved.The naive solution for confidentiality is to encrypt data before
outsourcing.Query execution,key management and statistical inference are major
challenges in this case.The proposed system suggests a mechanism for secure
storage and retrieval of private data using the secret sharing technique.The
idea is to develop a mechanism to store private information with a highly
available storage provider which could be accessed from anywhere using queries
while hiding the actual data values from the storage provider.The private
information retrieval system is implemented using Secure Multi-party
Computation(SMC) technique which is based on secret sharing. Multi-party
Computation enable parties to compute some joint function over their private
inputs.The query results are obtained by performing a secure computation on the
shares owned by the different servers.Comment: Data Science & Engineering (ICDSE), 2014 International Conference,
CUSA
Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes
Bitmap indexes must be compressed to reduce input/output costs and minimize
CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use
techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid
(WAH) compression. These techniques are sensitive to the order of the rows: a
simple lexicographical sort can divide the index size by 9 and make indexes
several times faster. We investigate reordering heuristics based on computed
attribute-value histograms. Simply permuting the columns of the table based on
these histograms can increase the sorting efficiency by 40%.Comment: To appear in proceedings of DOLAP 200
Geodatabase Development to Support Hyperspectral Imagery Exploitation
Geodatabase development for coastal studies conducted by the Naval Research Laboratory (NRL) is essential to support the exploitation of hyperspectral imagery (HSI). NRL has found that the remote sensing and mapping science community benefits from coastal classifications that group coastal types based on similar features. Selected features in project geodatabases relate to significant biological and physical forces that shape the coast. The project geodatabases help researchers understand factors that are necessary for imagery post processing, especially those features having a high degree of temporal and spatial variability. NRL project geodatabases include a hierarchy of environmental factors that extend from shallow water bottom types and beach composition to inland soil and vegetation characteristics. These geodatabases developed by NRL allow researchers to compare features among coast types. The project geodatabases may also be used to enhance littoral data archives that are sparse. This paper highlights geodatabase development for recent remote sensing experiments in barrier island, coral, and mangrove coast types
Designing a resource-efficient data structure for mobile data systems
Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semi-structured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This project seeks to investigate and evaluate an integration of the two approaches. The key strength of this technique is that both structural and value predicates could be resolved within one graph while further allowing for compression of the resulting data structure. As the current trend is towards the requirement for working with larger semi-structured data sets this work would allow for the utilisation of much larger data sets whilst reducing requirements on bandwidth and minimising the memory necessary both for the storage and querying of the data
Simulated evaluation of faceted browsing based on feature selection
In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness
An integrated ranking algorithm for efficient information computing in social networks
Social networks have ensured the expanding disproportion between the face of
WWW stored traditionally in search engine repositories and the actual ever
changing face of Web. Exponential growth of web users and the ease with which
they can upload contents on web highlights the need of content controls on
material published on the web. As definition of search is changing,
socially-enhanced interactive search methodologies are the need of the hour.
Ranking is pivotal for efficient web search as the search performance mainly
depends upon the ranking results. In this paper new integrated ranking model
based on fused rank of web object based on popularity factor earned over only
valid interlinks from multiple social forums is proposed. This model identifies
relationships between web objects in separate social networks based on the
object inheritance graph. Experimental study indicates the effectiveness of
proposed Fusion based ranking algorithm in terms of better search results.Comment: 14 pages, International Journal on Web Service Computing (IJWSC),
Vol.3, No.1, March 201
A Survey on Array Storage, Query Languages, and Systems
Since scientific investigation is one of the most important providers of
massive amounts of ordered data, there is a renewed interest in array data
processing in the context of Big Data. To the best of our knowledge, a unified
resource that summarizes and analyzes array processing research over its long
existence is currently missing. In this survey, we provide a guide for past,
present, and future research in array processing. The survey is organized along
three main topics. Array storage discusses all the aspects related to array
partitioning into chunks. The identification of a reduced set of array
operators to form the foundation for an array query language is analyzed across
multiple such proposals. Lastly, we survey real systems for array processing.
The result is a thorough survey on array data storage and processing that
should be consulted by anyone interested in this research topic, independent of
experience level. The survey is not complete though. We greatly appreciate
pointers towards any work we might have forgotten to mention.Comment: 44 page
- …