13,807 research outputs found

    An application of machine learning to the organization of institutional software repositories

    Get PDF
    Software reuse has become a major goal in the development of space systems, as a recent NASA-wide workshop on the subject made clear. The Data Systems Technology Division of Goddard Space Flight Center has been working on tools and techniques for promoting reuse, in particular in the development of satellite ground support software. One of these tools is the Experiment in Libraries via Incremental Schemata and Cobweb (ElvisC). ElvisC applies machine learning to the problem of organizing a reusable software component library for efficient and reliable retrieval. In this paper we describe the background factors that have motivated this work, present the design of the system, and evaluate the results of its application

    An Effective Private Data storage and Retrieval System using Secret sharing scheme based on Secure Multi-party Computation

    Full text link
    Privacy of the outsourced data is one of the major challenge.Insecurity of the network environment and untrustworthiness of the service providers are obstacles of making the database as a service.Collection and storage of personally identifiable information is a major privacy concern.On-line public databases and resources pose a significant risk to user privacy, since a malicious database owner may monitor user queries and infer useful information about the customer.The challenge in data privacy is to share data with third-party and at the same time securing the valuable information from unauthorized access and use by third party.A Private Information Retrieval(PIR) scheme allows a user to query database while hiding the identity of the data retrieved.The naive solution for confidentiality is to encrypt data before outsourcing.Query execution,key management and statistical inference are major challenges in this case.The proposed system suggests a mechanism for secure storage and retrieval of private data using the secret sharing technique.The idea is to develop a mechanism to store private information with a highly available storage provider which could be accessed from anywhere using queries while hiding the actual data values from the storage provider.The private information retrieval system is implemented using Secure Multi-party Computation(SMC) technique which is based on secret sharing. Multi-party Computation enable parties to compute some joint function over their private inputs.The query results are obtained by performing a secure computation on the shares owned by the different servers.Comment: Data Science & Engineering (ICDSE), 2014 International Conference, CUSA

    Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes

    Get PDF
    Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid (WAH) compression. These techniques are sensitive to the order of the rows: a simple lexicographical sort can divide the index size by 9 and make indexes several times faster. We investigate reordering heuristics based on computed attribute-value histograms. Simply permuting the columns of the table based on these histograms can increase the sorting efficiency by 40%.Comment: To appear in proceedings of DOLAP 200

    Geodatabase Development to Support Hyperspectral Imagery Exploitation

    Get PDF
    Geodatabase development for coastal studies conducted by the Naval Research Laboratory (NRL) is essential to support the exploitation of hyperspectral imagery (HSI). NRL has found that the remote sensing and mapping science community benefits from coastal classifications that group coastal types based on similar features. Selected features in project geodatabases relate to significant biological and physical forces that shape the coast. The project geodatabases help researchers understand factors that are necessary for imagery post processing, especially those features having a high degree of temporal and spatial variability. NRL project geodatabases include a hierarchy of environmental factors that extend from shallow water bottom types and beach composition to inland soil and vegetation characteristics. These geodatabases developed by NRL allow researchers to compare features among coast types. The project geodatabases may also be used to enhance littoral data archives that are sparse. This paper highlights geodatabase development for recent remote sensing experiments in barrier island, coral, and mangrove coast types

    Designing a resource-efficient data structure for mobile data systems

    Get PDF
    Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semi-structured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This project seeks to investigate and evaluate an integration of the two approaches. The key strength of this technique is that both structural and value predicates could be resolved within one graph while further allowing for compression of the resulting data structure. As the current trend is towards the requirement for working with larger semi-structured data sets this work would allow for the utilisation of much larger data sets whilst reducing requirements on bandwidth and minimising the memory necessary both for the storage and querying of the data

    Simulated evaluation of faceted browsing based on feature selection

    Get PDF
    In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness

    An integrated ranking algorithm for efficient information computing in social networks

    Full text link
    Social networks have ensured the expanding disproportion between the face of WWW stored traditionally in search engine repositories and the actual ever changing face of Web. Exponential growth of web users and the ease with which they can upload contents on web highlights the need of content controls on material published on the web. As definition of search is changing, socially-enhanced interactive search methodologies are the need of the hour. Ranking is pivotal for efficient web search as the search performance mainly depends upon the ranking results. In this paper new integrated ranking model based on fused rank of web object based on popularity factor earned over only valid interlinks from multiple social forums is proposed. This model identifies relationships between web objects in separate social networks based on the object inheritance graph. Experimental study indicates the effectiveness of proposed Fusion based ranking algorithm in terms of better search results.Comment: 14 pages, International Journal on Web Service Computing (IJWSC), Vol.3, No.1, March 201

    A Survey on Array Storage, Query Languages, and Systems

    Full text link
    Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page
    • …
    corecore