3,390 research outputs found

    MIHash: Online Hashing with Mutual Information

    Full text link
    Learning-based hashing methods are widely used for nearest neighbor retrieval, and recently, online hashing methods have demonstrated good performance-complexity trade-offs by learning hash functions from streaming data. In this paper, we first address a key challenge for online hashing: the binary codes for indexed data must be recomputed to keep pace with updates to the hash functions. We propose an efficient quality measure for hash functions, based on an information-theoretic quantity, mutual information, and use it successfully as a criterion to eliminate unnecessary hash table updates. Next, we also show how to optimize the mutual information objective using stochastic gradient descent. We thus develop a novel hashing method, MIHash, that can be used in both online and batch settings. Experiments on image retrieval benchmarks (including a 2.5M image dataset) confirm the effectiveness of our formulation, both in reducing hash table recomputations and in learning high-quality hash functions.Comment: International Conference on Computer Vision (ICCV), 201

    Knowledge-infused and Consistent Complex Event Processing over Real-time and Persistent Streams

    Full text link
    Emerging applications in Internet of Things (IoT) and Cyber-Physical Systems (CPS) present novel challenges to Big Data platforms for performing online analytics. Ubiquitous sensors from IoT deployments are able to generate data streams at high velocity, that include information from a variety of domains, and accumulate to large volumes on disk. Complex Event Processing (CEP) is recognized as an important real-time computing paradigm for analyzing continuous data streams. However, existing work on CEP is largely limited to relational query processing, exposing two distinctive gaps for query specification and execution: (1) infusing the relational query model with higher level knowledge semantics, and (2) seamless query evaluation across temporal spaces that span past, present and future events. These allow accessible analytics over data streams having properties from different disciplines, and help span the velocity (real-time) and volume (persistent) dimensions. In this article, we introduce a Knowledge-infused CEP (X-CEP) framework that provides domain-aware knowledge query constructs along with temporal operators that allow end-to-end queries to span across real-time and persistent streams. We translate this query model to efficient query execution over online and offline data streams, proposing several optimizations to mitigate the overheads introduced by evaluating semantic predicates and in accessing high-volume historic data streams. The proposed X-CEP query model and execution approaches are implemented in our prototype semantic CEP engine, SCEPter. We validate our query model using domain-aware CEP queries from a real-world Smart Power Grid application, and experimentally analyze the benefits of our optimizations for executing these queries, using event streams from a campus-microgrid IoT deployment.Comment: 34 pages, 16 figures, accepted in Future Generation Computer Systems, October 27, 201

    Sharing resources for performance and energy optimization of concurrent streaming applications

    Get PDF
    We aim at finding optimal mappings for concurrent streaming applications. Each application consists of a linear chain with several stages, and processes successive data sets in pipeline mode. The objective is to minimize the energy consumption of the whole platform, while satisfying given performance-related bounds on the period and latency of each application. The problem is to decide which processors to enroll, at which speed (or mode) to use them, and which stages they should execute. Processors can be identical (with the same modes) or heterogeneous. We also distinguish two mapping categories, interval mappings, and general mappings. For interval mappings, a processor is assigned a set of consecutive stages of the same application, so there is no resource sharing across applications. On the contrary, the assignment is fully arbitrary for general mappings, hence a processor can be reused for several applications. On the theoretical side, we establish complexity results for this tri-criteria mapping problem (energy, period, latency), classifying polynomial versus NP-complete instances. Furthermore, we derive an integer linear program that provides the optimal solution in the most general case. On the experimental side, we design polynomial-time heuristics, and assess their absolute performance thanks to the linear program. One main goal is to assess the impact of processor sharing on the quality of the solution

    Worst-case temporal analysis of real-time dynamic streaming applications

    Get PDF

    Dealing with data and software interoperability issues in digital factories

    Get PDF
    The digital factory paradigm comprises a multi-layered integration of the information related to various activities along the factory and product lifecycle manufacturing related resources. A central aspect of a digital factory is that of enabling the product lifecycle stakeholders to collaborate through the use of software solutions. The digital factory thus expands outside the actual company boundaries and offers the opportunity for the business and its suppliers to collaborate on business processes that affect the whole supply chain. This paper discusses an interoperability architecture for digital factories. To this end, it delves into the issue by analysing the main challenges that must be addressed to support an integrated and scalable factory architecture characterized by access to services, aggregation of data, and orchestration of production processes. Then, it revises the state of the art in the light of these requirements and proposes a general architectural framework conjugating the most interesting features of serviceoriented architectures and data sharing architectures. The study is exemplified through a case study
    corecore