Search CORE

594 research outputs found

A fuzzy approach for interpretation and application of ubiquitous data stream clustering

Author: Gaber M.
Horovitz O.
Krishnaswamy S.
Publication venue
Publication date: 07/10/2005
Field of study

Portsmouth University Research Portal (Pure)

QoS-Aware Middleware for Web Services Composition

Author: Benatallah Boualem
Chang Henry
Dumas Menjivar Marlon
Kalagnanam Jayant
Ngu Anne H H
Zeng Liangzhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

The paradigmatic shift from a Web of manual interactions to a Web of programmatic interactions driven by Web services is creating unprecedented opportunities for the formation of online Business-to-Business (B2B) collaborations. In particular, the creation of value-added services by composition of existing ones is gaining a significant momentum. Since many available Web services provide overlapping or identical functionality, albeit with different Quality of Service (QoS), a choice needs to be made to determine which services are to participate in a given composite service. This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service. Two selection approaches are described and compared: one based on local (task-level) selection of services and the other based on global allocation of tasks to services using integer programming

CiteSeerX

Queensland University of Technology ePrints Archive

SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

Author: Huang Fang
Yu Hao
Yu Weiren
Zhang Chengyuan
Zhang Zuping
Zhu Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/10/2019
Field of study

In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

Warwick Research Archives Portal Repository

Growth of relational model: Interdependence and complementary to big data

Author: Prabhu Srikanth
Rao B. Dinesh
Shetty Sucharitha
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/04/2021
Field of study

A database management system is a constant application of science that provides a platform for the creation, movement, and use of voluminous data. The area has witnessed a series of developments and technological advancements from its conventional structured database to the recent buzzword, bigdata. This paper aims to provide a complete model of a relational database that is still being widely used because of its well known ACID properties namely, atomicity, consistency, integrity and durability. Specifically, the objective of this paper is to highlight the adoption of relational model approaches by bigdata techniques. Towards addressing the reason for this in corporation, this paper qualitatively studied the advancements done over a while on the relational data model. First, the variations in the data storage layout are illustrated based on the needs of the application. Second, quick data retrieval techniques like indexing, query processing and concurrency control methods are revealed. The paper provides vital insights to appraise the efficiency of the structured database in the unstructured environment, particularly when both consistency and scalability become an issue in the working of the hybrid transactional and analytical database management system

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Filtering data streams for entity-based continuous queries

Author: Cheng R
Kao B
Kwan A
Prabhakar S
Tu Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based. In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly. © 2006 IEEE.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Semantic Web: Who is who in the field – A bibliometric analysis

Author: Ding Ying
Publication venue: 'SAGE Publications'
Publication date: 01/06/2010
Field of study

The Semantic Web (SW) is one of the main efforts aiming to enhance human and machine interaction by representing data in an understandable way for machines to mediate data and services. It is a fast-moving and multidisciplinary field. This study conducts a thorough bibliometric analysis of the field by collecting data from Web of Science (WOS) and Scopus for the period of 1960-2009. It utilizes a total of 44,157 papers with 651,673 citations from Scopus, and 22,951 papers with 571,911 citations from WOS. Based on these papers and citations, it evaluates the research performance of the SW by identifying the most productive players, major scholarly communication media, highly cited authors, influential papers and emerging stars

IUScholarWorks (University of Indiana)

A Novel Method to Prevent Misconfigurations of Industrial Automation and Control Systems

Author: Baker T
Ge Y
Yu P
Zhang J
Zhang Y
Zhang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Configuration errors are among the dominant causes of system faults for the industrial automation and control systems (IACS). It is difficult to detect and correct such errors of IACS as there are various kinds of systems and devices with miscellaneous configuration specifications. In this paper, we first propose a streaming algorithm to keep all the configuration changes in the limited memory space. And, when making a new configuration change, another novel streaming algorithm is proposed to search and return all the similar historical changes which can be used to validate this new one. So far, we are the first to model the configuration changes of IACS as a data stream and apply the streaming similarity search in correcting configuration errors while overcoming the inherent unbounded-memory bottleneck. The theoretical correctness and complexity analyses are presented. Experiments with real and synthetic datasets confirm the theoretical analyses and demonstrate the effectiveness of the proposed method in preventing misconfigurations of IACS

LJMU Research Online (Liverpool John Moores University)

Crossref

Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization

Author: Geambasu Roxana
Huang Tzu-Kuo
Lecuyer Mathias
Sen Siddhartha
Spahn Riley
Publication venue
Publication date: 21/05/2017
Field of study

Protecting vast quantities of data poses a daunting challenge for the growing number of organizations that collect, stockpile, and monetize it. The ability to distinguish data that is actually needed from data collected "just in case" would help these organizations to limit the latter's exposure to attack. A natural approach might be to monitor data use and retain only the working-set of in-use data in accessible storage; unused data can be evicted to a highly protected store. However, many of today's big data applications rely on machine learning (ML) workloads that are periodically retrained by accessing, and thus exposing to attack, the entire data store. Training set minimization methods, such as count featurization, are often used to limit the data needed to train ML workloads to improve performance or scalability. We present Pyramid, a limited-exposure data management system that builds upon count featurization to enhance data protection. As such, Pyramid uniquely introduces both the idea and proof-of-concept for leveraging training set minimization methods to instill rigor and selectivity into big data management. We integrated Pyramid into Spark Velox, a framework for ML-based targeting and personalization. We evaluate it on three applications and show that Pyramid approaches state-of-the-art models while training on less than 1% of the raw data

arXiv.org e-Print Archive

Crossref