64 research outputs found
Trajectory-Based Spatiotemporal Entity Linking
Trajectory-based spatiotemporal entity linking is to match the same moving
object in different datasets based on their movement traces. It is a
fundamental step to support spatiotemporal data integration and analysis. In
this paper, we study the problem of spatiotemporal entity linking using
effective and concise signatures extracted from their trajectories. This
linking problem is formalized as a k-nearest neighbor (k-NN) query on the
signatures. Four representation strategies (sequential, temporal, spatial, and
spatiotemporal) and two quantitative criteria (commonality and unicity) are
investigated for signature construction. A simple yet effective dimension
reduction strategy is developed together with a novel indexing structure called
the WR-tree to speed up the search. A number of optimization methods are
proposed to improve the accuracy and robustness of the linking. Our extensive
experiments on real-world datasets verify the superiority of our approach over
the state-of-the-art solutions in terms of both accuracy and efficiency.Comment: 15 pages, 3 figures, 15 table
Can Exclusive Clustering on Streaming Data be Achieved? ABSTRACT
Clustering on streaming data aims at partitioning a list of data points into k groups of “similar ” objects by scanning the data once. Most current one-scan clustering algorithms do not keep original data in the resulting clusters. The output of the algorithms is therefore not the clustered data points but the approximations of data properties according to the predefined similarity function, such that k centers and radiuses reflect the up-to-date data grouping. In this paper, we raise a critical question: can the partition-based clustering, or exclusive clustering, be achieved on streaming data by those currently available algorithms? After identifying the differences between traditional clustering and clustering on data streams, we discuss the basic requirements for the clusters that can be discovered from streaming data. We evaluate the recent work that is based on a subcluster maintenance approach. By using a few straightforward examples we illustrate that the subcluster maintenance approach may fail to resolve the exclusive clustering on data streams. Based on our observations, we also present the challenges on any heuristic method that claims solving the clustering problem on data streams in general. 1
A new polynomial time algorithm for BCNF relational database design
In this paper, we will formalize splittability of fact types in a NIAM conceptual schema by using functional dependencies. We also present a polynomial time algorithm for the design of BCNF relational databases
A Highly Fault-Tolerant Quorum Consensus Method for Managing Replicated Data
Abstract. The main objective of data replication is to provide high availability of data for processing transactions. Quorum consensus (QC) methods are frequently applied to managing replicated data. In this paper, we present a new QC method. The proposed QC approach has a low message overhead: 1) In the best case, each transaction opera-1 _.---.L..--. tion process needs only to communicate with O(x/-nlog ~og3 ~ n) (~ O(v~log ~176 n)) remote sites (n is the number of sites storing the manipulating data item). 2) In the worst case, each transaction operation process may be forced to communicate with O(x/~log ~ n) (~ O(v~log ~ n)) remote sites. Further, we can show that the proposed QC method is highly fault-tolerant. The proposed approach is also fully distributed, that is, each site in a distributed system bears equal responsibility. Key words: concurrency control, distributed computing, fault tolerance, quorum consensus method.
On capturing process requirements of workflow based information system
Abstract The workflow technology manages the execution of business activities and coordinates the flow of information throughout the enterprise. It is emerging as one of the fastest growing disciplines in information technology. It is essential to correctly and effectively capture the workflow specific requirements of business information systems before their deployment through workflow management systems. In this paper, we look at different issues in capturing such requirements and propose a systematic layered modeling approach. We split the workflow specification requirements into five basic dimensions: structure, data, execution, temporal, and transactional. The concepts introduced in this paper have been applied as a foundation to the development of a workflows modeling and verification tool, FlowMake
- …