22,025 research outputs found
Structured learning from heterogeneous behavior for social identity linkage
Singapore National Research Foundation under International Research Centre @ Singapore Funding Initiativ
SLIM : Scalable Linkage of Mobility Data
We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. In this paper, we first propose a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. In the experimental evaluation, SLIM outperforms the two existing state-of-the-art approaches in terms of precision and recall. Moreover, the LSH-based approach brings two to four orders of magnitude speedup
Joint Covariance Estimation with Mutual Linear Structure
We consider the problem of joint estimation of structured covariance
matrices. Assuming the structure is unknown, estimation is achieved using
heterogeneous training sets. Namely, given groups of measurements coming from
centered populations with different covariances, our aim is to determine the
mutual structure of these covariance matrices and estimate them. Supposing that
the covariances span a low dimensional affine subspace in the space of
symmetric matrices, we develop a new efficient algorithm discovering the
structure and using it to improve the estimation. Our technique is based on the
application of principal component analysis in the matrix space. We also derive
an upper performance bound of the proposed algorithm in the Gaussian scenario
and compare it with the Cramer-Rao lower bound. Numerical simulations are
presented to illustrate the performance benefits of the proposed method
Identity Resolution across Different Social Networks using Similarity Analysis
Today the Social Networking Sites have become very popular and are used by most of the people. This is because the Social Networking sites are playing different roles in different fields and facilitating the needs of its users from time to time. The most common purpose why people join in to these websites is to get connected with people and sharing information. An individual may be signed in on more than one Social Networking Site so identifying the same individual on different Social Networking sites is a task. To accomplish this task the proposed system uses the Similarity Analysis method on the available information details
- …