19 research outputs found

    Querying and mining heterogeneous spatial, social, and temporal data

    Get PDF

    An Agent Framework for Dynamic Health Data Aggregation for Research Purposes

    Get PDF
    This paper presents a model of a MAS framework for dynamic aggregation of population health data for research purposes. The contribution of the paper is twofold: First, it describes a MAS architecture that allows one to built on the fly anonymized databases from the distributed sources of data. Second, it shows how to improve the utility of the data with the growth of the database

    Similarity-aware query refinement for data exploration

    Get PDF

    Database Systems for Advanced Applications:19th International Conference, DASFAA 2014 Bali, Indonesia, April 21-24 2014 Proceedings, Part I

    No full text

    A Multiagent System for Dynamic Data Aggregation in Medical Research

    Get PDF
    The collection of medical data for research purposes is a challenging and long-lasting process. In an effort to accelerate and facilitate this process we propose a new framework for dynamic aggregation of medical data from distributed sources. We use agent-based coordination between medical and research institutions. Our system employs principles of peer-to-peer network organization and coordination models to search over already constructed distributed databases and to identify the potential contributors when a new database has to be built. Our framework takes into account both the requirements of a research study and current data availability. This leads to better definition of database characteristics such as schema, content, and privacy parameters. We show that this approach enables a more efficient way to collect data for medical research

    A hybrid method for community detection based on user interactions, topology and frequent pattern mining

    Get PDF
    In recent years, community detection in social networks has become one of the most important research areas. One of the ways to community detection is to use interactions between users. There are different types of interactions in social networks, which, if used together with network topology, improve the precision of community identification. In this paper, a new method based on the combination of user interactions and network topology is proposed to community detection. In the community formation stage, the effective nodes are identified based on eigenvector centrality, and the primary communities around these nodes are formed based on frequent pattern mining. In the community expansion phase, small communities expand using modularity and the degree of interactions among users. To calculate the degree of interaction between users, a new measure based on the local clustering coefficient and interactions between common neighbors is proposed, which improves the accuracy of the degree of user interactions. Analysis of Higgs Twitter and Flickr datasets utilizing internal density metric, NMI and Omega demonstrates that the proposed method outperforms the other five community detection methods

    Preventing Inferences through Data Dependencies on Sensitive Data

    Get PDF
    Simply restricting the computation to non-sensitive part of the data may lead to inferences on sensitive data through data dependencies. Inference control from data dependencies has been studied in the prior work. However, existing solutions either detect and deny queries which may lead to leakage – resulting in poor utility, or only protects against exact reconstruction of the sensitive data – resulting in poor security. In this paper, we present a novel security model called full deniability. Under this stronger security model, any information inferred about sensitive data from non-sensitive data is considered as a leakage. We describe algorithms for efficiently implementing full deniability on a given database instance with a set of data dependencies and sensitive cells. Using experiments on two different datasets, we demonstrate that our approach protects against realistic adversaries while hiding only minimal number of additional non-sensitive cells and scales well with database size and sensitive data

    Recursive Parameter Estimation of Non-Gaussian Hidden Markov Models for Occupancy Estimation in Smart Buildings

    Get PDF
    A significant volume of data has been produced in this era. Therefore, accurately modeling these data for further analysis and extraction of meaningful patterns is becoming a major concern in a wide variety of real-life applications. Smart buildings are one of these areas urgently demanding analysis of data. Managing the intelligent systems in smart homes, will reduce energy consumption as well as enhance users’ comfort. In this context, Hidden Markov Model (HMM) as a learnable finite stochastic model has consistently been a powerful tool for data modeling. Thus, we have been motivated to propose occupancy estimation frameworks for smart buildings through HMM due to the importance of indoor occupancy estimations in automating environmental settings. One of the key factors in modeling data with HMM is the choice of the emission probability. In this thesis, we have proposed novel HMMs extensions through Generalized Dirichlet (GD), Beta-Liouville (BL), Inverted Dirichlet (ID), Generalized Inverted Dirichlet (GID), and Inverted Beta-Liouville (IBL) distributions as emission probability distributions. These distributions have been investigated due to their capabilities in modeling a variety of non-Gaussian data, overcoming the limited covariance structures of other distributions such as the Dirichlet distribution. The next step after determining the emission probability is estimating an optimized parameter of the distribution. Therefore, we have developed a recursive parameter estimation based on maximum likelihood estimation approach (MLE). Due to the linear complexity of the proposed recursive algorithm, the developed models can successfully model real-time data, this allowed the models to be used in an extensive range of practical applications
    corecore