19 research outputs found
An Agent Framework for Dynamic Health Data Aggregation for Research Purposes
This paper presents a model of a MAS framework for dynamic aggregation of population health data for research purposes. The contribution of the paper is twofold: First, it describes a MAS architecture that allows one to built on the fly anonymized databases from the distributed sources of data. Second, it shows how to improve the utility of the data with the growth of the database
A Multiagent System for Dynamic Data Aggregation in Medical Research
The collection of medical data for research purposes is a challenging and long-lasting process. In an effort to accelerate and facilitate this process we propose a new framework for dynamic aggregation of medical data from distributed sources. We use agent-based coordination between medical and research institutions. Our system employs principles of peer-to-peer network organization and coordination models to search over already constructed distributed databases and to identify the potential contributors when a new database has to be built. Our framework takes into account both the requirements of a research study and current data availability. This leads to better definition of database characteristics such as schema, content, and privacy parameters. We show that this approach enables a more efficient way to collect data for medical research
A hybrid method for community detection based on user interactions, topology and frequent pattern mining
In recent years, community detection in social networks has become one of the most important research areas. One of the ways to community detection is to use interactions between users. There are different types of interactions in social networks, which, if used together with network topology, improve the precision of community identification. In this paper, a new method based on the combination of user interactions and network topology is proposed to community detection. In the community formation stage, the effective nodes are identified based on eigenvector centrality, and the primary communities around these nodes are formed based on frequent pattern mining. In the community expansion phase, small communities expand using modularity and the degree of interactions among users. To calculate the degree of interaction between users, a new measure based on the local clustering coefficient and interactions between common neighbors is proposed, which improves the accuracy of the degree of user interactions. Analysis of Higgs Twitter and Flickr datasets utilizing internal density metric, NMI and Omega demonstrates that the proposed method outperforms the other five community detection methods
Preventing Inferences through Data Dependencies on Sensitive Data
Simply restricting the computation to non-sensitive part of the data may lead to inferences on sensitive data through data dependencies. Inference control from data dependencies has been studied in the prior work. However, existing solutions either detect and deny queries which may lead to leakage – resulting in poor utility, or only protects against exact reconstruction of the sensitive data – resulting in poor security. In this paper, we present a novel security model called full deniability. Under this stronger security model, any information inferred about sensitive data from non-sensitive data is considered as a leakage. We describe algorithms for efficiently implementing full deniability on a given database instance with a set of data dependencies and sensitive cells. Using experiments on two different datasets, we demonstrate that our approach protects against realistic adversaries while hiding only minimal number of additional non-sensitive cells and scales well with database size and sensitive data
Recursive Parameter Estimation of Non-Gaussian Hidden Markov Models for Occupancy Estimation in Smart Buildings
A significant volume of data has been produced in this era. Therefore, accurately modeling these
data for further analysis and extraction of meaningful patterns is becoming a major concern in a
wide variety of real-life applications. Smart buildings are one of these areas urgently demanding
analysis of data. Managing the intelligent systems in smart homes, will reduce energy consumption
as well as enhance users’ comfort. In this context, Hidden Markov Model (HMM) as a learnable
finite stochastic model has consistently been a powerful tool for data modeling. Thus, we have been
motivated to propose occupancy estimation frameworks for smart buildings through HMM due to
the importance of indoor occupancy estimations in automating environmental settings. One of the
key factors in modeling data with HMM is the choice of the emission probability. In this thesis, we
have proposed novel HMMs extensions through Generalized Dirichlet (GD), Beta-Liouville (BL),
Inverted Dirichlet (ID), Generalized Inverted Dirichlet (GID), and Inverted Beta-Liouville (IBL)
distributions as emission probability distributions. These distributions have been investigated due
to their capabilities in modeling a variety of non-Gaussian data, overcoming the limited covariance
structures of other distributions such as the Dirichlet distribution. The next step after determining
the emission probability is estimating an optimized parameter of the distribution. Therefore, we
have developed a recursive parameter estimation based on maximum likelihood estimation approach
(MLE). Due to the linear complexity of the proposed recursive algorithm, the developed models can
successfully model real-time data, this allowed the models to be used in an extensive range of
practical applications