3,504 research outputs found

    06431 Abstracts Collection -- Scalable Data Management in Evolving Networks

    Get PDF
    From 22.10.06 to 27.10.06, the Dagstuhl Seminar 06431 ``Scalable Data Management in Evolving Networks\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Mining Frequent Itemsets over Uncertain Databases

    Full text link
    In recent years, due to the wide applications of uncertain data, mining frequent itemsets over uncertain databases has attracted much attention. In uncertain databases, the support of an itemset is a random variable instead of a fixed occurrence counting of this itemset. Thus, unlike the corresponding problem in deterministic databases where the frequent itemset has a unique definition, the frequent itemset under uncertain environments has two different definitions so far. The first definition, referred as the expected support-based frequent itemset, employs the expectation of the support of an itemset to measure whether this itemset is frequent. The second definition, referred as the probabilistic frequent itemset, uses the probability of the support of an itemset to measure its frequency. Thus, existing work on mining frequent itemsets over uncertain databases is divided into two different groups and no study is conducted to comprehensively compare the two different definitions. In addition, since no uniform experimental platform exists, current solutions for the same definition even generate inconsistent results. In this paper, we firstly aim to clarify the relationship between the two different definitions. Through extensive experiments, we verify that the two definitions have a tight connection and can be unified together when the size of data is large enough. Secondly, we provide baseline implementations of eight existing representative algorithms and test their performances with uniform measures fairly. Finally, according to the fair tests over many different benchmark data sets, we clarify several existing inconsistent conclusions and discuss some new findings.Comment: VLDB201

    Capturing Data Uncertainty in High-Volume Stream Processing

    Get PDF
    We present the design and development of a data stream system that captures data uncertainty from data collection to query processing to final result generation. Our system focuses on data that is naturally modeled as continuous random variables. For such data, our system employs an approach grounded in probability and statistical theory to capture data uncertainty and integrates this approach into high-volume stream processing. The first component of our system captures uncertainty of raw data streams from sensing devices. Since such raw streams can be highly noisy and may not carry sufficient information for query processing, our system employs probabilistic models of the data generation process and stream-speed inference to transform raw data into a desired format with an uncertainty metric. The second component captures uncertainty as data propagates through query operators. To efficiently quantify result uncertainty of a query operator, we explore a variety of techniques based on probability and statistical theory to compute the result distribution at stream speed. We are currently working with a group of scientists to evaluate our system using traces collected from the domains of (and eventually in the real systems for) hazardous weather monitoring and object tracking and monitoring.Comment: CIDR 200

    Behavioural pattern identification and prediction in intelligent environments

    Get PDF
    In this paper, the application of soft computing techniques in prediction of an occupant's behaviour in an inhabited intelligent environment is addressed. In this research, daily activities of elderly people who live in their own homes suffering from dementia are studied. Occupancy sensors are used to extract the movement patterns of the occupant. The occupancy data is then converted into temporal sequences of activities which are eventually used to predict the occupant behaviour. To build the prediction model, different dynamic recurrent neural networks are investigated. Recurrent neural networks have shown a great ability in finding the temporal relationships of input patterns. The experimental results show that non-linear autoregressive network with exogenous inputs model correctly extracts the long term prediction patterns of the occupant and outperformed the Elman network. The results presented here are validated using data generated from a simulator and real environments
    corecore