448 research outputs found

    WikiSensing: A collaborative sensor management system with trust assessment for big data

    Get PDF
    Big Data for sensor networks and collaborative systems have become ever more important in the digital economy and is a focal point of technological interest while posing many noteworthy challenges. This research addresses some of the challenges in the areas of online collaboration and Big Data for sensor networks. This research demonstrates WikiSensing (www.wikisensing.org), a high performance, heterogeneous, collaborative data cloud for managing and analysis of real-time sensor data. The system is based on the Big Data architecture with comprehensive functionalities for smart city sensor data integration and analysis. The system is fully functional and served as the main data management platform for the 2013 UPLondon Hackathon. This system is unique as it introduced a novel methodology that incorporates online collaboration with sensor data. While there are other platforms available for sensor data management WikiSensing is one of the first platforms that enable online collaboration by providing services to store and query dynamic sensor information without any restriction of the type and format of sensor data. An emerging challenge of collaborative sensor systems is modelling and assessing the trustworthiness of sensors and their measurements. This is with direct relevance to WikiSensing as an open collaborative sensor data management system. Thus if the trustworthiness of the sensor data can be accurately assessed, WikiSensing will be more than just a collaborative data management system for sensor but also a platform that provides information to the users on the validity of its data. Hence this research presents a new generic framework for capturing and analysing sensor trustworthiness considering the different forms of evidence available to the user. It uses an extensible set of metrics that can represent such evidence and use Bayesian analysis to develop a trust classification model. Based on this work there are several publications and others are at the final stage of submission. Further improvement is also planned to make the platform serve as a cloud service accessible to any online user to build up a community of collaborators for smart city research.Open Acces

    Determination of Rule Patterns in Complex Event Processing Using Machine Learning Techniques

    Get PDF
    AbstractComplex Event Processing (CEP) is a novel and promising methodology that enables the real-time analysis of stream event data. The main purpose of CEP is detection of the complex event patterns from the atomic and semantically low-level events such as sensor, log, or RFID data. Determination of the rule patterns for matching these simple events based on the temporal, semantic, or spatial correlations is the central task of CEP systems. In the current design of the CEP systems, experts provide event rule patterns. Having reached maturity, the Big Data Systems and Internet of Things (IoT) technology require the implementation of advanced machine learning approaches for automation in the CEP domain. The goal of this research is proposing a machine learning model to replace the manual identification of rule patterns. After a pre-processing stage (dealing with missing values, data outliers, etc.), various rule-based machine learning approaches were applied to detect complex events. Promising results with high preciseness were obtained. A comparative analysis of the performance of classifiers is discussed

    Bayesian Augmentation of Deep Learning to Improve Video Classification

    Get PDF
    Traditional automated video classification methods lack measures of uncertainty, meaning the network is unable to identify those cases in which its predictions are made with significant uncertainty. This leads to misclassification, as the traditional network classifies each observation with same amount of certainty, no matter what the observation is. Bayesian neural networks are a remedy to this issue by leveraging Bayesian inference to construct uncertainty measures for each prediction. Because exact Bayesian inference is typically intractable due to the large number of parameters in a neural network, Bayesian inference is approximated by utilizing dropout in a convolutional neural network. This research compared a traditional video classification neural network to its Bayesian equivalent based on performance and capabilities. The Bayesian network achieves higher accuracy than a comparable non-Bayesian video network and it further provides uncertainty measures for each classification

    Multimodal Affect Recognition: Current Approaches and Challenges

    Get PDF
    Many factors render multimodal affect recognition approaches appealing. First, humans employ a multimodal approach in emotion recognition. It is only fitting that machines, which attempt to reproduce elements of the human emotional intelligence, employ the same approach. Second, the combination of multiple-affective signals not only provides a richer collection of data but also helps alleviate the effects of uncertainty in the raw signals. Lastly, they potentially afford us the flexibility to classify emotions even when one or more source signals are not possible to retrieve. However, the multimodal approach presents challenges pertaining to the fusion of individual signals, dimensionality of the feature space, and incompatibility of collected signals in terms of time resolution and format. In this chapter, we explore the aforementioned challenges while presenting the latest scholarship on the topic. Hence, we first discuss the various modalities used in affect classification. Second, we explore the fusion of modalities. Third, we present publicly accessible multimodal datasets designed to expedite work on the topic by eliminating the laborious task of dataset collection. Fourth, we analyze representative works on the topic. Finally, we summarize the current challenges in the field and provide ideas for future research directions

    Recognition Situations Using Extended Dempster-Shafer Theory

    Get PDF
    Weiser’s [111] vision of pervasive computing describes a world where technology seamlessly integrates into the environment, automatically responding to peoples’ needs. Underpinning this vision is the ability of systems to automatically track the situation of a person. The task of situation recognition is critical and complex: noisy and unreliable sensor data, dynamic situations, unpredictable human behaviour and changes in the environment all contribute to the complexity. No single recognition technique is suitable in all environments. Factors such as availability of training data, ability to deal with uncertain information and transparency to the user will determine which technique to use in any particular environment. In this thesis, we propose the use of Dempster-Shafer theory as a theoretically sound basis for situation recognition - an approach that can reason with uncertainty, but which does not rely on training data. We use existing operations from Dempster-Shafer theory and create new operations to establish an evidence decision network. The network is used to generate and assess situation beliefs based on processed sensor data for an environment. We also define two specific extensions to Dempster-Shafer theory to enhance the knowledge that can be used for reasoning: 1) temporal knowledge about situation time patterns 2) quality of evidence sources (sensors) into the reasoning process. To validate the feasibility of our approach, this thesis creates evidence decision networks for two real-world data sets: a smart home data set and an officebased data set. We analyse situation recognition accuracy for each of the data sets, using the evidence decision networks with temporal/quality extensions. We also compare the evidence decision networks against two learning techniques: Naïve Bayes and J48 Decision Tree

    Learning ontology aware classifiers

    Get PDF
    Many applications of data-driven knowledge discovery processes call for the exploration of data from multiple points of view that reflect different ontological commitments on the part of the learner. Of particular interest in this context are algorithms for learning classifiers from ontologies and data. Against this background, my dissertation research is aimed at the design and analysis of algorithms for construction of robust, compact, accurate and ontology aware classifiers. We have precisely formulated the problem of learning pattern classifiers from attribute value taxonomies (AVT) and partially specified data. We have designed and implemented efficient and theoretically well-founded AVT-based classifier learners. Based on a general strategy of hypothesis refinement to search in a generalized hypothesis space, our AVT-guided learning algorithm adopts a general learning framework that takes into account the tradeoff between the complexity and the accuracy of the predictive models, which enables us to learn a classifier that is both compact and accurate. We have also extended our approach to learning compact and accurate classifier from semantically heterogeneous data sources. We presented a principled way to reduce the problem of learning from semantically heterogeneous data to the problem of learning from distributed partially specified data by reconciling semantic heterogeneity using AVT mappings, and we described a sufficient statistics based solution

    A Prediction System of Dengue Fever Using Monte Carlo Method

    Get PDF
    Dengue fever is an acute disease that clinically can cause death because there is no prediction system to estimate dengue fever cases so it resulted in the growing of dengue fever cases every year. Original data gathering in Jember area that uses technique of partial data gathering has caused data missing. To make this secondary data can be processed in prediction stage there is need to conduct missing imputation by using Monte Carlo method with four different randomization method, followed by data normality test with chi-square, then continued to regression stage. We use MSE (Mean Square Error) to measure prediction error. The smallest MSE result of regression is the best regression model for prediction

    Multi-dimensional clustering in user profiling

    Get PDF
    User profiling has attracted an enormous number of technological methods and applications. With the increasing amount of products and services, user profiling has created opportunities to catch the attention of the user as well as achieving high user satisfaction. To provide the user what she/he wants, when and how, depends largely on understanding them. The user profile is the representation of the user and holds the information about the user. These profiles are the outcome of the user profiling. Personalization is the adaptation of the services to meet the user’s needs and expectations. Therefore, the knowledge about the user leads to a personalized user experience. In user profiling applications the major challenge is to build and handle user profiles. In the literature there are two main user profiling methods, collaborative and the content-based. Apart from these traditional profiling methods, a number of classification and clustering algorithms have been used to classify user related information to create user profiles. However, the profiling, achieved through these works, is lacking in terms of accuracy. This is because, all information within the profile has the same influence during the profiling even though some are irrelevant user information. In this thesis, a primary aim is to provide an insight into the concept of user profiling. For this purpose a comprehensive background study of the literature was conducted and summarized in this thesis. Furthermore, existing user profiling methods as well as the classification and clustering algorithms were investigated. Being one of the objectives of this study, the use of these algorithms for user profiling was examined. A number of classification and clustering algorithms, such as Bayesian Networks (BN) and Decision Trees (DTs) have been simulated using user profiles and their classification accuracy performances were evaluated. Additionally, a novel clustering algorithm for the user profiling, namely Multi-Dimensional Clustering (MDC), has been proposed. The MDC is a modified version of the Instance Based Learner (IBL) algorithm. In IBL every feature has an equal effect on the classification regardless of their relevance. MDC differs from the IBL by assigning weights to feature values to distinguish the effect of the features on clustering. Existing feature weighing methods, for instance Cross Category Feature (CCF), has also been investigated. In this thesis, three feature value weighting methods have been proposed for the MDC. These methods are; MDC weight method by Cross Clustering (MDC-CC), MDC weight method by Balanced Clustering (MDC-BC) and MDC weight method by changing the Lower-limit to Zero (MDC-LZ). All of these weighted MDC algorithms have been tested and evaluated. Additional simulations were carried out with existing weighted and non-weighted IBL algorithms (i.e. K-Star and Locally Weighted Learning (LWL)) in order to demonstrate the performance of the proposed methods. Furthermore, a real life scenario is implemented to show how the MDC can be used for the user profiling to improve personalized service provisioning in mobile environments. The experiments presented in this thesis were conducted by using user profile datasets that reflect the user’s personal information, preferences and interests. The simulations with existing classification and clustering algorithms (e.g. Bayesian Networks (BN), Naïve Bayesian (NB), Lazy learning of Bayesian Rules (LBR), Iterative Dichotomister 3 (Id3)) were performed on the WEKA (version 3.5.7) machine learning platform. WEKA serves as a workbench to work with a collection of popular learning schemes implemented in JAVA. In addition, the MDC-CC, MDC-BC and MDC-LZ have been implemented on NetBeans IDE 6.1 Beta as a JAVA application and MATLAB. Finally, the real life scenario is implemented as a Java Mobile Application (Java ME) on NetBeans IDE 7.1. All simulation results were evaluated based on the error rate and accuracy
    • 

    corecore