431 research outputs found

    Pattern Mining and Sense-Making Support for Enhancing the User Experience

    Get PDF
    While data mining techniques such as frequent itemset and sequence mining are well established as powerful pattern discovery tools in domains from science, medicine to business, a detriment is the lack of support for interactive exploration of high numbers of patterns generated with diverse parameter settings and the relationships among the mined patterns. To enhance the user experience, real-time query turnaround times and improved support for interactive mining are desired. There is also an increasing interest in applying data mining solutions for mobile data. Patterns mined over mobile data may enable context-aware applications ranging from automating frequently repeated tasks to providing personalized recommendations. Overall, this dissertation addresses three problems that limit the utility of data mining, namely, (a.) lack of interactive exploration tools for mined patterns, (b.) insufficient support for mining localized patterns, and (c.) high computational mining requirements prohibiting mining of patterns on smaller compute units such as a smartphone. This dissertation develops interactive frameworks for the guided exploration of mined patterns and their relationships. Contributions include the PARAS pre- processing and indexing framework; enabling analysts to gain key insights into rule relationships in a parameter space view due to the compact storage of rules that enables query-time reconstruction of complete rulesets. Contributions also include the visual rule exploration framework FIRE that presents an interactive dual view of the parameter space and the rule space, that together enable enhanced sense-making of rule relationships. This dissertation also supports the online mining of localized association rules computed on data subsets by selectively deploying alternative execution strategies that leverage multidimensional itemset-based data partitioning index. Finally, we designed OLAPH, an on-device context-aware service that learns phone usage patterns over mobile context data such as app usage, location, call and SMS logs to provide device intelligence. Concepts introduced for modeling mobile data as sequences include compressing context logs to intervaled context events, adding generalized time features, and identifying meaningful sequences via filter expressions

    Scalable Daily Human Behavioral Pattern Mining from Multivariate Temporal Data

    Get PDF
    This work introduces a set of scalable algorithms to identify patterns of human daily behaviors. These patterns are extracted from multivariate temporal data that have been collected from smartphones. We have exploited sensors that are available on these devices, and have identified frequent behavioral patterns with a temporal granularity, which has been inspired by the way individuals segment time into events. These patterns are helpful to both end-users and third parties who provide services based on this information. We have demonstrated our approach on two real-world datasets and showed that our pattern identification algorithms are scalable. This scalability makes analysis on resource constrained and small devices such as smartwatches feasible. Traditional data analysis systems are usually operated in a remote system outside the device. This is largely due to the lack of scalability originating from software and hardware restrictions of mobile/wearable devices. By analyzing the data on the device, the user has the control over the data, i.e. privacy, and the network costs will also be removed

    Data mining by means of generalized patterns

    Get PDF
    The thesis is mainly focused on the study and the application of pattern discovery algorithms that aggregate database knowledge to discover and exploit valuable correlations, hidden in the analyzed data, at different abstraction levels. The aim of the research effort described in this work is two-fold: the discovery of associations, in the form of generalized patterns, from large data collections and the inference of semantic models, i.e., taxonomies and ontologies, suitable for driving the mining proces

    A Study on Vehicle Trajectory Analysis

    Get PDF
    Successful developments of effective real-time traffic management and information systems demand high quality real time traffic information. In the era of intelligent transportation convergence, traffic monitoring requires traffic sensory technologies. The present analysis extracted data from Mobile Century experiment. The data obtained in the experiment was pre-processed. Based on the pre processed data experimental road map has generated. Individual vehicle tracking has done using trajectory analysis. Finally an attempt has been made for extracting association rules from mobile century dataset using Apriori algorithm

    LC an effective classification based association rule mining algorithm

    Get PDF
    Classification using association rules is a research field in data mining that primarily uses association rule discovery techniques in classification benchmarks. It has been confirmed by many research studies in the literature that classification using association tends to generate more predictive classification systems than traditional classification data mining techniques like probabilistic, statistical and decision tree. In this thesis, we introduce a novel data mining algorithm based on classification using association called “Looking at the Class” (LC), which can be used in for mining a range of classification data sets. Unlike known algorithms in classification using the association approach such as Classification based on Association rule (CBA) system and Classification based on Predictive Association (CPAR) system, which merge disjoint items in the rule learning step without anticipating the class label similarity, the proposed algorithm merges only items with identical class labels. This saves too many unnecessary items combining during the rule learning step, and consequently results in large saving in computational time and memory. Furthermore, the LC algorithm uses a novel prediction procedure that employs multiple rules to make the prediction decision instead of a single rule. The proposed algorithm has been evaluated thoroughly on real world security data sets collected using an automated tool developed at Huddersfield University. The security application which we have considered in this thesis is about categorizing websites based on their features to legitimate or fake which is a typical binary classification problem. Also, experimental results on a number of UCI data sets have been conducted and the measures used for evaluation is the classification accuracy, memory usage, and others. The results show that LC algorithm outperformed traditional classification algorithms such as C4.5, PART and Naïve Bayes as well as known classification based association algorithms like CBA with respect to classification accuracy, memory usage, and execution time on most data sets we consider

    Human Mobility Mining Using Spatio-Temporal Data

    Get PDF
    Georuumilised tehnoloogiad on lahutamatu osa meie elust: tehnoloogilise arengu ja positsioneerimiseadmete levikuga on toimunud kiire kasv kĂ€ttesaadavate georuumiliste andmete mahus. Andmed kogutakse erinevate allikate kaudu, nt GPS ja mobiilseadmete logid, traadita sidevahendid ja asukohapĂ”hised teenused ning teised positsioneerimise sĂŒsteemid. Liikumise kohta on vĂ”imalik infot koguda suures mÔÔtkavas ja hea tĂ€psusega - see annab uurijatele vĂ”imaluse luua uusi ja innovaatilisi platvorme ja teenuseid georuumilise info analĂŒĂŒsimiseks ning parandada andmete kaevandamise ja visualiseerimise tehnikaid. Selleks, et luua hea nĂ”ustamisssĂŒsteem, on vĂ€ga oluline saada aru inimeste liikumisharjumustest ja kĂ€itumisest ning leida igapĂ€evaste tegevuste varjatud mustrid. Magistritöö eesmĂ€rgiks on analĂŒĂŒsida andmekaeve meetodeid, uurides, millised mustrid vĂ”ivad olla liikumise trajektoorides vĂ”i milliste algoritmidega saab ennustada inimeste kĂ€itumist. Töös kontrollitakse nii olemasolevaid metoodikad ja teooriad ruumilise andmekaevandamise valdkonnas kui ka pakutakse arendatud algoritmide jada inimeste liikumise ennustamiseks. Me hindame ja vördleme tulemusi omavahel ning töötame vĂ€lja metoodika inimeste liikumiskĂ€itumise adaptiivseks andmekaevandamiseks.Geospatial technologies have become an integral part of our lives. With technological progress and rapid increase of geospatial information and inexpensive positioning technologies, more space-related data is becoming available at any time. Data is collected using multiple sources such as GPS and mobile computer logs, wireless communication devices, location-aware services and other positioning systems. This gives scientists the opportunity to create new innovative platforms for spatio-temporal data analysis and improve methods for mining and visualization for decision support. In order to provide a good decision support systems, it is vital to understand people’s movement, mobility behaviour and be able to discover hidden patterns and associations in their daily activities. The aim of this thesis is to analyze and discuss spatial data mining techniques by answering questions like what kinds of patterns can be extracted from spatio-temporal data or which methods are best for predicting human mobility behavior. In this work, we verify existing methodologies and theories about spatio-temporal data mining and propose a sequence of algorithms to achieve good human mobility prediction. We evaluate the results and propose a methodology for adaptive data mining of human mobility behavior

    Mobile-based online data mining : outdoor activity recognition

    Get PDF
    One of the unique features of mobile applications is the context awareness. The mobility and power afforded by smartphones allow users to interact more directly and constantly with the external world more than ever before. The emerging capabilities of smartphones are fueling a rise in the use of mobile phones as input devices for a great range of application fields; one of these fields is the activity recognition. In pervasive computing, activity recognition has a significant weight because it can be applied to many real-life, human-centric problems. This important role allows providing services to various application domains ranging from real-time traffic monitoring to fitness monitoring, social networking, marketing and healthcare. However, one of the major problems that can shatter any mobile-based activity recognition model is the limited battery life. It represents a big hurdle for the quality and the continuity of the service. Indeed, excessive power consumption may become a major obstacle to broader acceptance context-aware mobile applications, no matter how useful the proposed service may be. We present during this thesis a novel unsupervised battery-aware approach to online recognize users’ outdoor activities without depleting the mobile resources. We succeed in associating the places visited by individuals during their movements to meaningful human activities. Our approach includes novel models that incrementally cluster users’ movements into different types of activities without any massive use of historical records. To optimize battery consumption, our approach behaves variably according to users’ behaviors and the remaining battery level. Moreover, we propose to learn users’ habits in order to reduce the activity recognition computation. Our innovative battery-friendly method combines activity recognition and prediction in order to recognize users’ activities accurately without draining the battery of their phones. We show that our approach reduces significantly the battery consumption while keeping the same high accuracy. Une des caractĂ©ristiques uniques des applications mobiles est la sensibilitĂ© au contexte. La mobilitĂ© et la puissance de calcul offertes par les smartphones permettent aux utilisateurs d’interagir plus directement et en permanence avec le monde extĂ©rieur. Ces capacitĂ©s Ă©mergentes ont pu alimenter plusieurs champs d’applications comme le domaine de la reconnaissance d’activitĂ©s. Dans le domaine de l'informatique omniprĂ©sente, la reconnaissance des activitĂ©s humaines reçoit une attention particuliĂšre grĂące Ă  son implication profonde dans plusieurs problĂ©matiques de vie quotidienne. Ainsi, ce domaine est devenu une piĂšce majeure qui fournit des services Ă  un large Ă©ventail de domaines comme la surveillance du trafic en temps rĂ©el, les rĂ©seaux sociaux, le marketing et la santĂ©. Cependant, l'un des principaux problĂšmes qui peuvent compromettre un modĂšle de reconnaissance d’activitĂ© sur les smartphones est la durĂ©e de vie limitĂ©e de la batterie. Ce handicap reprĂ©sente un grand obstacle pour la qualitĂ© et la continuitĂ© du service. En effet, la consommation d'Ă©nergie excessive peut devenir un obstacle majeur aux applications sensibles au contexte, peu importe Ă  quel point ce service est utile. Nous prĂ©sentons dans de cette thĂšse une nouvelle approche non supervisĂ©e qui permet la dĂ©tection incrĂ©mentale des activitĂ©s externes sans Ă©puiser les ressources du tĂ©lĂ©phone. Nous parvenons Ă  associer efficacement les lieux visitĂ©s par des individus lors de leurs dĂ©placements Ă  des activitĂ©s humaines significatives. Notre approche comprend de nouveaux modĂšles de classification en ligne des activitĂ©s humaines sans une utilisation massive des donnĂ©es historiques. Pour optimiser la consommation de la batterie, notre approche se comporte de façon variable selon les comportements des utilisateurs et le niveau de la batterie restant. De plus, nous proposons d'apprendre les habitudes des utilisateurs afin de rĂ©duire la complexitĂ© de l’algorithme de reconnaissance d'activitĂ©s. Pour se faire, notre mĂ©thode combine la reconnaissance d’activitĂ©s et la prĂ©diction des prochaines activitĂ©s afin d’atteindre une consommation raisonnable des ressources du tĂ©lĂ©phone. Nous montrons que notre proposition rĂ©duit remarquablement la consommation de la batterie tout en gardant un taux de prĂ©cision Ă©levĂ©
    • 

    corecore