7 research outputs found

    Mining Association Rules Events over Data Streams

    Get PDF
    Data streams have gained considerable attention in data analysis and data mining communities because of the emergence of a new classes of applications, such as monitoring, supply chain execution, sensor networks, oilfield and pipeline operations, financial marketing and health data industries. Telecommunication advancements have provided us with easy access to stream data produced by various applications. Data in streams differ from static data stored in data warehouses or database. Data streams are continuous, arrive at high-speeds and change through time. Traditional data mining algorithms assume presence of data in conventional storage means where data mining is performed centrally with the luxury of accessing the data multiple times, using powerful processors, providing offline output with no time constraints. Such algorithms are not suitable for dynamic data streams. Stream data needs to be mined promptly as it might not be feasible to store such volume of data. In addition, streams reflect live status of the environment generating it, so prompt analysis may provide early detection of faults, delays, performance measurements, trend analysis and other diagnostics. This thesis focuses on developing a data stream association rule mining algorithm among co-occurring events. The proposed algorithm mines association rules over data streams incrementally in a centralized setting. We are interested in association rules that meet a provided minimum confidence threshold and have a lift value greater than 1. We refer to such association rules as strong rules. Experiments on several datasets demonstrate that the proposed algorithms is efficient and effective in extracting association rules from data streams, thus having a faster processing time and better memory management

    Incremental Sparse-PCA Feature Extraction For Data Streams

    Get PDF
    Intruders attempt to penetrate commercial systems daily and cause considerable financial losses for individuals and organizations. Intrusion detection systems monitor network events to detect computer security threats. An extensive amount of network data is devoted to detecting malicious activities. Storing, processing, and analyzing the massive volume of data is costly and indicate the need to find efficient methods to perform network data reduction that does not require the data to be first captured and stored. A better approach allows the extraction of useful variables from data streams in real time and in a single pass. The removal of irrelevant attributes reduces the data to be fed to the intrusion detection system (IDS) and shortens the analysis time while improving the classification accuracy. This dissertation introduces an online, real time, data processing method for knowledge extraction. This incremental feature extraction is based on two approaches. First, Chunk Incremental Principal Component Analysis (CIPCA) detects intrusion in data streams. Then, two novel incremental feature extraction methods, Incremental Structured Sparse PCA (ISSPCA) and Incremental Generalized Power Method Sparse PCA (IGSPCA), find malicious elements. Metrics helped compare the performance of all methods. The IGSPCA was found to perform as well as or better than CIPCA overall in term of dimensionality reduction, classification accuracy, and learning time. ISSPCA yielded better results for higher chunk values and greater accumulation ratio thresholds. CIPCA and IGSPCA reduced the IDS dataset to 10 principal components as opposed to 14 eigenvectors for ISSPCA. ISSPCA is more expensive in terms of learning time in comparison to the other techniques. This dissertation presents new methods that perform feature extraction from continuous data streams to find the small number of features necessary to express the most data variance. Data subsets derived from a few important variables render their interpretation easier. Another goal of this dissertation was to propose incremental sparse PCA algorithms capable to process data with concept drift and concept shift. Experiments using WaveForm and WaveFormNoise datasets confirmed this ability. Similar to CIPCA, the ISSPCA and IGSPCA updated eigen-axes as a function of the accumulation ratio value, forming informative eigenspace with few eigenvectors

    Analyzing frequent patterns in data streams using a dynamic compact stream pattern algorithm

    Get PDF
    As a result of modern technology and the advancement in communication, a large amount of data streams are continually generated from various online applications, devices and sources. Mining frequent patterns from these streams of data is now an important research topic in the field of data mining and knowledge discovery. The traditional approach of mining data may not be appropriate for a large volume of data stream environment where the data volume is quite large and unbounded. They have the limitation of extracting recent change of knowledge in an adaptive mode from the data stream. Many algorithms and models have been developed to address the challenging task of mining data from an infinite influx of data generated from various points over the internet. The objective of this thesis is to introduce the concept of Dynamic Compact Pattern Stream tree (DCPS-tree) algorithm for mining recent data from the continuous data stream. Our DCPS-tree will dynamically achieves frequency descending prefix tree structure with only a single-pass over the data by applying tree restructuring techniques such as Branch sort method (BSM). This will cause any low frequency pattern to be maintained at the leaf nodes level and any high frequency components at a higher level. As a result of this, there will be a considerable mining time reduction on the datase

    Classificação de pacientes para adaptação de cadeira de rodas inteligente

    Get PDF
    Doutoramento em Engenharia InformáticaA importância e preocupação dedicadas à autonomia e independência das pessoas idosas e dos pacientes que sofrem de algum tipo de deficiência tem vindo a aumentar significativamente ao longo das últimas décadas. As cadeiras de rodas inteligentes (CRI) são tecnologias que podem ajudar este tipo de população a aumentar a sua autonomia, sendo atualmente uma área de investigação bastante ativa. Contudo, a adaptação das CRIs a pacientes específicos e a realização de experiências com utilizadores reais são assuntos de estudo ainda muito pouco aprofundados. A cadeira de rodas inteligente, desenvolvida no âmbito do Projeto IntellWheels, é controlada a alto nível utilizando uma interface multimodal flexível, recorrendo a comandos de voz, expressões faciais, movimentos de cabeça e através de joystick. Este trabalho teve como finalidade a adaptação automática da CRI atendendo às características dos potenciais utilizadores. Foi desenvolvida uma metodologia capaz de criar um modelo do utilizador. A investigação foi baseada num sistema de recolha de dados que permite obter e armazenar dados de voz, expressões faciais, movimentos de cabeça e do corpo dos pacientes. A utilização da CRI pode ser efetuada em diferentes situações em ambiente real e simulado e um jogo sério foi desenvolvido permitindo especificar um conjunto de tarefas a ser realizado pelos utilizadores. Os dados foram analisados recorrendo a métodos de extração de conhecimento, de modo a obter o modelo dos utilizadores. Usando os resultados obtidos pelo sistema de classificação, foi criada uma metodologia que permite selecionar a melhor interface e linguagem de comando da cadeira para cada utilizador. A avaliação para validação da abordagem foi realizada no âmbito do Projeto FCT/RIPD/ADA/109636/2009 - "IntellWheels - Intelligent Wheelchair with Flexible Multimodal Interface". As experiências envolveram um vasto conjunto de indivíduos que sofrem de diversos níveis de deficiência, em estreita colaboração com a Escola Superior de Tecnologia de Saúde do Porto e a Associação do Porto de Paralisia Cerebral. Os dados recolhidos através das experiências de navegação na CRI foram acompanhados por questionários preenchidos pelos utilizadores. Estes dados foram analisados estatisticamente, a fim de provar a eficácia e usabilidade na adequação da interface da CRI ao utilizador. Os resultados mostraram, em ambiente simulado, um valor de usabilidade do sistema de 67, baseado na opinião de uma amostra de pacientes que apresentam os graus IV e V (os mais severos) de Paralisia Cerebral. Foi também demonstrado estatisticamente que a interface atribuída automaticamente pela ferramenta tem uma avaliação superior à sugerida pelos técnicos de Terapia Ocupacional, mostrando a possibilidade de atribuir automaticamente uma linguagem de comando adaptada a cada utilizador. Experiências realizadas com distintos modos de controlo revelaram a preferência dos utilizadores por um controlo compartilhado com um nível de ajuda associado ao nível de constrangimento do paciente. Em conclusão, este trabalho demonstra que é possível adaptar automaticamente uma CRI ao utilizador com claros benefícios a nível de usabilidade e segurança.The importance and concern given to the autonomy and independence of elderly people and patients suffering from some kind of disability has been growing significantly in the last few decades. Intelligent wheelchairs (IW) are technologies that can increase the autonomy and independence of this kind of population and are nowadays a very active research area. However, the adaptations to users’ specificities and experiments with real users are topics that lack deeper studies. The intelligent wheelchair, developed in the context of the IntellWheels project, is controlled at a high-level through a flexible multimodal interface, using voice commands, facial expressions, head movements and joystick as its main input modalities. This work intended to develop a system enabling the automatic adaptation, to the user characteristics, of the previously developed intelligent wheelchair. A methodology was created enabling the creation of a user model. The research was based on the development of a data gathering system, enabling the collection and storage of data from voice commands, facial expressions, head and body movements from several patients with distinct disabilities such as Cerebral Palsy. The wheelchair can be used in different situations in real and simulated environments and a serious game was developed where different tasks may be performed by users. Data was analysed using knowledge discovery methods in order to create an automatic patient classification system. Based on the classification system, a methodology was developed enabling to select the best wheelchair interface and command language for each patient. Evaluation was performed in the context of Project FCT/RIPD/ADA/109636/ 2009 – “IntellWheels – Intelligent Wheelchair with Flexible Multimodal Interface”. Experiments were conducted, using a large set of patients suffering from severe physical constraints in close collaboration with Escola Superior de Tecnologia de Saúde do Porto and Associação do Porto de Paralisia Cerebral. The experiments using the intelligent wheelchair were followed by user questionnaires. The results were statistically analysed in order to prove the effectiveness and usability of the adaptation of the Intelligent Wheelchair multimodal interface to the user characteristics. The results obtained in a simulated environment showed a 67 score on the system usability scale based in the opinion of a sample of cerebral palsy patients with the most severe cases IV and V of the Gross Motor Function Scale. It was also statistically demonstrated that the data analysis system advised the use of an adapted interface with higher evaluation than the one suggested by the occupational therapists, showing the usefulness of defining a command language adapted to each user. Experiments conducted with distinct control modes revealed the users' preference for a shared control with an aid level taking into account the level of constraint of the patient. In conclusion, this work demonstrates that it is possible to adapt an intelligent wheelchair to the user with clear usability and safety benefits
    corecore