4,056 research outputs found

    Lessons Learned from the ECML/PKDD Discovery Challenge on the Atherosclerosis Risk Factors Data

    Get PDF
    It becomes a good habit to organize a data mining cup, a competition or a challenge at machine learning or data mining conferences. The main idea of the Discovery Challenge organized at the European Conferences on Principles and Practice of Knowledge Discovery in Databases since 1999 was to encourage a collaborative research effort rather than a competition between data miners. Different data sets have been used for the Discovery Challenge workshops during the seven years. The paper summarizes our experience gained when organizing and evaluating the Discovery Challenge on the atherosclerosis risk factor data

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    Unsupervised monitoring of an elderly person\u27s activities of daily living using Kinect sensors and a power meter

    Get PDF
    The need for greater independence amongst the growing population of elderly people has made the concept of “ageing in place” an important area of research. Remote home monitoring strategies help the elderly deal with challenges involved in ageing in place and performing the activities of daily living (ADLs) independently. These monitoring approaches typically involve the use of several sensors, attached to the environment or person, in order to acquire data about the ADLs of the occupant being monitored. Some key drawbacks associated with many of the ADL monitoring approaches proposed for the elderly living alone need to be addressed. These include the need to label a training dataset of activities, use wearable devices or equip the house with many sensors. These approaches are also unable to concurrently monitor physical ADLs to detect emergency situations, such as falls, and instrumental ADLs to detect deviations from the daily routine. These are all indicative of deteriorating health in the elderly. To address these drawbacks, this research aimed to investigate the feasibility of unsupervised monitoring of both physical and instrumental ADLs of elderly people living alone via inexpensive minimally intrusive sensors. A hybrid framework was presented which combined two approaches for monitoring an elderly occupant’s physical and instrumental ADLs. Both approaches were trained based on unlabelled sensor data from the occupant’s normal behaviours. The data related to physical ADLs were captured from Kinect sensors and those related to instrumental ADLs were obtained using a combination of Kinect sensors and a power meter. Kinect sensors were employed in functional areas of the monitored environment to capture the occupant’s locations and 3D structures of their physical activities. The power meter measured the power consumption of home electrical appliances (HEAs) from the electricity panel. A novel unsupervised fuzzy approach was presented to monitor physical ADLs based on depth maps obtained from Kinect sensors. Epochs of activities associated with each monitored location were automatically identified, and the occupant’s behaviour patterns during each epoch were represented through the combinations of fuzzy attributes. A novel membership function generation technique was presented to elicit membership functions for attributes by analysing the data distribution of attributes while excluding noise and outliers in the data. The occupant’s behaviour patterns during each epoch of activity were then classified into frequent and infrequent categories using a data mining technique. Fuzzy rules were learned to model frequent behaviour patterns. An alarm was raised when the occupant’s behaviour in new data was recognised as frequent with a longer than usual duration or infrequent with a duration exceeding a data-driven value. Another novel unsupervised fuzzy approach to monitor instrumental ADLs took unlabelled training data from Kinect sensors and a power meter to model the key features of instrumental ADLs. Instrumental ADLs in the training dataset were identified based on associating the occupant’s locations with specific power signatures on the power line. A set of fuzzy rules was then developed to model the frequency and regularity of the instrumental activities tailored to the occupant. This set was subsequently used to monitor new data and to generate reports on deviations from normal behaviour patterns. As a proof of concept, the proposed monitoring approaches were evaluated using a dataset collected from a real-life setting. An evaluation of the results verified the high accuracy of the proposed technique to identify the epochs of activities over alternative techniques. The approach adopted for monitoring physical ADLs was found to improve elderly monitoring. It generated fuzzy rules that could represent the person’s physical ADLs and exclude noise and outliers in the data more efficiently than alternative approaches. The performance of different membership function generation techniques was compared. The fuzzy rule set obtained from the output of the proposed technique could accurately classify more scenarios of normal and abnormal behaviours. The approach for monitoring instrumental ADLs was also found to reliably distinguish power signatures generated automatically by self-regulated devices from those generated as a result of an elderly person’s instrumental ADLs. The evaluations also showed the effectiveness of the approach in correctly identifying elderly people’s interactions with specific HEAs and tracking simulated upward and downward deviations from normal behaviours. The fuzzy inference system in this approach was found to be robust in regards to errors when identifying instrumental ADLs as it could effectively classify normal and abnormal behaviour patterns despite errors in the list of the used HEAs

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Data Mining for Marketing

    Get PDF
    This paper gives a brief insight about data mining, its process and the various techniques used for it in the field of marketing. Data mining is the process of extracting hidden valuable information from the data in given data sets .In this paper cross industry standard procedure for data mining is explained along with the various techniques used for it. With growing volume of data every day, the need for data mining in marketing is also increasing day by day. It is a powerful technology to help companies focus on the most important information in their data warehouses. Data mining is actually the process of collecting data from different sources and then interpreting it and finally converting it into useful information which helps in increasing the revenue, curtailing costs thereby providing a competitive edge to the organisation

    Theory and Applications for Advanced Text Mining

    Get PDF
    Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields
    corecore