1,688 research outputs found

    Mobile Device Background Sensors: Authentication vs Privacy

    Get PDF
    The increasing number of mobile devices in recent years has caused the collection of a large amount of personal information that needs to be protected. To this aim, behavioural biometrics has become very popular. But, what is the discriminative power of mobile behavioural biometrics in real scenarios? With the success of Deep Learning (DL), architectures based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), have shown improvements compared to traditional machine learning methods. However, these DL architectures still have limitations that need to be addressed. In response, new DL architectures like Transformers have emerged. The question is, can these new Transformers outperform previous biometric approaches? To answers to these questions, this thesis focuses on behavioural biometric authentication with data acquired from mobile background sensors (i.e., accelerometers and gyroscopes). In addition, to the best of our knowledge, this is the first thesis that explores and proposes novel behavioural biometric systems based on Transformers, achieving state-of-the-art results in gait, swipe, and keystroke biometrics. The adoption of biometrics requires a balance between security and privacy. Biometric modalities provide a unique and inherently personal approach for authentication. Nevertheless, biometrics also give rise to concerns regarding the invasion of personal privacy. According to the General Data Protection Regulation (GDPR) introduced by the European Union, personal data such as biometric data are sensitive and must be used and protected properly. This thesis analyses the impact of sensitive data in the performance of biometric systems and proposes a novel unsupervised privacy-preserving approach. The research conducted in this thesis makes significant contributions, including: i) a comprehensive review of the privacy vulnerabilities of mobile device sensors, covering metrics for quantifying privacy in relation to sensitive data, along with protection methods for safeguarding sensitive information; ii) an analysis of authentication systems for behavioural biometrics on mobile devices (i.e., gait, swipe, and keystroke), being the first thesis that explores the potential of Transformers for behavioural biometrics, introducing novel architectures that outperform the state of the art; and iii) a novel privacy-preserving approach for mobile biometric gait verification using unsupervised learning techniques, ensuring the protection of sensitive data during the verification process

    Towards Integration of Artificial Intelligence into Medical Devices as a Real-Time Recommender System for Personalised Healthcare:State-of-the-Art and Future Prospects

    Get PDF
    In the era of big data, artificial intelligence (AI) algorithms have the potential to revolutionize healthcare by improving patient outcomes and reducing healthcare costs. AI algorithms have frequently been used in health care for predictive modelling, image analysis and drug discovery. Moreover, as a recommender system, these algorithms have shown promising impacts on personalized healthcare provision. A recommender system learns the behaviour of the user and predicts their current preferences (recommends) based on their previous preferences. Implementing AI as a recommender system improves this prediction accuracy and solves cold start and data sparsity problems. However, most of the methods and algorithms are tested in a simulated setting which cannot recapitulate the influencing factors of the real world. This review article systematically reviews prevailing methodologies in recommender systems and discusses the AI algorithms as recommender systems specifically in the field of healthcare. It also provides discussion around the most cutting-edge academic and practical contributions present in the literature, identifies performance evaluation matrices, challenges in the implementation of AI as a recommender system, and acceptance of AI-based recommender systems by clinicians. The findings of this article direct researchers and professionals to comprehend currently developed recommender systems and the future of medical devices integrated with real-time recommender systems for personalized healthcare

    iBUST: An intelligent behavioural trust model for securing industrial cyber-physical systems

    Get PDF
    To meet the demand of the world's largest population, smart manufacturing has accelerated the adoption of smart factories—where autonomous and cooperative instruments across all levels of production and logistics networks are integrated through a Cyber-Physical Production System (CPPS). However, these networks are comprised of various heterogeneous devices with varying computational power and memory capabilities. As a result, many secure communication protocols – that demand considerably high computational power and memory – can not be verbatim employed on these networks, and thereby, leaving them more vulnerable to security threats and attacks over conventional networks. These threats can largely be tackled by employing a Trust Management Model (TMM) by exploiting the behavioural patterns of nodes to identify their trust class. In this context, ML-based models are best suited due to their ability to capture hidden patterns in data, learning and improving the pattern detection accuracy over time to counteract and tackle threats of a dynamic nature, which is absent in most of the conventional models. However, among the existing ML-based solutions in detecting attack patterns, many of them are computationally expensive, require a long training time, and a considerably large amount of training data—which are seldom available. An aid to this is the association rule learning (ARL) paradigm, whose models are computationally inexpensive and do not require a long training time. Therefore, this paper proposes an ARL-based intelligent Behavioural Trust Model (iBUST) for securing the CPPS. For this intelligent TMM, a variant of Frequency Pattern Growth (FP-Growth), called enhanced FP-Growth (EFP-Growth) algorithm is developed by altering the internal data structures for faster execution and by developing a modified exponential decay function (MEDF) to automatically calculate minimum supports for adapting trust evolution characteristics. In addition, a new optimisation model for finding optimum parameter values in the MEDF and an algorithm for transmuting a 1D quantitative feature into a respective categorical feature are developed to facilitate the model. Afterwards, the trust class of an object is identified employing the Naïve Bayes classifier. This proposed model is evaluated on a trust evolution-supported experimental environment along with other compared models taking a benchmark dataset into consideration, where it outperforms its counterparts

    A Floating-Point Secure Implementation of the Report Noisy Max with Gap Mechanism

    Full text link
    The Noisy Max mechanism and its variations are fundamental private selection algorithms that are used to select items from a set of candidates (such as the most common diseases in a population), while controlling the privacy leakage in the underlying data. A recently proposed extension, Noisy Top-k with Gap, provides numerical information about how much better the selected items are compared to the non-selected items (e.g., how much more common are the selected diseases). This extra information comes at no privacy cost but crucially relies on infinite precision for the privacy guarantees. In this paper, we provide a finite-precision secure implementation of this algorithm that takes advantage of integer arithmetic.Comment: 21 page

    Subgroup discovery for structured target concepts

    Get PDF
    The main object of study in this thesis is subgroup discovery, a theoretical framework for finding subgroups in data—i.e., named sub-populations— whose behaviour with respect to a specified target concept is exceptional when compared to the rest of the dataset. This is a powerful tool that conveys crucial information to a human audience, but despite past advances has been limited to simple target concepts. In this work we propose algorithms that bring this framework to novel application domains. We introduce the concept of representative subgroups, which we use not only to ensure the fairness of a sub-population with regard to a sensitive trait, such as race or gender, but also to go beyond known trends in the data. For entities with additional relational information that can be encoded as a graph, we introduce a novel measure of robust connectedness which improves on established alternative measures of density; we then provide a method that uses this measure to discover which named sub-populations are more well-connected. Our contributions within subgroup discovery crescent with the introduction of kernelised subgroup discovery: a novel framework that enables the discovery of subgroups on i.i.d. target concepts with virtually any kind of structure. Importantly, our framework additionally provides a concrete and efficient tool that works out-of-the-box without any modification, apart from specifying the Gramian of a positive definite kernel. To use within kernelised subgroup discovery, but also on any other kind of kernel method, we additionally introduce a novel random walk graph kernel. Our kernel allows the fine tuning of the alignment between the vertices of the two compared graphs, during the count of the random walks, while we also propose meaningful structure-aware vertex labels to utilise this new capability. With these contributions we thoroughly extend the applicability of subgroup discovery and ultimately re-define it as a kernel method.Der Hauptgegenstand dieser Arbeit ist die Subgruppenentdeckung (Subgroup Discovery), ein theoretischer Rahmen für das Auffinden von Subgruppen in Daten—d. h. benannte Teilpopulationen—deren Verhalten in Bezug auf ein bestimmtes Targetkonzept im Vergleich zum Rest des Datensatzes außergewöhnlich ist. Es handelt sich hierbei um ein leistungsfähiges Instrument, das einem menschlichen Publikum wichtige Informationen vermittelt. Allerdings ist es trotz bisherigen Fortschritte auf einfache Targetkonzepte beschränkt. In dieser Arbeit schlagen wir Algorithmen vor, die diesen Rahmen auf neuartige Anwendungsbereiche übertragen. Wir führen das Konzept der repräsentativen Untergruppen ein, mit dem wir nicht nur die Fairness einer Teilpopulation in Bezug auf ein sensibles Merkmal wie Rasse oder Geschlecht sicherstellen, sondern auch über bekannte Trends in den Daten hinausgehen können. Für Entitäten mit zusätzlicher relationalen Information, die als Graph kodiert werden kann, führen wir ein neuartiges Maß für robuste Verbundenheit ein, das die etablierten alternativen Dichtemaße verbessert; anschließend stellen wir eine Methode bereit, die dieses Maß verwendet, um herauszufinden, welche benannte Teilpopulationen besser verbunden sind. Unsere Beiträge in diesem Rahmen gipfeln in der Einführung der kernelisierten Subgruppenentdeckung: ein neuartiger Rahmen, der die Entdeckung von Subgruppen für u.i.v. Targetkonzepten mit praktisch jeder Art von Struktur ermöglicht. Wichtigerweise, unser Rahmen bereitstellt zusätzlich ein konkretes und effizientes Werkzeug, das ohne jegliche Modifikation funktioniert, abgesehen von der Angabe des Gramian eines positiv definitiven Kernels. Für den Einsatz innerhalb der kernelisierten Subgruppentdeckung, aber auch für jede andere Art von Kernel-Methode, führen wir zusätzlich einen neuartigen Random-Walk-Graph-Kernel ein. Unser Kernel ermöglicht die Feinabstimmung der Ausrichtung zwischen den Eckpunkten der beiden unter-Vergleich-gestelltenen Graphen während der Zählung der Random Walks, während wir auch sinnvolle strukturbewusste Vertex-Labels vorschlagen, um diese neue Fähigkeit zu nutzen. Mit diesen Beiträgen erweitern wir die Anwendbarkeit der Subgruppentdeckung gründlich und definieren wir sie im Endeffekt als Kernel-Methode neu

    Toward Unbiased Multiple-Target Fuzzing with Path Diversity

    Full text link
    In this paper, we propose a novel directed fuzzing solution named AFLRun, which features target path-diversity metric and unbiased energy assignment. Firstly, we develop a new coverage metric by maintaining extra virgin map for each covered target to track the coverage status of seeds that hit the target. This approach enables the storage of waypoints into the corpus that hit a target through interesting path, thus enriching the path diversity for each target. Additionally, we propose a corpus-level energy assignment strategy that guarantees fairness for each target. AFLRun starts with uniform target weight and propagates this weight to seeds to get a desired seed weight distribution. By assigning energy to each seed in the corpus according to such desired distribution, a precise and unbiased energy assignment can be achieved. We built a prototype system and assessed its performance using a standard benchmark and several extensively fuzzed real-world applications. The evaluation results demonstrate that AFLRun outperforms state-of-the-art fuzzers in terms of vulnerability detection, both in quantity and speed. Moreover, AFLRun uncovers 29 previously unidentified vulnerabilities, including 8 CVEs, across four distinct programs

    A Survey of Sequential Pattern Based E-Commerce Recommendation Systems

    Get PDF
    E-commerce recommendation systems usually deal with massive customer sequential databases, such as historical purchase or click stream sequences. Recommendation systems’ accuracy can be improved if complex sequential patterns of user purchase behavior are learned by integrating sequential patterns of customer clicks and/or purchases into the user–item rating matrix input of collaborative filtering. This review focuses on algorithms of existing E-commerce recommendation systems that are sequential pattern-based. It provides a comprehensive and comparative performance analysis of these systems, exposing their methodologies, achievements, limitations, and potential for solving more important problems in this domain. The review shows that integrating sequential pattern mining of historical purchase and/or click sequences into a user–item matrix for collaborative filtering can (i) improve recommendation accuracy, (ii) reduce user–item rating data sparsity, (iii) increase the novelty rate of recommendations, and (iv) improve the scalability of recommendation systems

    Knowledge-based recommender system for stocks using clustering and nearest neighbors

    Get PDF
    Recommendation systems and algorithms are part of many services we use today. Online marketplaces, social media sites, streaming services, and many others lean on the algorithms to provide content for a user that match one’s likings. A practical example of such system is Netflix which may recommend movies to a user based on one’s viewing history. “Since you watched X, you might also be interested in Y”. Even though these algorithms are used in multiple services, there are still applications where the power of recommendation systems hasn’t been fully utilized for a public consumer. One of these are publicly traded stocks. Investing into publicly listed stocks is a common way to generate wealth. There are thousands of companies listed in NYSE and NASDAQ stock markets in the USA only. For an investor this is a lot to choose from. Some may prefer growth stocks and others blue-chip stocks with high dividend yield. One can search higher risk-reward returns from stocks that are dropping heavily and other seek steady growth in their preferred stocks. This thesis aims to implement a knowledge-based recommendation system that considers not only stock’s financial data but also historical price development to give meaningful stock recommendations based on an input of a single stock in a casebased manner. The implementation considers two different approaches when combining these distinctly different data types. The experimental development relies on clustering techniques to categorize similar stocks into different recommendation lists and finally sorting the lists using nearest neighbors. The evaluation of the approaches is conducted using machine learning evaluation methods combined with evaluation metrics used in recommender systems. The final best performing implementation is built on top of K-means clustering technique and t-SNE dimensionality reduction method. Trendlines and financial data of the stocks are combined using separately computed distance matrices. Similarity between the trendlines is computed using customized cosine-distance function. Finally the thesis presents a Stock Recommender using Similarity-based Methods (StockRSM)

    Explainable temporal data mining techniques to support the prediction task in Medicine

    Get PDF
    In the last decades, the increasing amount of data available in all fields raises the necessity to discover new knowledge and explain the hidden information found. On one hand, the rapid increase of interest in, and use of, artificial intelligence (AI) in computer applications has raised a parallel concern about its ability (or lack thereof) to provide understandable, or explainable, results to users. In the biomedical informatics and computer science communities, there is considerable discussion about the `` un-explainable" nature of artificial intelligence, where often algorithms and systems leave users, and even developers, in the dark with respect to how results were obtained. Especially in the biomedical context, the necessity to explain an artificial intelligence system result is legitimate of the importance of patient safety. On the other hand, current database systems enable us to store huge quantities of data. Their analysis through data mining techniques provides the possibility to extract relevant knowledge and useful hidden information. Relationships and patterns within these data could provide new medical knowledge. The analysis of such healthcare/medical data collections could greatly help to observe the health conditions of the population and extract useful information that can be exploited in the assessment of healthcare/medical processes. Particularly, the prediction of medical events is essential for preventing disease, understanding disease mechanisms, and increasing patient quality of care. In this context, an important aspect is to verify whether the database content supports the capability of predicting future events. In this thesis, we start addressing the problem of explainability, discussing some of the most significant challenges need to be addressed with scientific and engineering rigor in a variety of biomedical domains. We analyze the ``temporal component" of explainability, focusing on detailing different perspectives such as: the use of temporal data, the temporal task, the temporal reasoning, and the dynamics of explainability in respect to the user perspective and to knowledge. Starting from this panorama, we focus our attention on two different temporal data mining techniques. The first one, based on trend abstractions, starting from the concept of Trend-Event Pattern and moving through the concept of prediction, we propose a new kind of predictive temporal patterns, namely Predictive Trend-Event Patterns (PTE-Ps). The framework aims to combine complex temporal features to extract a compact and non-redundant predictive set of patterns composed by such temporal features. The second one, based on functional dependencies, we propose a methodology for deriving a new kind of approximate temporal functional dependencies, called Approximate Predictive Functional Dependencies (APFDs), based on a three-window framework. We then discuss the concept of approximation, the data complexity of deriving an APFD, the introduction of two new error measures, and finally the quality of APFDs in terms of coverage and reliability. Exploiting these methodologies, we analyze intensive care unit data from the MIMIC dataset

    Patterns of Markup use in Wikipedia

    Get PDF
    Wikipedia is a knowledge building community that lets anyone create and edit articles. While editing articles, users employ visual structure elements (VSE) to format content. VSEs are part of the Wikipedia markup language. All creation and editing events are recorded in a revision history. An unsupervised learning approach was used to analyze a dataset with more than 2,000,000 revisions of 126,000 articles. Using K-Means clustering and association rules mining a general classification of revisions was derived. Relevant classes include vandalism revisions, correction revisions and common revisions. Each class was later studied, and patterns of usage of markups elements identified. Those results help to identify the user intention, and the knowledge of VSE use could contribute to improving the actual text editors provide by Wikipedia to improve the editor’s activity finally.Laboratorio de Investigación y Formación en Informática Avanzad
    • …
    corecore