975 research outputs found

    Mobile Device Background Sensors: Authentication vs Privacy

    Get PDF
    The increasing number of mobile devices in recent years has caused the collection of a large amount of personal information that needs to be protected. To this aim, behavioural biometrics has become very popular. But, what is the discriminative power of mobile behavioural biometrics in real scenarios? With the success of Deep Learning (DL), architectures based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), have shown improvements compared to traditional machine learning methods. However, these DL architectures still have limitations that need to be addressed. In response, new DL architectures like Transformers have emerged. The question is, can these new Transformers outperform previous biometric approaches? To answers to these questions, this thesis focuses on behavioural biometric authentication with data acquired from mobile background sensors (i.e., accelerometers and gyroscopes). In addition, to the best of our knowledge, this is the first thesis that explores and proposes novel behavioural biometric systems based on Transformers, achieving state-of-the-art results in gait, swipe, and keystroke biometrics. The adoption of biometrics requires a balance between security and privacy. Biometric modalities provide a unique and inherently personal approach for authentication. Nevertheless, biometrics also give rise to concerns regarding the invasion of personal privacy. According to the General Data Protection Regulation (GDPR) introduced by the European Union, personal data such as biometric data are sensitive and must be used and protected properly. This thesis analyses the impact of sensitive data in the performance of biometric systems and proposes a novel unsupervised privacy-preserving approach. The research conducted in this thesis makes significant contributions, including: i) a comprehensive review of the privacy vulnerabilities of mobile device sensors, covering metrics for quantifying privacy in relation to sensitive data, along with protection methods for safeguarding sensitive information; ii) an analysis of authentication systems for behavioural biometrics on mobile devices (i.e., gait, swipe, and keystroke), being the first thesis that explores the potential of Transformers for behavioural biometrics, introducing novel architectures that outperform the state of the art; and iii) a novel privacy-preserving approach for mobile biometric gait verification using unsupervised learning techniques, ensuring the protection of sensitive data during the verification process

    Colossal Trajectory Mining: A unifying approach to mine behavioral mobility patterns

    Get PDF
    Spatio-temporal mobility patterns are at the core of strategic applications such as urban planning and monitoring. Depending on the strength of spatio-temporal constraints, different mobility patterns can be defined. While existing approaches work well in the extraction of groups of objects sharing fine-grained paths, the huge volume of large-scale data asks for coarse-grained solutions. In this paper, we introduce Colossal Trajectory Mining (CTM) to efficiently extract heterogeneous mobility patterns out of a multidimensional space that, along with space and time dimensions, can consider additional trajectory features (e.g., means of transport or activity) to characterize behavioral mobility patterns. The algorithm is natively designed in a distributed fashion, and the experimental evaluation shows its scalability with respect to the involved features and the cardinality of the trajectory dataset

    Towards Integration of Artificial Intelligence into Medical Devices as a Real-Time Recommender System for Personalised Healthcare:State-of-the-Art and Future Prospects

    Get PDF
    In the era of big data, artificial intelligence (AI) algorithms have the potential to revolutionize healthcare by improving patient outcomes and reducing healthcare costs. AI algorithms have frequently been used in health care for predictive modelling, image analysis and drug discovery. Moreover, as a recommender system, these algorithms have shown promising impacts on personalized healthcare provision. A recommender system learns the behaviour of the user and predicts their current preferences (recommends) based on their previous preferences. Implementing AI as a recommender system improves this prediction accuracy and solves cold start and data sparsity problems. However, most of the methods and algorithms are tested in a simulated setting which cannot recapitulate the influencing factors of the real world. This review article systematically reviews prevailing methodologies in recommender systems and discusses the AI algorithms as recommender systems specifically in the field of healthcare. It also provides discussion around the most cutting-edge academic and practical contributions present in the literature, identifies performance evaluation matrices, challenges in the implementation of AI as a recommender system, and acceptance of AI-based recommender systems by clinicians. The findings of this article direct researchers and professionals to comprehend currently developed recommender systems and the future of medical devices integrated with real-time recommender systems for personalized healthcare

    Subgroup discovery for structured target concepts

    Get PDF
    The main object of study in this thesis is subgroup discovery, a theoretical framework for finding subgroups in data—i.e., named sub-populations— whose behaviour with respect to a specified target concept is exceptional when compared to the rest of the dataset. This is a powerful tool that conveys crucial information to a human audience, but despite past advances has been limited to simple target concepts. In this work we propose algorithms that bring this framework to novel application domains. We introduce the concept of representative subgroups, which we use not only to ensure the fairness of a sub-population with regard to a sensitive trait, such as race or gender, but also to go beyond known trends in the data. For entities with additional relational information that can be encoded as a graph, we introduce a novel measure of robust connectedness which improves on established alternative measures of density; we then provide a method that uses this measure to discover which named sub-populations are more well-connected. Our contributions within subgroup discovery crescent with the introduction of kernelised subgroup discovery: a novel framework that enables the discovery of subgroups on i.i.d. target concepts with virtually any kind of structure. Importantly, our framework additionally provides a concrete and efficient tool that works out-of-the-box without any modification, apart from specifying the Gramian of a positive definite kernel. To use within kernelised subgroup discovery, but also on any other kind of kernel method, we additionally introduce a novel random walk graph kernel. Our kernel allows the fine tuning of the alignment between the vertices of the two compared graphs, during the count of the random walks, while we also propose meaningful structure-aware vertex labels to utilise this new capability. With these contributions we thoroughly extend the applicability of subgroup discovery and ultimately re-define it as a kernel method.Der Hauptgegenstand dieser Arbeit ist die Subgruppenentdeckung (Subgroup Discovery), ein theoretischer Rahmen für das Auffinden von Subgruppen in Daten—d. h. benannte Teilpopulationen—deren Verhalten in Bezug auf ein bestimmtes Targetkonzept im Vergleich zum Rest des Datensatzes außergewöhnlich ist. Es handelt sich hierbei um ein leistungsfähiges Instrument, das einem menschlichen Publikum wichtige Informationen vermittelt. Allerdings ist es trotz bisherigen Fortschritte auf einfache Targetkonzepte beschränkt. In dieser Arbeit schlagen wir Algorithmen vor, die diesen Rahmen auf neuartige Anwendungsbereiche übertragen. Wir führen das Konzept der repräsentativen Untergruppen ein, mit dem wir nicht nur die Fairness einer Teilpopulation in Bezug auf ein sensibles Merkmal wie Rasse oder Geschlecht sicherstellen, sondern auch über bekannte Trends in den Daten hinausgehen können. Für Entitäten mit zusätzlicher relationalen Information, die als Graph kodiert werden kann, führen wir ein neuartiges Maß für robuste Verbundenheit ein, das die etablierten alternativen Dichtemaße verbessert; anschließend stellen wir eine Methode bereit, die dieses Maß verwendet, um herauszufinden, welche benannte Teilpopulationen besser verbunden sind. Unsere Beiträge in diesem Rahmen gipfeln in der Einführung der kernelisierten Subgruppenentdeckung: ein neuartiger Rahmen, der die Entdeckung von Subgruppen für u.i.v. Targetkonzepten mit praktisch jeder Art von Struktur ermöglicht. Wichtigerweise, unser Rahmen bereitstellt zusätzlich ein konkretes und effizientes Werkzeug, das ohne jegliche Modifikation funktioniert, abgesehen von der Angabe des Gramian eines positiv definitiven Kernels. Für den Einsatz innerhalb der kernelisierten Subgruppentdeckung, aber auch für jede andere Art von Kernel-Methode, führen wir zusätzlich einen neuartigen Random-Walk-Graph-Kernel ein. Unser Kernel ermöglicht die Feinabstimmung der Ausrichtung zwischen den Eckpunkten der beiden unter-Vergleich-gestelltenen Graphen während der Zählung der Random Walks, während wir auch sinnvolle strukturbewusste Vertex-Labels vorschlagen, um diese neue Fähigkeit zu nutzen. Mit diesen Beiträgen erweitern wir die Anwendbarkeit der Subgruppentdeckung gründlich und definieren wir sie im Endeffekt als Kernel-Methode neu

    A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

    Full text link
    Decision trees are essential yet NP-complete to train, prompting the widespread use of heuristic methods such as CART, which suffers from sub-optimal performance due to its greedy nature. Recently, breakthroughs in finding optimal decision trees have emerged; however, these methods still face significant computational costs and struggle with continuous features in large-scale datasets and deep trees. To address these limitations, we introduce a moving-horizon differential evolution algorithm for classification trees with continuous features (MH-DEOCT). Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples, a GPU-accelerated implementation that significantly reduces running time, and a moving-horizon strategy that iteratively trains shallow subtrees at each node to balance the vision and optimizer capability. Comprehensive studies on 68 UCI datasets demonstrate that our approach outperforms the heuristic method CART on training and testing accuracy by an average of 3.44% and 1.71%, respectively. Moreover, these numerical studies empirically demonstrate that MH-DEOCT achieves near-optimal performance (only 0.38% and 0.06% worse than the global optimal method on training and testing, respectively), while it offers remarkable scalability for deep trees (e.g., depth=8) and large-scale datasets (e.g., ten million samples).Comment: 36 pages (13 pages for the main body, 23 pages for the appendix), 7 figure

    Knowledge-based recommender system for stocks using clustering and nearest neighbors

    Get PDF
    Recommendation systems and algorithms are part of many services we use today. Online marketplaces, social media sites, streaming services, and many others lean on the algorithms to provide content for a user that match one’s likings. A practical example of such system is Netflix which may recommend movies to a user based on one’s viewing history. “Since you watched X, you might also be interested in Y”. Even though these algorithms are used in multiple services, there are still applications where the power of recommendation systems hasn’t been fully utilized for a public consumer. One of these are publicly traded stocks. Investing into publicly listed stocks is a common way to generate wealth. There are thousands of companies listed in NYSE and NASDAQ stock markets in the USA only. For an investor this is a lot to choose from. Some may prefer growth stocks and others blue-chip stocks with high dividend yield. One can search higher risk-reward returns from stocks that are dropping heavily and other seek steady growth in their preferred stocks. This thesis aims to implement a knowledge-based recommendation system that considers not only stock’s financial data but also historical price development to give meaningful stock recommendations based on an input of a single stock in a casebased manner. The implementation considers two different approaches when combining these distinctly different data types. The experimental development relies on clustering techniques to categorize similar stocks into different recommendation lists and finally sorting the lists using nearest neighbors. The evaluation of the approaches is conducted using machine learning evaluation methods combined with evaluation metrics used in recommender systems. The final best performing implementation is built on top of K-means clustering technique and t-SNE dimensionality reduction method. Trendlines and financial data of the stocks are combined using separately computed distance matrices. Similarity between the trendlines is computed using customized cosine-distance function. Finally the thesis presents a Stock Recommender using Similarity-based Methods (StockRSM)

    Explainable temporal data mining techniques to support the prediction task in Medicine

    Get PDF
    In the last decades, the increasing amount of data available in all fields raises the necessity to discover new knowledge and explain the hidden information found. On one hand, the rapid increase of interest in, and use of, artificial intelligence (AI) in computer applications has raised a parallel concern about its ability (or lack thereof) to provide understandable, or explainable, results to users. In the biomedical informatics and computer science communities, there is considerable discussion about the `` un-explainable" nature of artificial intelligence, where often algorithms and systems leave users, and even developers, in the dark with respect to how results were obtained. Especially in the biomedical context, the necessity to explain an artificial intelligence system result is legitimate of the importance of patient safety. On the other hand, current database systems enable us to store huge quantities of data. Their analysis through data mining techniques provides the possibility to extract relevant knowledge and useful hidden information. Relationships and patterns within these data could provide new medical knowledge. The analysis of such healthcare/medical data collections could greatly help to observe the health conditions of the population and extract useful information that can be exploited in the assessment of healthcare/medical processes. Particularly, the prediction of medical events is essential for preventing disease, understanding disease mechanisms, and increasing patient quality of care. In this context, an important aspect is to verify whether the database content supports the capability of predicting future events. In this thesis, we start addressing the problem of explainability, discussing some of the most significant challenges need to be addressed with scientific and engineering rigor in a variety of biomedical domains. We analyze the ``temporal component" of explainability, focusing on detailing different perspectives such as: the use of temporal data, the temporal task, the temporal reasoning, and the dynamics of explainability in respect to the user perspective and to knowledge. Starting from this panorama, we focus our attention on two different temporal data mining techniques. The first one, based on trend abstractions, starting from the concept of Trend-Event Pattern and moving through the concept of prediction, we propose a new kind of predictive temporal patterns, namely Predictive Trend-Event Patterns (PTE-Ps). The framework aims to combine complex temporal features to extract a compact and non-redundant predictive set of patterns composed by such temporal features. The second one, based on functional dependencies, we propose a methodology for deriving a new kind of approximate temporal functional dependencies, called Approximate Predictive Functional Dependencies (APFDs), based on a three-window framework. We then discuss the concept of approximation, the data complexity of deriving an APFD, the introduction of two new error measures, and finally the quality of APFDs in terms of coverage and reliability. Exploiting these methodologies, we analyze intensive care unit data from the MIMIC dataset

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Design and Evaluation of Parallel and Scalable Machine Learning Research in Biomedical Modelling Applications

    Get PDF
    The use of Machine Learning (ML) techniques in the medical field is not a new occurrence and several papers describing research in that direction have been published. This research has helped in analysing medical images, creating responsive cardiovascular models, and predicting outcomes for medical conditions among many other applications. This Ph.D. aims to apply such ML techniques for the analysis of Acute Respiratory Distress Syndrome (ARDS) which is a severe condition that affects around 1 in 10.000 patients worldwide every year with life-threatening consequences. We employ previously developed mechanistic modelling approaches such as the “Nottingham Physiological Simulator,” through which better understanding of ARDS progression can be gleaned, and take advantage of the growing volume of medical datasets available for research (i.e., “big data”) and the advances in ML to develop, train, and optimise the modelling approaches. Additionally, the onset of the COVID-19 pandemic while this Ph.D. research was ongoing provided a similar application field to ARDS, and made further ML research in medical diagnosis applications possible. Finally, we leverage the available Modular Supercomputing Architecture (MSA) developed as part of the Dynamical Exascale Entry Platform~- Extreme Scale Technologies (DEEP-EST) EU Project to scale up and speed up the modelling processes. This Ph.D. Project is one element of the Smart Medical Information Technology for Healthcare (SMITH) project wherein the thesis research can be validated by clinical and medical experts (e.g. Uniklinik RWTH Aachen).Notkun vélnámsaðferða (ML) í læknavísindum er ekki ný af nálinni og hafa nokkrar greinar verið birtar um rannsóknir á því sviði. Þessar rannsóknir hafa hjálpað til við að greina læknisfræðilegar myndir, búa til svörunarlíkön fyrir hjarta- og æðakerfi og spá fyrir um útkomu sjúkdóma meðal margra annarra notkunarmöguleika. Markmið þessarar doktorsrannsóknar er að beita slíkum ML aðferðum við greiningu á bráðu andnauðarheilkenni (ARDS), alvarlegan sjúkdóm sem hrjáir um 1 af hverjum 10.000 sjúklingum á heimsvísu á ári hverju með lífshættulegum afleiðingum. Til að framkvæma þessa greiningu notum við áður þróaðar aðferðir við líkanasmíði, s.s. „Nottingham Physiological Simulator“, sem nota má til að auka skilning á framvindu ARDS-sjúkdómsins. Við nýtum okkur vaxandi umfang læknisfræðilegra gagnasafna sem eru aðgengileg til rannsókna (þ.e. „stórgögn“), framfarir í vélnámi til að þróa, þjálfa og besta líkanaaðferðirnar. Þar að auki hófst COVID-19 faraldurinn þegar doktorsrannsóknin var í vinnslu, sem setti svipað svið fram og ARDS og gerði frekari rannsóknir á ML í læknisfræði mögulegar. Einnig nýtum við tiltæka einingaskipta högun ofurtölva, „Modular Supercomputing Architecture“ (MSA), sem er þróuð sem hluti af „Dynamical Exascale Entry Platform“ - Extreme Scale Technologies (DEEP-EST) verkefnisáætlun ESB til að kvarða og hraða líkanasmíðinni. Þetta doktorsverkefni er einn þáttur í SMITH-verkefninu (e. Smart Medical Information Technology for Healthcare) þar sem sérfræðingar í klíník og læknisfræði geta staðfest rannsóknina (t.d. Uniklinik RWTH Aachen)

    FootApp: An AI-powered system for football match annotation

    Get PDF
    In the last years, scientific and industrial research has experienced a growing interest in acquiring large annotated data sets to train artificial intelligence algorithms for tackling problems in different domains. In this context, we have observed that even the market for football data has substantially grown. The analysis of football matches relies on the annotation of both individual players’ and team actions, as well as the athletic performance of players. Consequently, annotating football events at a fine-grained level is a very expensive and error-prone task. Most existing semi-automatic tools for football match annotation rely on cameras and computer vision. However, those tools fall short in capturing team dynamics and in extracting data of players who are not visible in the camera frame. To address these issues, in this manuscript we present FootApp, an AI-based system for football match annotation. First, our system relies on an advanced and mixed user interface that exploits both vocal and touch interaction. Second, the motor performance of players is captured and processed by applying machine learning algorithms to data collected from inertial sensors worn by players. Artificial intelligence techniques are then used to check the consistency of generated labels, including those regarding the physical activity of players, to automatically recognize annotation errors. Notably, we implemented a full prototype of the proposed system, performing experiments to show its effectiveness in a real-world adoption scenario
    corecore