3,232 research outputs found

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Mobile Device Background Sensors: Authentication vs Privacy

    Get PDF
    The increasing number of mobile devices in recent years has caused the collection of a large amount of personal information that needs to be protected. To this aim, behavioural biometrics has become very popular. But, what is the discriminative power of mobile behavioural biometrics in real scenarios? With the success of Deep Learning (DL), architectures based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), have shown improvements compared to traditional machine learning methods. However, these DL architectures still have limitations that need to be addressed. In response, new DL architectures like Transformers have emerged. The question is, can these new Transformers outperform previous biometric approaches? To answers to these questions, this thesis focuses on behavioural biometric authentication with data acquired from mobile background sensors (i.e., accelerometers and gyroscopes). In addition, to the best of our knowledge, this is the first thesis that explores and proposes novel behavioural biometric systems based on Transformers, achieving state-of-the-art results in gait, swipe, and keystroke biometrics. The adoption of biometrics requires a balance between security and privacy. Biometric modalities provide a unique and inherently personal approach for authentication. Nevertheless, biometrics also give rise to concerns regarding the invasion of personal privacy. According to the General Data Protection Regulation (GDPR) introduced by the European Union, personal data such as biometric data are sensitive and must be used and protected properly. This thesis analyses the impact of sensitive data in the performance of biometric systems and proposes a novel unsupervised privacy-preserving approach. The research conducted in this thesis makes significant contributions, including: i) a comprehensive review of the privacy vulnerabilities of mobile device sensors, covering metrics for quantifying privacy in relation to sensitive data, along with protection methods for safeguarding sensitive information; ii) an analysis of authentication systems for behavioural biometrics on mobile devices (i.e., gait, swipe, and keystroke), being the first thesis that explores the potential of Transformers for behavioural biometrics, introducing novel architectures that outperform the state of the art; and iii) a novel privacy-preserving approach for mobile biometric gait verification using unsupervised learning techniques, ensuring the protection of sensitive data during the verification process

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Protecting Privacy in Indian Schools: Regulating AI-based Technologies' Design, Development and Deployment

    Get PDF
    Education is one of the priority areas for the Indian government, where Artificial Intelligence (AI) technologies are touted to bring digital transformation. Several Indian states have also started deploying facial recognition-enabled CCTV cameras, emotion recognition technologies, fingerprint scanners, and Radio frequency identification tags in their schools to provide personalised recommendations, ensure student security, and predict the drop-out rate of students but also provide 360-degree information of a student. Further, Integrating Aadhaar (digital identity card that works on biometric data) across AI technologies and learning and management systems (LMS) renders schools a ‘panopticon’. Certain technologies or systems like Aadhaar, CCTV cameras, GPS Systems, RFID tags, and learning management systems are used primarily for continuous data collection, storage, and retention purposes. Though they cannot be termed AI technologies per se, they are fundamental for designing and developing AI systems like facial, fingerprint, and emotion recognition technologies. The large amount of student data collected speedily through the former technologies is used to create an algorithm for the latter-stated AI systems. Once algorithms are processed using machine learning (ML) techniques, they learn correlations between multiple datasets predicting each student’s identity, decisions, grades, learning growth, tendency to drop out, and other behavioural characteristics. Such autonomous and repetitive collection, processing, storage, and retention of student data without effective data protection legislation endangers student privacy. The algorithmic predictions by AI technologies are an avatar of the data fed into the system. An AI technology is as good as the person collecting the data, processing it for a relevant and valuable output, and regularly evaluating the inputs going inside an AI model. An AI model can produce inaccurate predictions if the person overlooks any relevant data. However, the state, school administrations and parents’ belief in AI technologies as a panacea to student security and educational development overlooks the context in which ‘data practices’ are conducted. A right to privacy in an AI age is inextricably connected to data practices where data gets ‘cooked’. Thus, data protection legislation operating without understanding and regulating such data practices will remain ineffective in safeguarding privacy. The thesis undergoes interdisciplinary research that enables a better understanding of the interplay of data practices of AI technologies with social practices of an Indian school, which the present Indian data protection legislation overlooks, endangering students’ privacy from designing and developing to deploying stages of an AI model. The thesis recommends the Indian legislature frame better legislation equipped for the AI/ML age and the Indian judiciary on evaluating the legality and reasonability of designing, developing, and deploying such technologies in schools

    Computational and experimental studies on the reaction mechanism of bio-oil components with additives for increased stability and fuel quality

    Get PDF
    As one of the world’s largest palm oil producers, Malaysia encountered a major disposal problem as vast amount of oil palm biomass wastes are produced. To overcome this problem, these biomass wastes can be liquefied into biofuel with fast pyrolysis technology. However, further upgradation of fast pyrolysis bio-oil via direct solvent addition was required to overcome it’s undesirable attributes. In addition, the high production cost of biofuels often hinders its commercialisation. Thus, the designed solvent-oil blend needs to achieve both fuel functionality and economic targets to be competitive with the conventional diesel fuel. In this thesis, a multi-stage computer-aided molecular design (CAMD) framework was employed for bio-oil solvent design. In the design problem, molecular signature descriptors were applied to accommodate different classes of property prediction models. However, the complexity of the CAMD problem increases as the height of signature increases due to the combinatorial nature of higher order signature. Thus, a consistency rule was developed reduce the size of the CAMD problem. The CAMD problem was then further extended to address the economic aspects via fuzzy multi-objective optimisation approach. Next, a rough-set based machine learning (RSML) model has been proposed to correlate the feedstock characterisation and pyrolysis condition with the pyrolysis bio-oil properties by generating decision rules. The generated decision rules were analysed from a scientific standpoint to identify the underlying patterns, while ensuring the rules were logical. The decision rules generated can be used to select optimal feedstock composition and pyrolysis condition to produce pyrolysis bio-oil of targeted fuel properties. Next, the results obtained from the computational approaches were verified through experimental study. The generated pyrolysis bio-oils were blended with the identified solvents at various mixing ratio. In addition, emulsification of the solvent-oil blend in diesel was also conducted with the help of surfactants. Lastly, potential extensions and prospective work for this study have been discuss in the later part of this thesis. To conclude, this thesis presented the combination of computational and experimental approaches in upgrading the fuel properties of pyrolysis bio-oil. As a result, high quality biofuel can be generated as a cleaner burning replacement for conventional diesel fuel

    Backpropagation Beyond the Gradient

    Get PDF
    Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models for which they could manually compute derivatives. Now, they can create sophisticated models with almost no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the gradient computation in these libraries. Their entire design centers around gradient backpropagation. These frameworks are specialized around one specific task—computing the average gradient in a mini-batch. This specialization often complicates the extraction of other information like higher-order statistical moments of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order information and there is evidence that focusing solely on the gradient has not lead to significant recent advances in deep learning optimization. To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient must be made available at the same level of computational efficiency, automation, and convenience. This thesis presents approaches to simplify experimentation with rich information beyond the gradient by making it more readily accessible. We present an implementation of these ideas as an extension to the backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information. First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation which enables computing approximate per-layer curvature. This perspective unifies recently proposed block- diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order derivatives is modular, and therefore simple to automate and extend to new operations. Based on the insight that rich information beyond the gradient can be computed efficiently and at the same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and convenient access to statistical moments of the gradient and approximate curvature information, often at a small overhead compared to computing just the gradient. Next, we showcase the utility of such information to better understand neural network training. We build the Cockpit library that visualizes what is happening inside the model during training through various instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide hyperparameter tuning, and study deep learning phenomena. Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information helps in understanding challenges to make second-order optimization methods work in practice. This work develops new tools to experiment more easily with higher-order information in complex deep learning models. These tools have impacted works on Bayesian applications with Laplace approximations, out-of-distribution generalization, differential privacy, and the design of automatic differentia- tion systems. They constitute one important step towards developing and establishing more efficient deep learning algorithms

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum
    • …
    corecore