2,482 research outputs found

    Model-Based Evaluation of Multilinguality

    Full text link

    Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

    Get PDF
    In recent years, deep learning has infiltrated every field it has touched, reducing the need for specialist knowledge and automating the process of knowledge discovery from data. This review argues that astronomy is no different, and that we are currently in the midst of a deep learning revolution that is transforming the way we do astronomy. We trace the history of astronomical connectionism from the early days of multilayer perceptrons, through the second wave of convolutional and recurrent neural networks, to the current third wave of self-supervised and unsupervised deep learning. We then predict that we will soon enter a fourth wave of astronomical connectionism, in which finetuned versions of an all-encompassing 'foundation' model will replace expertly crafted deep learning models. We argue that such a model can only be brought about through a symbiotic relationship between astronomy and connectionism, whereby astronomy provides high quality multimodal data to train the foundation model, and in turn the foundation model is used to advance astronomical research.Comment: 60 pages, 269 references, 29 figures. Review submitted to Royal Society Open Science. Comments and feedback welcom

    Unsupervised Automatic Detection Of Transient Phenomena In InSAR Time-Series using Machine Learning

    Get PDF
    The detection and measurement of transient episodes of crustal deformation from global InSAR datasets are crucial for a wide range of solid earth and natural hazard applications. But the large volumes of unlabelled data captured by satellites preclude manual systematic analysis, and the small signal-to-noise ratio makes the task difficult. In this thesis, I present a state-of-the-art, unsupervised and event-agnostic deep-learning based approach for the automatic identification of transient deformation events in noisy time-series of unwrapped InSAR images. I adopt an anomaly detection framework that learns the ‘normal’ spatio-temporal pattern of noise in the data, and which therefore identifies any transient deformation phenomena that deviate from this pattern as ‘anomalies’. The deep-learning model is built around a bespoke autoencoder that includes convolutional and LSTM layers, as well as a neural network which acts as a bridge between the encoder and decoder. I train our model on real InSAR data from northern Turkey and find it has an overall accuracy and true positive rate of around 85% when trying to detect synthetic deformation signals of length-scale > 350 m and magnitude > 4 cm. Furthermore, I also show the method can detect (1) a real Mw 5.7 earthquake in InSAR data from an entirely different region- SW Turkey, (2) a volcanic deformation in Domuyo, Argentina, (3) a synthetic slow-slip event and (4) an interseismic deformation around NAF in a descending frame in northern Turkey. Overall I show that my method is suitable for automated analysis of large, global InSAR datasets, and for robust detection and separation of deformation signals from nuisance signals in InSAR data

    Semi-supervised and Active Learning Models for Software Fault Prediction

    Get PDF
    As software continues to insinuate itself into nearly every aspect of our life, the quality of software has been an extremely important issue. Software Quality Assurance (SQA) is a process that ensures the development of high-quality software. It concerns the important problem of maintaining, monitoring, and developing quality software. Accurate detection of fault prone components in software projects is one of the most commonly practiced techniques that offer the path to high quality products without excessive assurance expenditures. This type of quality modeling requires the availability of software modules with known fault content developed in similar environment. However, collection of fault data at module level, particularly in new projects, is expensive and time-consuming. Semi-supervised learning and active learning offer solutions to this problem for learning from limited labeled data by utilizing inexpensive unlabeled data.;In this dissertation, we investigate semi-supervised learning and active learning approaches in the software fault prediction problem. The role of base learner in semi-supervised learning is discussed using several state-of-the-art supervised learners. Our results showed that semi-supervised learning with appropriate base learner leads to better performance in fault proneness prediction compared to supervised learning. In addition, incorporating pre-processing technique prior to semi-supervised learning provides a promising direction to further improving the prediction performance. Active learning, sharing the similar idea as semi-supervised learning in utilizing unlabeled data, requires human efforts for labeling fault proneness in its learning process. Empirical results showed that active learning supplemented by dimensionality reduction technique performs better than the supervised learning on release-based data sets
    • …
    corecore