614 research outputs found

    What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models

    Get PDF
    Experiments with transfer learning on pre-trained language models such as BERT have shown that the layers of these models resemble the classical NLP pipeline, with progressively more complex tasks being concentrated in later layers of the network. We investigate to what extent these results also hold for a language other than English. For this we probe a Dutch BERT-based model and the multilingual BERT model for Dutch NLP tasks. In addition, by considering the task of part-of-speech tagging in more detail, we show that also within a given task, information is spread over different parts of the network and the pipeline might not be as neat as it seems. Each layer has different specialisations and it is therefore useful to combine information from different layers for best results, instead of selecting a single layer based on the best overall performance

    AdaSampling for positive-unlabeled and label noise learning with bioinformatics applications

    Full text link
    © 2018 IEEE. Class labels are required for supervised learning but may be corrupted or missing in various applications. In binary classification, for example, when only a subset of positive instances is labeled whereas the remaining are unlabeled, positive-unlabeled (PU) learning is required to model from both positive and unlabeled data. Similarly, when class labels are corrupted by mislabeled instances, methods are needed for learning in the presence of class label noise (LN). Here we propose adaptive sampling (AdaSampling), a framework for both PU learning and learning with class LN. By iteratively estimating the class mislabeling probability with an adaptive sampling procedure, the proposed method progressively reduces the risk of selecting mislabeled instances for model training and subsequently constructs highly generalizable models even when a large proportion of mislabeled instances is present in the data. We demonstrate the utilities of proposed methods using simulation and benchmark data, and compare them to alternative approaches that are commonly used for PU learning and/or learning with LN. We then introduce two novel bioinformatics applications where AdaSampling is used to: 1) identify kinase-substrates from mass spectrometry-based phosphoproteomics data and 2) predict transcription factor target genes by integrating various next-generation sequencing data

    Fuzzy-rough Classifier Ensemble Selection

    Get PDF

    Recent Advances in Machine Learning Applied to Ultrasound Imaging

    Get PDF
    Machine learning (ML) methods are pervading an increasing number of fields of application because of their capacity to effectively solve a wide variety of challenging problems. The employment of ML techniques in ultrasound imaging applications started several years ago but the scientific interest in this issue has increased exponentially in the last few years. The present work reviews the most recent (2019 onwards) implementations of machine learning techniques for two of the most popular ultrasound imaging fields, medical diagnostics and non-destructive evaluation. The former, which covers the major part of the review, was analyzed by classifying studies according to the human organ investigated and the methodology (e.g., detection, segmentation, and/or classification) adopted, while for the latter, some solutions to the detection/classification of material defects or particular patterns are reported. Finally, the main merits of machine learning that emerged from the study analysis are summarized and discussed. © 2022 by the authors. Licensee MDPI, Basel, Switzerland

    Target classification with simple infrared sensors using artificial neural networks

    Get PDF
    This study investigates the use of low-cost infrared (IR) sensors for the determination of geometry and surface properties of commonly encountered features or targets in indoor environments, such as planes, corners, edges, and cylinders using artificial neural networks (ANNs). The intensity measurements obtained from such sensors are highly dependent on the location, geometry, and surface properties of the reflecting target in a way which cannot be represented by a simple analytical relationship, therefore complicating the localization and classification process. We propose the use of angular intensity scans and feature vectors obtained by modeling of angular intensity scans and present two different neural network based approaches in order to classify the geometry and/or the surface type of the targets. In the first case, where planes, 90° corners, and 90° edges covered with aluminum, white cloth, and Styrofoam packaging material are differentiated, an average correct classification rate of 78% of both geometry and surface over all target types is achieved. In the second case, where planes, 90° edges, and cylinders covered with different surface materials are differentiated, an average correct classification rate of 99.5% is achieved. The method demonstrated shows that ANNs can be used to extract substantially more information than IR sensors are commonly employed for. © 2008 IEEE

    Statistical pattern recognition techniques for target differentiation using infrared sensor

    Get PDF
    This study compares the performances of various statistical pattern recognition techniques for the differentiation of commonly encountered features in indoor environments, possibly with different surface properties, using simple infrared (IR) sensors. The intensity measurements obtained from such sensors are highly dependent on the location, geometry, and surface properties of the reflecting feature in a way that cannot be represented by a simple analytical relationship, therefore complicating the differentiation process. We construct feature vectors based on the parameters of angular IR intensity scans from different targets to determine their geometry type. Mixture of normals classifier with three components correctly differentiates three types of geometries with different surface properties, resulting in the best performance (100%) in geometry differentiation. The results indicate that the geometrical properties of the targets are more distinctive than their surface properties, and surface recognition is the limiting factor in differentiation. The results demonstrate that simple IR sensors, when coupled with appropriate processing and recognition techniques, can be used to extract substantially more information than such devices are commonly employed for. © 2006 IEEE

    Comparative analysis of different approaches to target classification and localization with sonar

    Get PDF
    The comparison of different classification and fusion techniques was done for target classification and localization with sonar. Target localization performance of artificial neural networks (ANN) was found to be better than the target differentiation algorithm (TDA) and fusion techniques. The target classification performance of non-parametric approaches was better than that of parameterized density estimator (PDE) using homoscedastic and heteroscedastic NM for statistical pattern recognition techniques
    corecore