3,221 research outputs found
An empirical comparison of supervised machine learning techniques in bioinformatics
Research in bioinformatics is driven by the experimental data.
Current biological databases are populated by vast amounts of
experimental data. Machine learning has been widely applied to
bioinformatics and has gained a lot of success in this research
area. At present, with various learning algorithms available in the
literature, researchers are facing difficulties in choosing the best
method that can apply to their data. We performed an empirical
study on 7 individual learning systems and 9 different combined
methods on 4 different biological data sets, and provide some
suggested issues to be considered when answering the following
questions: (i) How does one choose which algorithm is best
suitable for their data set? (ii) Are combined methods better than
a single approach? (iii) How does one compare the effectiveness
of a particular algorithm to the others
A Primer on Bayesian Neural Networks: Review and Debates
Neural networks have achieved remarkable performance across various problem
domains, but their widespread applicability is hindered by inherent limitations
such as overconfidence in predictions, lack of interpretability, and
vulnerability to adversarial attacks. To address these challenges, Bayesian
neural networks (BNNs) have emerged as a compelling extension of conventional
neural networks, integrating uncertainty estimation into their predictive
capabilities.
This comprehensive primer presents a systematic introduction to the
fundamental concepts of neural networks and Bayesian inference, elucidating
their synergistic integration for the development of BNNs. The target audience
comprises statisticians with a potential background in Bayesian methods but
lacking deep learning expertise, as well as machine learners proficient in deep
neural networks but with limited exposure to Bayesian statistics. We provide an
overview of commonly employed priors, examining their impact on model behavior
and performance. Additionally, we delve into the practical considerations
associated with training and inference in BNNs.
Furthermore, we explore advanced topics within the realm of BNN research,
acknowledging the existence of ongoing debates and controversies. By offering
insights into cutting-edge developments, this primer not only equips
researchers and practitioners with a solid foundation in BNNs, but also
illuminates the potential applications of this dynamic field. As a valuable
resource, it fosters an understanding of BNNs and their promising prospects,
facilitating further advancements in the pursuit of knowledge and innovation.Comment: 65 page
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Improving adaptation and interpretability of a short-term traffic forecasting system
Traffic management is being more important than ever, especially in overcrowded big cities with over-pollution problems and with new unprecedented mobility changes. In this scenario, road-traffic prediction plays a key role within Intelligent Transportation Systems, allowing traffic managers to be able to anticipate and take the proper decisions. This paper aims to analyse the situation in a commercial real-time prediction system with its current problems and limitations. The analysis unveils the trade-off between simple parsimonious models and more complex models. Finally, we propose an enriched machine learning framework, Adarules, for the traffic prediction in real-time facing the problem as continuously incoming data streams with all the commonly occurring problems in such volatile scenario, namely changes in the network infrastructure and demand, new detection stations or failure ones, among others. The framework is also able to infer automatically the most relevant features to our end-task, including the relationships within the road network. Although the intention with the proposed framework is to evolve and grow with new incoming big data, however there is no limitation in starting to use it without any prior knowledge as it can starts learning the structure and parameters automatically from data. We test this predictive system in different real-work scenarios, and evaluate its performance integrating a multi-task learning paradigm for the sake of the traffic prediction task.Peer ReviewedPostprint (published version
Adaptive Polynomial Harmonic Distortion Compensation in Current and Voltage Transformers Through Iteratively Updated QR Factorization
Measuring current and voltage harmonics has paramount importance for improving the power quality of distribution grids. However, the achieved accuracy strongly depends on the adopted instrument transformer (IT). This article proposes an adaptive technique that enables an effective compensation of both the filtering behavior and the harmonic distortion (HD) introduced by current and voltage transformers (VTs), namely the strongest nonlinear effect at low-order harmonics. The approach is based on a flexible, linear in the parameters polynomial modeling of HD in the frequency domain. Model complexity can be different from one harmonic to the other, and it is selected through an automatic iterative process to suit the nonlinear behavior at each specific harmonic order, while avoiding overfitting. In particular, the number of parameters is increased by progressively updating the QR factorization of the regressor matrix trough Householder reflections until a convergence condition is reached. Experimental tests performed on an inductive VT and current transformer (CT) highlight the effectiveness of the approach
- …