1,706 research outputs found
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results
In Ordinal Classification tasks, items have to be assigned to classes that
have a relative ordering, such as positive, neutral, negative in sentiment
analysis. Remarkably, the most popular evaluation metrics for ordinal
classification tasks either ignore relevant information (for instance,
precision/recall on each of the classes ignores their relative ordering) or
assume additional information (for instance, Mean Average Error assumes
absolute distances between classes). In this paper we propose a new metric for
Ordinal Classification, Closeness Evaluation Measure, that is rooted on
Measurement Theory and Information Theory. Our theoretical analysis and
experimental results over both synthetic data and data from NLP shared tasks
indicate that the proposed metric captures quality aspects from different
traditional tasks simultaneously. In addition, it generalizes some popular
classification (nominal scale) and error minimization (interval scale) metrics,
depending on the measurement scale in which it is instantiated.Comment: To appear in Proceedings of ACL 202
Improving average ranking precision in user searches for biomedical research datasets
Availability of research datasets is keystone for health and life science
study reproducibility and scientific progress. Due to the heterogeneity and
complexity of these data, a main challenge to be overcome by research data
management systems is to provide users with the best answers for their search
queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we
investigate a novel ranking pipeline to improve the search of datasets used in
biomedical experiments. Our system comprises a query expansion model based on
word embeddings, a similarity measure algorithm that takes into consideration
the relevance of the query terms, and a dataset categorisation method that
boosts the rank of datasets matching query constraints. The system was
evaluated using a corpus with 800k datasets and 21 annotated user queries. Our
system provides competitive results when compared to the other challenge
participants. In the official run, it achieved the highest infAP among the
participants, being +22.3% higher than the median infAP of the participant's
best submissions. Overall, it is ranked at top 2 if an aggregated metric using
the best official measures per participant is considered. The query expansion
method showed positive impact on the system's performance increasing our
baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively.
Our similarity measure algorithm seems to be robust, in particular compared to
Divergence From Randomness framework, having smaller performance variations
under different training conditions. Finally, the result categorization did not
have significant impact on the system's performance. We believe that our
solution could be used to enhance biomedical dataset management systems. In
particular, the use of data driven query expansion methods could be an
alternative to the complexity of biomedical terminologies
REINA at RepLab2013 Topic Detection Task: Community Detection
Social networks have become a large repository of comments which can extract multiple information. Twitter is one of the most widespread social networks and larger and is therefore an important source for detecting states of opinion, events and happenings before even the mainstream media. Topic detection is important to discover areas of interest that arise in the tweets. We have used classical systems for a similarity matrix and we have used community detection techniques. The results have been good and allows us to study new possibilities
REINA at RepLab2013 Topic Detection Task: Community Detection
[EN]Social networks have become a large repository of comments which can extract multiple information. Twitter is one of the most widespread social networks and larger and is therefore an important source for detecting states of opinion, events and happenings before even the mainstream media. Topic detection is important to discover areas of interest that arise in the tweets. We have used classical systems for a similarity matrix and we have used community detection techniques. The results have been good and allows us to study new possibilities
Learning to classify software defects from crowds: a novel approach
In software engineering, associating each reported defect with a cate- gory allows, among many other things, for the appropriate allocation of resources. Although this classification task can be automated using stan- dard machine learning techniques, the categorization of defects for model training requires expert knowledge, which is not always available. To cir- cumvent this dependency, we propose to apply the learning from crowds paradigm, where training categories are obtained from multiple non-expert annotators (and so may be incomplete, noisy or erroneous) and, dealing with this subjective class information, classifiers are efficiently learnt. To illustrate our proposal, we present two real applications of the IBM’s or- thogonal defect classification working on the issue tracking systems from two different real domains. Bayesian network classifiers learnt using two state-of-the-art methodologies from data labeled by a crowd of annotators are used to predict the category (impact) of reported software defects. The considered methodologies show enhanced performance regarding the straightforward solution (majority voting) according to different metrics. This shows the possibilities of using non-expert knowledge aggregation techniques when expert knowledge is unavailable
Bagged ensemble of Fuzzy C-Means classifiers for nuclear transient identification
This paper presents an ensemble-based scheme for nuclear transient identification. The approach adopted to construct the ensemble of classifiers is bagging; the novelty consists in using supervised fuzzy C-means (FCM) classifiers as base classifiers of the ensemble. The performance of the proposed classification scheme has been verified by comparison with a single supervised, evolutionary-optimized FCM classifier with respect of the task of classifying artificial datasets. The results obtained indicate that in the cases of datasets of large or very small sizes and/or complex decision boundaries, the bagging ensembles can improve classification accuracy. Then, the approach has been applied to the identification of simulated transients in the feedwater system of a boiling water reactor (BWR)
Bagged ensemble of Fuzzy C-Means classifiers for nuclear transient identification
This paper presents an ensemble-based scheme for nuclear transient identification. The approach adopted to construct the ensemble of classifiers is bagging; the novelty consists in using supervised fuzzy C-means (FCM) classifiers as base classifiers of the ensemble. The performance of the proposed classification scheme has been verified by comparison with a single supervised, evolutionary-optimized FCM classifier with respect of the task of classifying artificial datasets. The results obtained indicate that in the cases of datasets of large or very small sizes and/or complex decision boundaries, the bagging ensembles can improve classification accuracy. Then, the approach has been applied to the identification of simulated transients in the feedwater system of a boiling water reactor (BWR)
- …