24,302 research outputs found
Mental distress detection and triage in forum posts: the LT3 CLPsych 2016 shared task system
This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts
Vote-boosting ensembles
Vote-boosting is a sequential ensemble learning method in which the
individual classifiers are built on different weighted versions of the training
data. To build a new classifier, the weight of each training instance is
determined in terms of the degree of disagreement among the current ensemble
predictions for that instance. For low class-label noise levels, especially
when simple base learners are used, emphasis should be made on instances for
which the disagreement rate is high. When more flexible classifiers are used
and as the noise level increases, the emphasis on these uncertain instances
should be reduced. In fact, at sufficiently high levels of class-label noise,
the focus should be on instances on which the ensemble classifiers agree. The
optimal type of emphasis can be automatically determined using
cross-validation. An extensive empirical analysis using the beta distribution
as emphasis function illustrates that vote-boosting is an effective method to
generate ensembles that are both accurate and robust
- …