485,755 research outputs found
COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks
For a company looking to provide delightful user experiences, it is of
paramount importance to take care of any customer issues. This paper proposes
COTA, a system to improve speed and reliability of customer support for end
users through automated ticket classification and answers selection for support
representatives. Two machine learning and natural language processing
techniques are demonstrated: one relying on feature engineering (COTA v1) and
the other exploiting raw signals through deep learning architectures (COTA v2).
COTA v1 employs a new approach that converts the multi-classification task into
a ranking problem, demonstrating significantly better performance in the case
of thousands of classes. For COTA v2, we propose an Encoder-Combiner-Decoder, a
novel deep learning architecture that allows for heterogeneous input and output
feature types and injection of prior knowledge through network architecture
choices. This paper compares these models and their variants on the task of
ticket classification and answer selection, showing model COTA v2 outperforms
COTA v1, and analyzes their inner workings and shortcomings. Finally, an A/B
test is conducted in a production setting validating the real-world impact of
COTA in reducing issue resolution time by 10 percent without reducing customer
satisfaction
End-to-end Feature Selection Approach for Learning Skinny Trees
Joint feature selection and tree ensemble learning is a challenging task.
Popular tree ensemble toolkits e.g., Gradient Boosted Trees and Random Forests
support feature selection post-training based on feature importances, which are
known to be misleading, and can significantly hurt performance. We propose
Skinny Trees: a toolkit for feature selection in tree ensembles, such that
feature selection and tree ensemble learning occurs simultaneously. It is based
on an end-to-end optimization approach that considers feature selection in
differentiable trees with Group regularization. We optimize
with a first-order proximal method and present convergence guarantees for a
non-convex and non-smooth objective. Interestingly, dense-to-sparse
regularization scheduling can lead to more expressive and sparser tree
ensembles than vanilla proximal method. On 15 synthetic and real-world
datasets, Skinny Trees can achieve - feature
compression rates, leading up to faster inference over dense trees,
without any loss in performance. Skinny Trees lead to superior feature
selection than many existing toolkits e.g., in terms of AUC performance for
feature budget, Skinny Trees outperforms LightGBM by (up to
), and Random Forests by (up to ).Comment: Preprin
Feature Selection in the Contrastive Analysis Setting
Contrastive analysis (CA) refers to the exploration of variations uniquely
enriched in a target dataset as compared to a corresponding background dataset
generated from sources of variation that are irrelevant to a given task. For
example, a biomedical data analyst may wish to find a small set of genes to use
as a proxy for variations in genomic data only present among patients with a
given disease (target) as opposed to healthy control subjects (background).
However, as of yet the problem of feature selection in the CA setting has
received little attention from the machine learning community. In this work we
present contrastive feature selection (CFS), a method for performing feature
selection in the CA setting. We motivate our approach with a novel
information-theoretic analysis of representation learning in the CA setting,
and we empirically validate CFS on a semi-synthetic dataset and four real-world
biomedical datasets. We find that our method consistently outperforms
previously proposed state-of-the-art supervised and fully unsupervised feature
selection methods not designed for the CA setting. An open-source
implementation of our method is available at https://github.com/suinleelab/CFS.Comment: NeurIPS 202
Machine learning from real data: A mental health registry case study
Imbalanced datasets can impair the learning performance of many Machine Learning techniques. Nevertheless, many real-world datasets, especially in the healthcare field, are inherently imbalanced. For instance, in the medical domain, the classes representing a specific disease are typically the minority of the total cases. This challenge justifies the substantial research effort spent in the past decades to tackle data imbalance at the data and algorithm levels. In this paper, we describe the strategies we used to deal with an imbalanced classification task on data extracted from a database generated from the Electronic Health Records of the Mental Health Service of the Ferrara Province, Italy. In particular, we applied balancing techniques to the original data, such as random undersampling and oversampling, and Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC). In order to assess the effectiveness of the balancing techniques on the classification task at hand, we applied different Machine Learning algorithms. We employed cost-sensitive learning as well and compared its results with those of the balancing methods. Furthermore, a feature selection analysis was conducted to investigate the relevance of each feature. Results show that balancing can help find the best setting to accomplish classification tasks. Since real-world imbalanced datasets are increasingly becoming the core of scientific research, further studies are needed to improve already existing techniqu
Boosting decision stumps for dynamic feature selection on data streams
Feature selection targets the identification of which features of a dataset are relevant to the learning task. It is also widely known and used to improve computation times, reduce computation requirements, and to decrease the impact of the curse of dimensionality and enhancing the generalization rates of classifiers. In data streams, classifiers shall benefit from all the items above, but more importantly, from the fact that the relevant subset of features may drift over time. In this paper, we propose a novel dynamic feature selection method for data streams called Adaptive Boosting for Feature Selection (ABFS). ABFS chains decision stumps and drift detectors, and as a result, identifies which features are relevant to the learning task as the stream progresses with reasonable success. In addition to our proposed algorithm, we bring feature selection-specific metrics from batch learning to streaming scenarios. Next, we evaluate ABFS according to these metrics in both synthetic and real-world scenarios. As a result, ABFS improves the classification rates of different types of learners and eventually enhances computational resources usage
ReConTab: Regularized Contrastive Representation Learning for Tabular Data
Representation learning stands as one of the critical machine learning
techniques across various domains. Through the acquisition of high-quality
features, pre-trained embeddings significantly reduce input space redundancy,
benefiting downstream pattern recognition tasks such as classification,
regression, or detection. Nonetheless, in the domain of tabular data, feature
engineering and selection still heavily rely on manual intervention, leading to
time-consuming processes and necessitating domain expertise. In response to
this challenge, we introduce ReConTab, a deep automatic representation learning
framework with regularized contrastive learning. Agnostic to any type of
modeling task, ReConTab constructs an asymmetric autoencoder based on the same
raw features from model inputs, producing low-dimensional representative
embeddings. Specifically, regularization techniques are applied for raw feature
selection. Meanwhile, ReConTab leverages contrastive learning to distill the
most pertinent information for downstream tasks. Experiments conducted on
extensive real-world datasets substantiate the framework's capacity to yield
substantial and robust performance improvements. Furthermore, we empirically
demonstrate that pre-trained embeddings can seamlessly integrate as easily
adaptable features, enhancing the performance of various traditional methods
such as XGBoost and Random Forest
- âŠ