20 research outputs found
Generative and Discriminative Text Classification with Recurrent Neural Networks
We empirically characterize the performance of discriminative and generative
LSTM models for text classification. We find that although RNN-based generative
models are more powerful than their bag-of-words ancestors (e.g., they account
for conditional dependencies across words in a document), they have higher
asymptotic error rates than discriminatively trained RNN models. However we
also find that generative models approach their asymptotic error rate more
rapidly than their discriminative counterparts---the same pattern that Ng &
Jordan (2001) proved holds for linear classification models that make more
naive conditional independence assumptions. Building on this finding, we
hypothesize that RNN-based generative classification models will be more robust
to shifts in the data distribution. This hypothesis is confirmed in a series of
experiments in zero-shot and continual learning settings that show that
generative models substantially outperform discriminative models
Hash Embeddings for Efficient Word Representations
We present hash embeddings, an efficient method for representing words in a
continuous vector form. A hash embedding may be seen as an interpolation
between a standard word embedding and a word embedding created using a random
hash function (the hashing trick). In hash embeddings each token is represented
by -dimensional embeddings vectors and one dimensional weight
vector. The final dimensional representation of the token is the product of
the two. Rather than fitting the embedding vectors for each token these are
selected by the hashing trick from a shared pool of embedding vectors. Our
experiments show that hash embeddings can easily deal with huge vocabularies
consisting of millions of tokens. When using a hash embedding there is no need
to create a dictionary before training nor to perform any kind of vocabulary
pruning after training. We show that models trained using hash embeddings
exhibit at least the same level of performance as models trained using regular
embeddings across a wide range of tasks. Furthermore, the number of parameters
needed by such an embedding is only a fraction of what is required by a regular
embedding. Since standard embeddings and embeddings constructed using the
hashing trick are actually just special cases of a hash embedding, hash
embeddings can be considered an extension and improvement over the existing
regular embedding types
Explicit Interaction Model towards Text Classification
Text classification is one of the fundamental tasks in natural language
processing. Recently, deep neural networks have achieved promising performance
in the text classification task compared to shallow models. Despite of the
significance of deep models, they ignore the fine-grained (matching signals
between words and classes) classification clues since their classifications
mainly rely on the text-level representations. To address this problem, we
introduce the interaction mechanism to incorporate word-level matching signals
into the text classification task. In particular, we design a novel framework,
EXplicit interAction Model (dubbed as EXAM), equipped with the interaction
mechanism. We justified the proposed approach on several benchmark datasets
including both multi-label and multi-class text classification tasks. Extensive
experimental results demonstrate the superiority of the proposed method. As a
byproduct, we have released the codes and parameter settings to facilitate
other researches.Comment: 8 page
The importance of data classification using machine learning methods in microarray data
The detection of genetic mutations has attracted global attention. several methods have proposed to detect diseases such as cancers and tumours. One of them is microarrays, which is a type of representation for gene expression that is helpful in diagnosis. To unleash the full potential of microarrays, machine-learning algorithms and gene selection methods can be implemented to facilitate processing on microarrays and to overcome other potential challenges. One of these challenges involves high dimensional data that are redundant, irrelevant, and noisy. To alleviate this problem, this representation should be simplified. For example, the feature selection process can be implemented by reducing the number of features adopted in clustering and classification. A subset of genes can be selected from a pool of gene expression data recorded on DNA micro-arrays. This paper reviews existing classification techniques and gene selection methods. The effectiveness of emerging techniques, such as the swarm intelligence technique in feature selection and classification in microarrays, are reported as well. These emerging techniques can be used in detecting cancer. The swarm intelligence technique can be combined with other statistical methods for attaining better results
Board of Directors' Profile: A Case for Deep Learning as a Valid Methodology to Finance Research
This paper presents a Deep Learning (DL) model for natural language processing of unstructured CVs to generate a six-dimensional profile of the professional experience of the Spanish companies' board of directors. We show the complete process starting with open data extraction and cleaning, the generation of a labeled dataset for supervised learning, the development, training and validation of a DL model capable of accurately analyzing the dataset, and, finally, a data analysis work based on the automated generation of the professional profiles of more than 6,000 directors of Spanish listed companies between 2003 and 2020. An RNN-LSTM neural network has been trained in three phases starting from a random initial state, (1) learning of basic structures of the Spanish language, (2) fine tuning for scientific texts in the field of economics and finance, and (3) regression modeling to generate a six-dimensional profile based on a generalization of sentiment classification systems. The complete training has been carried out with very low computational requirements, having a total duration of 120 hours of processing in a low-end GPU. The results obtained in the validation of the DL model show great accuracy, obtaining a value for the standard deviation of the mean error between 0.015 and 0.033. As a result, we have been able to outline with a high degree of reliability the profile of the listed Spanish companies' board of directors. We found that the predominant profile is that of directors with experience in executive or consultancy positions, followed by the financial profile. The results achieved show the potential of DL in social science research, particularly in Finance