583 research outputs found
How is a data-driven approach better than random choice in label space division for multi-label classification?
We propose using five data-driven community detection approaches from social
networks to partition the label space for the task of multi-label
classification as an alternative to random partitioning into equal subsets as
performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector,
infomap, walktrap and label propagation algorithms. We construct a label
co-occurence graph (both weighted an unweighted versions) based on training
data and perform community detection to partition the label set. We include
Binary Relevance and Label Powerset classification methods for comparison. We
use gini-index based Decision Trees as the base classifier. We compare educated
approaches to label space divisions against random baselines on 12 benchmark
data sets over five evaluation measures. We show that in almost all cases seven
educated guess approaches are more likely to outperform RAkELd than otherwise
in all measures, but Hamming Loss. We show that fastgreedy and walktrap
community detection methods on weighted label co-occurence graphs are 85-92%
more likely to yield better F1 scores than random partitioning. Infomap on the
unweighted label co-occurence graphs is on average 90% of the times better than
random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard
similarity. Weighted fastgreedy is better on average than RAkELd when it comes
to Hamming Loss
Message effectiveness in corporate career websites: analysis of the top 30 employers in Switzerland
This paper proposes a comprehensive tool for analyzing online recruitment messages, drawing upon dimensions of employers’ online communication effectiveness developed in the literature. A codebook is developed to analyze the content of corporate career websites, including the dimensions identified by Cober, Brown and Levy (2004), in order to evaluate employment websites. This is integrated with other dimensions that the literature acknowledges as predictive of organizational attractiveness. A content analysis of the career website of the top 30 employers in Switzerland is conducted. Results show that the selected companies provide a message that is credible, vivid, and employee-oriented. In addition, companies that express the uniqueness of their employer brand communicate more effectively with their potential candidates. The current analysis takes into consideration a limited number of companies. The study could be extended to a larger sample that includes poor-performing employers and that allows the identification of specificities by industries. The tool could be a useful grid for the creation and implementation of corporate career websites by practitioners. The paper provides a picture of best practices of online recruitment in Switzerland and, by putting together contributions of several research traditions, a comprehensive tool with which to assess career websites’ messages. Moreover, empirical results highlight a relationship between employer’s communication of a branding statement and excellence in conveying the effectiveness dimensions of the online recruitment message
Support matrix machine: A review
Support vector machine (SVM) is one of the most studied paradigms in the
realm of machine learning for classification and regression problems. It relies
on vectorized input data. However, a significant portion of the real-world data
exists in matrix format, which is given as input to SVM by reshaping the
matrices into vectors. The process of reshaping disrupts the spatial
correlations inherent in the matrix data. Also, converting matrices into
vectors results in input data with a high dimensionality, which introduces
significant computational complexity. To overcome these issues in classifying
matrix input data, support matrix machine (SMM) is proposed. It represents one
of the emerging methodologies tailored for handling matrix input data. The SMM
method preserves the structural information of the matrix data by using the
spectral elastic net property which is a combination of the nuclear norm and
Frobenius norm. This article provides the first in-depth analysis of the
development of the SMM model, which can be used as a thorough summary by both
novices and experts. We discuss numerous SMM variants, such as robust, sparse,
class imbalance, and multi-class classification models. We also analyze the
applications of the SMM model and conclude the article by outlining potential
future research avenues and possibilities that may motivate academics to
advance the SMM algorithm
LAM-Related Research Funded Under Spain’s National Research Agenda (2010 – 2020)
This study analysed and contextualised research on LAMs (acronym for libraries, archives and museums) funded by the Spanish Ministry of Science and Innovation under competitive calls for projects from 2010 to 2020. The ultimate intention was to verify the existence or otherwise of a national research agenda on these cultural institutions. The initial search and location of Spanish Ministry-funded projects in official sources was followed by data processing and grouping by subject category. A total of 145 projects were analysed. The results showed LAM projects to be scant in number, highly varied in terms of subject matter, poorly funded, widely scattered across a number of areas of knowledge although with a prevalence of the humanities, and highly concentrated in certain institutions and disciplines. The subject-based analysis characterised LAM institutions, from the research perspective, as tools supporting other types of research but not themselves objects of study. None of the nationwide research plans was observed to include LAMs as a line of research. This study has essentially two practical implications. 1. It underscores the need for greater transparency among research project funding agencies; and 2. it defends the inclusion of LAMs among the items on a country’s national research agenda deserving of funding to enhance awareness of their value, purpose and projects
Survey of deep representation learning for speech emotion recognition
Traditionally, speech emotion recognition (SER) research has relied on manually handcrafted acoustic features using feature engineering. However, the design of handcrafted features for complex SER tasks requires significant manual eort, which impedes generalisability and slows the pace of innovation. This has motivated the adoption of representation learning techniques that can automatically learn an intermediate representation of the input signal without any manual feature engineering. Representation learning has led to improved SER performance and enabled rapid innovation. Its effectiveness has further increased with advances in deep learning (DL), which has facilitated \textit{deep representation learning} where hierarchical representations are automatically learned in a data-driven manner. This paper presents the first comprehensive survey on the important topic of deep representation learning for SER. We highlight various techniques, related challenges and identify important future areas of research. Our survey bridges the gap in the literature since existing surveys either focus on SER with hand-engineered features or representation learning in the general setting without focusing on SER
An efficiency curve for evaluating imbalanced classifiers considering intrinsic data characteristics: Experimental analysis
Balancing the accuracy rates of the majority and minority classes is challenging in imbalanced
classification. Furthermore, data characteristics have a significant impact on the performance
of imbalanced classifiers, which are generally neglected by existing evaluation
methods. The objective of this study is to introduce a new criterion to comprehensively
evaluate imbalanced classifiers. Specifically, we introduce an efficiency curve that is established
using data envelopment analysis without explicit inputs (DEA-WEI), to determine
the trade-off between the benefits of improved minority class accuracy and the cost of
reduced majority class accuracy. In sequence, we analyze the impact of the imbalanced
ratio and typical imbalanced data characteristics on the efficiency of the classifiers.
Empirical analyses using 68 imbalanced data reveal that traditional classifiers such as
C4.5 and the k-nearest neighbor are more effective on disjunct data, whereas ensemble
and undersampling techniques are more effective for overlapping and noisy data. The efficiency
of cost-sensitive classifiers decreases dramatically when the imbalanced ratio
increases. Finally, we investigate the reasons for the different efficiencies of classifiers on
imbalanced data and recommend steps to select appropriate classifiers for imbalanced data
based on data characteristics.National Natural Science Foundation of China (NSFC) 71874023
71725001
71771037
7197104
A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning
Current deep learning research is dominated by benchmark evaluation. A method
is regarded as favorable if it empirically performs well on the dedicated test
set. This mentality is seamlessly reflected in the resurfacing area of
continual learning, where consecutively arriving sets of benchmark data are
investigated. The core challenge is framed as protecting previously acquired
representations from being catastrophically forgotten due to the iterative
parameter updates. However, comparison of individual methods is nevertheless
treated in isolation from real world application and typically judged by
monitoring accumulated test set performance. The closed world assumption
remains predominant. It is assumed that during deployment a model is guaranteed
to encounter data that stems from the same distribution as used for training.
This poses a massive challenge as neural networks are well known to provide
overconfident false predictions on unknown instances and break down in the face
of corrupted data. In this work we argue that notable lessons from open set
recognition, the identification of statistically deviating data outside of the
observed dataset, and the adjacent field of active learning, where data is
incrementally queried such that the expected performance gain is maximized, are
frequently overlooked in the deep learning era. Based on these forgotten
lessons, we propose a consolidated view to bridge continual learning, active
learning and open set recognition in deep neural networks. Our results show
that this not only benefits each individual paradigm, but highlights the
natural synergies in a common framework. We empirically demonstrate
improvements when alleviating catastrophic forgetting, querying data in active
learning, selecting task orders, while exhibiting robust open world application
where previously proposed methods fail.Comment: 32 page
The Internet and Prospective Engineers: Results Analysis for Studies Conducted During the Pandemic
The relevance of the study is justified by transition to distance learning that modifies the learning methods and principles during the pandemic to conform to remote training needs of students majoring in Land Management and Cadastres at Arctic State Agrotechnological University (ASAU). of the Republic of Sakha (Yakutia). The study objective was to substantiate the interaction between ASAU students and teachers in remote training organization within the pandemic period using network technologies. The study monitored the educational process dynamics during the pandemic. The study results evidence that during the pandemic, a particular priority in educational process arrangement at ASAU was given to enhancing the professional readiness of students and teachers and building up their competence in organizing their professional activities using the Internet in compliance with the latest requirements of the Federal State Educational Standards. The reference and experimental groups were sampled based on students’ interviews during the transition to remote access Internet-based educational process. The practical implications of the study lie in identifying the distinctive features of teachers’ and students’ educational activities at ASAU during the pandemic. These results can be adapted and implemented in the system of prospective engineers’ training in other regional universities in the north-east of Russia
- …