841 research outputs found
Separable Convex Optimization with Nested Lower and Upper Constraints
We study a convex resource allocation problem in which lower and upper bounds
are imposed on partial sums of allocations. This model is linked to a large
range of applications, including production planning, speed optimization,
stratified sampling, support vector machines, portfolio management, and
telecommunications. We propose an efficient gradient-free divide-and-conquer
algorithm, which uses monotonicity arguments to generate valid bounds from the
recursive calls, and eliminate linking constraints based on the information
from sub-problems. This algorithm does not need strict convexity or
differentiability. It produces an -approximate solution for the
continuous problem in time
and an integer solution in time, where is
the number of decision variables, is the number of constraints, and is
the resource bound. A complexity of is also achieved
for the linear and quadratic cases. These are the best complexities known to
date for this important problem class. Our experimental analyses confirm the
good performance of the method, which produces optimal solutions for problems
with up to 1,000,000 variables in a few seconds. Promising applications to the
support vector ordinal regression problem are also investigated
Active deep learning for nonlinear optics design of a vertical FFA accelerator
Vertical Fixed-Field Alternating Gradient (vFFA) accelerators exhibit particle orbits which move vertically during acceleration. This recently rediscovered circular accelerator type has several advantages over conventional ring accelerators, such as zero momentum compaction factor. At the same time, inherently non-planar orbits and a unique transverse coupling make controlling the beam dynamics a complex task. In general, betatron tune adjustment is crucial to avoid resonances, particularly when space charge effects are present. Due to highly nonlinear magnetic fields in the vFFA, it remains a challenging task to determine an optimal lattice design in terms of maximising the dynamic aperture.
This contribution describes a deep learning based algorithm which strongly improves on regular grid scans and random search to find an optimal lattice: a surrogate model is built iteratively from simulations with varying lattice parameters to predict the dynamic aperture. The training of the model follows an active learning paradigm, which thus considerably reduces the number of samples needed from the computationally expensive simulations
The Group Loss++: A deeper look into group loss for deep metric learning
Deep metric learning has yielded impressive results in tasks such as clustering and image retrieval by leveraging neural networks to obtain highly discriminative feature embeddings, which can be used to group samples into different classes. Much research has been devoted to the design of smart loss functions or data mining strategies for training such networks. Most methods consider only pairs or triplets of samples within a mini-batch to compute the loss function, which is commonly based on the distance between embeddings. We propose Group Loss, a loss function based on a differentiable label-propagation method that enforces embedding similarity across all samples of a group while promoting, at the same time, low-density regions amongst data points belonging to different groups. Guided by the smoothness assumption that '`similar objects should belong to the same group'', the proposed loss trains the neural network for a classification task, enforcing a consistent labelling amongst samples within a class. We design a set of inference strategies tailored towards our algorithm, named Group Loss++ that further improve the results of our model. We show state-of-the-art results on clustering and image retrieval on four retrieval datasets, and present competitive results on two person re-identification datasets, providing a unified framework for retrieval and re-identification
The Group Loss++: A deeper look into group loss for deep metric learning
Deep metric learning has yielded impressive results in tasks such as
clustering and image retrieval by leveraging neural networks to obtain highly
discriminative feature embeddings, which can be used to group samples into
different classes. Much research has been devoted to the design of smart loss
functions or data mining strategies for training such networks. Most methods
consider only pairs or triplets of samples within a mini-batch to compute the
loss function, which is commonly based on the distance between embeddings. We
propose Group Loss, a loss function based on a differentiable label-propagation
method that enforces embedding similarity across all samples of a group while
promoting, at the same time, low-density regions amongst data points belonging
to different groups. Guided by the smoothness assumption that "similar objects
should belong to the same group", the proposed loss trains the neural network
for a classification task, enforcing a consistent labelling amongst samples
within a class. We design a set of inference strategies tailored towards our
algorithm, named Group Loss++ that further improve the results of our model. We
show state-of-the-art results on clustering and image retrieval on four
retrieval datasets, and present competitive results on two person
re-identification datasets, providing a unified framework for retrieval and
re-identification.Comment: Accepted to IEEE Transactions on Pattern Analysis and Machine
Intelligence (tPAMI), 2022. Includes supplementary materia
The Group Loss for Deep Metric Learning
Deep metric learning has yielded impressive results in tasks such as
clustering and image retrieval by leveraging neural networks to obtain highly
discriminative feature embeddings, which can be used to group samples into
different classes. Much research has been devoted to the design of smart loss
functions or data mining strategies for training such networks. Most methods
consider only pairs or triplets of samples within a mini-batch to compute the
loss function, which is commonly based on the distance between embeddings. We
propose Group Loss, a loss function based on a differentiable label-propagation
method that enforces embedding similarity across all samples of a group while
promoting, at the same time, low-density regions amongst data points belonging
to different groups. Guided by the smoothness assumption that "similar objects
should belong to the same group", the proposed loss trains the neural network
for a classification task, enforcing a consistent labelling amongst samples
within a class. We show state-of-the-art results on clustering and image
retrieval on several datasets, and show the potential of our method when
combined with other techniques such as ensemblesComment: Accepted to European Conference on Computer Vision (ECCV) 2020,
includes non-archival supplementary materia
LambdaOpt: Learn to Regularize Recommender Models in Finer Levels
Recommendation models mainly deal with categorical variables, such as
user/item ID and attributes. Besides the high-cardinality issue, the
interactions among such categorical variables are usually long-tailed, with the
head made up of highly frequent values and a long tail of rare ones. This
phenomenon results in the data sparsity issue, making it essential to
regularize the models to ensure generalization. The common practice is to
employ grid search to manually tune regularization hyperparameters based on the
validation data. However, it requires non-trivial efforts and large computation
resources to search the whole candidate space; even so, it may not lead to the
optimal choice, for which different parameters should have different
regularization strengths. In this paper, we propose a hyperparameter
optimization method, LambdaOpt, which automatically and adaptively enforces
regularization during training. Specifically, it updates the regularization
coefficients based on the performance of validation data. With LambdaOpt, the
notorious tuning of regularization hyperparameters can be avoided; more
importantly, it allows fine-grained regularization (i.e. each parameter can
have an individualized regularization coefficient), leading to better
generalized models. We show how to employ LambdaOpt on matrix factorization, a
classical model that is representative of a large family of recommender models.
Extensive experiments on two public benchmarks demonstrate the superiority of
our method in boosting the performance of top-K recommendation.Comment: Accepted by KDD 201
Recommended from our members
Variable grouping in multivariate time series via correlation
The decomposition of high-dimensional multivariate time series (MTS) into a number of low-dimensional MTS is a useful but challenging task because the number of possible dependencies between variables is likely to be huge. This paper is about a systematic study of the “variable groupings” problem in MTS. In particular, we investigate different methods of utilizing the information regarding correlations among MTS variables. This type of method does not appear to have been studied before. In all, 15 methods are suggested and applied to six datasets where there are identifiable mixed groupings of MTS variables. This paper describes the general methodology, reports extensive experimental results, and concludes with useful insights on the strength and weakness of this type of grouping metho
Feature selection strategies for improving data-driven decision support in bank telemarketing
The usage of data mining techniques to unveil previously undiscovered knowledge has
been applied in past years to a wide number of domains, including banking and marketing. Raw
data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw
data manipulation is feature engineering and it is related with the correct characterization or
selection of relevant features (or variables) that conceal relations with the target goal.
This study is particularly focused on feature engineering, aiming at the unfolding
features that best characterize the problem of selling long-term bank deposits through
telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank,
ranging the 2008-2013 year period and encompassing the recent global financial crisis, was
addressed. To assess the relevance of such problem, a novel literature analysis using text
mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a
research gap for bank telemarketing.
Starting from a dataset containing typical telemarketing contacts and client information,
research followed three different and complementary strategies: first, by enriching the dataset
with social and economic context features; then, by including customer lifetime value related
features; finally, by applying a divide and conquer strategy for splitting the problem in smaller
fractions, leading to optimized sub-problems. Each of the three approaches improved previous
results in terms of model metrics related to prediction performance. The relevance of the
proposed features was evaluated, confirming the obtained models as credible and valuable for
telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido
aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing.
Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões
de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia
de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou
variáveis) que se relacionem com o alvo da descoberta de conhecimento.
Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as
variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de
campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de
um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira
internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise
da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a
existência de uma lacuna nesta matéria.
Utilizando como base um conjunto de dados de contactos de telemarketing e
informação sobre os clientes, três estratégias diferentes e complementares foram propostas:
primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram
adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida;
finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens
otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à
capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada,
confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing
- …