16,841 research outputs found
The Potential of Restarts for ProbSAT
This work analyses the potential of restarts for probSAT, a quite successful
algorithm for k-SAT, by estimating its runtime distributions on random 3-SAT
instances that are close to the phase transition. We estimate an optimal
restart time from empirical data, reaching a potential speedup factor of 1.39.
Calculating restart times from fitted probability distributions reduces this
factor to a maximum of 1.30. A spin-off result is that the Weibull distribution
approximates the runtime distribution for over 93% of the used instances well.
A machine learning pipeline is presented to compute a restart time for a
fixed-cutoff strategy to exploit this potential. The main components of the
pipeline are a random forest for determining the distribution type and a neural
network for the distribution's parameters. ProbSAT performs statistically
significantly better than Luby's restart strategy and the policy without
restarts when using the presented approach. The structure is particularly
advantageous on hard problems.Comment: Eurocast 201
Effects of sampling skewness of the importance-weighted risk estimator on model selection
Importance-weighting is a popular and well-researched technique for dealing
with sample selection bias and covariate shift. It has desirable
characteristics such as unbiasedness, consistency and low computational
complexity. However, weighting can have a detrimental effect on an estimator as
well. In this work, we empirically show that the sampling distribution of an
importance-weighted estimator can be skewed. For sample selection bias
settings, and for small sample sizes, the importance-weighted risk estimator
produces overestimates for datasets in the body of the sampling distribution,
i.e. the majority of cases, and large underestimates for data sets in the tail
of the sampling distribution. These over- and underestimates of the risk lead
to suboptimal regularization parameters when used for importance-weighted
validation.Comment: Conference paper, 6 pages, 5 figure
Generalized Batch Normalization: Towards Accelerating Deep Neural Networks
Utilizing recently introduced concepts from statistics and quantitative risk
management, we present a general variant of Batch Normalization (BN) that
offers accelerated convergence of Neural Network training compared to
conventional BN. In general, we show that mean and standard deviation are not
always the most appropriate choice for the centering and scaling procedure
within the BN transformation, particularly if ReLU follows the normalization
step. We present a Generalized Batch Normalization (GBN) transformation, which
can utilize a variety of alternative deviation measures for scaling and
statistics for centering, choices which naturally arise from the theory of
generalized deviation measures and risk theory in general. When used in
conjunction with the ReLU non-linearity, the underlying risk theory suggests
natural, arguably optimal choices for the deviation measure and statistic.
Utilizing the suggested deviation measure and statistic, we show experimentally
that training is accelerated more so than with conventional BN, often with
improved error rate as well. Overall, we propose a more flexible BN
transformation supported by a complimentary theoretical framework that can
potentially guide design choices.Comment: accepted at AAAI-1
Hierarchical Attention Network for Visually-aware Food Recommendation
Food recommender systems play an important role in assisting users to
identify the desired food to eat. Deciding what food to eat is a complex and
multi-faceted process, which is influenced by many factors such as the
ingredients, appearance of the recipe, the user's personal preference on food,
and various contexts like what had been eaten in the past meals. In this work,
we formulate the food recommendation problem as predicting user preference on
recipes based on three key factors that determine a user's choice on food,
namely, 1) the user's (and other users') history; 2) the ingredients of a
recipe; and 3) the descriptive image of a recipe. To address this challenging
problem, we develop a dedicated neural network based solution Hierarchical
Attention based Food Recommendation (HAFR) which is capable of: 1) capturing
the collaborative filtering effect like what similar users tend to eat; 2)
inferring a user's preference at the ingredient level; and 3) learning user
preference from the recipe's visual images. To evaluate our proposed method, we
construct a large-scale dataset consisting of millions of ratings from
AllRecipes.com. Extensive experiments show that our method outperforms several
competing recommender solutions like Factorization Machine and Visual Bayesian
Personalized Ranking with an average improvement of 12%, offering promising
results in predicting user preference for food. Codes and dataset will be
released upon acceptance
Background Rejection in Atmospheric Cherenkov Telescopes using Recurrent Convolutional Neural Networks
In this work, we present a new, high performance algorithm for background
rejection in imaging atmospheric Cherenkov telescopes. We build on the already
popular machine-learning techniques used in gamma-ray astronomy by the
application of the latest techniques in machine learning, namely recurrent and
convolutional neural networks, to the background rejection problem. Use of
these machine-learning techniques addresses some of the key challenges
encountered in the currently implemented algorithms and helps to significantly
increase the background rejection performance at all energies.
We apply these machine learning techniques to the H.E.S.S. telescope array,
first testing their performance on simulated data and then applying the
analysis to two well known gamma-ray sources. With real observational data we
find significantly improved performance over the current standard methods, with
a 20-25\% reduction in the background rate when applying the recurrent neural
network analysis. Importantly, we also find that the convolutional neural
network results are strongly dependent on the sky brightness in the source
region which has important implications for the future implementation of this
method in Cherenkov telescope analysis.Comment: 11 pages, 7 figures. To be submitted to The European Physical Journal
- …