39,598 research outputs found
Automated test data generation using a scatter search approach
The techniques for the automatic generation of test cases try to efficiently find a small set of cases that allow a given adequacy criterion to be fulfilled, thus contributing to a reduction in the cost of software testing. In this paper we present and analyze two versions of an approach based on the Scatter Search metaheuristic technique for the automatic generation of software test cases using a branch coverage adequacy criterion. The first test case generator, called TCSS, uses a diversity property to extend the search of test cases to all branches of the program under test in order to generate test cases that cover these. The second, called TCSS-LS, is an extension of the previous test case generator which combines the diversity property with a local search method that allows the intensification of the search for test cases that cover the difficult branches. We present the results obtained by our generators and carry out a detailed comparison with many other generators, showing a good performance of our approac
Scatteract: Automated extraction of data from scatter plots
Charts are an excellent way to convey patterns and trends in data, but they
do not facilitate further modeling of the data or close inspection of
individual data points. We present a fully automated system for extracting the
numerical values of data points from images of scatter plots. We use deep
learning techniques to identify the key components of the chart, and optical
character recognition together with robust regression to map from pixels to the
coordinate system of the chart. We focus on scatter plots with linear scales,
which already have several interesting challenges. Previous work has done fully
automatic extraction for other types of charts, but to our knowledge this is
the first approach that is fully automatic for scatter plots. Our method
performs well, achieving successful data extraction on 89% of the plots in our
test set.Comment: Submitted to ECML PKDD 2017 proceedings, 16 page
Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation
Automated metrics such as BLEU are widely used in the machine translation
literature. They have also been used recently in the dialogue community for
evaluating dialogue response generation. However, previous work in dialogue
response generation has shown that these metrics do not correlate strongly with
human judgment in the non task-oriented dialogue setting. Task-oriented
dialogue responses are expressed on narrower domains and exhibit lower
diversity. It is thus reasonable to think that these automated metrics would
correlate well with human judgment in the task-oriented setting where the
generation task consists of translating dialogue acts into a sentence. We
conduct an empirical study to confirm whether this is the case. Our findings
indicate that these automated metrics have stronger correlation with human
judgments in the task-oriented setting compared to what has been observed in
the non task-oriented setting. We also observe that these metrics correlate
even better for datasets which provide multiple ground truth reference
sentences. In addition, we show that some of the currently available corpora
for task-oriented language generation can be solved with simple models and
advocate for more challenging datasets
Visual Integration of Data and Model Space in Ensemble Learning
Ensembles of classifier models typically deliver superior performance and can
outperform single classifier models given a dataset and classification task at
hand. However, the gain in performance comes together with the lack in
comprehensibility, posing a challenge to understand how each model affects the
classification outputs and where the errors come from. We propose a tight
visual integration of the data and the model space for exploring and combining
classifier models. We introduce a workflow that builds upon the visual
integration and enables the effective exploration of classification outputs and
models. We then present a use case in which we start with an ensemble
automatically selected by a standard ensemble selection algorithm, and show how
we can manipulate models and alternative combinations.Comment: 8 pages, 7 picture
The Configurable SAT Solver Challenge (CSSC)
It is well known that different solution strategies work well for different
types of instances of hard combinatorial problems. As a consequence, most
solvers for the propositional satisfiability problem (SAT) expose parameters
that allow them to be customized to a particular family of instances. In the
international SAT competition series, these parameters are ignored: solvers are
run using a single default parameter setting (supplied by the authors) for all
benchmark instances in a given track. While this competition format rewards
solvers with robust default settings, it does not reflect the situation faced
by a practitioner who only cares about performance on one particular
application and can invest some time into tuning solver parameters for this
application. The new Configurable SAT Solver Competition (CSSC) compares
solvers in this latter setting, scoring each solver by the performance it
achieved after a fully automated configuration step. This article describes the
CSSC in more detail, and reports the results obtained in its two instantiations
so far, CSSC 2013 and 2014
Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data
In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrapand k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tocorrect classification rates with less than 10% of the original features
Extensive light profile fitting of galaxy-scale strong lenses
We investigate the merits of a massive forward modeling of ground-based
optical imaging as a diagnostic for the strong lensing nature of Early-Type
Galaxies, in the light of which blurred and faint Einstein rings can hide. We
simulate several thousand mock strong lenses under ground- and space-based
conditions as arising from the deflection of an exponential disk by a
foreground de Vaucouleurs light profile whose lensing potential is described by
a Singular Isothermal Ellipsoid. We then fit for the lensed light distribution
with sl_fit after having subtracted the foreground light emission off (ideal
case) and also after having fitted the deflector's light with galfit. By
setting thresholds in the output parameter space, we can decide the
lens/not-a-lens status of each system. We finally apply our strategy to a
sample of 517 lens candidates present in the CFHTLS data to test the
consistency of our selection approach. The efficiency of the fast modeling
method at recovering the main lens parameters like Einstein radius, total
magnification or total lensed flux, is quite comparable under CFHT and HST
conditions when the deflector is perfectly subtracted off (only possible in
simulations), fostering a sharp distinction between the good and the bad
candidates. Conversely, for a more realistic subtraction, a substantial
fraction of the lensed light is absorbed into the deflector's model, which
biases the subsequent fitting of the rings and then disturbs the selection
process. We quantify completeness and purity of the lens finding method in both
cases. This suggests that the main limitation currently resides in the
subtraction of the foreground light. Provided further enhancement of the
latter, the direct forward modeling of large numbers of galaxy-galaxy strong
lenses thus appears tractable and could constitute a competitive lens finder in
the next generation of wide-field imaging surveys.Comment: A&A accepted version, minor changes (13 pages, 10 figures
- …