25,178 research outputs found
A reusable iterative optimization software library to solve combinatorial problems with approximate reasoning
Real world combinatorial optimization problems such as scheduling are
typically too complex to solve with exact methods. Additionally, the problems
often have to observe vaguely specified constraints of different importance,
the available data may be uncertain, and compromises between antagonistic
criteria may be necessary. We present a combination of approximate reasoning
based constraints and iterative optimization based heuristics that help to
model and solve such problems in a framework of C++ software libraries called
StarFLIP++. While initially developed to schedule continuous caster units in
steel plants, we present in this paper results from reusing the library
components in a shift scheduling system for the workforce of an industrial
production plant.Comment: 33 pages, 9 figures; for a project overview see
http://www.dbai.tuwien.ac.at/proj/StarFLIP
Cut Size Statistics of Graph Bisection Heuristics
We investigate the statistical properties of cut sizes generated by heuristic
algorithms which solve approximately the graph bisection problem. On an
ensemble of sparse random graphs, we find empirically that the distribution of
the cut sizes found by ``local'' algorithms becomes peaked as the number of
vertices in the graphs becomes large. Evidence is given that this distribution
tends towards a Gaussian whose mean and variance scales linearly with the
number of vertices of the graphs. Given the distribution of cut sizes
associated with each heuristic, we provide a ranking procedure which takes into
account both the quality of the solutions and the speed of the algorithms. This
procedure is demonstrated for a selection of local graph bisection heuristics.Comment: 17 pages, 5 figures, submitted to SIAM Journal on Optimization also
available at http://ipnweb.in2p3.fr/~martin
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
Question Generation (QG) is fundamentally a simple syntactic transformation;
however, many aspects of semantics influence what questions are good to form.
We implement this observation by developing Syn-QG, a set of transparent
syntactic rules leveraging universal dependencies, shallow semantic parsing,
lexical resources, and custom rules which transform declarative sentences into
question-answer pairs. We utilize PropBank argument descriptions and VerbNet
state predicates to incorporate shallow semantic content, which helps generate
questions of a descriptive nature and produce inferential and semantically
richer questions than existing systems. In order to improve syntactic fluency
and eliminate grammatically incorrect questions, we employ back-translation
over the output of these syntactic rules. A set of crowd-sourced evaluations
shows that our system can generate a larger number of highly grammatical and
relevant questions than previous QG systems and that back-translation
drastically improves grammaticality at a slight cost of generating irrelevant
questions.Comment: Some of the results in the paper were incorrec
Using a unified measure function for heuristics, discretization, and rule quality evaluation in Ant-Miner
Ant-Miner is a classification rule discovery algorithm that is based on Ant Colony Optimization (ACO) meta-heuristic. cAnt-Miner is the extended version of the algorithm that handles continuous attributes on-the-fly during the rule construction process, while ?Ant-Miner is an extension of the algorithm that selects the rule class prior to its construction, and utilizes multiple pheromone types, one for each permitted rule class. In this paper, we combine these two algorithms to derive a new approach for learning classification rules using ACO. The proposed approach is based on using the measure function for 1) computing the heuristics for rule term selection, 2) a criteria for discretizing continuous attributes, and 3) evaluating the quality of the constructed rule for pheromone update as well. We explore the effect of using different measure functions for on the output model in terms of predictive accuracy and model size. Empirical evaluations found that hypothesis of different functions produce different results are acceptable according to Friedman’s statistical test
A geo-temporal information extraction service for processing descriptive metadata in digital libraries
In the context of digital map libraries, resources are usually described according to metadata records that define the relevant subject, location, time-span, format and keywords. On what concerns locations and time-spans, metadata records are often incomplete or they provide information in a way that is not machine-understandable (e.g. textual descriptions). This paper presents techniques for extracting geotemporal information from text, using relatively simple text mining methods that leverage on a Web gazetteer service. The idea is to go from human-made geotemporal referencing (i.e. using place and period names in textual expressions) into geo-spatial coordinates and time-spans. A prototype system, implementing the proposed methods, is described in detail. Experimental results demonstrate the efficiency and accuracy of the proposed approaches
- …