62,827 research outputs found
Learning Features that Predict Cue Usage
Our goal is to identify the features that predict the occurrence and
placement of discourse cues in tutorial explanations in order to aid in the
automatic generation of explanations. Previous attempts to devise rules for
text generation were based on intuition or small numbers of constructed
examples. We apply a machine learning program, C4.5, to induce decision trees
for cue occurrence and placement from a corpus of data coded for a variety of
features previously thought to affect cue usage. Our experiments enable us to
identify the features with most predictive power, and show that machine
learning can be used to induce decision trees useful for text generation.Comment: 10 pages, 2 Postscript figures, uses aclap.sty, psfig.te
Temporary staffing services: a data mining perspective
Research on the temporary staffing industry discusses different topics ranging from workplace safety to the internationalization of temporary labor. However, there is a lack of data mining studies concerning this topic. This paper meets this void and uses a financial dataset as input for the estimated models. Bagged decision trees were utilized to cope with the high dimensionality. Two bagged decision trees were estimated: one using the whole dataset and one using the top 12 predictors. Both had the same predictive performance. This means we can highly reduce the computational complexity, without losing accuracy
Temporary staffing services: a data mining perspective
Research on the temporary staffing industry discusses different topics ranging from workplace safety to the internationalization of temporary labor. However, there is a lack of data mining studies concerning this topic. This paper meets this void and uses a financial dataset as input for the estimated models. Bagged decision trees were utilized to cope with the high dimensionality. Two bagged decision trees were estimated: one using the whole dataset and one using the top 12 predictors. Both had the same predictive performance. This means we can highly reduce the computational complexity, without losing accuracy
Advances and applications of automata on words and trees : abstracts collection
From 12.12.2010 to 17.12.2010, the Dagstuhl Seminar 10501 "Advances and Applications of Automata on Words and Trees" was held in Schloss Dagstuhl - Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available
Individual and Domain Adaptation in Sentence Planning for Dialogue
One of the biggest challenges in the development and deployment of spoken
dialogue systems is the design of the spoken language generation module. This
challenge arises from the need for the generator to adapt to many features of
the dialogue domain, user population, and dialogue context. A promising
approach is trainable generation, which uses general-purpose linguistic
knowledge that is automatically adapted to the features of interest, such as
the application domain, individual user, or user group. In this paper we
present and evaluate a trainable sentence planner for providing restaurant
information in the MATCH dialogue system. We show that trainable sentence
planning can produce complex information presentations whose quality is
comparable to the output of a template-based generator tuned to this domain. We
also show that our method easily supports adapting the sentence planner to
individuals, and that the individualized sentence planners generally perform
better than models trained and tested on a population of individuals. Previous
work has documented and utilized individual preferences for content selection,
but to our knowledge, these results provide the first demonstration of
individual preferences for sentence planning operations, affecting the content
order, discourse structure and sentence structure of system responses. Finally,
we evaluate the contribution of different feature sets, and show that, in our
application, n-gram features often do as well as features based on higher-level
linguistic representations
Improving dependency label accuracy using statistical post-editing: A cross-framework study
We present a statistical post-editing method for modifying the dependency labels in a dependency analysis. We test the method using two English datasets, three parsing systems and three labelled dependency schemes. We demonstrate how it can be used both to improve dependency label accuracy in parser output and highlight problems with and differences between constituency-to-dependency conversions
- ā¦