1,920 research outputs found
Evaluating topic-word review analysis for understanding student peer review performance
© 2013 International Educational Data Mining Society. All rights reserved. Topic modeling is widely used for content analysis of textual documents. While the mined topic terms are considered as a semantic abstraction of the original text, few people evaluate the accuracy of humans’ interpretation of them in the context of an application based on the topic terms. Previously, we proposed RevExplore, an interactive peer-review analytic tool that supports teachers in making sense of large volumes of student peer reviews. To better evaluate the functionality of RevExplore, in this paper we take a closer look at its Natural Language Processing component which automatically compares two groups of reviews at the topic-word level. We employ a user study to evaluate our topic extraction method, as well as the topic-word analysis approach in the context of educational peer-review analysis. Our results show that the proposed method is better than a baseline in terms of capturing student reviewing/writing performance. While users generally identify student writing/reviewing performance correctly, participants who have prior teaching or peer-review experience tend to have better performance on our review exploration tasks, as well as higher satisfaction towards the proposed review analysis approach
Estimating the reliability of MDP policies: A confidence interval approach
Past approaches for using reinforcement learning to derive dialog control policies have assumed that there was enough collected data to derive a reliable policy. In this paper we present a methodology for numerically constructing confidence intervals for the expected cumulative reward for a learned policy. These intervals are used to (1) better assess the reliability of the expected cumulative reward, and (2) perform a refined comparison between policies derived from different Markov Decision Processes (MDP) models. We applied this methodology to a prior experiment where the goal was to select the best features to include in the MDP statespace. Our results show that while some of the policies developed in the prior work exhibited very large confidence intervals, the policy developed from the best feature set had a much smaller confidence interval and thus showed very high reliability. © 2007 Association for Computational Linguistics
Impact of Annotation Difficulty on Automatically Detecting Problem Localization of Peer-Review Feedback
We believe that providing assessment on students ’ reviewing performance will enable students to improve the quality of their peer reviews. We focus on assessing one particular aspect of the textual feedback contained in a peer review – the presence or absence of problem localization; feedback containing problem localization has been shown to be associated with increased understanding and implementation of the feedback. While in prior work we demonstrated the feasibility of learning to predict problem localization using linguistic features automatically extracted from textual feedback, we hypothesize that inter-annotator disagreement on labeling problem localization might impact both the accuracy and the content of the predictive models. To test this hypothesis, we compare the use of feedback examples where problem localization is labeled with differing levels of annotator agreement, for both training and testing our models. Our results show that when models are trained and tested using only feedback where annotators agree on problem localization, the models both perform with high accuracy, and contain rules involving just two simple linguistic features. In contrast, when training and testing using feedback examples where annotators both agree and disagree, the model performance slightly drops, but the learned rules capture more subtle patterns of problem localization. Keywords problem localization in text comments, data mining of peer reviews, inter-annotator agreement, natural langua
Intrapersonal curiosity: Inquisitiveness about the inner self
Intrapersonal Curiosity (InC) is the desire to learn more about one’s inner-self. A pool of 39 experimental InC items were administered to 988 participants (498 women), along with other measures of curiosity and personality. Three InC factors with acceptable model fit were identified, from which three internally consistent (alphas > .89) 4-item subscales were developed: “Understanding Emotions and Motives”, “Reflecting on the Past”, and “Exploring Identity and Purpose”. The InC scales correlated positively with other curiosity measures, evidencing convergent validity; divergent validity was demonstrated on the basis of weak relations to other constructs. The InC scales were positively associated with less self-awareness, poorer self-regulation, and experiences of distress, suggesting that InC tends to be higher in individuals who lack, but seek, new intrapersonal knowledge to reduce uncertainty about the self
Adult life stage and crisis as predictors of curiosity and authenticity: Testing inferences from Eriksons lifespan theory
During periods of developmental crisis, individuals experience uncomfortable internal incongruence and are motivated to reduce this through forms of exploration of self, other and world. From this, we inferred that crisis would relate positively to curiosity and negatively to a felt sense of authenticity. A quasi-experimental design using self-report data from a nationally representative UK sample (N = 963) of adults in early life (20-39 yrs.), midlife (40-59 yrs.) and later-life (60+) showed a pattern of findings supportive of the hypotheses. Three forms of curiosity (intrapersonal, perceptual and epistemic D-type) were significantly higher, while authenticity was lower, among those currently in crisis that those of the same age group not in crisis. Crisis was also related to curiosity about particular book genres; early adult crisis to self-help and spirituality, midlife to self-help and biography, and later life to food and eating
Incentivizing High Quality Crowdwork
We study the causal effects of financial incentives on the quality of
crowdwork. We focus on performance-based payments (PBPs), bonus payments
awarded to workers for producing high quality work. We design and run
randomized behavioral experiments on the popular crowdsourcing platform Amazon
Mechanical Turk with the goal of understanding when, where, and why PBPs help,
identifying properties of the payment, payment structure, and the task itself
that make them most effective. We provide examples of tasks for which PBPs do
improve quality. For such tasks, the effectiveness of PBPs is not too sensitive
to the threshold for quality required to receive the bonus, while the magnitude
of the bonus must be large enough to make the reward salient. We also present
examples of tasks for which PBPs do not improve quality. Our results suggest
that for PBPs to improve quality, the task must be effort-responsive: the task
must allow workers to produce higher quality work by exerting more effort. We
also give a simple method to determine if a task is effort-responsive a priori.
Furthermore, our experiments suggest that all payments on Mechanical Turk are,
to some degree, implicitly performance-based in that workers believe their work
may be rejected if their performance is sufficiently poor. Finally, we propose
a new model of worker behavior that extends the standard principal-agent model
from economics to include a worker's subjective beliefs about his likelihood of
being paid, and show that the predictions of this model are in line with our
experimental findings. This model may be useful as a foundation for theoretical
studies of incentives in crowdsourcing markets.Comment: This is a preprint of an Article accepted for publication in WWW
\c{opyright} 2015 International World Wide Web Conference Committe
Risk Factors for Urgency Incontinence in Women Undergoing Stress Urinary Incontinence Surgery
Objective. To determine baseline variables associated with urgency urinary incontinence (UUI) in women presenting for stress urinary incontinence (SUI) surgery. Methods. Baseline data from two randomized trials enrolling 1,252 women were analyzed: SISTEr (fascial sling versus Burch colposuspension) and TOMUS (retropubic versus transobturator midurethral sling). Demographic data, POP-Q measures, and validated measures of symptom severity and quality of life were collected. Charlson Comorbidity Index (CCI) and Patient Health Questionnaire-9 were measured in TOMUS. Multivariate models were constructed with UUI and symptom severity as outcomes. Results. Over two-thirds of subjects reported bothersome UUI at baseline. TOMUS patients with more comorbidities had higher UDI irritative scores (CCI score 0 = 39.4, CCI score 1 = 42.1, and CCI score 2+ = 51.0, P=0.0003), and higher depression scores were associated with more severe UUI. Smoking, parity, prior incontinence surgery/treatment, prolapse stage, and incontinence episode frequency were not independently associated with UUI. Conclusions. There were no modifiable risk factors identified for patient-reported UUI in women presenting for SUI surgery. However, the direct relationships between comorbidity level, depression, and worsening of UUI/urgency symptoms may represent targets for preoperative intervention. Further research is necessary to elucidate the pathophysiologic mechanisms that explain the associations between these medical conditions and bladder function
Assessment of energy efficiency and sustainability scenarios in the transport system
Background
Energy Policy is one of the main drivers of Transport Policy. A number of strategies to reduce current energy consumption trends in the transport sector have been designed over the last decades. They include fuel taxes, more efficient technologies and changing travel behavior through demand regulation. But energy market has a high degree of uncertainty and the effectiveness of those policy options should be assessed.
Methods
A scenario based assessment methodology has been developed in the frame of the EU project STEPS. It provides an integrated view of Energy efficiency, environment, social and competitiveness impacts of the different strategies. It has been applied at European level and to five specific Regions.
Concluding remarks
The results are quite site specific dependent. However they show that regulation measures appear to be more effective than new technology investments. Higher energy prices could produce on their turn a deterioration of competitiveness and a threat for social goals
Copyright Legislation and Technological Change
Throughout its history, copyright law has had difficulty accommodating technological change. Although the substance of copyright legislation in this century has evolved from meetings among industry representatives whose avowed purpose was to draft legislation that provided for the future,6 the resulting statutes have done so poorly. The language of copyright statutes has been phrased in fact-specific language that has grown obsolete as new modes and mediums of copyrightable expression have developed. Whatever copyright statute has been on the books has been routinely, and justifiably, criticized as outmoded.7 In this Article, I suggest that the nature of the legislative process we have relied on for copyright revision is largely to blame for those laws\u27 deficiencies.
Copyright, Compromise and Legislative History
Copyright law gives authors a property right. But what kind of property right? Indeed, a property right in what? The answers to these questions should be apparent from a perusal of title seventeen of the United States Code-the statute that confers the property right.\u27 Courts, however, have apparently found title seventeen an unhelpful guide. For the most part, they look elsewhere for answers, relying primarily on prior courts\u27 constructions of an earlier and very different statute on the same subject.
- …
