100,610 research outputs found

    Controlling Risk of Web Question Answering

    Full text link
    Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users' search experience by providing a direct answer to users' information need. This could be achieved by applying machine reading comprehension (MRC) models over the retrieved passages to extract answers with respect to the search query. With the development of deep learning techniques, state-of-the-art MRC performances have been achieved by recent deep methods. However, existing studies on MRC seldom address the predictive uncertainty issue, i.e., how likely the prediction of an MRC model is wrong, leading to uncontrollable risks in real-world Web QA applications. In this work, we first conduct an in-depth investigation over the risk of Web QA. We then introduce a novel risk control framework, which consists of a qualify model for uncertainty estimation using the probe idea, and a decision model for selectively output. For evaluation, we introduce risk-related metrics, rather than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA. The empirical results over both the real-world Web QA dataset and the academic MRC benchmark collection demonstrate the effectiveness of our approach.Comment: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieva

    Determinants of quality, latency, and amount of Stack Overflow answers about recent Android APIs.

    Get PDF
    Stack Overflow is a popular crowdsourced question and answer website for programming-related issues. It is an invaluable resource for software developers; on average, questions posted there get answered in minutes to an hour. Questions about well established topics, e.g., the coercion operator in C++, or the difference between canonical and class names in Java, get asked often in one form or another, and answered very quickly. On the other hand, questions on previously unseen or niche topics take a while to get a good answer. This is particularly the case with questions about current updates to or the introduction of new application programming interfaces (APIs). In a hyper-competitive online market, getting good answers to current programming questions sooner could increase the chances of an app getting released and used. So, can developers anyhow, e.g., hasten the speed to good answers to questions about new APIs? Here, we empirically study Stack Overflow questions pertaining to new Android APIs and their associated answers. We contrast the interest in these questions, their answer quality, and timeliness of their answers to questions about old APIs. We find that Stack Overflow answerers in general prioritize with respect to currentness: questions about new APIs do get more answers, but good quality answers take longer. We also find that incentives in terms of question bounties, if used appropriately, can significantly shorten the time and increase answer quality. Interestingly, no operationalization of bounty amount shows significance in our models. In practice, our findings confirm the value of bounties in enhancing expert participation. In addition, they show that the Stack Overflow style of crowdsourcing, for all its glory in providing answers about established programming knowledge, is less effective with new API questions

    Technology in Practice (Section 2.31 of the Comprehensive Clinical Psychology: Vol. 2. Professional Issues)

    Full text link
    The contemporary practice of psychology requires a prudent balance of traditional and emerging communication methods. Interpersonal interactions in the context of human relationship (e.g., speech, emotional expressions, and nonverbal gestures) have been a vital part of emotional healing throughout many centuries, and research findings in the 1990s underscore the importance of relational factors in effective psychological interventions (Whiston & Sexton, 1993). In addition to the time honored interpersonal communication methods of professional psychology, rapid technological advances have propelled psychologists into another sphere of communication. Today\u27s professional psychologist is increasingly expected to attain mastery in both of these communication methods-the very old and the very new

    Experimental Tests of Survey Responses to Expenditure Questions

    Get PDF
    This paper tests for a number of survey effects in the elicitation of expenditure items. In particular we examine the extent to which individuals use features of the expenditure question to construct their answers. We test whether respondents interpret question wording as researchers intend and examine the extent to which prompts, clarifications and seemingly arbitrary features of survey design influence expenditure reports. We find that over one quarter of respondents have difficulty distinguishing between "you" and “your household” when making expenditure reports; that respondents report higher pro-rata expenditure when asked to give responses on a weekly as opposed to monthly or annual time scale; that respondents give higher estimates when using a scale with a higher mid-point; and that respondents report higher aggregated expenditure when categories are presented in a disaggregated form. In summary, expenditure reports are constructed using convenient rules of thumb and available information, which will depend on the characteristics of the respondent, the expenditure domain and features of the survey question. It is crucial to further account for these features in ongoing surveys.expenditure surveys, survey design, data experiments

    Effects of study design and allocation on participant behaviour-ESDA: study protocol for a randomized controlled trial

    Get PDF
    Background: What study participants think about the nature of a study has been hypothesised to affect subsequent behaviour and to potentially bias study findings. In this trial we examine the impact of awareness of study design and allocation on participant drinking behaviour. Methods/Design: A three-arm parallel group randomised controlled trial design will be used. All recruitment, screening, randomisation, and follow-up will be conducted on-line among university students. Participants who indicate a hazardous level of alcohol consumption will be randomly assigned to one of three groups. Group A will be informed their drinking will be assessed at baseline and again in one month (as in a cohort study design). Group B will be told the study is an intervention trial and they are in the control group. Group C will be told the study is an intervention trial and they are in the intervention group. All will receive exactly the same brief educational material to read. After one month, alcohol intake for the past 4 weeks will be assessed. Discussion: The experimental manipulations address subtle and previously unexplored ways in which participant behaviour may be unwittingly influenced by standard practice in trials. Given the necessity of relying on self-reported outcome, it will not be possible to distinguish true behaviour change from reporting artefact. This does not matter in the present study, as any effects of awareness of study design or allocation involve bias that is not well understood. There has been little research on awareness effects, and our outcomes will provide an indication of the possible value of further studies of this type and inform hypothesis generation

    Life in children's homes: a report of children's experience by the Children's Rights Director for England

    Get PDF

    Using Internet in Stated Preference Surveys: A Review and Comparison of Survey Modes

    Get PDF
    Internet is quickly becoming the survey mode of choice for stated preference (SP) surveys in environmental economics. However, this choice is being made with relatively little consideration of its potential influence on survey results. This paper reviews the theory and emerging evidence of mode effects in the survey methodology and SP literatures, summarizes the findings, and points out implications for Internet SP practice and research. The SP studies that compare Internet with other modes do generally not find substantial difference. The majority of welfare estimates are equal; or somewhat lower for the Internet surveys. Further, there is no clear evidence of substantially lower quality or validity of Internet responses. However, the degree of experimental control is often low in comparative studies across survey modes, and they often confound measurement and sample composition effects. Internet offers a huge potential for experimentation and innovation in SP research, but when used to derive reliable welfare estimates for policy assessment, issues like representation and nonresponse bias for different Internet panels should receive more attention.Internet; survey mode; contingent valuation; stated preferences

    Can cheap panel-based internet surveys substitute costly in-person interviews in CV surveys?

    Get PDF
    With the current growth in broadband penetration, Internet is likely to be the data collection mode of choice for stated preference research in the not so distant future. However, little is known about how this survey mode may influence data quality and welfare estimates. In a first controlled field experiment to date as part of a national contingent valuation (CV) survey estimating willingness to pay (WTP) for biodiversity protection plans, we assign two groups sampled from the same panel of respondents either to an Internet or in-person (in-house) interview mode. Our design is better able than previous studies to isolate measurement effects from sample composition effects. We find little evidence of social desirability bias in the in-person interview setting or satisficing (shortcutting the response process) in the Internet survey. The share of “don’t knows”, zeros and protest responses to the WTP question with a payment card is very similar between modes. Equality of mean WTP between samples cannot be rejected. Considering equivalence, we can reject that mean WTP from the in-person sample is more than 30% higher. Results are quite encouraging for the use of Internet in CV as stated preferences do not seem to be significantly different or biased compared to in-person interviews.Internet; contingent valuation; interviews; survey mode; willingness to pay
    corecore