26,018 research outputs found

    Distributional Measures of Semantic Distance: A Survey

    Full text link
    The ability to mimic human notions of semantic distance has widespread applications. Some measures rely only on raw text (distributional measures) and some rely on knowledge sources such as WordNet. Although extensive studies have been performed to compare WordNet-based measures with human judgment, the use of distributional measures as proxies to estimate semantic distance has received little attention. Even though they have traditionally performed poorly when compared to WordNet-based measures, they lay claim to certain uniquely attractive features, such as their applicability in resource-poor languages and their ability to mimic both semantic similarity and semantic relatedness. Therefore, this paper presents a detailed study of distributional measures. Particular attention is paid to flesh out the strengths and limitations of both WordNet-based and distributional measures, and how distributional measures of distance can be brought more in line with human notions of semantic distance. We conclude with a brief discussion of recent work on hybrid measures

    Two knowledge-based methods for High-Performance Sense Distribution Learning

    Get PDF
    Knowing the correct distribution of senses within a corpus can potentially boost the performance of Word Sense Disambiguation (WSD) systems by many points. We present two fully automatic and language-independent methods for computing the distribution of senses given a raw corpus of sentences. Intrinsic and extrinsic evaluations show that our methods outperform the current state of the art in sense distribution learning and the strongest baselines for the most frequent sense in multiple languages and on domain-specific test sets. Our sense distributions are available at http://trainomatic.org

    Global disease monitoring and forecasting with Wikipedia

    Full text link
    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with r2r^2 up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

    Insights from Machine-Learned Diet Success Prediction

    Get PDF
    To support people trying to lose weight and stay healthy, more and more fitness apps have sprung up including the ability to track both calories intake and expenditure. Users of such apps are part of a wider ``quantified self'' movement and many opt-in to publicly share their logged data. In this paper, we use public food diaries of more than 4,000 long-term active MyFitnessPal users to study the characteristics of a (un-)successful diet. Concretely, we train a machine learning model to predict repeatedly being over or under self-set daily calories goals and then look at which features contribute to the model's prediction. Our findings include both expected results, such as the token ``mcdonalds'' or the category ``dessert'' being indicative for being over the calories goal, but also less obvious ones such as the difference between pork and poultry concerning dieting success, or the use of the ``quick added calories'' functionality being indicative of over-shooting calorie-wise. This study also hints at the feasibility of using such data for more in-depth data mining, e.g., looking at the interaction between consumed foods such as mixing protein- and carbohydrate-rich foods. To the best of our knowledge, this is the first systematic study of public food diaries.Comment: Preprint of an article appearing at the Pacific Symposium on Biocomputing (PSB) 2016 in the Social Media Mining for Public Health Monitoring and Surveillance trac

    Strategic I/O Psychology and the Role of Utility Analysis Models

    Get PDF
    In the 1990’s, the significance of human capital in organizations has been increasing,and measurement issues in human resource management have achieved significant prominence. Yet, I/O psychology research on utility analysis and measurement has actually declined. In this chapter we propose a decision-based framework to review developments in utility analysis research since 1991, and show that through lens of this framework there are many fertile avenues for research. We then show that both I/O psychology and strategic HRM research and practice can be enhanced by greater collaboration and integration, particularly regarding the link between human capital and organizational success. We present an integrative framework as the basis for that integration, and illustrate its implications for future research

    Development and evaluation of a web-based learning system based on learning object design and generative learning to improve higher-order thinking skills and learning

    Get PDF
    This research aims to design, develop and evaluate the effectiveness of a Webbased learning system prototype called Generative Object Oriented Design (GOOD) learning system. Result from the preliminary study conducted showed most of the students were at lower order thinking skills (LOTS) compared to higher order thinking skills (HOTS) based on Bloom’s Taxonomy. Based on such concern, GOOD learning system was designed and developed based on learning object design and generative learning to improve HOTS and learning. A conceptual model design of GOOD learning system, called Generative Learning Object Organizer and Thinking Tasks (GLOOTT) model, has been proposed from the theoretical framework of this research. The topic selected for this research was Computer System (CS) which focused on the hardware concepts from the first year Diploma of Computer Science subjects. GOOD learning system acts as a mindtool to improve HOTS and learning in CS. A pre-experimental research design of one group pretest and posttest was used in this research. The samples of this research were 30 students and 12 lecturers. Data was collected from the pretest, posttest, portfolio, interview and Web-based learning system evaluation form. The paired-samples T test analysis was used to analyze the achievement of the pretest and posttest and the result showed that there was significance difference between the mean scores of pretest and posttest at the significant level a = 0.05 (p=0.000). In addition, the paired-samples T test analysis of the cognitive operations from Bloom’s Taxonomy showed that there was significance difference for each of the cognitive operation of the students before and after using GOOD learning system. Results from the study showed improvement of HOTS and learning among the students. Besides, analysis of portfolio showed that the students engaged HOTS during the use of the system. Most of the students and lecturers gave positive comments about the effectiveness of the system in improving HOTS and learning in CS. From the findings in this research, GOOD learning system has the potential to improve students’ HOTS and learning
    • …
    corecore