33,726 research outputs found

    EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

    Full text link
    This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets

    Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture

    Full text link
    We describe a new deep learning architecture for learning to rank question answer pairs. Our approach extends the long short-term memory (LSTM) network with holographic composition to model the relationship between question and answer representations. As opposed to the neural tensor layer that has been adopted recently, the holographic composition provides the benefits of scalable and rich representational learning approach without incurring huge parameter costs. Overall, we present Holographic Dual LSTM (HD-LSTM), a unified architecture for both deep sentence modeling and semantic matching. Essentially, our model is trained end-to-end whereby the parameters of the LSTM are optimized in a way that best explains the correlation between question and answer representations. In addition, our proposed deep learning architecture requires no extensive feature engineering. Via extensive experiments, we show that HD-LSTM outperforms many other neural architectures on two popular benchmark QA datasets. Empirical studies confirm the effectiveness of holographic composition over the neural tensor layer.Comment: SIGIR 2017 Full Pape

    The Nature of Novelty Detection

    Full text link
    Sentence level novelty detection aims at reducing redundant sentences from a sentence list. In the task, sentences appearing later in the list with no new meanings are eliminated. Aiming at a better accuracy for detecting redundancy, this paper reveals the nature of the novelty detection task currently overlooked by the Novelty community - Novelty as a combination of the partial overlap (PO, two sentences sharing common facts) and complete overlap (CO, the first sentence covers all the facts of the second sentence) relations. By formalizing novelty detection as a combination of the two relations between sentences, new viewpoints toward techniques dealing with Novelty are proposed. Among the methods discussed, the similarity, overlap, pool and language modeling approaches are commonly used. Furthermore, a novel approach, selected pool method is provided, which is immediate following the nature of the task. Experimental results obtained on all the three currently available novelty datasets showed that selected pool is significantly better or no worse than the current methods. Knowledge about the nature of the task also affects the evaluation methodologies. We propose new evaluation measures for Novelty according to the nature of the task, as well as possible directions for future study.Comment: This paper pointed out the future direction for novelty detection research. 37 pages, double spaced versio

    Teaching and learning in live online classrooms

    Get PDF
    Online presence of information and services is pervasive. Teaching and learning are no exception. Courseware management systems play an important role in enhancing instructional delivery for either traditional day, full-time students or non-traditional evening, party-time adult learners enrolled in online programs. While online course management tools are with no doubt practical, they limit, however, live or synchronous communication to chat rooms, whose discourse has little in common with face-to-face class communication. A more recent trend in online teaching and learning is the adoption and integration of web conferencing tools to enable live online classrooms and recreate the ethos of traditional face-to-face sessions. In this paper we present the experience we have had with the adoption of the LearnLinc® web conferencing tool, an iLinc Communications, Inc. product. We have coupled LearnLinc with Blackboard®, for the online and hybrid computer science courses we offered in the past academic year in the evening undergraduate and graduate computer science programs at Rivier College. Twelve courses, enrolling over 150 students, have used the synchronous online teaching capabilities of LearnLinc. Students who took courses in the online or hybrid format could experience a comparable level of interaction, participation, and collaboration as in traditional classes. We solicited student feedback by administering a student survey to over 100 students. The 55% response rate produced the data for this paper\u27s study. We report on the study\u27s findings and show students\u27 rankings of evaluation criteria applied to hybrid and online instructional formats, with or without a web conferencing tool. Our analysis shows that students ranked favorably LearnLinc live sessions added to Blackboard-only online classes. In addition, how they learned in live online classrooms was found to be the closest to the hybrid class experience with regard to teaching practices they perceived as most important to them, such as seeking instructor\u27s assistance, managing time on task, and exercising problem solving skills
    corecore