22,736 research outputs found

    Training Curricula for Open Domain Answer Re-Ranking

    Full text link
    In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long

    Training Curricula for Open Domain Answer Re-Ranking

    Get PDF
    In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques

    Relation Discovery from Web Data for Competency Management

    Get PDF
    This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006

    Multinational perspectives on information technology from academia and industry

    Get PDF
    As the term \u27information technology\u27 has many meanings for various stakeholders and continues to evolve, this work presents a comprehensive approach for developing curriculum guidelines for rigorous, high quality, bachelor\u27s degree programs in information technology (IT) to prepare successful graduates for a future global technological society. The aim is to address three research questions in the context of IT concerning (1) the educational frameworks relevant for academics and students of IT, (2) the pathways into IT programs, and (3) graduates\u27 preparation for meeting future technologies. The analysis of current trends comes from survey data of IT faculty members and professional IT industry leaders. With these analyses, the IT Model Curricula of CC2005, IT2008, IT2017, extensive literature review, and the multinational insights of the authors into the status of IT, this paper presents a comprehensive overview and discussion of future directions of global IT education toward 2025

    GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

    Full text link
    Retrieval-enhanced text generation, which aims to leverage passages retrieved from a large passage corpus for delivering a proper answer given the input query, has shown remarkable progress on knowledge-intensive language tasks such as open-domain question answering and knowledge-enhanced dialogue generation. However, the retrieved passages are not ideal for guiding answer generation because of the discrepancy between retrieval and generation, i.e., the candidate passages are all treated equally during the retrieval procedure without considering their potential to generate the proper answers. This discrepancy makes a passage retriever deliver a sub-optimal collection of candidate passages to generate answers. In this paper, we propose the GeneRative Knowledge Improved Passage Ranking (GripRank) approach, addressing the above challenge by distilling knowledge from a generative passage estimator (GPE) to a passage ranker, where the GPE is a generative language model used to measure how likely the candidate passages can generate the proper answer. We realize the distillation procedure by teaching the passage ranker learning to rank the passages ordered by the GPE. Furthermore, we improve the distillation quality by devising a curriculum knowledge distillation mechanism, which allows the knowledge provided by the GPE can be progressively distilled to the ranker through an easy-to-hard curriculum, enabling the passage ranker to correctly recognize the provenance of the answer from many plausible candidates. We conduct extensive experiments on four datasets across three knowledge-intensive language tasks. Experimental results show advantages over the state-of-the-art methods for both passage ranking and answer generation on the KILT benchmark.Comment: 11 pages, 4 figure

    Learning requirements engineering within an engineering ethos

    Get PDF
    An interest in educating software developers within an engineering ethos may not align well with the characteristics of the discipline, nor address the underlying concerns of software practitioners. Education for software development needs to focus on creativity, adaptability and the ability to transfer knowledge. A change in the way learning is undertaken in a core Software Engineering unit within a university's engineering program demonstrates one attempt to provide students with a solid foundation in subject matter while at the same time exposing them to these real-world characteristics. It provides students with a process to deal with problems within a metacognitive-rich framework that makes complexity apparent and lets students deal with it adaptively. The results indicate that, while the approach is appropriate, student-learning characteristics need to be investigated further, so that the two aspects of learning may be aligned more closely

    On the Internationalization of CAD Learning Through an English Glossary

    Get PDF
    Comunicació presentada al XXIX Congreso International INGEGRAF 2019 "La transformación Digital en la Ingeniería Gráfica” (20-21 Junio 2019, Logroño - La Rioja)The internationalization of higher education is an essential factor to improve the quality and efficiency of Spanish universities, providing students with the main skills, and knowledge to interact effectively in an international and multicultural work context as professionals. The internationalization of universities must be a transversal process, not exclusive of its territorial dimension, aimed at advancing towards a society and a knowledge economy that propitiate a solid and stable model of development and growth. To this end, professors in the area of Graphic Expression for Engineering at the Universitat Jaume I (UJI) have developed an online glossary of specific terms in English related to the 3D modelling CAD tools used in Graphic Engineering subjects. This new online tool seeks to train students to increase their technical vocabulary in English and improve their learning and communication skills to face possible collaborations in future European projects. The glossary is introduced weekly to the students during the course. Subsequently, a survey is conducted to the students to verify the effectiveness of the training. This work collects the results and conclusions of this analysis

    Kecenderungan kerjaya sebagai usahawan di kalangan pelajar bumiputera di Politeknik Sultan Haji Ahmad Shah, Kuantan, Pahang

    Get PDF
    Usahawan memainkan peranan penting dalam pembangunan ekonomi sesebuah negara. Walau bagaimanapun, jika dilihat senario sekarang, masih ramai Bumiputera yang kurang berminat untuk menceburi bidang keusahawanan. Justeru itu, kajian ini bertujuan untuk menyelidik tentang tahap kecenderungan keijaya sebagai usahawan di kaiangan pelajar Bumiputera tahun akhir Diploma Akauntansi Politeknik Sultan Haji Ahmad Shah, Kuantan (POLISAS). Secara khususnya, kajian ini akan menyelidik sejauh mana cita-cita pelajar Bumiputera untuk menjadi seorang usahawan dipengaruhi oleh ciri-ciri peribadi pelajar, faktor keluarga, faktor pengajian, pengalaman keija dan faktor persekitaran. Responden kajian terdiri daripada pelajar Bumiputera POLISAS iaitu seramai 51 orang. Data dikumpul menggunakan soal selidik dan dianalisis dengan menggunakan prosedur Ujian T, Crosstabs dan Korelasi Pearson melalui perisian SPSS (Statistical Package For Social Sciences). Dapatan kajian menunjukkan hanya faktor persekitaran (iaitu faktor pihak yang paling mempengaruhi pelajar untuk berniaga) yang dapat menarik minat pelajar untuk melibatkan diri dalam perniagaan. Faktor-faktor lain didapati kurang memberikan sumbangan dalam menarik minat pelajar untuk cenderung kepada bidang keusahawanan. Oleh itu, beberapa cadangan telah dibuat bagi menangani masalah ini agar kaum Bumiputera tidak jauh ketinggalan berbanding kaum lain dan seterusnya memenuhi hasrat kerajaan dalam merealisasikan matlamat Dasar Ekonomi Baru (DEB) yang masih belum dicapai sepenuhnya hingga ke hari ini
    corecore