34,118 research outputs found

    Synopsis articles in the planning of a trilingual dictionary ; Yilumbu–French–English

    Get PDF
    A distinction is often drawn between single articles and synopsis articles. A single article is the so-called default article. It does not deviate from the traditional microstructural approach of the dictionary because it presents the minimum data for each lemma sign treated, while a synopsis article gives additional data for each treated lemma. The classical conception of the synopsis article originates with Bergenholtz, Tarp and Wiegand (1999). In their view, synopsis articles which were devised for language for special purposes (LSP) dictionaries have to make provision for the treatment of lemma signs functioning as superordinate terms (general lemmata or topics) as well as their hyponyms. The aim of this article is to present a new lexicographic approach to the study of synopsis articles against the background of a planned trilingual dictionary. It will be shown that synopsis articles have a more general use than its restricted application in LSP dictionaries. In fact, the focus will not be on the treatment of technical or scientific vocabulary but rather on cultural terms. This article also raises among other things, questions about the purpose of the planned dictionary, its target users, different types of microstructures as well as different data categories to be included in synopsis articles.Une distinction est souvent faite entre articles simples et articles synopsis. L'article simple est encore connu sous la dénomination d'article par défaut. Il ne dévie pas de l'approche microstructurelle traditionnelle du dictionnaire parce qu'il présente le minimum pour chaque lemme traité, alors qu'un article synopsis offre des données additives pour chaque lemme traité. La conception classique d'article synopsis est de Bergenholtz, Tarp et Wiegand (1999). De leur point de vue pour les articles synopsis qui ont été conçus pour des dictionnaires de langue de spécialité (LSP), des dispositions doivent être prises pour le traitement de signeslemmes fonctionnant comme termes superordonnés (lemmes généraux ou thèmes) ainsi que leurs hyponymes. L'objectif de cet article est de présenter une approche lexicographique nouvelle pour l'étude des articles synopsis avec en toile de fond un dictionnaire trilingue en proposition. Il sera montré que les articles synopsis ont un usage plus large que leur application limitée aux dictionnaires LSP. En effet, le centre d'intérêt portera non pas sur le traitement des termes du vocabulaire 182 P.A. Mavoungou technique ou scientifique mais plutôt sur des termes culturels. Cet article soulève également entre autres choses des interrogations quant à la visée du dictionnaire proposé, ses utilisateurs cibles, les différents types de microstructures ainsi que les différentes catégories de données à inclure pour chaque article synopsis

    Interim report on Media Analysis

    Get PDF
    PACHELBEL WP4 “Stimulus Materials” uses findings from WP3 (Policy Assumptions) and from additional sources to prepare stimulus materials for the group-based process to be implemented in WP5. The output, informed by the present report, will be a set of materials to inform and stimulate the group-based process. These will take the form of real or simulated media coverage and/or documentary materials produced by various sources, scenarios, vignettes, and dramatised accounts. Another output, also informed by this report, will be an individual questionnaire for use in the group-based process. The present deliverable is centred on one of the data-gathering and analytic activities set up by WP4 to identify pertinent representational elements that should be included in the future stimulus materials, country by country. “Representational elements” have been defined in WP4 as typical images, anecdotes, examples, and references which are used by policy actors to explain and justify policy choices within the policy domains pertinent to PACHELBEL. Particular attention is given to references made to citizens, their perceptions and behaviours. In Task 4.2, PACHELBEL partners gathered representational elements in their respective contexts. To support this task, a “media analysis” template was developed by WPL SYMLOG for discussion at the second Consortium project meeting (Dorking, Mo. 6). Criteria were agreed for the analysis of a selection of actual publications in a range of media (print periodicals, public information materials disseminated by authorities, etc.). In Summer 2010, partners in each country used the template to analyze and report a sample of several dozen articles in selected policy areas. This interim report (D4.2) recalls methodology (Part 1), presents representational elements country by country (Part 2) and provides a summary overview of similarities and contrasts across country samples (Part 3). Conclusions and next steps are presented in Part 4. Also provided are a simplified media analysis template (Annex 1) and the compiled basic frequency analysis (Annex 2)

    Knowledge driven approaches to e-learning recommendation.

    Get PDF
    Learners often have difficulty finding and retrieving relevant learning materials to support their learning goals because of two main challenges. The vocabulary learners use to describe their goals is different from that used by domain experts in teaching materials. This challenge causes a semantic gap. Learners lack sufficient knowledge about the domain they are trying to learn about, so are unable to assemble effective keywords that identify what they wish to learn. This problem presents an intent gap. The work presented in this thesis focuses on addressing the semantic and intent gaps that learners face during an e-Learning recommendation task. The semantic gap is addressed by introducing a method that automatically creates background knowledge in the form of a set of rich learning-focused concepts related to the selected learning domain. The knowledge of teaching experts contained in e-Books is used as a guide to identify important domain concepts. The concepts represent important topics that learners should be interested in. An approach is developed which leverages the concept vocabulary for representing learning materials and this influences retrieval during the recommendation of new learning materials. The effectiveness of our approach is evaluated on a dataset of Machine Learning and Data Mining papers, and our approach outperforms benchmark methods. The results confirm that incorporating background knowledge into the representation of learning materials provides a shared vocabulary for experts and learners, and this enables the recommendation of relevant materials. We address the intent gap by developing an approach which leverages the background knowledge to identify important learning concepts that are employed for refining learners' queries. This approach enables us to automatically identify concepts that are similar to queries, and take advantage of distinctive concept terms for refining learners' queries. Using the refined query allows the search to focus on documents that contain topics which are relevant to the learner. An e-Learning recommender system is developed to evaluate the success of our approach using a collection of learner queries and a dataset of Machine Learning and Data Mining learning materials. Users with different levels of expertise are employed for the evaluation. Results from experts, competent users and beginners all showed that using our method produced documents that were consistently more relevant to learners than when the standard method was used. The results show the benefits in using our knowledge driven approaches to help learners find relevant learning materials

    Building Accurate Low Latency ASR for Streaming Voice Search

    Full text link
    Automatic Speech Recognition (ASR) plays a crucial role in voice-based applications. For applications requiring real-time feedback like Voice Search, streaming capability becomes vital. While LSTM/RNN and CTC based ASR systems are commonly employed for low-latency streaming applications, they often exhibit lower accuracy compared to state-of-the-art models due to a lack of future audio frames. In this work, we focus on developing accurate LSTM, attention, and CTC based streaming ASR models for large-scale Hinglish (a blend of Hindi and English) Voice Search. We investigate various modifications in vanilla LSTM training which enhance the system's accuracy while preserving its streaming capabilities. We also address the critical requirement of end-of-speech (EOS) detection in streaming applications. We present a simple training and inference strategy for end-to-end CTC models that enables joint ASR and EOS detection. The evaluation of our model on Flipkart's Voice Search, which handles substantial traffic of approximately 6 million queries per day, demonstrates significant performance gains over the vanilla LSTM-CTC model. Our model achieves a word error rate (WER) of 3.69% without EOS and 4.78% with EOS while also reducing the search latency by approximately ~1300 ms (equivalent to 46.64% reduction) when compared to an independent voice activity detection (VAD) model.Comment: Accepted at ACL 2023 Industry Trac

    Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts

    Get PDF
    The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering
    • …
    corecore