1,454 research outputs found

    Global disease monitoring and forecasting with Wikipedia

    Full text link
    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with r2r^2 up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

    Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia

    Full text link
    Online encyclopedias, such as Wikipedia, have been well-developed and researched in the last two decades. One can find any attributes or other information of a wiki item on a wiki page edited by a community of volunteers. However, the traditional text, images and tables can hardly express some aspects of an wiki item. For example, when we talk about ``Shiba Inu'', one may care more about ``How to feed it'' or ``How to train it not to protect its food''. Currently, short-video platforms have become a hallmark in the online world. Whether you're on TikTok, Instagram, Kuaishou, or YouTube Shorts, short-video apps have changed how we consume and create content today. Except for producing short videos for entertainment, we can find more and more authors sharing insightful knowledge widely across all walks of life. These short videos, which we call knowledge videos, can easily express any aspects (e.g. hair or how-to-feed) consumers want to know about an item (e.g. Shiba Inu), and they can be systematically analyzed and organized like an online encyclopedia. In this paper, we propose Kuaipedia, a large-scale multi-modal encyclopedia consisting of items, aspects, and short videos lined to them, which was extracted from billions of videos of Kuaishou (Kwai), a well-known short-video platform in China. We first collected items from multiple sources and mined user-centered aspects from millions of users' queries to build an item-aspect tree. Then we propose a new task called ``multi-modal item-aspect linking'' as an expansion of ``entity linking'' to link short videos into item-aspect pairs and build the whole short-video encyclopedia. Intrinsic evaluations show that our encyclopedia is of large scale and highly accurate. We also conduct sufficient extrinsic experiments to show how Kuaipedia can help fundamental applications such as entity typing and entity linking

    CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

    Full text link
    Continual pretraining is a standard way of building a domain-specific pretrained language model from a general-domain language model. However, sequential task training may cause catastrophic forgetting, which affects the model performance in downstream tasks. In this paper, we propose a continual pretraining method for the BERT-based model, named CBEAF-Adapting (Chinese Biomedical Enhanced Attention-FFN Adapting). Its main idea is to introduce a small number of attention heads and hidden units inside each self-attention layer and feed-forward network. Using the Chinese biomedical domain as a running example, we trained a domain-specific language model named CBEAF-RoBERTa. We conduct experiments by applying models to downstream tasks. The results demonstrate that with only about 3% of model parameters trained, our method could achieve about 0.5%, 2% average performance gain compared to the best performing model in baseline and the domain-specific model, PCL-MedBERT, respectively. We also examine the forgetting problem of different pretraining methods. Our method alleviates the problem by about 13% compared to fine-tuning

    Hasil Plagiasi artikel Anchos

    Get PDF

    STRATEGIE I NARZĘDZIA TŁUMACZENIA PRAWNICZEGO

    Get PDF
    The article deals with translation strategies in their relation to translation tools. It reflects the theoretical requirements for professional legal translations in the light of the legallinguistic equivalence and the skopos-theory. The author stresses that developing translatorial strategies as well as designing and using translation tools are theory-dependent activities. What remains to be developed is the explicit model of hitherto implicitly followed particular translatorial strategies in relation to all types of translation tools. In the institutional setting the relevant translatorial strategies are influenced by guidelines that regulate many issues that are subject to choices made by individual translators. These guidelines often also determine the use of translation tools. As of now, on-line translation tools widen considerably the traditional lexicographical notions and they contribute to work rationalization in that they offer the translator a survey of already existing translation alternatives. However, available translation tools, traditional and digital, tend towards solving problems of translatorial routine.Their multitude corresponds with the number of dynamic problems in legal translation that cannot be rigidly determined. Therefore, creative legal translation remains an essentially human activity. Meanwhile, the multitude of existing approaches might lead in future to the emergence of a legallinguistic thesaurus that would display the totality of legal speech acts that constitute the legal discourse. The legal-linguistic thesaurus, that would constitute the main translation tool, does not preclude developing of other goal-oriented translation tools of limited scope. Therefore, notwithstanding the on-going changes, strategically responsible choice of translatorial strategies and the corresponding informed choice of translatorial tools are essential techniques for daily translation work.W artykule omówione zostają problemy wynikające w relacji pomiędzy strategiami translatorskimi i narzędziami wspomagającymi tłumaczenie. Punkt wyjściowy stanowią teoretyczne wymagania dla profesionalnych tłumaczeń tekstów prawnych wynikające z pojęcia ekwiwalencji legilingwistycznej oraz teorii skoposu. Autor podkreśla, że planowanie strategii translatorskich, jak również stosowanie narzędzi wspomagających tłumaczenie są działaniami zależnymi od wyboru teorii. W tym kontekście koniecznym wydaje się rozwinięcie eksplicytnego modelu strategii translatorskich związanych z wyborem narzędzi wspomagających tłumaczenie, które dotychczas są jedynie domyślne w praktyce translatorskiej. Ponadto, w instytucjach w których wykonywane są przekłady mają zastosowanie dyrektywy dla tłumaczy, które regulują kwestie związane z wyborem i zastosowaniem narzędzi wspomagających tłumaczenie. Cyfrowe narzędzia wspomagające przekład rozszerzyły dotychczasowe pojęcia leksykograficzne i przyczyniły się do racjonalizacji trybu pracy udostępniając tłumaczowi do wyboru przegląd ekwiwalentów tłumaczeniowych. Jednakowoż, tradycyjne i cyfrowe narzędzia wspomagające są pomocne głównie przy rozwiązywaniu rutynowych problemów przekładów. Ich znaczna liczba odpowiada ilości problemów przekładu prawnego o charakterze dynamicznym, które nie mogą być rozwiązane w sposób sztywny. Z tego powodu kreatywne tłumaczenie prawne pozostaje działalnością wykonywaną przez ludzi. Jednakże istniejąca mnogość podejść do identyfikacji strategii translacyjnych mogłaby w przyszłości doprowadzić do stworzenia tezaurusu języka prawa dokumentującego całokształt prawnych aktów mowy, które tworzą dyskurs prawny. Tezaurus języka prawa, który mógłby stać się głównym narzędziem wspomagającym, nie wyklucza jednak rozwoju innych narzędzi mniejszego pokroju wspomagających przekład. Dlatego, pomimo zachodzących zmian, odpowiedzialny wybór strategii translatorskich i narzędzi wspomagających przekład pozostaje jedną z podstawowych umiejętności zawodowych tłumacza w jego codziennej prac

    Intelligent Learning for Knowledge Graph towards Geological Data

    Get PDF
    corecore