3,106 research outputs found

    Comparing Fifty Natural Languages and Twelve Genetic Languages Using Word Embedding Language Divergence (WELD) as a Quantitative Measure of Language Distance

    Full text link
    We introduce a new measure of distance between languages based on word embedding, called word embedding language divergence (WELD). WELD is defined as divergence between unified similarity distribution of words between languages. Using such a measure, we perform language comparison for fifty natural languages and twelve genetic languages. Our natural language dataset is a collection of sentence-aligned parallel corpora from bible translations for fifty languages spanning a variety of language families. Although we use parallel corpora, which guarantees having the same content in all languages, interestingly in many cases languages within the same family cluster together. In addition to natural languages, we perform language comparison for the coding regions in the genomes of 12 different organisms (4 plants, 6 animals, and two human subjects). Our result confirms a significant high-level difference in the genetic language model of humans/animals versus plants. The proposed method is a step toward defining a quantitative measure of similarity between languages, with applications in languages classification, genre identification, dialect identification, and evaluation of translations

    Functional analysis and transcriptional output of the Göttingen minipig genome

    Get PDF
    In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development.; Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies.; Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed

    Potency by Name? ‘Medicine Buddha Plant’ and Other Herbs in the Japanese \u3ci\u3eScroll of Equine Medicine\u3c/i\u3e (\u3ci\u3eBa’i sƍshi emaki\u3c/i\u3e, 1267)

    Get PDF
    Buddhist ritual healing and medical therapies included care for domestic animals, such as the horse. In pre-modern Japan, equine medicine (ba’i éŠŹćŒ») was not restricted to the treatment of military horses; it was also practiced in a religious context. The Scroll of Equine Medicine (Ba’i sƍshi emaki éŠŹćŒ»è‰çŽ™ç”” ć·», 1267) is an enigmatic picture scroll held by the Tokyo National Museum. It extends to more than six meters and contains images of ten divine figures related to the healing of horses, followed by seventeen pictures of plants, and a postscript emphasizing that the content of the scroll should be kept secret. Many of the plants listed in the scroll are either associated with the world of Buddhism, e.g. Yakushi-sƍ è–Ź ćž«è‰, ‘Medicine Buddha plant,’ or with horses, e.g. metsu-sƍ éŠŹé ­è‰, ‘horsehead plant.’ Previous analyses of the scroll largely focused on the botanical identification of the sketches of the plants. This article reviews current interpretations of the scroll and explores the question of whether the plant names were thought to empower the plants to be used as potent materia medica for veterinary purposes. Based on earlier analyses, I suggest a new interpretation of the scroll from a study of religions perspective taking into consideration that some of the plant names in the scroll indicate both health-related and salvific potency. I also address the possible use of the scroll. The scarcity of textual information and the choice of textual detail and imagery in this ‘secret’ scroll suggests that it was used in the context of an oral transmission and empowerment ritual. The scroll itself seems to have been an object of ritual empowerment, rather than a compendium of materia medica for practical daily use when caring for horses

    A Review of Accent-Based Automatic Speech Recognition Models for E-Learning Environment

    Get PDF
    The adoption of electronics learning (e-learning) as a method of disseminating knowledge in the global educational system is growing at a rapid rate, and has created a shift in the knowledge acquisition methods from the conventional classrooms and tutors to the distributed e-learning technique that enables access to various learning resources much more conveniently and flexibly. However, notwithstanding the adaptive advantages of learner-centric contents of e-learning programmes, the distributed e-learning environment has unconsciously adopted few international languages as the languages of communication among the participants despite the various accents (mother language influence) among these participants. Adjusting to and accommodating these various accents has brought about the introduction of accents-based automatic speech recognition into the e-learning to resolve the effects of the accent differences. This paper reviews over 50 research papers to determine the development so far made in the design and implementation of accents-based automatic recognition models for the purpose of e-learning between year 2001 and 2021. The analysis of the review shows that 50% of the models reviewed adopted English language, 46.50% adopted the major Chinese and Indian languages and 3.50% adopted Swedish language as the mode of communication. It is therefore discovered that majority of the ASR models are centred on the European, American and Asian accents, while unconsciously excluding the various accents peculiarities associated with the less technologically resourced continents

    Local adaptations of Mediterranean sheep and goats through an integrative approach

    Get PDF
    Small ruminants are suited to a wide variety of habitats and thus represent promising study models for identifying genes underlying adaptations. Here, we considered local Mediterranean breeds of goats (n = 17) and sheep (n = 25) from Italy, France and Spain. Based on historical archives, we selected the breeds potentially most linked to a territory and defined their original cradle (i.e., the geographical area in which the breed has emerged), including transhumant pastoral areas. We then used the programs PCAdapt and LFMM to identify signatures of artificial and environmental selection. Considering cradles instead of current GPS coordinates resulted in a greater number of signatures identified by the LFMM analysis. The results, combined with a systematic literature review, revealed a set of genes with potentially key adaptive roles in relation to the gradient of aridity and altitude. Some of these genes have been previously implicated in lipid metabolism (SUCLG2, BMP2), hypoxia stress/lung function (BMPR2), seasonal patterns (SOX2, DPH6) or neuronal function (TRPC4, TRPC6). Selection signatures involving the PCDH9 and KLH1 genes, as well as NBEA/NBEAL1, were identified in both species and thus could play an important adaptive role

    Barley heads east: Genetic analyses reveal routes of spread through diverse Eurasian landscapes

    Get PDF
    One of the world’s most important crops, barley, was domesticated in the Near East around 11,000 years ago. Barley is a highly resilient crop, able to grown in varied and marginal environments, such as in regions of high altitude and latitude. Archaeobotanical evidence shows that barley had spread throughout Eurasia by 2,000 BC. To further elucidate the routes by which barley cultivation was spread through Eurasia, simple sequence repeat (SSR) analysis was used to determine genetic diversity and population structure in three extant barley taxa: domesticated barley (Hordeum vulgare L. subsp. vulgare), wild barley (H. vulgare subsp. spontaneum) and a six-rowed brittle rachis form (H. vulgare subsp. vulgare f. agriocrithon (Åberg) Bowd.). Analysis of data using the Bayesian clustering algorithm InStruct suggests a model with three ancestral genepools, which captures a major split in the data, with substantial additional resolution provided under a model with eight genepools. Our results indicate that H. vulgare subsp. vulgare f. agriocrithon accessions and Tibetan Plateau H. vulgare subsp. spontaneum are closely related to the H. vulgare subsp. vulgare in their vicinity, and are therefore likely to be feral derivatives of H. vulgare subsp. vulgare. Under the eight genepool model, cultivated barley is split into six ancestral genepools, each of which has a distinct distribution through Eurasia, along with distinct morphological features and flowering time phenotypes. The distribution of these genepools and their phenotypic characteristics is discussed together with archaeological evidence for the spread of barley eastwards across Eurasia
    • 

    corecore