Search CORE

270 research outputs found

Speech Recognition and Scholarly Research: Usability and Sustainability

Author: Ordelman Roeland J.F.
van Hessen Adrianus J.
Publication venue
Publication date: 10/10/2018
Field of study

University of Twente Research Information

Media Suite: Unlocking Archives for Mixed Media Scholarly Research

Author: Martínez Ortíz Carlos
Melgar Estrada Liliana
Noordegraaf Julia
Ordelman Roeland J.F.
Publication venue
Publication date: 10/10/2018
Field of study

University of Twente Research Information

Why High-Performance Modelling and Simulation for Big Data Applications Matters

Author: Aldinucci M.
Bracciali A.
Grelck C.
Larsson E.
Niewiadomska-Szynkiewicz E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

European Language Grid

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2022
Field of study

This open access book provides an in-depth description of the EU project European Language Grid (ELG). Its motivation lies in the fact that Europe is a multilingual society with 24 official European Union Member State languages and dozens of additional languages including regional and minority languages. The only meaningful way to enable multilingualism and to benefit from this rich linguistic heritage is through Language Technologies (LT) including Natural Language Processing (NLP), Natural Language Understanding (NLU), Speech Technologies and language-centric Artificial Intelligence (AI) applications. The European Language Grid provides a single umbrella platform for the European LT community, including research and industry, effectively functioning as a virtual home, marketplace, showroom, and deployment centre for all services, tools, resources, products and organisations active in the field. Today the ELG cloud platform already offers access to more than 13,000 language processing tools and language resources. It enables all stakeholders to deposit, upload and deploy their technologies and datasets. The platform also supports the long-term objective of establishing digital language equality in Europe by 2030 – to create a situation in which all European languages enjoy equal technological support. This is the very first book dedicated to Language Technology and NLP platforms. Cloud technology has only recently matured enough to make the development of a platform like ELG feasible on a larger scale. The book comprehensively describes the results of the ELG project. Following an introduction, the content is divided into four main parts: (I) ELG Cloud Platform; (II) ELG Inventory of Technologies and Resources; (III) ELG Community and Initiative; and (IV) ELG Open Calls and Pilot Projects

Directory of Open Access Books (DOAB)

European Language Grid

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

The Pharmacoepigenomics Informatics Pipeline and H-GREEN Hi-C Compiler: Discovering Pharmacogenomic Variants and Pathways with the Epigenome and Spatial Genome

Author: Allyn-Feuer Ari
Publication venue
Publication date: 01/01/2018
Field of study

Over the last decade, biomedical science has been transformed by the epigenome and spatial genome, but the discipline of pharmacogenomics, the study of the genetic underpinnings of pharmacological phenotypes like drug response and adverse events, has not. Scientists have begun to use omics atlases of increasing depth, and inferences relating to the bidirectional causal relationship between the spatial epigenome and gene expression, as a foundational underpinning for genetics research. The epigenome and spatial genome are increasingly used to discover causative regulatory variants in the significance regions of genome-wide association studies, for the discovery of the biological mechanisms underlying these phenotypes and the design of genetic tests to predict them. Such variants often have more predictive power than coding variants, but in the area of pharmacogenomics, such advances have been radically underapplied. The majority of pharmacogenomics tests are designed manually on the basis of mechanistic work with coding variants in candidate genes, and where genome wide approaches are used, they are typically not interpreted with the epigenome. This work describes a series of analyses of pharmacogenomics association studies with the tools and datasets of the epigenome and spatial genome, undertaken with the intent of discovering causative regulatory variants to enable new genetic tests. It describes the potent regulatory variants discovered thereby to have a putative causative and predictive role in a number of medically important phenotypes, including analgesia and the treatment of depression, bipolar disorder, and traumatic brain injury with opiates, anxiolytics, antidepressants, lithium, and valproate, and in particular the tendency for such variants to cluster into spatially interacting, conceptually unified pathways which offer mechanistic insight into these phenotypes. It describes the Pharmacoepigenomics Informatics Pipeline (PIP), an integrative multiple omics variant discovery pipeline designed to make this kind of analysis easier and cheaper to perform, more reproducible, and amenable to the addition of advanced features. It described the successes of the PIP in rediscovering manually discovered gene networks for lithium response, as well as discovering a previously unknown genetic basis for warfarin response in anticoagulation therapy. It describes the H-GREEN Hi-C compiler, which was designed to analyze spatial genome data and discover the distant target genes of such regulatory variants, and its success in discovering spatial contacts not detectable by preceding methods and using them to build spatial contact networks that unite disparate TADs with phenotypic relationships. It describes a potential featureset of a future pipeline, using the latest epigenome research and the lessons of the previous pipeline. It describes my thinking about how to use the output of a multiple omics variant pipeline to design genetic tests that also incorporate clinical data. And it concludes by describing a long term vision for a comprehensive pharmacophenomic atlas, to be constructed by applying a variant pipeline and machine learning test design system, such as is described, to thousands of phenotypes in parallel. Scientists struggled to assay genotypes for the better part of a century, and in the last twenty years, succeeded. The struggle to predict phenotypes on the basis of the genotypes we assay remains ongoing. The use of multiple omics variant pipelines and machine learning models with omics atlases, genetic association, and medical records data will be an increasingly significant part of that struggle for the foreseeable future.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145835/1/ariallyn_1.pd

Deep Blue Documents at the University of Michigan

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Dissertations of the University of Groningen

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

ARTS repository - University of Groningen

Computational Histopathology Analysis based on Deep Learning

Author: SUN Y
Publication venue
Publication date: 28/09/2022
Field of study

Pathology has benefited from the rapid progress in technology of digital scanning during the last decade. Nowadays, slide scanners are able to produce super-resolution whole slide images (WSI), also called digital slides, which can be explored by image viewers as an alternative to the use of conventional microscope. The use of WSI together with the other microscopic and molecular pathology images brings the development of digital pathology, which further enables to perform digital diagnostics. Moreover, the availability of WSI makes it possible to apply image processing and recognition techniques to support digital diagnostics, opening new revenues of computational pathology. However, there still remain many challenging tasks towards computational pathology such as automated cancer categorisation, tumour area segmentation, and cell-level instance detection. In this study, we explore problems related to the above tasks in histology images. Cancer categorisation can be addressed as a histopathological image classification problem. Multiple aspects such as variations caused by magnification factors and class imbalance make it a challenging task where conventional methods cannot obtain satisfactory performance in many cases. We propose to learn similarity-based embeddings for magnification-independent cancer categorisation. A pair loss and a triplet loss are proposed to learn embeddings that can measure similarity between images for classification. Furthermore, to eliminate the impact of class imbalance, instead of using the strategy of hard samples mining that intuitively discard some easy samples, we introduce a new loss function to simultaneously punish hard misclassified samples and suppress easy well-classified samples. Tumour area segmentation in whole-slide images is a fundamental step for viable tumour burden estimation, which is of great value for cancer assessment. Vague boundaries and small regions dissociated from viable tumour areas are two main challenges to accurately segment tumour area. We present a structure-aware scale-adaptive feature selection method for efficient and accurate tumour area segmentation. Specifically, based on a segmentation network with a popular encoder-decoder architecture, a scale-adaptive module is proposed to select more robust features to represent the vague, non-rigid boundaries. Furthermore, a structural similarity metric is proposed for better tissue structure awareness to deal with small region segmentation. Detection of cell-level instances in histology images is essential to acquire morphological and numeric clues for cancer assessment. However, multiple reasons such as morphological variations of nuclei or cells make it a challenging task where conventional object detection methods cannot obtain satisfactory performance in many cases. We propose similarity-based region proposal networks for nuclei and cells detection in histology images. In particular, a customized convolution layer termed as embedding layer is designed for network building. The embedding layer is then added on to modify the region proposal networks, which enables the networks to learn discriminative features based on similarity learning

Queen Mary Research Online