Search CORE

36 research outputs found

Infrastructure for synthetic health data

Author: Aguiló-Castillo Sergi
ALSHAIKHDEEB Basel M.M.
Barbero Marcos Casado
BOLZANI Luca
Castro Leyla Jael
Cirillo Davide
GHOSH Soumyabrata
Kalaš Matúš
Palmblad Magnus
Queralt-Rosinach Núria
SATAGOPAM Venkata
Sheriff Rahuman S. Malik
SHOAIB Muhammad
Tsueng Ginger
WELTER Danielle
Publication venue
Publication date: 22/07/2023
Field of study

editorial reviewedMachine learning (ML) methods are becoming ever more prevalent across all domains of lifesciences. However, a key component of effective ML is the availability of large datasets thatare diverse and representative. In the context of health systems, with significant heterogeneityof clinical phenotypes and diversity of healthcare systems, there exists a necessity to developand refine unbiased and fair ML models. Synthetic data are increasingly being used to protectthe patient’s right to privacy and overcome the paucity of annotated open-access medical data. Here, we present our proof of concept for the generation of synthetic health data and our proposed FAIR implementation of the generated synthetic datasets. The work was developed during and after the one-week-long BioHackathon Europe, by together 20 participants (10 new to the project), from different countries (NL, ES, LU, UK, GR, FL, DE, . . . ).</p

Open Repository and Bibliography - Luxembourg

High-performance web services for querying gene and variant annotation

Author: A Kamburov
A Singh
Adam Mark
Ali Torkamani
Andrew I. Su
Benjamin J. Ainscough
C UniProt
C Wu
Christopher J. Mungall
Chunlei Wu
CJ Sigrist
Cyrus Afrasiabi
D Croft
D Maglott
D Smedley
D Welter
G Liu
Ginger Tsueng
GR Abecasis
GR Brown
Gregory S. Stupp
IA Adzhubei
Jiwen Xin
JT den Dunnen
K Wang
KD Pruitt
LJ Bean
M Cariaso
M Kircher
M Whirl-Carrillo
MJ Landrum
Moritz Juchler
Nikhil Gopal
Obi L. Griffith
P Cingolani
P Flicek
P Kumar
Patricia L. Whetzel
R Leslie
RC Gentleman
SA Forbes
SB Ng
Sean D. Mooney
ST Sherry
Timothy E. Putman
WJ Kent
X Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Mark2Cure: Learn, Work, Help

Author: Ginger Tsueng
Publication venue: 'Center for Open Science'
Publication date: 05/06/2017
Field of study

At 26 million articles and growing, knowledge extraction from biomedical literature is an important big data problem. Mark2Cure trains citizen scientists to help tackle this problem in order to facilitate research on a rare disease known as NGLY1-deficiency. Learn about biomedical terms, biological processes, fascinating diseases, genes, and drugs from the same sources that scientists use--all while helping organize information relevant to a rare disease that makes children unable to shed tears when they cry. Training is provided via an online tutorial, and there is NO cost to participate. If you can READ, you can HELP

OSF Preprints

Coxsackievirus Persistence in the Neonatal Central Nervous System : Investigating the Interplay between the Host Response and Viral Persistence in Neural Stem and Progenitor Cells

Author: Tsueng Ginger
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Newborn infants are particularly vulnerable to neurotropic infections of coxsackievirus which can potentially cause serious central nervous system (CNS) diseases such as meningitis and encephalitis. Coxsackievirus is also capable of persisting in the host CNS for extensive periods of time; however, the mechanism by which the virus evades clearance by the host remains unclear. In vivo models of coxsackievirus infection have previously revealed that the virus isolated from persistent infection is not infectious, suggesting the evolution of the virus over the course of infection. In order to disaggregate the effects of the adaptive immune response and other complicating factors from the actual infection of the central nervous system, we therefore wish to develop and utilize an in vitro model of Coxsackievirus infection using Neural Progenitor and Stem Cells (NPSCs) and a recombinant Coxsackievirus B3 expressing enhanced GFP (eGFP). In developing and utilizing this model we hope to explore the interaction between the host innate immune response and the virus and the impact of these interactions on the evolution of the virus and the development of disorders in the infected host CN

Ezid

eScholarship - University of California

Mark2Curator annotation submissions for NCBI disease corpus

Author: Andrew Su (420054)
Benjamin Good (608291)
Ginger Tsueng (1377234)
Publication venue
Publication date
Field of study

An export of annotations submitted via Mark2Cure a citizen science project aimed at empowering the public to help facilitate biomedical research. This data set contains citizen scientist submitted annotations of the NCBI disease corpus in BioC xml format<br

FigShare

Citizen Science for Mining the Biomedical Literature

Author: Andrew I. Su
Benjamin M. Good
Ginger Tsueng
Jennifer Fouquier
Steven M. Nanis
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/12/2016
Field of study

Biomedical literature represents one of the largest and fastest growing collections of unstructured biomedical knowledge. Finding critical information buried in the literature can be challenging. To extract information from free-flowing text, researchers need to: 1. identify the entities in the text (named entity recognition), 2. apply a standardized vocabulary to these entities (normalization), and 3. identify how entities in the text are related to one another (relationship extraction). Researchers have primarily approached these information extraction tasks through manual expert curation and computational methods. We have previously demonstrated that named entity recognition (NER) tasks can be crowdsourced to a group of non-experts via the paid microtask platform, Amazon Mechanical Turk (AMT), and can dramatically reduce the cost and increase the throughput of biocuration efforts. However, given the size of the biomedical literature, even information extraction via paid microtask platforms is not scalable. With our web-based application Mark2Cure (http://mark2cure.org), we demonstrate that NER tasks also can be performed by volunteer citizen scientists with high accuracy. We apply metrics from the Zooniverse Matrices of Citizen Science Success and provide the results here to serve as a basis of comparison for other citizen science projects. Further, we discuss design considerations, issues, and the application of analytics for successfully moving a crowdsourcing workflow from a paid microtask platform to a citizen science platform. To our knowledge, this study is the first application of citizen science to a natural language processing task

Directory of Open Access Journals

BioGPS: building your own mash-up of gene annotations and expression profiles

Author: Andrew I. Su
Chunlei Wu
Cyrus Afrasiabi
Ginger Tsueng
Subramanian
Xuefeng Jin
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

MyGene.info web frontend component

Author: Afrasiabi Cyrus
Mark Adam
Su Andrew I.
Tsueng Ginger
Wu Chunlei
Xin Jiwen
Publication venue
Publication date
Field of study

MyGene.info: Gene Annotation Query as a Service http://mygene.inf

ZENODO

MyGene.info data backend component

Author: Afrasiabi Cyrus
Mark Adam
Su Andrew I.
Tsueng Ginger
Wu Chunlei
Xin Jiwen
Publication venue
Publication date
Field of study

MyGene.info: Gene Annotation Query as a Service http://mygene.inf

ZENODO

Les baraquettes / paroles de Ed. Barneaud ; musique de J. A. Fruchier

Author: Andrew I. Su
Chunlei Wu
Cyrus Afrasiabi
Ginger Tsueng
Jiwen Xin
Julee Adesara
Sebastien Lelong
Publication venue: [s.n.]
Publication date: 01/01/1906
Field of study

Abstract Background Application Programming Interfaces (APIs) are now widely used to distribute biological data. And many popular biological APIs developed by many different research teams have adopted Javascript Object Notation (JSON) as their primary data format. While usage of a common data format offers significant advantages, that alone is not sufficient for rich integrative queries across APIs. Results Here, we have implemented JSON for Linking Data (JSON-LD) technology on the BioThings APIs that we have developed, MyGene.info, MyVariant.info and MyChem.info. JSON-LD provides a standard way to add semantic context to the existing JSON data structure, for the purpose of enhancing the interoperability between APIs. We demonstrated several use cases that were facilitated by semantic annotations using JSON-LD, including simpler and more precise query capabilities as well as API cross-linking. Conclusions We believe that this pattern offers a generalizable solution for interoperability of APIs in the life sciences

Directory of Open Access Journals

FigShare