Search CORE

155 research outputs found

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Author: Adelani DI
Aji AF
Almubarak K
Bari MS
Baruwa A
Biderman S
Kasai J
Muennighoff N
Nikoulina V
Radev D
Raff E
Schoelkopf H
Sutawika L
Winata GI
Yong ZX
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2023
Field of study

The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages. To extend the benefits of BLOOM to other languages without incurring prohibitively large costs, it is desirable to adapt BLOOM to new languages not seen during pretraining. In this work, we apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages in a resource-constrained setting. We find language adaptation to be effective at improving zero-shot performance in new languages. Surprisingly, we find that adapter-based finetuning is more effective than continued pretraining for large models. In addition, we discover that prompting performance is not significantly affected by language specifics, such as the writing system. It is primarily determined by the size of the language adaptation data. We also add new languages to BLOOMZ, which is a multitask finetuned version of BLOOM capable of following task instructions zero-shot. We find including a new language in the multitask fine-tuning mixture to be the most effective method to teach BLOOMZ a new language. We conclude that with sufficient training data language adaptation can generalize well to diverse languages. Our code is available at https://github.com/bigscience-workshop/multilingual-modeling

UCL Discovery

Risk scorecard to minimize impact of COVID-19 when reopening.

Author: Lang Jocelyn HS
Lee Vernon JM
Lim Shin B
Ma Stefan
Pung Rachael
Quah Elizabeth
Sun Yinxiaohe
Tan Kellie
Teh Shi-Hua
Yong Dominique ZX
Publication venue: 'Oxford University Press (OUP)'
Publication date: 23/07/2021
Field of study

BACKGROUND: We present a novel approach for exiting coronavirus disease 2019 (COVID-19) lockdowns using a 'risk scorecard' to prioritize activities to resume whilst allowing safe reopening. METHODS: We modelled cases generated in the community/week, incorporating parameters for social distancing, contact tracing and imported cases. We set thresholds for cases and analysed the effect of varying parameters. An online tool to facilitate country-specific use including the modification of parameters (https://sshsphdemos.shinyapps.io/covid_riskbudget/) enables visualization of effects of parameter changes and trade-offs. Local outbreak investigation data from Singapore illustrate this. RESULTS: Setting a threshold of 0.9 mean number of secondary cases arising from a case to keep R 1. CONCLUSIONS: Countries can utilize a 'risk scorecard' to balance relaxations for travel and domestic activity depending on factors that reduce disease impact, including hospital/ICU capacity, contact tracing, quarantine and vaccination. The tool enabled visualization of the combinations of imported cases and activity levels on the case numbers and the trade-offs required. For vaccination, a reduction factor should be applied both for likelihood of an infected case being present and a close contact getting infected

LSHTM Research Online

ScholarBank@NUS

Stepwise classification of cancer samples using clinical and molecular data

Author: A Tan
AL Boulesteix
AL Boulesteix
AL Boulesteix
Askar Obulkasim
D Dunkler
D Krag
Gerrit A Meijer
JA Stephenson
JR Tibshirani
KA Cao
L Breiman
M Bovelstad
M Futschik
M Jelizarow
M van de Vijver
Mark A van de Wiel
RJ Nevins
SL Pomeroy
Y Qi
Z Yong
ZX Huang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost) inefficient. Results We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples. Conclusions Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis) and hence lower the patients distress. Stepwise classification is implemented in R-package <it>stepwiseCM </it>and available at the Bioconductor website.</p

Crossref

VU Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Impact of leg lengthening on viscoelastic properties of the deep fascia

Author: A Stecco
A Udén
AH Simpson
AM Khan
B Mike
C Heisel
C Hopf
CN Maganaris
CN Maganaris
D Ulrich
EC Bass
FR Noyes
GA Ilizarov
GA Ilizarov
H Sun
Hai-Qiang Wang
HM Langevin
HM Langevin
HQ Wang
HQ Wang
HR Elden
LH Yahia
N Yasui
NA Bouffard
OL Thomas
PC Ivancic
RF Zernicke
RP Melcher
RR Pelker
SD Waldman
SH White
W Ando
Y Hayashi
YC Fung
Yi-Yong Wei
Zhuo-Jing Luo
Zi-Xiang Wu
ZX Wu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Despite the morphological alterations of the deep fascia subjected to leg lengthening have been investigated in cellular and extracellular aspects, the impact of leg lengthening on viscoelastic properties of the deep fascia remains largely unknown. This study aimed to address the changes of viscoelastic properties of the deep fascia during leg lengthening using uniaxial tensile test. Methods Animal model of leg lengthening was established in New Zealand white rabbits. Distraction was initiated at a rate of 1 mm/day and 2 mm/day in two steps, and preceded until increases of 10% and 20% in the initial length of tibia had been achieved. The deep fascia specimens of 30 mm × 10 mm were clamped with the Instron 1122 tensile tester at room temperature with a constant tensile rate of 5 mm/min. After 5 load-download tensile tests had been performed, the specimens were elongated until rupture. The load-displacement curves were automatically generated. Results The normal deep fascia showed typical viscoelastic rule of collagenous tissues. Each experimental group of the deep fascia after leg lengthening kept the properties. The curves of the deep fascia at a rate of 1 mm/day with 20% increase in tibia length were the closest to those of normal deep fascia. The ultimate tension strength and the strain at rupture on average of normal deep fascia were 2.69 N (8.97 mN/mm2) and 14.11%, respectively. The increases in ultimate tension strength and strain at rupture of the deep fascia after leg lengthening were statistically significant. Conclusion The deep fascia subjected to leg lengthening exhibits viscoelastic properties as collagenous tissues without lengthening other than increased strain and strength. Notwithstanding different lengthening schemes result in varied viscoelastic properties changes, the most comparable viscoelastic properties to be demonstrated are under the scheme of a distraction rate of 1 mm/day and 20% increase in tibia length.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The bovine alveolar macrophage DNA methylome is resilient to infection with Mycobacterium bovis

Author: A Akalin
AK Marr
AM Deaton
AM O’Doherty
AM O’Doherty
AM O’Doherty
AM O’Doherty
AM O’Doherty
AV Zimin
B Langmead
D Holoch
D Karolchik
DA Magee
F Krueger
F Olea-Popelka
G Elliott
G Sharma
G Sharma
I Yaseen
JR Peat
KD Hansen
L Andersson
L Navarro-Martin
L Zheng
LN Lyu
M Kathirvel
M Weber
MD Young
MJ Ziller
MM Esterhuyse
N Pervjakova
NC Nalpas
P Nestorov
P Vegh
PA Jones
R Doherty
R Edgar
R Jaenisch
R Kucharski
R Sitaraman
RA Waterland
RM Gonzalez
SH Sinclair
SJ Clark
SS Shell
T Garnier
WS Yong
Y Benjamini
ZD Smith
ZX Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

DNA methylation is pivotal in orchestrating gene expression patterns in various mammalian biological processes. Perturbation of the bovine alveolar macrophage (bAM) transcriptome, due to Mycobacterium bovis (M. bovis) infection, has been well documented; however, the impact of this intracellular pathogen on the bAM epigenome has not been determined. Here, whole genome bisulfite sequencing (WGBS) was used to assess the effect of M. bovis infection on the bAM DNA methylome. The methylomes of bAM infected with M. bovis were compared to those of non-infected bAM 24 hours post-infection (hpi). No differences in DNA methylation (CpG or non-CpG) were observed. Analysis of DNA methylation at proximal promoter regions uncovered >250 genes harbouring intermediately methylated (IM) promoters (average methylation of 33-66%). Gene ontology analysis, focusing on genes with low, intermediate or highly methylated promoters, revealed that genes with IM promoters were enriched for immune-related GO categories; this enrichment was not observed for genes in the high or low methylation groups. Targeted analysis of genes in the IM category confirmed the WGBS observation. This study is the first in cattle examining genome-wide DNA methylation at single nucleotide resolution in an important bovine cellular host-pathogen interaction model, providing evidence for IM promoter methylation in bAM

Crossref

Directory of Open Access Journals

Publikationsserver der Universität Tübingen

Oxford University Research Archive

Ulster University's Research Portal

SUMOylation Represses Nanog Expression via Modulating Transcription Factors Oct4 and Sox2

Author: A Georges
B Cox
Bo Tang
CH Woo
D Guo
DJ Rodda
DL van den Berg
E Karantzali
ES Johnson
F Lavial
F Wei
G-H Liu
GR Martin
Haibo Wu
I Chambers
I Fukuda
J Ren
J Silva
J Sánchez
JS Seeler
Juan Du
JX Du
K Maderböck
K Mitsui
K Yang
KK Chan
L Hyslop
LB Liu
Liping Yang
Lixia Yang
MJ Evans
MW Pfaffl
O Kerscher
P Gupta
PJ Hamard
Q Cai
R Geiss-Friedlander
RT Hay
S La Salle
S Masui
S Nagai
S Stefanovic
S Tsuruzoe
S Yamaguchi
SY Chiu
T Kuroda
T Okuma
Tadayuki Akagi
V Chickarmane
Wenzhong Li
X Liu
Xiaohai Wang
Xiaoyan Shi
Y Hong
Yong Zhang
Yongyan Wu
Z Zhang
Zekun Guo
ZX Wang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Nanog is a pivotal transcription factor in embryonic stem (ES) cells and is essential for maintaining the pluripotency and self-renewal of ES cells. SUMOylation has been proved to regulate several stem cell markers' function, such as Oct4 and Sox2. Nanog is strictly regulated by Oct4/Sox2 heterodimer. However, the direct effects of SUMOylation on Nanog expression remain unclear. In this study, we reported that SUMOylation repressed Nanog expression. Depletion of Sumo1 or its conjugating enzyme Ubc9 increased the expression of Nanog, while high SUMOylation reduced its expression. Interestingly, we found that SUMOylation of Oct4 and Sox2 regulated Nanog in an opposing manner. SUMOylation of Oct4 enhanced Nanog expression, while SUMOylated Sox2 inhibited its expression. Moreover, SUMOylation of Oct4 by Pias2 or Sox2 by Pias3 impaired the interaction between Oct4 and Sox2. Taken together, these results indicate that SUMOylation has a negative effect on Nanog expression and provides new insights into the mechanism of SUMO modification involved in ES cells regulation

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Estimating PM 2.5 concentrations in Xi'an City using a generalized additive model with multi-source monitoring data

Author: A Notario
A Saiz-Lopez
APK Tai
AW Strawa
B Gao
BL MR Boys
BV Bhaskar
C Liousse
C Yang
C Zhang
CA Pope
CJ Paciorek
CQ LY Lin
DA Chu
DC Carslaw
DE Abbey
DK Deshmukh
DP Edwards
DYH Pui
F Costabile
F Yang
G Lin
G Wang
GX Li
Hong-Lei Yang
HS Bian
HS Kim
J Schwartz
J Tao
J Tian
J Wang
J Wang
JG Watson
JH Seinfeld
JJ Cao
Jun-Huan Peng
JY Xin
K Zhang
K Zhang
KF Ho
L Glasser
L Tao
LF Li
LWA Chen
M Khodeir
M Schaap
M Sorek-Hamer
ML Bell
N Kumar
P Glantz
P Gupta
Qian Sun
Qinghua Sun
R Federal
R Federal
RA Moyeed
RB Schlesinger
RBA Koelemeijer
RM Hoff
S Wood
SC Dogruparmak
SS Lim
W Song
X Chi
XF Hu
Y Huang
Y Liu
Y Liu
Y Liu
Y Wang
Yi-Rong Song
YL Sun
YM Guo
Yong-Ze Song
YP DJ Lu
YS Wang
Yuan Li
Z Hu
Z Li
Z Ma
Z Meng
Z Sun
Z Wang
ZW Yan
ZX Shen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

© 2015 Song et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Particulate matter with an aerodynamic diameter <2.5 μm (PM2.5) represents a severe environmental problem and is of negative impact on human health. Xi'an City, with a population of 6.5 million, is among the highest concentrations of PM2.5 in China. In 2013, in total, there were 191 days in Xi'an City on which PM2.5 concentrations were greater than 100 μg/m3. Recently, a few studies have explored the potential causes of high PM2.5 concentration using remote sensing data such as the MODIS aerosol optical thickness (AOT) product. Linear regression is a commonly used method to find statistical relationships among PM2.5 concentrations and other pollutants, including CO, NO2, SO2, and O3, which can be indicative of emission sources. The relationships of these variables, however, are usually complicated and non-linear. Therefore, a generalized additive model (GAM) is used to estimate the statistical relationships between potential variables and PM2.5 concentrations. This model contains linear functions of SO2 and CO, univariate smoothing non-linear functions of NO2, O3, AOT and temperature, and bivariate smoothing non-linear functions of location and wind variables. The model can explain 69.50% of PM2.5 concentrations, with R2 = 0.691, which improves the result of a stepwise linear regression (R2 = 0.582) by 18.73%. The two most significant variables, CO concentration and AOT, represent 20.65% and 19.54% of the deviance, respectively, while the three other gas-phase concentrations, SO2, NO2, and O3 account for 10.88% of the total deviance. These results show that in Xi'an City, the traffic and other industrial emissions are the primary source of PM2.5. Temperature, location, and wind variables also non-linearly related with PM2.5

Crossref

Directory of Open Access Journals

PubMed Central

espace@Curtin

FigShare

Integrated Profiling of MicroRNAs and mRNAs: MicroRNAs Located on Xq27.3 Associate with Clear Cell Renal Cell Carcinoma

Author: A Jemal
A Lopez-Beltran
A Subramanian
A Verkman
AM Torres
AS Morrissy
B Rini
B Vogelstein
C Camps
C Coulouarn
Chad Creighton
Chaozhao Liang
D Betel
D Juan
D Parkin
DJ Manalo
F Gottardo
F Kosari
F Xiao
Feng Shen
G Bindea
G Seifert
GA Calin
GJ Hurteau
GL Papadopoulos
GL Semenza
GR Sutherland
H Cohen
H Gardner
H Kutay
H Liu
H Seitz
Huanming Yang
HW Tun
I Bernascone
J Kluiver
J Lu
Jiahao Chen
Jing Chen
Jiongxian Ye
JK Maranchie
JR Gnarra
Jun Wang
K Pulkkinen
K Ruan
K Smith
L Weng
L Zhang
Liang Sun
Liang Zhou
M Jung
M Kanehisa
M Korpal
M Metzler
Maoshan Chen
Min Shi
N Mizuno
N Yanaihara
NC Lau
P Shannon
P t Hoen
Qingna Zhai
R Beroukhim
R Garzon
R Garzon
R Li
Ruilin Yang
S Audic
S Volinia
SD Hsu
SM Park
TB Deb
TF Chow
V Mootha
VN Kim
W Tsai
Xianxin Li
Xiaohong Xu
Xiaokun Zhao
Xiuqing Zhang
Xueda Hu
Y Benjamini
Y Chen
Y Huang
Y Lee
Yaoting Gui
Yi Huang
Yong Wang
Z Hegedus
Zhichen Guan
Zhiming Cai
Zhiyu Peng
Zhizhong Li
Zhongfu Zhang
Zujing Han
ZX Shan
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: With the advent of second-generation sequencing, the expression of gene transcripts can be digitally measured with high accuracy. The purpose of this study was to systematically profile the expression of both mRNA and miRNA genes in clear cell renal cell carcinoma (ccRCC) using massively parallel sequencing technology. Methodology: The expression of mRNAs and miRNAs were analyzed in tumor tissues and matched normal adjacent tissues obtained from 10 ccRCC patients without distant metastases. In a prevalence screen, some of the most interesting results were validated in a large cohort of ccRCC patients. Principal Findings: A total of 404 miRNAs and 9,799 mRNAs were detected to be differentially expressed in the 10 ccRCC patients. We also identified 56 novel miRNA candidates in at least two samples. In addition to confirming that canonical cancer genes and miRNAs (including VEGFA, DUSP9 and ERBB4; miR-210, miR-184 and miR-206) play pivotal roles in ccRCC development, promising novel candidates (such as PNCK and miR-122) without previous annotation in ccRCC carcinogenesis were also discovered in this study. Pathways controlling cell fates (e. g., cell cycle and apoptosis pathways) and cell communication (e. g., focal adhesion and ECM-receptor interaction) were found to be significantly more likely to be disrupted in ccRCC. Additionally, the results of the prevalence screen revealed that the expression of a miRNA gene cluster located on Xq27.3 was consistently downregulated in at least 76.7% of similar to 50 ccRCC patients. Conclusions: Our study provided a two-dimensional map of the mRNA and miRNA expression profiles of ccRCC using deep sequencing technology. Our results indicate that the phenotypic status of ccRCC is characterized by a loss of normal renal function, downregulation of metabolic genes, and upregulation of many signal transduction genes in key pathways. Furthermore, it can be concluded that downregulation of miRNA genes clustered on Xq27.3 is associated with ccRCC

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Hong Kong University of Science and Technology Institutional Repository

Trends in template/fragment-free protein structure prediction

Author: A BenNaim
A Cavalli
A Elofsson
A Grossfield
A Jagielska
A Liwo
A Liwo
A Pillardy
A Warshel
A Warshel
A Warshel
AE Roitberg
AF Voter
AP Lyubartsev
AR Ortiz
AR Panchenko
AV Morozov
B Fain
B Roux
B Xue
B Zagrovic
BR Brooks
C Alsenoy Van
C Bystroff
C Hardin
C Hoppe
C Simmerling
C Simmerling
C Zhang
C Zhang
C Zhang
C Zhang
C Zhang
CL Brooks
CM Deane
CM Summa
D Chivian
D Eramian
D Gilis
D Hamelberg
D Jiao
D Katagiri
D Kihara
D Kim
DE Kim
DE Shaw
DS Wishart
DT Jones
E Faraggi
E Faraggi
E Ferrada
E Ferrada
E Haber
E Krieger
E Marinari
E Pettersson
Eshel Faraggi
F Wagner
F Zhao
F Zhao
F Zhao
FG Wang
G Chopra
G Cornilescu
G Pollastri
G Yona
GA Kaminski
GA Papoian
GM Torrie
GR Bowman
H Fan
H Kamberaj
H Kamisetty
H Lu
H Zhou
H Zhou
Hongxing Lei
HP Gong
HS Kang
HX Lei
HX Lei
HX Lei
HX Lei
HX Lei
HY Liu
HY Zhou
HZ Li
J Cheng
J DeBartolo
J DeBartolo
J Lundstrom
J Meiler
J Moult
J Pei
J Shi
J Skolnick
J Vreede
J Wang
J Xu
J Zhu
J Zhu
JA Hegler
JA McCammon
JA McCammon
JA Vila
JE Stone
JF Gibrat
JL Gao
JL Knight
JM Bujnicki
JM Bujnicki
JP Ma
JP Piquemal
JW Pitera
K Karplus
KT Simons
LA Kelley
LC Song
LJ Yang
LJ Yang
LJ Yang
LQ Zheng
M Ben-David
M Challacombe
M Christen
M Lu
M Lu
M Masella
M Mirzaie
M Nanias
M Stork
M Vieth
MJ Rooman
MJ Sippl
MM Seibert
MR Betancourt
MR Lee
MS Friedrichs
MS Lin
MS Shell
MY Shen
N Todorova
N Yu
N Yu
NV Buchete
O Dor
O Dor
O Zimmermann
P Bradley
P Robustelli
P Sherwood
PA Bash
PD Renfrew
PD Thomas
PEM Lopes
PH Maccallum
PH Maccallum
PI Bakker de
R Kuang
R Paulini
R Samudrala
R Srinivasan
RW Montalvao
S Brown
S Chowdhury
S Kannan
S Liu
S Miyazawa
S Miyazawa
S Neal
S Oldziej
S Patel
S Piana
S Piana
S Roy
S Tanaka
SB Ozkan
SF Altschul
SJ Weiner
T Hamelryck
T Kortemme
T Lazaridis
T Yoshidome
TC Terwilliger
TJ Brunette
U Ryde
UHE Hansmann
V Leone
V Tozzini
V Tozzini
V Tsui
VA Eyrich
W Blokzijl
W Boomsma
W Xie
W Zhang
WS Xie
WW Chen
X Zhu
XF Li
XP Xu
Y Duan
Y Duan
Y Shan
Y Shen
Y Shen
Y Sugita
Y Zhang
Y Zhang
Y Zhou
Yaoqi Zhou
YD Yang
YD Yang
YD Yang
YG Mu
YH Tan
YH Wu
Yong Duan
YQ Gao
YQ Gao
Yuedong Yang
YX Liu
Z Wang
ZX Wang
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Climatic implications on variations of Qehan Lake in the arid regions of Inner Mongolia during the recent five decades

Author: B Wunnemann
C Morrill
D Bian
DB Madsen
G Yu
GB Philip
HB Mann
HC Hartmann
HL Zhao
J Bai
JB Li
Jiyao Liu
K Kezer
KA Poianik
KA Thomas
Khkhuudei Ulambadrakh
L Fan
L Huang
LD Sun
LP Zhu
LQ Liang
M Ma
Mei Yong
NK Davi
PM Zhai
RH Ma
Riguge Su
SL Liu
SL Liu
SL Tao
SZ Qi
W Qian
Wenjun Liang
XC Liu
Xi Chun
XQ Feng
XW Wang
XY Li
YF Shi
YH Ding
YN Chen
YR Zheng
ZG Niu
ZG Shao
ZM Ma
ZM Wang
ZX Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref