Search CORE

5 research outputs found

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Author: Baldock Robert
Bellagente Marco
Brack Manuel
Cruz-Salinas Andres Felipe
Dai Andrew
Deiseroth Björn
Eichenberg Constantin
Friedrich Felix
Kersting Kristian
Nanda Souradeep
Oostermeijer Koen
Schramowski Patrick
Teufel Hannah
Weinbach Samuel
Publication venue
Publication date: 24/05/2023
Field of study

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interpretations of text prompts. However, expressing complex or nuanced ideas in text alone can be difficult. To ease image generation, we propose MultiFusion that allows one to express complex and nuanced concepts with arbitrarily interleaved inputs of multiple modalities and languages. MutliFusion leverages pre-trained models and aligns them for integration into a cohesive system, thereby avoiding the need for extensive training from scratch. Our experimental results demonstrate the efficient transfer of capabilities from individual modules to the downstream model. Specifically, the fusion of all independent components allows the image generation module to utilize multilingual, interleaved multimodal inputs despite being trained solely on monomodal data in a single language

arXiv.org e-Print Archive

Tokenizer Choice For LLM Training: Negligible or Crucial?

Author: Abdelwahab Hammam
Ali Mehdi
Buschhoff Jasper Schulze
Doll Niclas
Ebert Jan
Flores-Herr Nicolas
Fromm Michael
Jain Charvi
John Chelsea
Jurkschat Lena
Kesselheim Stefan
Klug Katrin
Leveling Johannes
Lübbering Max
Ostendorff Malte
Rutmann Richard
Sifa Rafet
Suarez Pedro Ortiz
Thellmann Klaudia
Weber Alexander Arno
Weinbach Samuel
Publication venue
Publication date: 18/10/2023
Field of study

The recent success of LLMs has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot. Shedding light on this underexplored area, we conduct a comprehensive study on the influence of tokenizer choice on LLM downstream performance by training 24 mono- and multilingual LLMs at a 2.6B parameter scale, ablating different tokenizer algorithms and parameterizations. Our studies highlight that the tokenizer choice can significantly impact the model's downstream performance, training and inference costs. In particular, we find that the common tokenizer evaluation metrics fertility and parity are not always predictive of model downstream performance, rendering these metrics a questionable proxy for the model's downstream performance. Furthermore, we show that multilingual tokenizers trained on the five most frequent European languages require vocabulary size increases of factor three in comparison to English. While English-only tokenizers have been applied to the training of multi-lingual LLMs, we find that this approach results in a severe downstream performance degradation and additional training costs of up to 68%, due to an inefficient tokenization vocabulary

arXiv.org e-Print Archive

Synthesis and characterization of PEDOT, an intrinsically conductive polymer

Author: Bailleul Quentin
Bouquey Michel
Favier Damien
Karst Adèle
PARPAITE Thibault
Pelletier Hervé
Samuel Cédric
Soulestin Jérémie
Weinbach Quentin
Publication venue: AIP
Publication date: 26/09/2021
Field of study

International audiencePoly(3,4-ethylenedioxythiophene) (PEDOT) is a highly valuable polymer material for modern electronics due to its impressive electrical conductivity (up to 1000 S.cm

^{−1}

). Combined with poly(styrenesulfonate) (PSS), PEDOT can beprocessed into thin film for a multitude of applications. However, the introduction of PEDOT in melt-state processes of the plastic industryis still challenging because PEDOT is infusible and only commercially available as highly diluted aqueous solutions. Nevertheless, previous study showed that extrusion processing of PEDOT:PSS solutions with PEO as a melting carrier is possiblebut sophisticated pre/post-treatments are mandatory to maintain high level of electrical conductivity up to 5 S.cm

^{−1}

.In this context, the goal of this study is to synthesize electrically conductive polymeric particles of PEDOT. An oxidative chemical polymerization of EDOT was carried out using two different oxidants: Fe

_2

(SO

_4

)

_3

and FeCl

_3

. Resulting polymeric particles were studied thanks to SEM observations and resistivity measurements. The influence of parameters such as the ratio (monomer/oxidant), the polymerization time or the use of surfactants on the conductivity of PEDOT particles was studied.Interestingly, as-prepared PEDOT particles display conductivities between 0.1 and 10 S.cm

^{−1}

without any posttreatment. This result is nearly 1000 times higher than previously reported by Jiang et al. A first explanation can be the presence of the oxidant/surfactant combined with PEDOT particles and acting as a dopant. Besides, comparison between the samples allows us to highlight the impact of the monomer-oxidant ratio on the electrical conductivity of the pellets. To better understand these results, correlations between (i) the polymerization process, (ii) the particles morphology and (iii) final electrical properties are currently being investigated

HAL-Inserm

Niclosamide ethanolamine–induced mild mitochondrial uncoupling improves diabetic symptoms in mice

Author: A Prince
A Qaseem
AK Madiraju
B Li
C Frezza
C Yu
CE Amara
CM Steppan
CS Choi
DM Muoio
DM Muoio
DM Nathan
EC Weinbach
G Hecht
G Mattiasson
G Perseghin
G Zhou
GE Swan
Gerald I Shulman
GJ Frayha
H Terada
Hanlin Tao
HJ Kim
IA Trounce
J An
J Kopecky
J Nedergaard
J Parascandola
JA Chavez
JA Harper
JC Clapham
JP Boyle
K Inoki
K Kantartzis
K Kobayashi
KF Petersen
KT Uysal
LG Fryer
M Chen
M Yuan
MA Pelleymounter
MD Fullerton
ME Griffin
ME Harper
MR Owen
P Andrews
PJ Randle
Q Yang
RI Misbin
RJ Perry
RR Henry
S Neschen
SE Kahn
Shengkan Jin
T Osada
T Yamauchi
UK Sheth
VT Samuel
VT Samuel
VT Samuel
WF Maragos
WL Holland
X Ren
Xiangang Zeng
Y Ishigaki
Y Si
Y Zhang
Y Zhang
Y-W Chang
YH Tseng
Yong Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Chapter VI: Physical Growth from Birth to Maturity

Author: Adams JW
Anderson John E
Anderson John E
Ashley-Montague Montague F
Bakwin Harry
Burch George E
Burke Bertha S
Burman Michael S
Butler Craig D
Cole WCC
Cureton Thomas K
Davenport Charles B
Davenport Charles B
Davis William R
Dearborn Walter F
Donaldson SW
East Bion R
East Bion R
Etheredge Maude Lee
Francis Carl C
Goldstein Marcus S
Goldstein Marcus S
Gould Harley N
Gray H
Gray H
Harriet J. Kelley
Higgons Reginal A
Hoffmann Clifford J
Janet E. Redfield
Jenss Rachel M
Jones Harold E
Jorgensen Nelphi M
Journal of the American Medical Association
Klein Henry
Korb Edward M
Krogman Wilton M
Levine Arthur S
Litchfield HR
Lloyd-Jones Orren
Ludlum Florence E
Maresh Marion M
Massler M
Mateeff Dragomir
McCay CM
McCloy CH
Meredith Howard V
Metheny Eleanor
Metheny Eleanor
Mills Clarence A
Mullen Frances A
Newman Horatio H
Palmer Carroll E
Peller Sigismond
Powell Elizabeth
Pryor Helen B
Pryor Helen B
Rife David C
Robinson Samuel C
Schour Isaac
Seib George A
Shuttleworth Frank K
Shuttleworth Frank K
Siegling John A
Simmons Katherine
Sleggs GF
Smith F
Sontag Lester W
Sontag Lester W
Steggerda Morris
Todd T Wingate
Van Dusen Clarence R
Weinbach AP
Weinbach AP
Youmans John B
Publication venue: 'American Educational Research Association (AERA)'
Publication date
Field of study

Crossref