154 research outputs found
Trends and characteristics of multiple births in Baoan Shenzhen: A retrospective study over a decade
BackgroundShenzhen has the largest and youngest foreign population among all cities in China. The reproductive health of pregnant women from different backgrounds is a social issue that deserves attention. In the past decade, China has liberalized its population policies to stimulate population growth, and the proportion of multiple births has continued to increase.MethodThis retrospective cohort included 526,654 newborns born in Baoan, Shenzhen, from January 1, 2009, to December 31, 2019, including 515,016 singletons and 11,638 twins or triplets. Univariate regression models were used to analyze the effects of maternal sociodemographic characteristics, physiological characteristics, medical history, antenatal care and other factors associated with single vs. multiple births and to elucidate the changing trends of different factors affecting multiple births in the past 11 years. Additionally, fetal development in multiple births was analyzed by generalized linear mixed models.ResultsThe rates of pregnancy complications, preterm birth, and advanced-age pregnancy were significantly higher in the multiple birth mothers than in single birth mothers, and more multiple pregnancies were achieved through assisted reproductive technologies. The rates of adverse outcomes such as stillbirth, malformation, hypoxia, and ultralow body weight in multiple fetuses were significantly higher than that in singleton fetuses. The trend analysis from 2009 to 2019 showed that the socioeconomic status and health level of mothers with multiple births improved over time, and the risk during pregnancy generally decreased. Simultaneously, the development indicators of multiple fetuses have improved year by year, and the proportion of adverse outcomes has also decreased significantly. A low pre-natal care utilization rate was shown to be detrimental to the development of multiple fetuses. Independent risk factors for hypoxia and very low birth weight were also identified. The differences in secular trends between two birth groups were further revealed by time series models.ConclusionThis study presented a comprehensive survey of multiple pregnancies in the area with the largest population inflow in China. This study identified the factors that affect the health of multiple birth mothers and their fetuses, particularly suggesting that preterm birth rates and the use of assisted reproduction remain high. The findings provide a basis for the formulation of individualized pre-natal care, assisted reproductive guidance and healthcare policies for multiple births
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
Diffusion models have shown remarkable performance in image generation in
recent years. However, due to a quadratic increase in memory during generating
ultra-high-resolution images (e.g. 4096*4096), the resolution of generated
images is often limited to 1024*1024. In this work. we propose a unidirectional
block attention mechanism that can adaptively adjust the memory overhead during
the inference process and handle global dependencies. Building on this module,
we adopt the DiT structure for upsampling and develop an infinite
super-resolution model capable of upsampling images of various shapes and
resolutions. Comprehensive experiments show that our model achieves SOTA
performance in generating ultra-high-resolution images in both machine and
human evaluation. Compared to commonly used UNet structures, our model can save
more than 5x memory when generating 4096*4096 images. The project URL is
https://github.com/THUDM/Inf-DiT
BDTS: Blockchain-based Data Trading System
Trading data through blockchain platforms is hard to achieve \textit{fair
exchange}. Reasons come from two folds: Firstly, guaranteeing fairness between
sellers and consumers is a challenging task as the deception of any
participating parties is risk-free. This leads to the second issue where
judging the behavior of data executors (such as cloud service providers) among
distrustful parties is impractical in the context of traditional trading
protocols. To fill the gaps, in this paper, we present a
\underline{b}lockchain-based \underline{d}ata \underline{t}rading
\underline{s}ystem, named BDTS. BDTS implements a fair-exchange protocol in
which benign behaviors can get rewarded while dishonest behaviors will be
punished. Our scheme requires the seller to provide consumers with the correct
encryption keys for proper execution and encourage a rational data executor to
behave faithfully for maximum benefits from rewards. We analyze the strategies
of consumers, sellers, and dealers in the trading game and point out that
everyone should be honest about their interests so that the game will reach
Nash equilibrium. Evaluations prove efficiency and practicability.Comment: ICICS 2023 (Best Paper Award
Effect of triptolide on proliferation and apoptosis of angiotensin II-induced cardiac fibroblasts in vitro: a preliminary study
Background: The effect of triptolide (TPL) on cardiac fibroblasts (CFbs) and cardiac fibrosis remain unknown till now. This study was conducted to explore the effects of TPL on proliferation and apoptosis of angiotensin II (Ang II)-induced CFbs.Materials and Methods: Ang II was used to promote proliferation of CFbs. Two dosages of TPL (10ng/ml and 100ng/ml) were chosen. MTT assay was used to detect cell survival rate in vitro. Flow cytometer was performed to analyze apoptosis of CFbs. Hydroxyproline concentration was detected with hydroxyproline assay kit. Quantitative real-time PCR was used to detect the expression of TGF-β1 and Smad3 mRNA.Results: Ang II promoted CFbs proliferation significantly. Compared to Ang II group, TPL markedly reduced the viability of CFbs and its Hydroxyproline concentration (P<0.05). Besides, TPL can significantly promote apoptosis of CFbs (P<0.05). Furthermore, TPL reduced the expressions of TGF-β1 and Smad3 mRNA in Ang II-induced CFbs (P<0.05).Conclusion: TPL can inhibit the proliferation of CFbs in rats by down-regulating TGF-β1/Smad3 signaling pathway. TPL might be a promising therapeutic drug for myocardial fibrosis.Keywords: Cardiac fibroblast; triptolide; proliferation; apoptosis; angiotensi
Connecting Speech Encoder and Large Language Model for ASR
The impressive capability and versatility of large language models (LLMs)
have aroused increasing attention in automatic speech recognition (ASR), with
several pioneering studies attempting to build integrated ASR models by
connecting a speech encoder with an LLM. This paper presents a comparative
study of three commonly used structures as connectors, including fully
connected layers, multi-head cross-attention, and Q-Former. Speech encoders
from the Whisper model series as well as LLMs from the Vicuna model series with
different model sizes were studied. Experiments were performed on the commonly
used LibriSpeech, Common Voice, and GigaSpeech datasets, where the LLMs with
Q-Formers demonstrated consistent and considerable word error rate (WER)
reductions over LLMs with other connector structures. Q-Former-based LLMs can
generalise well to out-of-domain datasets, where 12% relative WER reductions
over the Whisper baseline ASR model were achieved on the Eval2000 test set
without using any in-domain training data from Switchboard. Moreover, a novel
segment-level Q-Former is proposed to enable LLMs to recognise speech segments
with a duration exceeding the limitation of the encoders, which results in 17%
relative WER reductions over other connector structures on 90-second-long
speech data
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Hearing is arguably an essential ability of artificial intelligence (AI)
agents in the physical world, which refers to the perception and understanding
of general auditory information consisting of at least three types of sounds:
speech, audio events, and music. In this paper, we propose SALMONN, a speech
audio language music open neural network, built by integrating a pre-trained
text-based large language model (LLM) with speech and audio encoders into a
single multimodal model. SALMONN enables the LLM to directly process and
understand general audio inputs and achieve competitive performances on a
number of speech and audio tasks used in training, such as automatic speech
recognition and translation, auditory-information-based question answering,
emotion recognition, speaker verification, and music and audio captioning etc.
SALMONN also has a diverse set of emergent abilities unseen in the training,
which includes but is not limited to speech translation to untrained languages,
speech-based slot filling, spoken-query-based question answering, audio-based
storytelling, and speech audio co-reasoning etc. The presence of cross-modal
emergent abilities is studied, and a novel few-shot activation tuning approach
is proposed to activate such abilities. To our knowledge, SALMONN is the first
model of its type and can be regarded as a step towards AI with generic hearing
abilities. The source code, model checkpoints and data are available at
https://github.com/bytedance/SALMONN
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Audio-visual large language models (LLM) have drawn significant attention,
yet the fine-grained combination of both input streams is rather
under-explored, which is challenging but necessary for LLMs to understand
general video inputs. To this end, a fine-grained audio-visual joint
representation (FAVOR) learning framework for multimodal LLMs is proposed in
this paper, which extends a text-based LLM to simultaneously perceive speech
and audio events in the audio input stream and images or videos in the visual
input stream, at the frame level. To fuse the audio and visual feature streams
into joint representations and to align the joint space with the LLM input
embedding space, we propose a causal Q-Former structure with a causal attention
module to enhance the capture of causal relations of the audio-visual frames
across time. An audio-visual evaluation benchmark (AVEB) is also proposed which
comprises six representative single-modal tasks with five cross-modal tasks
reflecting audio-visual co-reasoning abilities. While achieving competitive
single-modal performance on audio, speech and image tasks in AVEB, FAVOR
achieved over 20% accuracy improvements on the video question-answering task
when fine-grained information or temporal causal reasoning is required. FAVOR,
in addition, demonstrated remarkable video comprehension and reasoning
abilities on tasks that are unprecedented by other multimodal LLMs. An
interactive demo of FAVOR is available at
https://github.com/BriansIDP/AudioVisualLLM.git, and the training code and
model checkpoints will be released soon
- …