236 research outputs found
Who are Like-minded: Mining User Interest Similarity in Online Social Networks
In this paper, we mine and learn to predict how similar a pair of users'
interests towards videos are, based on demographic (age, gender and location)
and social (friendship, interaction and group membership) information of these
users. We use the video access patterns of active users as ground truth (a form
of benchmark). We adopt tag-based user profiling to establish this ground
truth, and justify why it is used instead of video-based methods, or many
latent topic models such as LDA and Collaborative Filtering approaches. We then
show the effectiveness of the different demographic and social features, and
their combinations and derivatives, in predicting user interest similarity,
based on different machine-learning methods for combining multiple features. We
propose a hybrid tree-encoded linear model for combining the features, and show
that it out-performs other linear and treebased models. Our methods can be used
to predict user interest similarity when the ground-truth is not available,
e.g. for new users, or inactive users whose interests may have changed from old
access data, and is useful for video recommendation. Our study is based on a
rich dataset from Tencent, a popular service provider of social networks, video
services, and various other services in China
Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search
In this work, we propose a novel and scalable solution to address the
challenges of developing efficient dense predictions on edge platforms. Our
first key insight is that MultiTask Learning (MTL) and hardware-aware Neural
Architecture Search (NAS) can work in synergy to greatly benefit on-device
Dense Predictions (DP). Empirical results reveal that the joint learning of the
two paradigms is surprisingly effective at improving DP accuracy, achieving
superior performance over both the transfer learning of single-task NAS and
prior state-of-the-art approaches in MTL, all with just 1/10th of the
computation. To the best of our knowledge, our framework, named EDNAS, is the
first to successfully leverage the synergistic relationship of NAS and MTL for
DP. Our second key insight is that the standard depth training for multi-task
DP can cause significant instability and noise to MTL evaluation. Instead, we
propose JAReD, an improved, easy-to-adopt Joint Absolute-Relative Depth loss,
that reduces up to 88% of the undesired noise while simultaneously boosting
accuracy. We conduct extensive evaluations on standard datasets, benchmark
against strong baselines and state-of-the-art approaches, as well as provide an
analysis of the discovered optimal architectures.Comment: WACV 2023. 14 pages, 5 figure
Pyrrolidine Dithiocarbamate Attenuates Paraquat-Induced Lung Injury in Rats
Paraquat (PQ) has been demonstrated that the main target organ for the toxicity is the lung. This study aimed to investigate the potential protective effect of PDTC on the PQ-induced pulmonary damage. Fifty-four rats were divided into control, PQ-treated and PQ+PDTC-treated groups. Rats in the PQ group were administrated 40 mg/kg PQ by gastric gavage, and PDTC group with 40 mg/kg PQ followed by injection of 120 mg/kg PDTC (IP). On the days 3, 7, 14 and 21 after treatments, the activities of GSH-Px, SOD, MDA level and the content of HYP were measured. TGF-β1 mRNA and protein were assayed by RT-PCR and ELISA. MDA level in plasma and BALF was increased and the activities of GSH-Px and SOD were decreased significantly in the PQ-treated groups (P < .05) compared with control group. While the activities of GSH-Px and SOD in the PQ+PDTC-treated groups was markedly higher than that of PQ-treated groups (P < .05), and in contrast, MDA level was lower. TGF-β1 mRNA and protein were significantly lower in the
PQ+PDTC-treated groups than that of PQ-treated groups (P < .05). The histopathological changes in the PQ+PDTC-treated groups were milder than those of PQ groups. Our results suggested that PDTC treatment significantly attenuated paraquat-induced pulmonary damage
The effects of yam gruel on lowering fasted blood glucose in T2DM rats
© 2020 Xinjun Lin et al., published by De Gruyter 2020. There is increasing evidence of the linkage between type 2 diabetes mellitus (T2DM) and gut microbiota. Based on our previous studies, we investigated the hypoglycemic mechanisms of yam gruel to provide a scientific basis for its popularization and application. Wistar rats were randomly divided into control and T2DM model groups. Rats in the model group were stimulated by a high-sugar/high-fat diet combined with an intraperitoneal injection of streptozotocin to induce T2DM. The T2DM rats were further subdivided randomly into three groups: (1) DM, (2) DM + yam gruel, and (3) DM + metformin. After 4 weeks of intervention, the changes in gut microbiota, short-chain fatty acids (SCFAs) (acetic acid, propionic acid, and butyric acid), the expression of G protein-coupled receptor 43 (GPR43), glucagon-like peptide-1 (GLP-1), peptide YY (PYY), and fasted blood glucose (FBG) levels were observed. Yam gruel intervention elevated the abundance of probiotic bacteria and increased the expression of SCFAs, GPR43 receptor, GLP-1, and PYY. It also reduced FBG levels. We conclude that yam gruel can lower FBG by promoting the growth of probiotic bacteria, increasing the content of SCFAs, and enhancing the expression of GPR43 receptor to increase the content of GLP-1 and PYY in serum
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Zero-shot text-to-speech aims at synthesizing voices with unseen speech
prompts. Previous large-scale multispeaker TTS models have successfully
achieved this goal with an enrolled recording within 10 seconds. However, most
of them are designed to utilize only short speech prompts. The limited
information in short speech prompts significantly hinders the performance of
fine-grained identity imitation. In this paper, we introduce Mega-TTS 2, a
generic zero-shot multispeaker TTS model that is capable of synthesizing speech
for unseen speakers with arbitrary-length prompts. Specifically, we 1) design a
multi-reference timbre encoder to extract timbre information from multiple
reference speeches; 2) and train a prosody language model with arbitrary-length
speech prompts; With these designs, our model is suitable for prompts of
different lengths, which extends the upper bound of speech quality for
zero-shot text-to-speech. Besides arbitrary-length prompts, we introduce
arbitrary-source prompts, which leverages the probabilities derived from
multiple P-LLM outputs to produce expressive and controlled prosody.
Furthermore, we propose a phoneme-level auto-regressive duration model to
introduce in-context learning capabilities to duration modeling. Experiments
demonstrate that our method could not only synthesize identity-preserving
speech with a short prompt of an unseen speaker but also achieve improved
performance with longer speech prompts. Audio samples can be found in
https://mega-tts.github.io/mega2_demo/
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Scaling text-to-speech to a large and wild dataset has been proven to be
highly effective in achieving timbre and speech style generalization,
particularly in zero-shot TTS. However, previous works usually encode speech
into latent using audio codec and use autoregressive language models or
diffusion models to generate it, which ignores the intrinsic nature of speech
and may lead to inferior or uncontrollable results. We argue that speech can be
decomposed into several attributes (e.g., content, timbre, prosody, and phase)
and each of them should be modeled using a module with appropriate inductive
biases. From this perspective, we carefully design a novel and large zero-shot
TTS system called Mega-TTS, which is trained with large-scale wild data and
models different attributes in different ways: 1) Instead of using latent
encoded by audio codec as the intermediate feature, we still choose spectrogram
as it separates the phase and other attributes very well. Phase can be
appropriately constructed by the GAN-based vocoder and does not need to be
modeled by the language model. 2) We model the timbre using global vectors
since timbre is a global attribute that changes slowly over time. 3) We further
use a VQGAN-based acoustic model to generate the spectrogram and a latent code
language model to fit the distribution of prosody, since prosody changes
quickly over time in a sentence, and language models can capture both local and
long-range dependencies. We scale Mega-TTS to multi-domain datasets with 20K
hours of speech and evaluate its performance on unseen speakers. Experimental
results demonstrate that Mega-TTS surpasses state-of-the-art TTS systems on
zero-shot TTS, speech editing, and cross-lingual TTS tasks, with superior
naturalness, robustness, and speaker similarity due to the proper inductive
bias of each module. Audio samples are available at
https://mega-tts.github.io/demo-page
Recommended from our members
Monitoring of the central blood pressure waveform via a conformal ultrasonic device.
Continuous monitoring of the central-blood-pressure waveform from deeply embedded vessels, such as the carotid artery and jugular vein, has clinical value for the prediction of all-cause cardiovascular mortality. However, existing non-invasive approaches, including photoplethysmography and tonometry, only enable access to the superficial peripheral vasculature. Although current ultrasonic technologies allow non-invasive deep-tissue observation, unstable coupling with the tissue surface resulting from the bulkiness and rigidity of conventional ultrasound probes introduces usability constraints. Here, we describe the design and operation of an ultrasonic device that is conformal to the skin and capable of capturing blood-pressure waveforms at deeply embedded arterial and venous sites. The wearable device is ultrathin (240 μm) and stretchable (with strains up to 60%), and enables the non-invasive, continuous and accurate monitoring of cardiovascular events from multiple body locations, which should facilitate its use in a variety of clinical environments
- …