33 research outputs found
Content-based Controls For Music Large Language Modeling
Recent years have witnessed a rapid growth of large-scale language models in
the domain of music audio. Such models enable end-to-end generation of
higher-quality music, and some allow conditioned generation using text
descriptions. However, the control power of text controls on music is
intrinsically limited, as they can only describe music indirectly through
meta-data (such as singers and instruments) or high-level representations (such
as genre and emotion). We aim to further equip the models with direct and
content-based controls on innate music languages such as pitch, chords and drum
track. To this end, we contribute Coco-Mulla, a content-based control method
for music large language modeling. It uses a parameter-efficient fine-tuning
(PEFT) method tailored for Transformer-based audio models. Experiments show
that our approach achieved high-quality music generation with low-resource
semi-supervised learning, tuning with less than 4% parameters compared to the
original model and training on a small dataset with fewer than 300 songs.
Moreover, our approach enables effective content-based controls, and we
illustrate the control power via chords and rhythms, two of the most salient
features of music audio. Furthermore, we show that by combining content-based
controls and text descriptions, our system achieves flexible music variation
generation and style transfer. Our source codes and demos are available online
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing
Creating music is iterative, requiring varied methods at each stage. However,
existing AI music systems fall short in orchestrating multiple subsystems for
diverse needs. To address this gap, we introduce Loop Copilot, a novel system
that enables users to generate and iteratively refine music through an
interactive, multi-round dialogue interface. The system uses a large language
model to interpret user intentions and select appropriate AI models for task
execution. Each backend model is specialized for a specific task, and their
outputs are aggregated to meet the user's requirements. To ensure musical
coherence, essential attributes are maintained in a centralized table. We
evaluate the effectiveness of the proposed system through semi-structured
interviews and questionnaires, highlighting its utility not only in
facilitating music creation but also its potential for broader applications.Comment: Source code and demo video are available at
\url{https://sites.google.com/view/loop-copilot
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Pre-training on large-scale video data has become a common recipe for
learning transferable spatiotemporal representations in recent years. Despite
some progress, existing methods are mostly limited to highly curated datasets
(e.g., K400) and exhibit unsatisfactory out-of-the-box representations. We
argue that it is due to the fact that they only capture pixel-level knowledge
rather than spatiotemporal commonsense, which is far away from cognition-level
video understanding. Inspired by the great success of image-text pre-training
(e.g., CLIP), we take the first step to exploit language semantics to boost
transferable spatiotemporal representation learning. We introduce a new pretext
task, Turning to Video for Transcript Sorting (TVTS), which sorts shuffled ASR
scripts by attending to learned video representations. We do not rely on
descriptive captions and learn purely from video, i.e., leveraging the natural
transcribed speech knowledge to provide noisy but useful semantics over time.
Furthermore, rather than the simple concept learning in vision-caption
contrast, we encourage cognition-level temporal commonsense reasoning via
narrative reorganization. The advantages enable our model to contextualize what
is happening like human beings and seamlessly apply to large-scale uncurated
video data in the real world. Note that our method differs from ones designed
for video-text alignment (e.g., Frozen) and multimodal representation learning
(e.g., Merlot). Our method demonstrates strong out-of-the-box spatiotemporal
representations on diverse video benchmarks, e.g., +13.6% gains over VideoMAE
on SSV2 via linear probing
The Protective Role of Hyaluronic Acid in Cr(VI)-Induced Oxidative Damage in Corneal Epithelial Cells
Cr(VI) exposure could produce kinds of intermediates and reactive oxygen species, both of which were related to DNA damage. Hyaluronan (HA) has impressive biological functions and was reported to protect corneal epithelial cells against oxidative damage induced by ultraviolet B, benzalkonium chloride, and sodium lauryl sulfate. So the aim of our study was to investigate HA protection on human corneal epithelial (HCE) cells against Cr(VI)-induced toxic effects. The HCE cell lines were exposed to different concentrations of K2Cr2O7 (1.875, 3.75, 7.5, 15.0, and 30 μM) or a combination of K2Cr2O7 and 0.2% HA and incubated with different times (15 min, 30 min, and 60 min). Our data showed that Cr(VI) exposure could cause decreased cell viability, increased DNA damage, and ROS generation to the HCE cell lines. But incubation of HA increased HCE cell survival rates and decreased DNA damage and ROS generation induced by Cr(VI) in a dose- and time-dependent manner. We report for the first time that HA can protect HCE cells against the toxicity of Cr(VI), indicating that it will be a promising therapeutic agent to corneal injuries caused by Cr(VI)
DeePMD-kit v2: A software package for Deep Potential models
DeePMD-kit is a powerful open-source software package that facilitates
molecular dynamics simulations using machine learning potentials (MLP) known as
Deep Potential (DP) models. This package, which was released in 2017, has been
widely used in the fields of physics, chemistry, biology, and material science
for studying atomistic systems. The current version of DeePMD-kit offers
numerous advanced features such as DeepPot-SE, attention-based and hybrid
descriptors, the ability to fit tensile properties, type embedding, model
deviation, Deep Potential - Range Correction (DPRc), Deep Potential Long Range
(DPLR), GPU support for customized operators, model compression, non-von
Neumann molecular dynamics (NVNMD), and improved usability, including
documentation, compiled binary packages, graphical user interfaces (GUI), and
application programming interfaces (API). This article presents an overview of
the current major version of the DeePMD-kit package, highlighting its features
and technical details. Additionally, the article benchmarks the accuracy and
efficiency of different models and discusses ongoing developments.Comment: 51 pages, 2 figure
Orai1 and Orai3 Mediate Store-Operated Calcium Entry Contributing to Neuronal Excitability in Dorsal Root Ganglion Neurons
Store-operated calcium channels (SOCs) are highly calcium-selective channels that mediate calcium entry in various cell types. We have previously reported that intraplantar injection of YM-58483 (a SOC inhibitor) attenuates chronic pain. A previous study has reported that the function of SOCs in dorsal root ganglia (DRG) is enhanced after nerve injury, suggesting that SOCs may play a peripheral role in chronic pain. However, the expression, functional distribution and significance of the SOC family in DRG neurons remain elusive and the key components that mediate SOC entry (SOCE) are still controversial. Here, we demonstrated that the SOC family (STIM1, STIM2, Orai1, Orai2, and Orai3) was expressed in DRGs and STIM1 was mainly present in small- and medium-sized DRG neurons. Using confocal live cell imaging, Ca2+ imaging and electrophysiology techniques, we demonstrated that depletion of the endoplasmic reticulum Ca2+ stores induced STIM1 and STIM2 translocation, and that inhibition of STIM1 or blockage of Orai channels with pharmacological tools attenuated SOCE and SOC currents. Using the small inhibitory RNA knockdown approach, we identified STIM1, STIM2, Orai1, and Orai3 as the key components of SOCs mediating SOCE in DRG neurons. Importantly, activation of SOCs by thapsigargin induced plasma membrane depolarization and increased neuronal excitability, which were completely abolished by inhibition of SOCs or double knockdown of Orai1 and Orai3. Our findings suggest that SOCs exert an excitatory action in DRG neurons and provide a potential peripheral mechanism for modulation of pain hypersensitivity by SOC inhibition
PvCT: A Publicly Verifiable Contact Tracing Algorithm in Cloud Computing
Contact tracing is a critical tool in containing epidemics such as COVID-19. Researchers have carried out a lot of work on contact tracing. However, almost all of the existing works assume that their clients and authorities have large storage space and powerful computation capability and clients can implement contact tracing on their own mobile devices such as mobile phones, tablet computers, and wearable computers. With the widespread outbreaks of the epidemics, these approaches are of less robustness to a larger scale of datasets when it comes to resource-constrained clients. To address this limitation, we propose a publicly verifiable contact tracing algorithm in cloud computing (PvCT), which utilizes cloud services to provide storage and computation capability in contact tracing. To guarantee the integrity and accuracy of contact tracing results, PvCT applies a novel set accumulator-based authentication data structure whose computation is outsourced, and the client can check whether returned results are valid. Furthermore, we provide rigorous security proof of our algorithm based on the q-Strong Bilinear Diffie–Hellman assumption. Detailed experimental evaluation is also conducted on three real-world datasets. The results show that our algorithm is feasible within milliseconds of client CPU time and can significantly reduce the storage overhead from the size of datasets to a constant 128 bytes
Alkali-metal ions (M+ = Li+, Na+, K+, Rb+, and Cs+) endohedral cyclo[18]carbon (C18): Exploring binding interactions and predicting optical properties
With growing interest in carbon-based materials for energy storage and active research in the field of advanced optoelectronic devices, we theoretically designed ten complexes by cyclo[18]carbon (C18) inside and outside complexing alkali-metal ions (M+ = Li+, Na+, K+, Rb+, and Cs+), respectively referred to as M+@C18in and M+@C18out, and performed careful analyses of binding interaction between M+ and C18 as well as optical properties of stable endohedral complexes M+@C18in. The effects of atomic number of alkali-metals on electronic structures, binding interactions, electronic absorption spectra, and molecular (hyper)polarizabilities of the M+@C18in were studied using accurate density functional theory (DFT) calculations. The research results indicated that the differences in radius and properties of M+ lead to different binding modes and strengths with C18, but there is no difference in electronic absorption spectra between the complexes; the polarizability and second hyperpolarizability of M+@C18in containing different alkali-metal ions are similar due to their analogous electronic structures, but their first hyperpolarizability differ greatly due to discrepancies in molecular symmetry. The similarities and differences in intramolecular interactions, electronic absorption spectra, and (hyper)polarizability of M+@C18in were explored using advanced wavefunction analysis methods