Search CORE

246 research outputs found

Everybody Compose: Deep Beats To Music

Author: Liu Yixin
Shen Conghao
Yao Violet Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/06/2023
Field of study

This project presents a deep learning approach to generate monophonic melodies based on input beats, allowing even amateurs to create their own music compositions. Three effective methods - LSTM with Full Attention, LSTM with Local Attention, and Transformer with Relative Position Representation - are proposed for this novel task, providing great variation, harmony, and structure in the generated music. This project allows anyone to compose their own music by tapping their keyboards or ``recoloring'' beat sequences from existing works.Comment: Accepted MMSys '2

arXiv.org e-Print Archive

Convergence of flow-based generative models via proximal gradient descent in Wasserstein space

Author: Cheng Xiuyuan
Lu Jianfeng
Tan Yixin
Xie Yao
Publication venue
Publication date: 26/10/2023
Field of study

Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood, and have recently shown competitive empirical performance. Compared to the accumulating theoretical studies on related score-based diffusion models, analysis of flow-based models, which are deterministic in both forward (data-to-noise) and reverse (noise-to-data) directions, remain sparse. In this paper, we provide a theoretical guarantee of generating data distribution by a progressive flow model, the so-called JKO flow model, which implements the Jordan-Kinderleherer-Otto (JKO) scheme in a normalizing flow network. Leveraging the exponential convergence of the proximal gradient descent (GD) in Wasserstein space, we prove the Kullback-Leibler (KL) guarantee of data generation by a JKO flow model to be

O(\varepsilon^2)

when using

N \lesssim \log (1/\varepsilon)

many JKO steps (

N

Residual Blocks in the flow) where

\varepsilon

is the error in the per-step first-order condition. The assumption on data density is merely a finite second moment, and the theory extends to data distributions without density and when there are inversion errors in the reverse process where we obtain KL-

W_2

mixed error guarantees. The non-asymptotic convergence rate of the JKO-type

W_2

-proximal GD is proved for a general class of convex objective functionals that includes the KL divergence as a special case, which can be of independent interest

arXiv.org e-Print Archive

CFI2P: Coarse-to-Fine Cross-Modal Correspondence Learning for Image-to-Point Cloud Registration

Author: Chen Yiwei
Pan Yu
Xuan Yixin
Yao Gongxin
Publication venue
Publication date: 13/07/2023
Field of study

In the context of image-to-point cloud registration, acquiring point-to-pixel correspondences presents a challenging task since the similarity between individual points and pixels is ambiguous due to the visual differences in data modalities. Nevertheless, the same object present in the two data formats can be readily identified from the local perspective of point sets and pixel patches. Motivated by this intuition, we propose a coarse-to-fine framework that emphasizes the establishment of correspondences between local point sets and pixel patches, followed by the refinement of results at both the point and pixel levels. On a coarse scale, we mimic the classic Visual Transformer to translate both image and point cloud into two sequences of local representations, namely point and pixel proxies, and employ attention to capture global and cross-modal contexts. To supervise the coarse matching, we propose a novel projected point proportion loss, which guides to match point sets with pixel patches where more points can be projected into. On a finer scale, point-to-pixel correspondences are then refined from a smaller search space (i.e., the coarsely matched sets and patches) via well-designed sampling, attentional learning and fine matching, where sampling masks are embedded in the last two steps to mitigate the negative effect of sampling. With the high-quality correspondences, the registration problem is then resolved by EPnP algorithm within RANSAC. Experimental results on large-scale outdoor benchmarks demonstrate our superiority over existing methods

arXiv.org e-Print Archive

Impact of Intermittent Wind Generation on Power System Small Signal Stability

Author: Chen Wang
Liangzhong Yao
Libao Shi
Yixin Ni
Zheng Xu
Publication venue: 'IntechOpen'
Publication date: 04/04/2011
Field of study

IntechOpen

Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section

Author: Cho Kyunghyun
Jiang Lavender Yao
Oermann Eric Karl
Zheng Hongyi
Zhu Yixin
Publication venue
Publication date: 13/07/2023
Field of study

Recent advances in large language models have led to renewed interest in natural language processing in healthcare using the free text of clinical notes. One distinguishing characteristic of clinical notes is their long time span over multiple long documents. The unique structure of clinical notes creates a new design choice: when the context length for a language model predictor is limited, which part of clinical notes should we choose as the input? Existing studies either choose the inputs with domain knowledge or simply truncate them. We propose a framework to analyze the sections with high predictive power. Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large. Our findings suggest that a carefully selected sampling function could enable more efficient information extraction from clinical notes.Comment: Our code is publicly available on GitHub (https://github.com/nyuolab/EfficientTransformer

arXiv.org e-Print Archive

Radial Basis Function Neural Network with Particle Swarm Optimization Algorithms for Regional Logistics Demand Prediction

Author: Liming Yao
Yixin Zhang
Zhineng Hu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Regional logistics prediction is the key step in regional logistics planning and logistics resources rationalization. Since regional economy is the inherent and determinative factor of regional logistics demand, it is feasible to forecast regional logistics demand by investigating economic indicators which can accelerate the harmonious development of regional logistics industry and regional economy. In this paper, the PSO-RBFNN model, a radial basis function neural network (RBFNN) combined with particle swarm optimization (PSO) algorithm, is studied. The PSO-RBFNN model is trained by indicators data in a region to predict the regional logistics demand. And the corresponding results indicate the model’s applicability and potential advantages

Crossref

Directory of Open Access Journals