2,356 research outputs found
Bitewing Radiography Semantic Segmentation Base on Conditional Generative Adversarial Nets
Currently, Segmentation of bitewing radiograpy images is a very challenging
task. The focus of the study is to segment it into caries, enamel, dentin,
pulp, crowns, restoration and root canal treatments. The main method of
semantic segmentation of bitewing radiograpy images at this stage is the
U-shaped deep convolution neural network, but its accuracy is low. in order to
improve the accuracy of semantic segmentation of bitewing radiograpy images,
this paper proposes the use of Conditional Generative Adversarial network
(cGAN) combined with U-shaped network structure (U-Net) approach to semantic
segmentation of bitewing radiograpy images. The experimental results show that
the accuracy of cGAN combined with U-Net is 69.7%, which is 13.3% higher than
the accuracy of u-shaped deep convolution neural network of 56.4%.Comment: 12pages, in Chines
Retinal Vessels Segmentation Based on Dilated Multi-Scale Convolutional Neural Network
Accurate segmentation of retinal vessels is a basic step in Diabetic
retinopathy(DR) detection. Most methods based on deep convolutional neural
network (DCNN) have small receptive fields, and hence they are unable to
capture global context information of larger regions, with difficult to
identify lesions. The final segmented retina vessels contain more noise with
low classification accuracy. Therefore, in this paper, we propose a DCNN
structure named as D-Net. In the proposed D-Net, the dilation convolution is
used in the backbone network to obtain a larger receptive field without losing
spatial resolution, so as to reduce the loss of feature information and to
reduce the difficulty of tiny thin vessels segmentation. The large receptive
field can better distinguished between the lesion area and the blood vessel
area. In the proposed Multi-Scale Information Fusion module (MSIF), parallel
convolution layers with different dilation rates are used, so that the model
can obtain more dense feature information and better capture retinal vessel
information of different sizes. In the decoding module, the skip layer
connection is used to propagate context information to higher resolution
layers, so as to prevent low-level information from passing the entire network
structure. Finally, our method was verified on DRIVE, STARE and CHASE dataset.
The experimental results show that our network structure outperforms some
state-of-art method, such as N4-fields, U-Net, and DRIU in terms of accuracy,
sensitivity, specificity, and AUCROC. Particularly, D-Net outperforms U-Net by
1.04%, 1.23% and 2.79% in DRIVE, STARE, and CHASE three dataset, respectively
Learning Disentangled Representations for Timber and Pitch in Music Audio
Timbre and pitch are the two main perceptual properties of musical sounds.
Depending on the target applications, we sometimes prefer to focus on one of
them, while reducing the effect of the other. Researchers have managed to
hand-craft such timbre-invariant or pitch-invariant features using domain
knowledge and signal processing techniques, but it remains difficult to
disentangle them in the resulting feature representations. Drawing upon
state-of-the-art techniques in representation learning, we propose in this
paper two deep convolutional neural network models for learning disentangled
representation of musical timbre and pitch. Both models use encoders/decoders
and adversarial training to learn music representations, but the second model
additionally uses skip connections to deal with the pitch information. As music
is an art of time, the two models are supervised by frame-level instrument and
pitch labels using a new dataset collected from MuseScore. We compare the
result of the two disentangling models with a new evaluation protocol called
"timbre crossover", which leads to interesting applications in audio-domain
music editing. Via various objective evaluations, we show that the second model
can better change the instrumentation of a multi-instrument music piece without
much affecting the pitch structure. By disentangling timbre and pitch, we
envision that the model can contribute to generating more realistic music audio
as well
Multitask learning for frame-level instrument recognition
For many music analysis problems, we need to know the presence of instruments
for each time frame in a multi-instrument musical piece. However, such a
frame-level instrument recognition task remains difficult, mainly due to the
lack of labeled datasets. To address this issue, we present in this paper a
large-scale dataset that contains synthetic polyphonic music with frame-level
pitch and instrument labels. Moreover, we propose a simple yet novel network
architecture to jointly predict the pitch and instrument for each frame. With
this multitask learning method, the pitch information can be leveraged to
predict the instruments, and also the other way around. And, by using the
so-called pianoroll representation of music as the main target output of the
model, our model also predicts the instruments that play each individual note
event. We validate the effectiveness of the proposed method for framelevel
instrument recognition by comparing it with its singletask ablated versions and
three state-of-the-art methods. We also demonstrate the result of the proposed
method for multipitch streaming with real-world music. For reproducibility, we
will share the code to crawl the data and to implement the proposed model at:
https://github.com/biboamy/ instrument-streaming.Comment: This is a pre-print version of an ICASSP 2019 pape
Nuclear mass parabola and its applications
We propose a method to extract the properties of the isobaric mass parabola
based on the total double decay energies of isobaric nuclei. Two
important parameters of the mass parabola, the location of the most
-stable nuclei and the curvature parameter , are obtained
for 251 A values based on the total double decay energies of nuclei
compiled in AME2016 database.The advantage of this approach is that we can
remove the pairing energy term caused by odd-even variation, and the
mass excess of the most stable nuclide for mass number in the
performance process, which are used in the mass parabolic fitting method. The
Coulomb energy coefficient MeV is determined by the mass
difference relation of mirror nuclei, and the symmetry energy coefficient is
also studied by the relation .Comment: 16 pages, 5 figures, To be published in Chinese Physics
Retinal Vessel Segmentation Based on Conditional Deep Convolutional Generative Adversarial Networks
The segmentation of retinal vessels is of significance for doctors to
diagnose the fundus diseases. However, existing methods have various problems
in the segmentation of the retinal vessels, such as insufficient segmentation
of retinal vessels, weak anti-noise interference ability, and sensitivity to
lesions, etc. Aiming to the shortcomings of existed methods, this paper
proposes the use of conditional deep convolutional generative adversarial
networks to segment the retinal vessels. We mainly improve the network
structure of the generator. The introduction of the residual module at the
convolutional layer for residual learning makes the network structure sensitive
to changes in the output, as to better adjust the weight of the generator. In
order to reduce the number of parameters and calculations, using a small
convolution to halve the number of channels in the input signature before using
a large convolution kernel. By used skip connection to connect the output of
the convolutional layer with the output of the deconvolution layer to avoid
low-level information sharing. By verifying the method on the DRIVE and STARE
datasets, the segmentation accuracy rate is 96.08% and 97.71%, the sensitivity
reaches 82.74% and 85.34% respectively, and the F-measure reaches 82.08% and
85.02% respectively. The sensitivity is 4.82% and 2.4% higher than that of
R2U-Net.Comment: in Chines
High-energy gamma-ray afterglows from low-luminosity gamma-ray bursts
The observations of gamma-ray bursts (GRBs) such as 980425, 031203 and
060218, with luminosities much lower than those of other classic bursts, lead
to the definition of a new class of GRBs -- low-luminosity GRBs. The nature of
the outflow responsible for them is not clear yet. Two scenarios have been
suggested: one is the conventional relativistic outflow with initial Lorentz
factor of order of \Gamma_0\ga 10 and the other is a trans-relativistic
outflow with . Here we compare the high energy gamma-ray
afterglow emission from these two different models, taking into account both
synchrotron self inverse-Compton scattering (SSC) and the external
inverse-Compton scattering due to photons from the cooling supernova or
hypernova envelope (SNIC). We find that the conventional relativistic outflow
model predicts a relatively high gamma-ray flux from SSC at early times ( for typical parameters) with a rapidly decaying light curve, while in
the trans-relativistic outflow model, one would expect a much flatter light
curve of high-energy gamma-ray emission at early times, which could be
dominated by both the SSC emission and SNIC emission, depending on the
properties of the underlying supernova and the shock parameter and
. The Fermi Gamma-ray Space Telescope should be able to distinguish
between the two models in the future.Comment: Published in ApJ, 29 pages (aastex style), 6 figure
Hit Song Prediction for Pop Music by Siamese CNN with Ranking Loss
A model for hit song prediction can be used in the pop music industry to
identify emerging trends and potential artists or songs before they are
marketed to the public. While most previous work formulates hit song prediction
as a regression or classification problem, we present in this paper a
convolutional neural network (CNN) model that treats it as a ranking problem.
Specifically, we use a commercial dataset with daily play-counts to train a
multi-objective Siamese CNN model with Euclidean loss and pairwise ranking loss
to learn from audio the relative ranking relations among songs. Besides, we
devise a number of pair sampling methods according to some empirical
observation of the data. Our experiment shows that the proposed model with a
sampling method called A/B sampling leads to much higher accuracy in hit song
prediction than the baseline regression model. Moreover, we can further improve
the accuracy by using a neural attention mechanism to extract the highlights of
songs and by using a separate CNN model to offer high-level features of songs
Multitask learning for instrument activation aware music source separation
Music source separation is a core task in music information retrieval which
has seen a dramatic improvement in the past years. Nevertheless, most of the
existing systems focus exclusively on the problem of source separation itself
and ignore the utilization of other~---possibly related---~MIR tasks which
could lead to additional quality gains. In this work, we propose a novel
multitask structure to investigate using instrument activation information to
improve source separation performance. Furthermore, we investigate our system
on six independent instruments, a more realistic scenario than the three
instruments included in the widely-used MUSDB dataset, by leveraging a
combination of the MedleyDB and Mixing Secrets datasets. The results show that
our proposed multitask model outperforms the baseline Open-Unmix model on the
mixture of Mixing Secrets and MedleyDB dataset while maintaining comparable
performance on the MUSDB dataset
Initial Sampling in Symmetrical Quasiclassical Dynamics Based on Li-Miller Mapping Hamiltonian
A symmetrical quasiclassical (SQC) dynamics approach based on the Li-Miller
(LM) mapping Hamiltonian (SQC-LM) was employed to describe nonadiabatic
dynamics. In principle, the different initial sampling procedures may be
applied in the SQC-LM dynamics, and the results may be dependent on the initial
sampling. We provided various initial sampling approaches and checked their
influence. We selected two groups of models including site-exciton models for
exciton dynamics and linear vibronic coupling models for conical intersections
to test the performance of SQC-LM dynamics with the different initial sampling
methods. The results were examined with respect to those of the accurate
multilayer multiconfigurational time-dependent Hartree (ML-MCTDH) quantum
dynamics. For both two models, the SQC-LM method more-or-less gives a
reasonable description of the population dynamics, while the influence of the
initial sampling approaches on the final results is noticeable. It seems that
the proper initial sampling methods should be determined by the system under
study. This indicates that the combination of the SQC-LM method with a suitable
sampling approach may be a potential method in the description of nonadiabatic
dynamics
- β¦