24,118 research outputs found
Dirac shell quark-core model for the study of non-strange baryonic spectroscopy
A Dirac shell model is developed for the study of baryon spectroscopy, taking
into account the most relevant results of the quark-diquark models. The lack of
translational invariance of the shell model is avoided, in the present work, by
introducing a scalar-isoscalar fictitious particle that represents the origin
of quark shell interaction; in this way the states of the system are
eigenstates of the total momentum of the baryon. Only one-particle excitations
are considered. A two-quark core takes the place of the diquark, while the
third quark is excited to reproduce the baryonic resonances. For the
and , that represent the ground states of the spectra, the three
quarks are considered identical particles and the wave functions are completely
antisymmetric. The model is used to calculate the spectra of the and
resonances and the nucleon magnetic moments. The results are compared
to the present experimental data. Due to the presence of the core and to the
one-particle excitations, the structure of the obtained spectra is analogous to
that given by the quark-diquark models.Comment: To appear on Acta Physica Polonica
Learning from Videos with Deep Convolutional LSTM Networks
This paper explores the use of convolution LSTMs to simultaneously learn
spatial- and temporal-information in videos. A deep network of convolutional
LSTMs allows the model to access the entire range of temporal information at
all spatial scales of the data. We describe our experiments involving
convolution LSTMs for lipreading that demonstrate the model is capable of
selectively choosing which spatiotemporal scales are most relevant for a
particular dataset. The proposed deep architecture also holds promise in other
applications where spatiotemporal features play a vital role without having to
specifically cater the design of the network for the particular spatiotemporal
features existent within the problem. For the Lip Reading in the Wild (LRW)
dataset, our model slightly outperforms the previous state of the art (83.4%
vs. 83.0%) and sets the new state of the art at 85.2% when the model is
pretrained on the Lip Reading Sentences (LRS2) dataset
Ultra-high energy neutrino dispersion in plasma and radiative transition
Qualitative analysis of additional energy of neutrino and antineutrino in
plasma is performed. A general expression for the neutrino self-energy operator
is obtained in the case of ultra-high energies when the local limit of the weak
interaction is not valid. The neutrino and antineutrino additional energy in
plasma is calculated using the dependence of the and --boson propagators
on the momentum transferred. The kinematical region for the neutrino radiative
transition (the so-called "neutrino spin light") is established for some
important astrophysical cases. For high energy neutrino and antineutrino,
dominating transition channels in plasma, , and , are indicated.Comment: 12 pages, LaTeX, 3 EPS figures, submitted to Int. J. Mod. Phys. A;
version 2: typos corrected, presentation improved, the version to be
publishe
Evolving Indoor Navigational Strategies Using Gated Recurrent Units In NEAT
Simultaneous Localisation and Mapping (SLAM) algorithms are expensive to run
on smaller robotic platforms such as Micro-Aerial Vehicles. Bug algorithms are
an alternative that use relatively little processing power, and avoid high
memory consumption by not building an explicit map of the environment. Bug
Algorithms achieve relatively good performance in simulated and robotic maze
solving domains. However, because they are hand-designed, a natural question is
whether they are globally optimal control policies. In this work we explore the
performance of Neuroevolution - specifically NEAT - at evolving control
policies for simulated differential drive robots carrying out generalised maze
navigation. We extend NEAT to include Gated Recurrent Units (GRUs) to help deal
with long term dependencies. We show that both NEAT and our NEAT-GRU can
repeatably generate controllers that outperform I-Bug (an algorithm
particularly well-suited for use in real robots) on a test set of 209 indoor
maze like environments. We show that NEAT-GRU is superior to NEAT in this task
but also that out of the 2 systems, only NEAT-GRU can continuously evolve
successful controllers for a much harder task in which no bearing information
about the target is provided to the agent
Towards Finding Longer Proofs
We present a reinforcement learning (RL) based guidance system for automated
theorem proving geared towards Finding Longer Proofs (FLoP). FLoP focuses on
generalizing from short proofs to longer ones of similar structure. To achieve
that, FLoP uses state-of-the-art RL approaches that were previously not applied
in theorem proving. In particular, we show that curriculum learning
significantly outperforms previous learning-based proof guidance on a synthetic
dataset of increasingly difficult arithmetic problems.Comment: 9 pages, 5 figure
Preventing Posterior Collapse with Levenshtein Variational Autoencoder
Variational autoencoders (VAEs) are a standard framework for inducing latent
variable models that have been shown effective in learning text representations
as well as in text generation. The key challenge with using VAEs is the {\it
posterior collapse} problem: learning tends to converge to trivial solutions
where the generators ignore latent variables. In our Levenstein VAE, we propose
to replace the evidence lower bound (ELBO) with a new objective which is simple
to optimize and prevents posterior collapse. Intuitively, it corresponds to
generating a sequence from the autoencoder and encouraging the model to predict
an optimal continuation according to the Levenshtein distance (LD) with the
reference sentence at each time step in the generated sequence. We motivate the
method from the probabilistic perspective by showing that it is closely related
to optimizing a bound on the intractable Kullback-Leibler divergence of an
LD-based kernel density estimator from the model distribution. With this
objective, any generator disregarding latent variables will incur large
penalties and hence posterior collapse does not happen. We relate our approach
to policy distillation \cite{RossGB11} and dynamic oracles \cite{GoldbergN12}.
By considering Yelp and SNLI benchmarks, we show that Levenstein VAE produces
more informative latent representations than alternative approaches to
preventing posterior collapse
AMNet: Deep Atrous Multiscale Stereo Disparity Estimation Networks
In this paper, a new deep learning architecture for stereo disparity
estimation is proposed. The proposed atrous multiscale network (AMNet) adopts
an efficient feature extractor with depthwise-separable convolutions and an
extended cost volume that deploys novel stereo matching costs on the deep
features. A stacked atrous multiscale network is proposed to aggregate rich
multiscale contextual information from the cost volume which allows for
estimating the disparity with high accuracy at multiple scales. AMNet can be
further modified to be a foreground-background aware network, FBA-AMNet, which
is capable of discriminating between the foreground and the background objects
in the scene at multiple scales. An iterative multitask learning method is
proposed to train FBA-AMNet end-to-end. The proposed disparity estimation
networks, AMNet and FBA-AMNet, show accurate disparity estimates and advance
the state of the art on the challenging Middlebury, KITTI 2012, KITTI 2015, and
Sceneflow stereo disparity estimation benchmarks
Towards Characterizing COVID-19 Awareness on Twitter
The coronavirus (COVID-19) pandemic has significantly altered our lifestyles
as we resort to minimize the spread through preventive measures such as social
distancing and quarantine. An increasingly worrying aspect is the gap between
the exponential disease spread and the delay in adopting preventive measures.
This gap is attributed to the lack of awareness about the disease and its
preventive measures. Nowadays, social media platforms (ie., Twitter) are
frequently used to create awareness about major events, including COVID-19. In
this paper, we use Twitter to characterize public awareness regarding COVID-19
by analyzing the information flow in the most affected countries. Towards that,
we collect more than 46K trends and 622 Million tweets from the top twenty most
affected countries to examine 1) the temporal evolution of COVID-19 related
trends, 2) the volume of tweets and recurring topics in those trends, and 3)
the user sentiment towards preventive measures. Our results show that countries
with a lower pandemic spread generated a higher volume of trends and tweets to
expedite the information flow and contribute to public awareness. We also
observed that in those countries, the COVID-19 related trends were generated
before the sharp increase in the number of cases, indicating a preemptive
attempt to notify users about the potential threat. Finally, we noticed that in
countries with a lower spread, users had a positive sentiment towards COVID-19
preventive measures. Our measurements and analysis show that effective social
media usage can influence public behavior, which can be leveraged to better
combat future pandemics.Comment: Figure 1 is incorrect. Will be updated in the revisio
Scaleable input gradient regularization for adversarial robustness
In this work we revisit gradient regularization for adversarial robustness
with some new ingredients. First, we derive new per-image theoretical
robustness bounds based on local gradient information. These bounds strongly
motivate input gradient regularization. Second, we implement a scaleable
version of input gradient regularization which avoids double backpropagation:
adversarially robust ImageNet models are trained in 33 hours on four consumer
grade GPUs. Finally, we show experimentally and through theoretical
certification that input gradient regularization is competitive with
adversarial training. Moreover we demonstrate that gradient regularization does
not lead to gradient obfuscation or gradient masking
3D Shape Synthesis for Conceptual Design and Optimization Using Variational Autoencoders
We propose a data-driven 3D shape design method that can learn a generative
model from a corpus of existing designs, and use this model to produce a wide
range of new designs. The approach learns an encoding of the samples in the
training corpus using an unsupervised variational autoencoder-decoder
architecture, without the need for an explicit parametric representation of the
original designs. To facilitate the generation of smooth final surfaces, we
develop a 3D shape representation based on a distance transformation of the
original 3D data, rather than using the commonly utilized binary voxel
representation. Once established, the generator maps the latent space
representations to the high-dimensional distance transformation fields, which
are then automatically surfaced to produce 3D representations amenable to
physics simulations or other objective function evaluation modules. We
demonstrate our approach for the computational design of gliders that are
optimized to attain prescribed performance scores. Our results show that when
combined with genetic optimization, the proposed approach can generate a rich
set of candidate concept designs that achieve prescribed functional goals, even
when the original dataset has only a few or no solutions that achieve these
goals.Comment: Preprint accepted by ASME IDETC/CIE 201
- …