2,770 research outputs found
Do Neural Nets Learn Statistical Laws behind Natural Language?
The performance of deep learning in natural language processing has been
spectacular, but the reasons for this success remain unclear because of the
inherent complexity of deep learning. This paper provides empirical evidence of
its effectiveness and of a limitation of neural networks for language
engineering. Precisely, we demonstrate that a neural language model based on
long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law,
two representative statistical properties underlying natural language. We
discuss the quality of reproducibility and the emergence of Zipf's law and
Heaps' law as training progresses. We also point out that the neural language
model has a limitation in reproducing long-range correlation, another
statistical property of natural language. This understanding could provide a
direction for improving the architectures of neural networks.Comment: 21 pages, 11 figure
Recurrent Neural Networks Applied to GNSS Time Series for Denoising and Prediction
Global Navigation Satellite Systems (GNSS) are systems that continuously acquire data and provide position time series. Many monitoring applications are based on GNSS data and their efficiency depends on the capability in the time series analysis to characterize the signal content and/or to predict incoming coordinates. In this work we propose a suitable Network Architecture, based on Long Short Term Memory Recurrent Neural Networks, to solve two main tasks in GNSS time series analysis: denoising and prediction. We carry out an analysis on a synthetic time series, then we inspect two real different case studies and evaluate the results. We develop a non-deep network that removes almost the 50% of scattering from real GNSS time series and achieves a coordinate prediction with 1.1 millimeters of Mean Squared Error
A deep learning integrated Lee-Carter model
In the field of mortality, the Lee–Carter based approach can be considered the milestone
to forecast mortality rates among stochastic models. We could define a “Lee–Carter model family”
that embraces all developments of this model, including its first formulation (1992) that remains the
benchmark for comparing the performance of future models. In the Lee–Carter model, the kt parameter,
describing the mortality trend over time, plays an important role about the future mortality behavior.
The traditional ARIMA process usually used to model kt shows evident limitations to describe the future
mortality shape. Concerning forecasting phase, academics should approach a more plausible way in
order to think a nonlinear shape of the projected mortality rates. Therefore, we propose an alternative
approach the ARIMA processes based on a deep learning technique. More precisely, in order to catch
the pattern of kt series over time more accurately, we apply a Recurrent Neural Network with a Long
Short-Term Memory architecture and integrate the Lee–Carter model to improve its predictive capacity.
The proposed approach provides significant performance in terms of predictive accuracy and also allow
for avoiding the time-chunks’ a priori selection. Indeed, it is a common practice among academics to
delete the time in which the noise is overflowing or the data quality is insufficient. The strength of
the Long Short-Term Memory network lies in its ability to treat this noise and adequately reproduce it
into the forecasted trend, due to its own architecture enabling to take into account significant long-term
patterns
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems
Natural language generation (NLG) is a critical component of spoken dialogue
and it has a significant impact both on usability and perceived quality. Most
NLG systems in common use employ rules and heuristics and tend to generate
rigid and stylised responses without the natural variation of human language.
They are also not easily scaled to systems covering multiple domains and
languages. This paper presents a statistical language generator based on a
semantically controlled Long Short-term Memory (LSTM) structure. The LSTM
generator can learn from unaligned data by jointly optimising sentence planning
and surface realisation using a simple cross entropy training criterion, and
language variation can be easily achieved by sampling from output candidates.
With fewer heuristics, an objective evaluation in two differing test domains
showed the proposed method improved performance compared to previous methods.
Human judges scored the LSTM system higher on informativeness and naturalness
and overall preferred it to the other systems.Comment: To be appear in EMNLP 201
Data-Driven Forecasting of High-Dimensional Chaotic Systems with Long Short-Term Memory Networks
We introduce a data-driven forecasting method for high-dimensional chaotic
systems using long short-term memory (LSTM) recurrent neural networks. The
proposed LSTM neural networks perform inference of high-dimensional dynamical
systems in their reduced order space and are shown to be an effective set of
nonlinear approximators of their attractor. We demonstrate the forecasting
performance of the LSTM and compare it with Gaussian processes (GPs) in time
series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation
and a prototype climate model. The LSTM networks outperform the GPs in
short-term forecasting accuracy in all applications considered. A hybrid
architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is
proposed to ensure convergence to the invariant measure. This novel hybrid
method is fully data-driven and extends the forecasting capabilities of LSTM
networks.Comment: 31 page
- …