14,990 research outputs found
Dispersive Fourier Transformation for Versatile Microwave Photonics Applications
Abstract: Dispersive Fourier transformation (DFT) maps the broadband spectrum of an ultrashort optical pulse into a time stretched waveform with its intensity profile mirroring the spectrum using chromatic dispersion. Owing to its capability of continuous pulse-by-pulse spectroscopic measurement and manipulation, DFT has become an emerging technique for ultrafast signal generation and processing, and high-throughput real-time measurements, where the speed of traditional optical instruments falls short. In this paper, the principle and implementation methods of DFT are first introduced and the recent development in employing DFT technique for widespread microwave photonics applications are presented, with emphasis on real-time spectroscopy, microwave arbitrary waveform generation, and microwave spectrum sensing. Finally, possible future research directions for DFT-based microwave photonics techniques are discussed as well
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Robust speech processing in multi-talker environments requires effective
speech separation. Recent deep learning systems have made significant progress
toward solving this problem, yet it remains challenging particularly in
real-time, short latency applications. Most methods attempt to construct a mask
for each source in time-frequency representation of the mixture signal which is
not necessarily an optimal representation for speech separation. In addition,
time-frequency decomposition results in inherent problems such as
phase/magnitude decoupling and long time window which is required to achieve
sufficient frequency resolution. We propose Time-domain Audio Separation
Network (TasNet) to overcome these limitations. We directly model the signal in
the time-domain using an encoder-decoder framework and perform the source
separation on nonnegative encoder outputs. This method removes the frequency
decomposition step and reduces the separation problem to estimation of source
masks on encoder outputs which is then synthesized by the decoder. Our system
outperforms the current state-of-the-art causal and noncausal speech separation
algorithms, reduces the computational cost of speech separation, and
significantly reduces the minimum required latency of the output. This makes
TasNet suitable for applications where low-power, real-time implementation is
desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad
Side-channel based intrusion detection for industrial control systems
Industrial Control Systems are under increased scrutiny. Their security is
historically sub-par, and although measures are being taken by the
manufacturers to remedy this, the large installed base of legacy systems cannot
easily be updated with state-of-the-art security measures. We propose a system
that uses electromagnetic side-channel measurements to detect behavioural
changes of the software running on industrial control systems. To demonstrate
the feasibility of this method, we show it is possible to profile and
distinguish between even small changes in programs on Siemens S7-317 PLCs,
using methods from cryptographic side-channel analysis.Comment: 12 pages, 7 figures. For associated code, see
https://polvanaubel.com/research/em-ics/code
Tacotron: Towards End-to-End Speech Synthesis
A text-to-speech synthesis system typically consists of multiple stages, such
as a text analysis frontend, an acoustic model and an audio synthesis module.
Building these components often requires extensive domain expertise and may
contain brittle design choices. In this paper, we present Tacotron, an
end-to-end generative text-to-speech model that synthesizes speech directly
from characters. Given pairs, the model can be trained completely
from scratch with random initialization. We present several key techniques to
make the sequence-to-sequence framework perform well for this challenging task.
Tacotron achieves a 3.82 subjective 5-scale mean opinion score on US English,
outperforming a production parametric system in terms of naturalness. In
addition, since Tacotron generates speech at the frame level, it's
substantially faster than sample-level autoregressive methods.Comment: Submitted to Interspeech 2017. v2 changed paper title to be
consistent with our conference submission (no content change other than typo
fixes
- …