2,416 research outputs found
Advances in deep learning methods for speech recognition and understanding
Ce travail expose plusieurs Ă©tudes dans les domaines de
la reconnaissance de la parole et
compréhension du langage parlé.
La compréhension sémantique du langage parlé est un sous-domaine important
de l'intelligence artificielle.
Le traitement de la parole intéresse depuis longtemps les chercheurs,
puisque la parole est une des charactéristiques qui definit l'être humain.
Avec le développement du réseau neuronal artificiel,
le domaine a connu une Ă©volution rapide
à la fois en terme de précision et de perception humaine.
Une autre étape importante a été franchie avec le développement
d'approches bout en bout.
De telles approches permettent une coadaptation de toutes
les parties du modèle, ce qui augmente ainsi les performances,
et ce qui simplifie la procédure d'entrainement.
Les modèles de bout en bout sont devenus réalisables avec la quantité croissante
de données disponibles, de ressources informatiques et,
surtout, avec de nombreux développements architecturaux innovateurs.
NĂ©anmoins, les approches traditionnelles (qui ne sont pas bout en bout)
sont toujours pertinentes pour le traitement de la parole en raison
des données difficiles dans les environnements bruyants,
de la parole avec un accent et de la grande variété de dialectes.
Dans le premier travail, nous explorons la reconnaissance de la parole hybride
dans des environnements bruyants.
Nous proposons de traiter la reconnaissance de la parole,
qui fonctionne dans
un nouvel environnement composé de différents bruits inconnus,
comme une tâche d'adaptation de domaine.
Pour cela, nous utilisons la nouvelle technique Ă l'Ă©poque
de l'adaptation du domaine antagoniste.
En résumé, ces travaux antérieurs proposaient de former
des caractéristiques de manière à ce qu'elles soient distinctives
pour la tâche principale, mais non-distinctive pour la tâche secondaire.
Cette tâche secondaire est conçue pour être la tâche de reconnaissance de domaine.
Ainsi, les fonctionnalités entraînées sont invariantes vis-à -vis du domaine considéré.
Dans notre travail, nous adoptons cette technique et la modifions pour
la tâche de reconnaissance de la parole dans un environnement bruyant.
Dans le second travail, nous développons une méthode générale
pour la régularisation des réseaux génératif récurrents.
Il est connu que les réseaux récurrents ont souvent des difficultés à rester
sur le mĂŞme chemin, lors de la production de sorties longues.
Bien qu'il soit possible d'utiliser des réseaux bidirectionnels pour
une meilleure traitement de séquences pour l'apprentissage des charactéristiques,
qui n'est pas applicable au cas génératif.
Nous avons développé un moyen d'améliorer la cohérence de
la production de longues séquences avec des réseaux récurrents.
Nous proposons un moyen de construire un modèle similaire à un réseau bidirectionnel.
L'idée centrale est d'utiliser une perte L2 entre
les réseaux récurrents génératifs vers l'avant et vers l'arrière.
Nous fournissons une évaluation expérimentale sur
une multitude de tâches et d'ensembles de données,
y compris la reconnaissance vocale,
le sous-titrage d'images et la modélisation du langage.
Dans le troisième article, nous étudions la possibilité de développer
un identificateur d'intention de bout en bout pour la compréhension du langage parlé.
La compréhension sémantique du langage parlé est une étape importante vers
le développement d'une intelligence artificielle de type humain.
Nous avons vu que les approches de bout en bout montrent
des performances élevées sur les tâches, y compris la traduction automatique et
la reconnaissance de la parole.
Nous nous inspirons des travaux antérieurs pour développer
un système de bout en bout pour la reconnaissance de l'intention.This work presents several studies in the areas of speech recognition and
understanding.
The semantic speech understanding is an important sub-domain of the
broader field of artificial intelligence.
Speech processing has had interest from the researchers for long time
because language is one of the defining characteristics of a human being.
With the development of neural networks, the domain has seen rapid progress
both in terms of accuracy and human perception.
Another important milestone was achieved with the development of
end-to-end approaches.
Such approaches allow co-adaptation of all the parts of the model
thus increasing the performance, as well as simplifying the training
procedure.
End-to-end models became feasible with the increasing amount of available
data, computational resources, and most importantly with many novel
architectural developments.
Nevertheless, traditional, non end-to-end, approaches are still relevant
for speech processing due to challenging data in noisy environments,
accented speech, and high variety of dialects.
In the first work, we explore the hybrid speech recognition in noisy
environments.
We propose to treat the recognition in the unseen noise condition
as the domain adaptation task.
For this, we use the novel at the time technique of the adversarial
domain adaptation.
In the nutshell, this prior work proposed to train features in such
a way that they are discriminative for the primary task,
but non-discriminative for the secondary task.
This secondary task is constructed to be the domain recognition task.
Thus, the features trained are invariant towards the domain at hand.
In our work, we adopt this technique and modify it for the task of
noisy speech recognition.
In the second work, we develop a general method for regularizing
the generative recurrent networks.
It is known that the recurrent networks frequently have difficulties
staying on same track when generating long outputs.
While it is possible to use bi-directional networks for better
sequence aggregation for feature learning, it is not applicable
for the generative case.
We developed a way improve the consistency of generating long sequences
with recurrent networks.
We propose a way to construct a model similar to bi-directional network.
The key insight is to use a soft L2 loss between the forward and
the backward generative recurrent networks.
We provide experimental evaluation on a multitude of tasks and datasets,
including speech recognition, image captioning, and language modeling.
In the third paper, we investigate the possibility of developing
an end-to-end intent recognizer for spoken language understanding.
The semantic spoken language understanding is an important
step towards developing a human-like artificial intelligence.
We have seen that the end-to-end approaches show high
performance on the tasks including machine translation and speech recognition.
We draw the inspiration from the prior works to develop
an end-to-end system for intent recognition
Image enhancement techniques applied to solar feature detection
This dissertation presents the development of automatic image enhancement techniques for solar feature detection. The new method allows for detection and tracking of the evolution of filaments in solar images. Series of H-alpha full-disk images are taken in regular time intervals to observe the changes of the solar disk features. In each picture, the solar chromosphere filaments are identified for further evolution examination. The initial preprocessing step involves local thresholding to convert grayscale images into black-and-white pictures with chromosphere granularity enhanced. An alternative preprocessing method, based on image normalization and global thresholding is presented. The next step employs morphological closing operations with multi-directional linear structuring elements to extract elongated shapes in the image. After logical union of directional filtering results, the remaining noise is removed from the final outcome using morphological dilation and erosion with a circular structuring element. Experimental results show that the developed techniques can achieve excellent results in detecting large filaments and good detection rates for small filaments. The final chapter discusses proposed directions of the future research and applications to other areas of solar image processing, in particular to detection of solar flares, plages and sunspots
Auroral Image Processing Techniques - Machine Learning Classification and Multi-Viewpoint Analysis
Every year, millions of scientific images are acquired in order to study the auroral phenomena. The accumulated data contain a vast amount of untapped information that can be used in auroral science. Yet, auroral research has traditionally been focused on case studies, where one or a few auroral events have been investigated and explained in detail. Consequently, theories have often been developed on the basis of limited data sets, which can possibly be biased in location, spatial resolution or temporal resolution.
Advances in technology and data processing now allow for acquisition and analysis of large image data sets. These tools have made it feasible to perform statistical studies based on auroral data from numerous events, varying geophysical conditions and multiple locations in the Arctic and Antarctic. Such studies require reliable auroral image processing techniques to organize, extract and represent the auroral information in a scientifically rigorous manner, preferably with a minimal amount of user interaction. This dissertation focuses on two such branches of image processing techniques: machine learning classification and multi-viewpoint analysis.
Machine learning classification: This thesis provides an in-depth description on the implementation of machine learning methods for auroral image classification; from raw images to labeled data. The main conclusion of this work is that convolutional neural networks stand out as a particularly suitable classifier for auroral image data, achieving up to 91 % average class-wise accuracy. A major challenge is that most auroral images have an ambiguous auroral form. These images can not be readily labeled without establishing an auroral morphology, where each class is clearly defined.
Multi-viewpoint analysis: Three multi-viewpoint analysis techniques are evaluated and described in this work: triangulation, shell-projection and 3-D reconstruction. These techniques are used for estimating the volume distribution of artificially induced aurora and the height and horizontal distribution of a newly reported auroral feature: Lumikot aurora. The multi-viewpoint analysis techniques are compared and methods for obtaining uncertainty estimates are suggested.
Overall, this dissertation evaluates and describes auroral image processing techniques that require little or no user input. The presented methods may therefore facilitate statistical studies such as: probability studies of auroral classes, investigations of the evolution and formation of auroral structures, and studies of the height and distribution of auroral displays. Furthermore, automatic classification and cataloging of large image data sets will support auroral scientists in finding the data of interest, reducing the needed time for manual inspection of auroral images
Microspacecraft and Earth observation: Electrical field (ELF) measurement project
The Utah State University space system design project for 1989 to 1990 focuses on the design of a global electrical field sensing system to be deployed in a constellation of microspacecraft. The design includes the selection of the sensor and the design of the spacecraft, the sensor support subsystems, the launch vehicle interface structure, on board data storage and communications subsystems, and associated ground receiving stations. Optimization of satellite orbits and spacecraft attitude are critical to the overall mapping of the electrical field and, thus, are also included in the project. The spacecraft design incorporates a deployable sensor array (5 m booms) into a spinning oblate platform. Data is taken every 0.1 seconds by the electrical field sensors and stored on-board. An omni-directional antenna communicates with a ground station twice per day to down link the stored data. Wrap-around solar cells cover the exterior of the spacecraft to generate power. Nine Pegasus launches may be used to deploy fifty such satellites to orbits with inclinations greater than 45 deg. Piggyback deployment from other launch vehicles such as the DELTA 2 is also examined
Magnetosphere imager science definition team interim report
For three decades, magnetospheric field and plasma measurements have been made by diverse instruments flown on spacecraft in may different orbits, widely separated in space and time, and under various solar and magnetospheric conditions. Scientists have used this information to piece together an intricate, yet incomplete view of the magnetosphere. A simultaneous global view, using various light wavelengths and energetic neutral atoms, could reveal exciting new data nd help explain complex magnetospheric processes, thus providing a clear picture of this region of space. This report documents the scientific rational for such a magnetospheric imaging mission and provides a mission concept for its implementation
Fiber optic networks: fairness, access controls and prototyping
Fiber optic technologies enabling high-speed, high-capacity digital information transport have only been around for about 3 decades but in their short life have completely revolutionized global communications. To keep pace with the growing demand for digital communications and entertainment, fiber optic networks and technologies continue to grow and mature. As new applications in telecommunications, computer networking and entertainment emerge, reliability, scalability, and high Quality of Service (QoS) requirements are increasing the complexity of optical transport networks.;This dissertation is devoted to providing a discussion of existing and emerging technologies in modern optical communications networks. To this end, we first outline traditional telecommunication and data networks that enable high speed, long distance information transport. We examine various network architectures including mesh, ring and bus topologies of modern Local, Metropolitan and Wide area networks. We present some of the most successful technologies used in todays communications networks, outline their shortcomings and introduce promising new technologies to meet the demands of future transport networks.;The capacity of a single wavelength optical signal is 10 Gbps today and is likely to increase to over 100 Gbps as demonstrated in laboratory settings. In addition, Wavelength Division Multiplexing (WDM) techniques, able to support over 160 wavelengths on a single optical fiber, have effectively increased the capacity of a single optical fiber to well over 1 Tbps. However, user requirements are often of a sub-wavelength order. This mis-match between individual user requirements and single wavelength offerings necessitates bandwidth sharing mechanisms to efficiently multiplex multiple low rate streams on to high rate wavelength channels, called traffic grooming.;This dissertation examines traffic grooming in the context of circuit, packet, burst and trail switching paradigms. Of primary interest are the Media Access Control (MAC) protocols used to provide QoS and fairness in optical networks. We present a comprehensive discussion of the most recognized fairness models and MACs for ring and bus networks which lay the groundwork for the development of the Robust, Dynamic and Fair Network (RDFN) protocol for ring networks. The RDFN protocol is a novel solution to fairly share ring bandwidth for bursty asynchronous data traffic while providing bandwidth and delay guarantees for synchronous voice traffic.;We explain the light-trail (LT) architecture and technology introduced in [37] as a solution to providing high network resource utilization, seamless scalability and network transparency for metropolitan area networks. The goal of light-trails is to eliminate Optical Electronic Optical (O-E-O) conversion, minimize active switching, maximize wavelength utilization, and offer protocol and bit-rate transparency to address the growing demands placed on WDM networks. Light-trail technology is a physical layer architecture that combines commercially available optical components to allow multiple nodes along a lightpath to participate in time multiplexed communication without the need for burst or packet level switch reconfiguration. We present three medium access control protocols for light-trails that provide collision protection but do not consider fair network access. As an improvement to these light-trail MAC protocols we introduce the Token LT and light-trail Fair Access (LT-FA) MAC protocols and evaluate their performance. We illustrate how fairness is achieved and access delay guarantees are made to satisfy the bandwidth budget fairness model. The goal of light-trails and our access control solution is to combine commercially available components with emerging network technologies to provide a transparent, reliable and highly scalable communication network.;The second area of discussion in this dissertation deals with the rapid prototyping platform. We discuss how the reconfigurable rapid prototyping platform (RRPP) is being utilized to bridge the gap between academic research, education and industry. We provide details of the Real-time Radon transform and the Griffin parallel computing platform implemented using the RRPP. We discuss how the RRPP provides additional visibility to academic research initiatives and facilitates understanding of system level designs. As a proof of concept, we introduce the light-trail testbed developed at the High Speed Systems Engineering lab. We discuss how a light-trail test bed has been developed using the RRPP to provide additional insight on the real-world limitations of light-trail technology. We provide details on its operation and discuss the steps required to and decisions made to realize test-bed operation. Two applications are presented to illustrate the use of the LT-FA MAC in the test-bed and demonstrate streaming media over light-trails.;As a whole, this dissertation aims to provide a comprehensive discussion of current and future technologies and trends for optical communication networks. In addition, we provide media access control solutions for ring and bus networks to address fair resource sharing and access delay guarantees. The light-trail testbed demonstrates proof of concept and outlines system level design challenges for future optical networks
An automated auroral detection system using deep learning: real-time operation in Tromsø, Norway
The activity of citizen scientists who capture images of aurora borealis using digital cameras has recently been contributing to research regarding space physics by professional scientists. Auroral images captured using digital cameras not only fascinate us, but may also provide information about the energy of precipitating auroral electrons from space; this ability makes the use of digital cameras more meaningful. To support the application of digital cameras, we have developed artificial intelligence that monitors the auroral appearance in Tromsø, Norway, instead of relying on the human eye, and implemented a web application, “Tromsø AI”, which notifies the scientists of the appearance of auroras in real-time. This “AI” has a double meaning: artificial intelligence and eyes (instead of human eyes). Utilizing the Tromsø AI, we also classified large-scale optical data to derive annual, monthly, and UT variations of the auroral occurrence rate for the first time. The derived occurrence characteristics are fairly consistent with the results obtained using the naked eye, and the evaluation using the validation data also showed a high F1 score of over 93%, indicating that the classifier has a performance comparable to that of the human eye classifying observed images
25 Years of Self-Organized Criticality: Solar and Astrophysics
Shortly after the seminal paper {\sl "Self-Organized Criticality: An
explanation of 1/f noise"} by Bak, Tang, and Wiesenfeld (1987), the idea has
been applied to solar physics, in {\sl "Avalanches and the Distribution of
Solar Flares"} by Lu and Hamilton (1991). In the following years, an inspiring
cross-fertilization from complexity theory to solar and astrophysics took
place, where the SOC concept was initially applied to solar flares, stellar
flares, and magnetospheric substorms, and later extended to the radiation belt,
the heliosphere, lunar craters, the asteroid belt, the Saturn ring, pulsar
glitches, soft X-ray repeaters, blazars, black-hole objects, cosmic rays, and
boson clouds. The application of SOC concepts has been performed by numerical
cellular automaton simulations, by analytical calculations of statistical
(powerlaw-like) distributions based on physical scaling laws, and by
observational tests of theoretically predicted size distributions and waiting
time distributions. Attempts have been undertaken to import physical models
into the numerical SOC toy models, such as the discretization of
magneto-hydrodynamics (MHD) processes. The novel applications stimulated also
vigorous debates about the discrimination between SOC models, SOC-like, and
non-SOC processes, such as phase transitions, turbulence, random-walk
diffusion, percolation, branching processes, network theory, chaos theory,
fractality, multi-scale, and other complexity phenomena. We review SOC studies
from the last 25 years and highlight new trends, open questions, and future
challenges, as discussed during two recent ISSI workshops on this theme.Comment: 139 pages, 28 figures, Review based on ISSI workshops "Self-Organized
Criticality and Turbulence" (2012, 2013, Bern, Switzerland
Regulation of Mitosis by Nuclear Speckle Proteins
Mitosis is an intricate process that is monitored by multiple cell cycle regulator proteins. Errors in mitotic regulation have been linked to many types of cancer. Recently, evidence suggests an indirect regulation of mitosis by nuclear speckle proteins. Nuclear speckles are one of the multiple compartments found in the mammalian cell nucleus. They serve as assembly compartments for premRNA processing factors. Mass spectrometry analysis of purified nuclear speckles revealed 33 novel proteins including Son and TRAP150. The aim of my study was to determine how Son and TRAP150 can directly or indirectly impact mitosis. We and others recently reported that Son is required to maintain cell proliferation, as its depletion results in growth arrest in metaphase. We hypothesized that Son is required for the assembly of important mitotic structures such as the mitotic spindle and kinetochores. My results showed elongated and disorganized mitotic spindles in Son-depleted cells. In addition, my studies showed a novel localization pattern for Son in cytoplasmic foci following microtubule destabilization at metaphase. Son is important in the alternative splicing of transcripts that encode cell cycle regulators (Sharma et al., 2011). Therefore the mitotic defects seen are most likely explained by alternative splicing defects that occur after Son depletion. In addition, preliminary evidence from the Bubulya lab showed the mitotic defects in TRAP-150 depleted cells suggesting a role for TRAP150 in mitosis. Given that TRAP150 did not colocalize with mitotic structures during mitosis, we hypothesized that TRAP150 depletion alters mitosis by altering the transcripts of mitotic regulators. My results indicated that TRAP150 is important in controlling abundance of transcripts that encode mitotic regulators
Development of the DAQ Front-end for the DSSC Detector at the European XFEL
The European XFEL is an international photon science facility currently
under construction at DESY, Hamburg. Its unique characteristics will
open up new research opportunities for investigating tiny structures,
ultra-fast processes, and also matter under extreme conditions. The
research will allow invaluable insights for many scientific disciplines
like biology, medicine, and chemistry, but also for nano-technology,
astro-physics, and others. The DSSC detector is one of three 2d
megapixel detectors presently being developed for application at the
XFEL facility. A challange is the acquisition of the huge data amount
produced by the detector system. The total payload data rate is
estimated to be in the order of 67.2 Gb/s. This thesis presents the DAQ
front-end for the DSSC detector. A special focus is on the development
of the I/O Board, which represents the basic component of the lower DAQ
layer. The DSSC front-end DAQ system exploits the features of latest
technology in microelectronics and high-speed data transmission.
Organized as a two-staged hierarchical system, it comprises 20 readout
nodes in total, based on FPGA technology. The 16 slave nodes of the
first DAQ layer receive data from the detector front-end at an aggregate
link bandwidth of 89.6 Gb/s via 256 electrical links. The accumulated
data are then concentrated into four 3.125 Gb/s high-speed links per
node for transmission towards the four master nodes of the second DAQ
layer, the Patch Panel Transceivers. Custom-built firmware on the slave
node FPGAs implements the readout logic and concentrator mechanism for
the acquired detector data. It additionally comprises several controller
modules, which are responsible for operating critical detector
electronics. The test results and measurements show that the I/O Board
is able both to manage data acquisition at the required bandwith and
also to perform low-level controlling tasks as required for proper
detector operation
- …