35,907 research outputs found
Neural Responding Machine for Short-Text Conversation
We propose Neural Responding Machine (NRM), a neural network-based response
generator for Short-Text Conversation. NRM takes the general encoder-decoder
framework: it formalizes the generation of response as a decoding process based
on the latent representation of the input text, while both encoding and
decoding are realized with recurrent neural networks (RNN). The NRM is trained
with a large amount of one-round conversation data collected from a
microblogging service. Empirical study shows that NRM can generate
grammatically correct and content-wise appropriate responses to over 75% of the
input text, outperforming state-of-the-arts in the same setting, including
retrieval-based and SMT-based models.Comment: accepted as a full paper at ACL 201
Consistency Index-Based Sensor Fault Detection System for Nuclear Power Plant Emergency Situations Using an LSTM Network
A nuclear power plant (NPP) consists of an enormous number of components with complex interconnections. Various techniques to detect sensor errors have been developed to monitor the state of the sensors during normal NPP operation, but not for emergency situations. In an emergency situation with a reactor trip, all the plant parameters undergo drastic changes following the sudden decrease in core reactivity. In this paper, a machine learning model adopting a consistency index is suggested for sensor error detection during NPP emergency situations. The proposed consistency index refers to the soundness of the sensors based on their measurement accuracy. The application of consistency index labeling makes it possible to detect sensor error immediately and specify the particular sensor where the error occurred. From a compact nuclear simulator, selected plant parameters were extracted during typical emergency situations, and artificial sensor errors were injected into the raw data. The trained system successfully generated output that gave both sensor error states and error-free states
Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times
Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs
Deep learning has significantly advanced the state of the
art in artificial intelligence, gaining wide popularity from both industry
and academia. Special interest is around Convolutional Neural Networks
(CNN), which take inspiration from the hierarchical structure
of the visual cortex, to form deep layers of convolutional operations,
along with fully connected classifiers. Hardware implementations of these
deep CNN architectures are challenged with memory bottlenecks that
require many convolution and fully-connected layers demanding large
amount of communication for parallel computation. Multi-core CPU
based solutions have demonstrated their inadequacy for this problem
due to the memory wall and low parallelism. Many-core GPU architectures
show superior performance but they consume high power and also
have memory constraints due to inconsistencies between cache and main
memory. OpenCL is commonly used to describe these architectures for
their execution on GPGPUs or FPGAs. FPGA design solutions are also
actively being explored, which allow implementing the memory hierarchy
using embedded parallel BlockRAMs. This boosts the parallel use
of shared memory elements between multiple processing units, avoiding
data replicability and inconsistencies. This makes FPGAs potentially
powerful solutions for real-time classification of CNNs. In this
paper both Altera and Xilinx adopted OpenCL co-design frameworks
for pseudo-automatic development solutions are evaluated. A comprehensive
evaluation and comparison for a 5-layer deep CNN is presented.
Hardware resources, temporal performance and the OpenCL architecture
for CNNs are discussed. Xilinx demonstrates faster synthesis, better
FPGA resource utilization and more compact boards. Altera provides
multi-platforms tools, mature design community and better execution
times.Ministerio de Economía y Competitividad TEC2016-77785-
- …