16,178 research outputs found
The Signal Data Explorer: A high performance Grid based signal search tool for use in distributed diagnostic applications
We describe a high performance Grid based signal search tool for distributed diagnostic applications developed in conjunction with Rolls-Royce plc for civil aero engine condition monitoring applications. With the introduction of advanced monitoring technology into engineering systems, healthcare, etc., the associated diagnostic processes are increasingly required to handle and consider vast amounts of data. An exemplar of such a diagnosis process was developed during the DAME project, which built a proof of concept demonstrator to assist in the enhanced diagnosis and prognosis of aero-engine conditions. In particular it has shown the utility of an interactive viewing and high performance distributed search tool (the Signal Data Explorer) in the aero-engine diagnostic process. The viewing and search techniques are equally applicable to other domains. The Signal Data Explorer and search services have been demonstrated on the Worldwide Universities Network to search distributed databases of electrocardiograph data
Architecture and Design of Medical Processor Units for Medical Networks
This paper introduces analogical and deductive methodologies for the design
medical processor units (MPUs). From the study of evolution of numerous earlier
processors, we derive the basis for the architecture of MPUs. These specialized
processors perform unique medical functions encoded as medical operational
codes (mopcs). From a pragmatic perspective, MPUs function very close to CPUs.
Both processors have unique operation codes that command the hardware to
perform a distinct chain of subprocesses upon operands and generate a specific
result unique to the opcode and the operand(s). In medical environments, MPU
decodes the mopcs and executes a series of medical sub-processes and sends out
secondary commands to the medical machine. Whereas operands in a typical
computer system are numerical and logical entities, the operands in medical
machine are objects such as such as patients, blood samples, tissues, operating
rooms, medical staff, medical bills, patient payments, etc. We follow the
functional overlap between the two processes and evolve the design of medical
computer systems and networks.Comment: 17 page
Distributed learning of CNNs on heterogeneous CPU/GPU architectures
Convolutional Neural Networks (CNNs) have shown to be powerful classification
tools in tasks that range from check reading to medical diagnosis, reaching
close to human perception, and in some cases surpassing it. However, the
problems to solve are becoming larger and more complex, which translates to
larger CNNs, leading to longer training times that not even the adoption of
Graphics Processing Units (GPUs) could keep up to. This problem is partially
solved by using more processing units and distributed training methods that are
offered by several frameworks dedicated to neural network training. However,
these techniques do not take full advantage of the possible parallelization
offered by CNNs and the cooperative use of heterogeneous devices with different
processing capabilities, clock speeds, memory size, among others. This paper
presents a new method for the parallel training of CNNs that can be considered
as a particular instantiation of model parallelism, where only the
convolutional layer is distributed. In fact, the convolutions processed during
training (forward and backward propagation included) represent from -\%
of global processing time. The paper analyzes the influence of network size,
bandwidth, batch size, number of devices, including their processing
capabilities, and other parameters. Results show that this technique is capable
of diminishing the training time without affecting the classification
performance for both CPUs and GPUs. For the CIFAR-10 dataset, using a CNN with
two convolutional layers, and and kernels, respectively, best
speedups achieve using four CPUs and with three GPUs.
Modern imaging datasets, larger and more complex than CIFAR-10 will certainly
require more than -\% of processing time calculating convolutions, and
speedups will tend to increase accordingly
A Multi-GPU Programming Library for Real-Time Applications
We present MGPU, a C++ programming library targeted at single-node multi-GPU
systems. Such systems combine disproportionate floating point performance with
high data locality and are thus well suited to implement real-time algorithms.
We describe the library design, programming interface and implementation
details in light of this specific problem domain. The core concepts of this
work are a novel kind of container abstraction and MPI-like communication
methods for intra-system communication. We further demonstrate how MGPU is used
as a framework for porting existing GPU libraries to multi-device
architectures. Putting our library to the test, we accelerate an iterative
non-linear image reconstruction algorithm for real-time magnetic resonance
imaging using multiple GPUs. We achieve a speed-up of about 1.7 using 2 GPUs
and reach a final speed-up of 2.1 with 4 GPUs. These promising results lead us
to conclude that multi-GPU systems are a viable solution for real-time MRI
reconstruction as well as signal-processing applications in general.Comment: 15 pages, 10 figure
Abnormality Detection in Mammography using Deep Convolutional Neural Networks
Breast cancer is the most common cancer in women worldwide. The most common
screening technology is mammography. To reduce the cost and workload of
radiologists, we propose a computer aided detection approach for classifying
and localizing calcifications and masses in mammogram images. To improve on
conventional approaches, we apply deep convolutional neural networks (CNN) for
automatic feature learning and classifier building. In computer-aided
mammography, deep CNN classifiers cannot be trained directly on full mammogram
images because of the loss of image details from resizing at input layers.
Instead, our classifiers are trained on labelled image patches and then adapted
to work on full mammogram images for localizing the abnormalities.
State-of-the-art deep convolutional neural networks are compared on their
performance of classifying the abnormalities. Experimental results indicate
that VGGNet receives the best overall accuracy at 92.53\% in classifications.
For localizing abnormalities, ResNet is selected for computing class activation
maps because it is ready to be deployed without structural change or further
training. Our approach demonstrates that deep convolutional neural network
classifiers have remarkable localization capabilities despite no supervision on
the location of abnormalities is provided.Comment: 6 page
Fault tolerant architectures for integrated aircraft electronics systems, task 2
The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported
- …