7,868 research outputs found
Computer Architectures to Close the Loop in Real-time Optimization
© 2015 IEEE.Many modern control, automation, signal processing and machine learning applications rely on solving a sequence of optimization problems, which are updated with measurements of a real system that evolves in time. The solutions of each of these optimization problems are then used to make decisions, which may be followed by changing some parameters of the physical system, thereby resulting in a feedback loop between the computing and the physical system. Real-time optimization is not the same as fast optimization, due to the fact that the computation is affected by an uncertain system that evolves in time. The suitability of a design should therefore not be judged from the optimality of a single optimization problem, but based on the evolution of the entire cyber-physical system. The algorithms and hardware used for solving a single optimization problem in the office might therefore be far from ideal when solving a sequence of real-time optimization problems. Instead of there being a single, optimal design, one has to trade-off a number of objectives, including performance, robustness, energy usage, size and cost. We therefore provide here a tutorial introduction to some of the questions and implementation issues that arise in real-time optimization applications. We will concentrate on some of the decisions that have to be made when designing the computing architecture and algorithm and argue that the choice of one informs the other
TTC: A Tensor Transposition Compiler for Multiple Architectures
We consider the problem of transposing tensors of arbitrary dimension and
describe TTC, an open source domain-specific parallel compiler. TTC generates
optimized parallel C++/CUDA C code that achieves a significant fraction of the
system's peak memory bandwidth. TTC exhibits high performance across multiple
architectures, including modern AVX-based systems (e.g.,~Intel Haswell, AMD
Steamroller), Intel's Knights Corner as well as different CUDA-based GPUs such
as NVIDIA's Kepler and Maxwell architectures. We report speedups of TTC over a
meaningful baseline implementation generated by external C++ compilers; the
results suggest that a domain-specific compiler can outperform its general
purpose counterpart significantly: For instance, comparing with Intel's latest
C++ compiler on the Haswell and Knights Corner architecture, TTC yields
speedups of up to and , respectively. We also showcase
TTC's support for multiple leading dimensions, making it a suitable candidate
for the generation of performance-critical packing functions that are at the
core of the ubiquitous BLAS 3 routines
The OCareCloudS project: toward organizing care through trusted cloud services
The increasing elderly population and the shift from acute to chronic illness makes it difficult to care for people in hospitals and rest homes. Moreover, elderly people, if given a choice, want to stay at home as long as possible. In this article, the methodologies to develop a cloud-based semantic system, offering valuable information and knowledge-based services, are presented. The information and services are related to the different personal living hemispheres of the patient, namely the daily care-related needs, the social needs and the daily life assistance. Ontologies are used to facilitate the integration, analysis, aggregation and efficient use of all the available data in the cloud. By using an interdisciplinary research approach, where user researchers, (ontology) engineers, researchers and domain stakeholders are at the forefront, a platform can be developed of great added value for the patients that want to grow old in their own home and for their caregivers
- …