1,828 research outputs found

    Dead Reckoning Localization Technique for Mobile Wireless Sensor Networks

    Full text link
    Localization in wireless sensor networks not only provides a node with its geographical location but also a basic requirement for other applications such as geographical routing. Although a rich literature is available for localization in static WSN, not enough work is done for mobile WSNs, owing to the complexity due to node mobility. Most of the existing techniques for localization in mobile WSNs uses Monte-Carlo localization, which is not only time-consuming but also memory intensive. They, consider either the unknown nodes or anchor nodes to be static. In this paper, we propose a technique called Dead Reckoning Localization for mobile WSNs. In the proposed technique all nodes (unknown nodes as well as anchor nodes) are mobile. Localization in DRLMSN is done at discrete time intervals called checkpoints. Unknown nodes are localized for the first time using three anchor nodes. For their subsequent localizations, only two anchor nodes are used. The proposed technique estimates two possible locations of a node Using Bezouts theorem. A dead reckoning approach is used to select one of the two estimated locations. We have evaluated DRLMSN through simulation using Castalia simulator, and is compared with a similar technique called RSS-MCL proposed by Wang and Zhu .Comment: Journal Paper, IET Wireless Sensor Systems, 201

    SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

    Full text link
    Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far beyond the GPU DRAM capacity. SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis}, \textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all together they effectively reduce the network-wide peak memory usage down to the maximal memory usage among layers. We also address the performance issues in those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have demonstrated that SuperNeurons trains at least 3.2432 deeper network than current ones with the leading performance. Particularly, SuperNeurons can train ResNet2500 that has 10410^4 basic network layers on a 12GB K40c.Comment: PPoPP '2018: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programmin

    Hard and Soft Error Resilience for One-sided Dense Linear Algebra Algorithms

    Get PDF
    Dense matrix factorizations, such as LU, Cholesky and QR, are widely used by scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on supercomputers, whose ever-growing scale induces a fast decline of the Mean Time To Failure (MTTF). This dissertation develops fault tolerance algorithms for one-sided dense matrix factorizations, which handles Both hard and soft errors. For hard errors, we propose methods based on diskless checkpointing and Algorithm Based Fault Tolerance (ABFT) to provide full matrix protection, including the left and right factor that are normally seen in dense matrix factorizations. A horizontal parallel diskless checkpointing scheme is devised to maintain the checkpoint data with scalable performance and low space overhead, while the ABFT checksum that is generated before the factorization constantly updates itself by the factorization operations to protect the right factor. In addition, without an available fault tolerant MPI supporting environment, we have also integrated the Checkpoint-on-Failure(CoF) mechanism into one-sided dense linear operations such as QR factorization to recover the running stack of the failed MPI process. Soft error is more challenging because of the silent data corruption, which leads to a large area of erroneous data due to error propagation. Full matrix protection is developed where the left factor is protected by column-wise local diskless checkpointing, and the right factor is protected by a combination of a floating point weighted checksum scheme and soft error modeling technique. To allow practical use on large scale system, we have also developed a complexity reduction scheme such that correct computing results can be recovered with low performance overhead. Experiment results on large scale cluster system and multicore+GPGPU hybrid system have confirmed that our hard and soft error fault tolerance algorithms exhibit the expected error correcting capability, low space and performance overhead and compatibility with double precision floating point operation

    FORM version 4.0

    Full text link
    We present version 4.0 of the symbolic manipulation system FORM. The most important new features are manipulation of rational polynomials and the factorization of expressions. Many other new functions and commands are also added; some of them are very general, while others are designed for building specific high level packages, such as one for Groebner bases. New is also the checkpoint facility, that allows for periodic backups during long calculations. Lastly, FORM 4.0 has become available as open source under the GNU General Public License version 3.Comment: 26 pages. Uses axodra

    RADIC II : a fault tolerant architecture with flexible dynamic redundancy

    Get PDF
    The demand for computational power has been leading the improvement of the High Performance Computing (HPC) area, generally represented by the use of distributed systems like clusters of computers running parallel applications. In this area, fault tolerance plays an important role in order to provide high availability isolating the application from the faults effects. Performance and availability form an undissociable binomial for some kind of applications. Therefore, the fault tolerant solutions must take into consideration these two constraints when it has been designed. In this dissertation, we present a few side-effects that some fault tolerant solutions may presents when recovering a failed process. These effects may causes degradation of the system, affecting mainly the overall performance and availability. We introduce RADIC-II, a fault tolerant architecture for message passing based on RADIC (Redundant Array of Distributed Independent Fault Tolerance Controllers) architecture. RADIC-II keeps as maximum as possible the RADIC features of transparency, decentralization, flexibility and scalability, incorporating a flexible dynamic redundancy feature, allowing to mitigate or to avoid some recovery side-effects.La demanda de computadores más veloces ha provocado el incremento del área de computación de altas prestaciones, generalmente representado por el uso de sistemas distribuidos como los clusters de computadores ejecutando aplicaciones paralelas. En esta área, la tolerancia a fallos juega un papel muy importante a la hora de proveer alta disponibilidad, aislando los efectos causados por los fallos. Prestaciones y disponibilidad componen un binomio indisociable para algunos tipos de aplicaciones. Por eso, las soluciones de tolerancia a fallos deben tener en consideración estas dos restricciones desde el momento de su diseño. En esta disertación, presentamos algunos efectos colaterales que se puede presentar en ciertas soluciones tolerantes a fallos cuando recuperan un proceso fallado. Estos efectos pueden causar una degradación del sistema, afectando las prestaciones y disponibilidad finales. Presentamos RADIC-II, una arquitectura tolerante a fallos para paso de mensajes basada en la arquitectura RADIC (Redundant Array of Distributed Independent Fault Tolerance Controllers). RADIC-II mantiene al máximo posible las características de transparencia, descentralización, flexibilidad y escalabilidad existentes en RADIC, e incorpora una flexible funcionalidad de redundancia dinámica, que permite mitigar o evitar algunos efectos colaterales en la recuperación

    Rollback recovery with low overhead for fault tolerance in mobile ad hoc networks

    Get PDF
    AbstractMobile ad hoc networks (MANETs) have significantly enhanced the wireless networks by eliminating the need for any fixed infrastructure. Hence, these are increasingly being used for expanding the computing capacity of existing networks or for implementation of autonomous mobile computing Grids. However, the fragile nature of MANETs makes the constituent nodes susceptible to failures and the computing potential of these networks can be utilized only if they are fault tolerant. The technique of checkpointing based rollback recovery has been used effectively for fault tolerance in static and cellular mobile systems; yet, the implementation of existing protocols for MANETs is not straightforward. The paper presents a novel rollback recovery protocol for handling the failures of mobile nodes in a MANET using checkpointing and sender based message logging. The proposed protocol utilizes the routing protocol existing in the network for implementing a low overhead recovery mechanism. The presented recovery procedure at a node is completely domino-free and asynchronous. The protocol is resilient to the dynamic characteristics of the MANET; allowing a distributed application to be executed independently without access to any wired Grid or cellular network access points. We also present an algorithm to record a consistent global snapshot of the MANET

    Experimental evaluation of a UWB-based cooperative positioning system for pedestrians in GNSS-denied environment

    Get PDF
    Cooperative positioning (CP) utilises information sharing among multiple nodes to enable positioning in Global Navigation Satellite System (GNSS)-denied environments. This paper reports the performance of a CP system for pedestrians using Ultra-Wide Band (UWB) technology in GNSS-denied environments. This data set was collected as part of a benchmarking measurement campaign carried out at the Ohio State University in October 2017. Pedestrians were equipped with a variety of sensors, including two different UWB systems, on a specially designed helmet serving as a mobile multi-sensor platform for CP. Different users were walking in stop-and-go mode along trajectories with predefined checkpoints and under various challenging environments. In the developed CP network, both Peer-to-Infrastructure (P2I) and Peer-to-Peer (P2P) measurements are used for positioning of the pedestrians. It is realised that the proposed system can achieve decimetre-level accuracies (on average, around 20 cm) in the complete absence of GNSS signals, provided that the measurements from infrastructure nodes are available and the network geometry is good. In the absence of these good conditions, the results show that the average accuracy degrades to meter level. Further, it is experimentally demonstrated that inclusion of P2P cooperative range observations further enhances the positioning accuracy and, in extreme cases when only one infrastructure measurement is available, P2P CP may reduce positioning errors by up to 95%. The complete test setup, the methodology for development, and data collection are discussed in this paper. In the next version of this system, additional observations such as the Wi-Fi, camera, and other signals of opportunity will be included

    Subcutaneous Vein Recognition System Using Deep Learning for Intravenous (IV) Access Procedure

    Get PDF
    Intravenous (IV) access is an important daily clinical procedure that delivers fluids or medication into a patient’s vein. However, IV insertion is very challenging where clinicians are suffering in locating the subcutaneous vein due to patients’ physiological factors such as hairy forearm and thick dermis fat, and also medical staff’s level of fatigue. To resolve this issue, researchers have proposed autonomous machines to be used for IV access, but such equipment are lacking capability in detecting the vein accurately. Therefore, this project proposes an automatic vein detection algorithm using deep learning for IV access purpose. U-Net, a fully connected network (FCN) architecture is employed in this project due to its capability in detecting the near-infrared (NIR) subcutaneous vein. Data augmentation is applied to increase the dataset size and reduce the bias from overfitting. The original U-Net architecture is optimized by replacing up-sampling with transpose convolution as well as the additional implementation of batch normalization besides reducing the number of layers to diminish the risk of overfitting. After fine-tuning and retraining the hypermodel, an unsupervised dataset is used to evaluate the hypermodel by selecting 10 checkpoints for each forearm image and comparing the checkpoints on predicted outputs to determine true positive vein pixels. The proposed lightweight U-Net has achieved slightly lower accuracy (0.8871) than the original U-Net architecture. Even so, the sensitivity, specificity, and precision are greatly improved by achieving 0.7806, 0.9935, and 0.9918 respectively. This result indicates that the proposed algorithm can be applied into the venipuncture machine to accurately locate the subcutaneous vein for intravenous (IV) procedures
    • …
    corecore