5,759 research outputs found

    Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation

    Full text link
    TensorFlow has been the most widely adopted Machine/Deep Learning framework. However, little exists in the literature that provides a thorough understanding of the capabilities which TensorFlow offers for the distributed training of large ML/DL models that need computation and communication at scale. Most commonly used distributed training approaches for TF can be categorized as follows: 1) Google Remote Procedure Call (gRPC), 2) gRPC+X: X=(InfiniBand Verbs, Message Passing Interface, and GPUDirect RDMA), and 3) No-gRPC: Baidu Allreduce with MPI, Horovod with MPI, and Horovod with NVIDIA NCCL. In this paper, we provide an in-depth performance characterization and analysis of these distributed training approaches on various GPU clusters including the Piz Daint system (6 on Top500). We perform experiments to gain novel insights along the following vectors: 1) Application-level scalability of DNN training, 2) Effect of Batch Size on scaling efficiency, 3) Impact of the MPI library used for no-gRPC approaches, and 4) Type and size of DNN architectures. Based on these experiments, we present two key insights: 1) Overall, No-gRPC designs achieve better performance compared to gRPC-based approaches for most configurations, and 2) The performance of No-gRPC is heavily influenced by the gradient aggregation using Allreduce. Finally, we propose a truly CUDA-Aware MPI Allreduce design that exploits CUDA kernels and pointer caching to perform large reductions efficiently. Our proposed designs offer 5-17X better performance than NCCL2 for small and medium messages, and reduces latency by 29% for large messages. The proposed optimizations help Horovod-MPI to achieve approximately 90% scaling efficiency for ResNet-50 training on 64 GPUs. Further, Horovod-MPI achieves 1.8X and 3.2X higher throughput than the native gRPC method for ResNet-50 and MobileNet, respectively, on the Piz Daint cluster.Comment: 10 pages, 9 figures, submitted to IEEE IPDPS 2019 for peer-revie

    Performance enhancement of the photovoltaic cells system by using the pneumatic routers

    Get PDF
    Solar photovoltaic modules are of immense benefits to ordinary people in terms of independent energy solutions and conventional fuel savings. However, due to the inherent drawback of lower efficiencies per unit area, these technologies adoption rates are very slow and face resistance from domestic consumers for widespread acceptance. Thus, solar photovoltaic thermal hybrid technology was suggested, producing electrical and thermal output from the same unit area. Unfortunately, the lower individual efficiencies of the PV/T collector compared to their individual technologies hinders the potential advantages of this hybrid technology. This is due to the low solar energy absorption and high thermal resistance between the PV cell and the cooling medium. This study aims to develop a novel photovoltaic thermal collector to evaluate PVT performance using three rib configurations with pneumatic guiding devices. This thereby reduced thermal resistance and improved performance using different angles to increase system efficiency and reduce thermal losses resulting from increased temperature. The channel was developed and designed in the new model in three phases to study the improvement of heat transfer. The first phase is to test the simulation of the pneumatic routers numbers in the ribs, while the second phase is to test the simulation of the ribs numbers in the channel. Simulation analysis was conducted using 3D simulation by ANSYS-Fluent software to determine the optimum design of configurations in terms of the airflow channel. The results best from the simulation test indicate that the PVT complex with seven polygons and five vectors was the best design. The simulation results are shown in a combined PVT efficiency of 70.86 % and electrical PVT efficiency of 11.22% with a mass flow rate of 0.17 kg/s and solar irradiance of 1000 W /m². In the third phase, three different angles were chosen for pneumatic routers tested experimentally to determine the best angle. All configurations were set and tested experimentally outdoor under the Iraq climatic conditions to ASHRAE standard at different air mass flow rates. Experimental results of a PV inboard consisting of pneumatic ribs and angle guides with highest daily performance and electrical and thermal efficiency at angle guides of 30 ° compared to 45 ° and 15 ° and an empty PVT collector tube at air mass flow rate of (0.08- 0.17) kg/s. A good agreement was obtained when the 3D simulation and experimental results were compared. It was the average difference in the outlet air temperatures obtained in the numerical and experimental results from 6.18 % to 6.47 % and of the electrical and thermal efficiency from 5.25 % to 6.37 % respectivel

    Adaptive transient solution of nonuniform multiconductor transmission lines using wavelets

    Get PDF
    Abstract—This paper presents a highly adaptive algorithm for the transient simulation of nonuniform interconnects loaded with arbitrary nonlinear and dynamic terminations. The discretization of the governing equations is obtained through a weak formula-tion using biorthogonal wavelet bases as trial and test functions. It is shown how the multiresolution properties of wavelets lead to very sparse approximations of the voltages and currents in typical transient analyzes. A simple yet effective time–space adaptive al-gorithm capable of selecting the minimal number of unknowns at each time iteration is described. Numerical results show the high degree of adaptivity of the proposed scheme. Index Terms—Electromagnetic (EM) transient analysis, multi-conductor transmission lines (TLs), wavelet transforms. I

    Electricity from photovoltaic solar cells: Flat-Plate Solar Array Project final report. Volume VI: Engineering sciences and reliability

    Get PDF
    The Flat-Plate Solar Array (FSA) Project, funded by the U.S. Government and managed by the Jet Propulsion Laboratory, was formed in 1975 to develop the module/array technology needed to attain widespread terrestrial use of photovoltaics by 1985. To accomplish this, the FSA Project established and managed an Industry, University, and Federal Government Team to perform the needed research and development. This volume of the series of final reports documenting the FSA Project deals with the Project's activities directed at developing the engineering technology base required to achieve modules that meet the functional, safety and reliability requirements of large-scale terrestrial photovoltaic systems applications. These activities included: (1) development of functional, safety, and reliability requirements for such applications; (2) development of the engineering analytical approaches, test techniques, and design solutions required to meet the requirements; (3) synthesis and procurement of candidate designs for test and evaluation; and (4) performance of extensive testing, evaluation, and failure analysis to define design shortfalls and, thus, areas requiring additional research and development. During the life of the FSA Project, these activities were known by and included a variety of evolving organizational titles: Design and Test, Large-Scale Procurements, Engineering, Engineering Sciences, Operations, Module Performance and Failure Analysis, and at the end of the Project, Reliability and Engineering Sciences. This volume provides both a summary of the approach and technical outcome of these activities and provides a complete Bibliography (Appendix A) of the published documentation covering the detailed accomplishments and technologies developed

    Metal oxide semiconductor nanomembrane-based soft unnoticeable multifunctional electronics for wearable human-machine interfaces

    Get PDF
    Wearable human-machine interfaces (HMIs) are an important class of devices that enable human and machine interaction and teaming. Recent advances in electronics, materials, and mechanical designs have offered avenues toward wearable HMI devices. However, existing wearable HMI devices are uncomfortable to use and restrict the human body's motion, show slow response times, or are challenging to realize with multiple functions. Here, we report sol-gel-on-polymer-processed indium zinc oxide semiconductor nanomembrane-based ultrathin stretchable electronics with advantages of multifunctionality, simple manufacturing, imperceptible wearing, and robust interfacing. Multifunctional wearable HMI devices range from resistive random-access memory for data storage to field-effect transistors for interfacing and switching circuits, to various sensors for health and body motion sensing, and to microheaters for temperature delivery. The HMI devices can be not only seamlessly worn by humans but also implemented as prosthetic skin for robotics, which offer intelligent feedback, resulting in a closed-loop HMI system

    Investigation of welded interconnection of large area wraparound contacted silicon solar cells

    Get PDF
    An investigation was conducted to evaluate the welding and temperature cycle testing of large area 5.9 x 5.9 wraparound silicon solar cells utilizing printed circuit substrates with SSC-155 interconnect copper metals and the LMSC Infrared Controlled weld station. An initial group of 5 welded modules containing Phase 2 developmental 5.9 x 5.9 cm cells were subjected to cyclical temperatures of + or 80 C at a rate of 120 cycles per day. Anomalies were noted in the adhesion of the cell contact metallization; therefore, 5 additional modules were fabricated and tested using available Phase I cells with demonstrated contact integrity. Cycling of the later module type through 12,000 cycles indicated the viability of this type of lightweight flexible array concept. This project demonstrated acceptable use of an alternate interconnect copper in combination with large area wraparound cells and emphasized the necessity to implement weld pull as opposed to solder pull procedures at the cell vendors for cells that will be interconnected by welding

    Multifunctional vertical interconnections of multilayered flexible substrates for miniaturised POCT devices

    Get PDF
    Point-of-care testing (POCT) is an emerging technology which can lead to an eruptive change of lifestyle and medication of population against the traditional medical laboratory. Since living organisms are intrinsically flexible and malleable, the flexible substrate is a necessity for successful integration of electronics in biological systems that do not cause discomfort during prolonged use. Isotropic conductive adhesives (ICAs) are attractive to wearable POCT devices because ICAs are environmentally friendly and allow a lower processing temperature than soldering which protects heat-sensitive components. Vertical interconnections and optical interconnections are considered as the technologies to realise the miniaturised high-performance devices for the future applications. This thesis focused on the multifunctional integration to enable both electrical and optical vertical interconnections through one via hole that can be fabricated in flexible substrates. The functional properties of the via and their response to the external loadings which are likely encountered in the POCT devices are the primary concerns of this PhD project. In this thesis, the research of curing effect on via performance was first conducted by studying the relationship between curing conditions and material properties. Based on differential scanning calorimetry (DSC) analysis results, two-parameter autocatalytic model (Sestak-Berggren model) was established as the most suitable curing process description of our typical ICA composed of epoxy-based binders and Ag filler particles. A link between curing conditions and the mechanical properties of ICAs was established based on the DMA experiments. A series of test vehicles containing vias filled with ICAs were cured under varying conditions. The electrical resistance of the ICA filled vias were measured before testing and in real time during thermal cycling tests, damp heat tests and bending tests. A simplified model was derived to represent rivet-shaped vias in the flexible printed circuit boards (FPCBs) based on the assumption of homogenous ICAs. An equation was thus proposed to evaluate the resistance of the model. Vias with different cap sizes were also tested, and the equation was validated. Those samples were divided into three groups for thermal cycling test, damp heat ageing test and bending test. Finite element analysis (FEA) was used to aid better understanding of the electrical conduction mechanisms. Based on theoretical equation and simulation model, the fistula-shape ICA via was fabricated in flexible PCB. Its hollow nature provides the space for integrations of optical or fluidic circuits. Resistance measurements and reliability tests proved that carefully designed and manufactured small bores in vias did not comprise the performance. Test vehicles with optoelectrical vias were made through two different approaches to prove the feasibility of multifunctional vertical interconnections in flexible substrates. A case study was carried out on reflection Photoplethysmography (rPPG) sensors manufacturing, using a specially designed optoelectronic system. ICA-based low-temperature manufacture processes were developed to enable the integration of these flexible but delicate substrates and components. In the manufacturing routes, a modified stencil printing setup, which merges two printing-curing steps (vias forming and components bonding) into one step, was developed to save both time and energy. The assembled probes showed the outstanding performance in functional and physiological tests. The results from this thesis are anticipated to facilitate the understanding of ICA via conduction mechanism and provide an applicable tool to optimise the design and manufacturing of optoelectrical vias

    Mechanics of Non Planar Interfaces in Flip-Chip Interconnects

    Get PDF
    With the continued proliferation of low cost, portable consumer electronic products with greater functionality, there is increasing demand for electronic packaging that is smaller, lighter and less expensive. Flip chip is an essential enabling technology for these products. The electrical connection between the chip I/O and substrate is achieved using conductive materials, such as solder, conductive epoxy, metallurgy bump (e.g., gold) and anisotropic conductive adhesives. The interconnect regions of flip-chip packages consists of highly dissimilar materials to meet their functional requirements. The mismatches in properties, contact morphology and crystal orientation at those material interfaces make them vulnerable to failure through delamination and crack growth under various loading patterns. This study encompasses contact between deformable bodies, bonding at the asperities and fracture properties at interfaces formed by the interconnects of flip-chip packages. This is achieved through experimentation and modeling at different length scales, to be able to capture the detailed microstructural features and contact mechanics at interfaces typically found in electronic systems
    corecore