6,258 research outputs found

    Thermal Characterization of Next-Generation Workloads on Heterogeneous MPSoCs

    Get PDF
    Next-generation High-Performance Computing (HPC) applications need to tackle outstanding computational complexity while meeting latency and Quality-of-Service constraints. Heterogeneous Multi-Processor Systems-on-Chip (MPSoCs), equipped with a mix of general-purpose cores and reconfigurable fabric for custom acceleration of computational blocks, are key in providing the flexibility to meet the requirements of next-generation HPC. However, heterogeneity brings new challenges to efficient chip thermal management. In this context, accurate and fast thermal simulators are becoming crucial to understand and exploit the trade-offs brought by heterogeneous MPSoCs. In this paper, we first thermally characterize a next-generation HPC workload, the online video transcoding application, using a highly-accurate Infra-Red (IR) microscope. Second, we extend the 3D-ICE thermal simulation tool with a new generic heat spreader model capable of accurately reproducing package surface temperature, with an average error of 6.8% for the hot spots of the chip. Our model is used to characterize the thermal behaviour of the online transcoding application when running on a heterogeneous MPSoC. Moreover, by using our detailed thermal system characterization we are able to explore different application mappings as well as the thermal limits of such heterogeneous platforms

    Traffic Forecasting for Pavement Design

    Get PDF
    The need for improved traffic estimation procedures has been emphasized by several studies that demonstrated that previously available data were not adequate. Some data were not considered representative of actual traffic conditions because of overloaded trucks avoiding weighing scales and insufficient traffic sampling programs. In addition, previous forecasting procedures did not reflect the increases in legal load limits, the significant increase in the number of heavy trucks, or the shift toward larger vehicle types that has occurred in recent years. Improved estimates of current traffic loadings based on larger samples of much higher quality data would allow development of procedures for making improved estimates of historical traffic loadings and better forecasts of traffic loadings during the design period. The emergence of automatic vehicle classification equipment, permanent and portable weigh-in-motion (WIM) systems, and the application of microprocessors and microcomputers to these data acquisition functions now offer tools that may be used effectively in meeting these needs. Representatives from four States (Florida, Kentucky, Oregon, and Washington) met on several occasions to discuss the subject of traffic forecasting for pavement design. Information was compiled on all aspects of the traffic forecasting process, options were presented for each step of the process, and recommendations were developed to assist highway agencies in improving current practices and procedures

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Robust Magnetic Resonance Imaging of Short T2 Tissues

    Get PDF
    Tissues with short transverse relaxation times are defined as ‘short T2 tissues’, and short T2 tissues often appear dark on images generated by conventional magnetic resonance imaging techniques. Common short T2 tissues include tendons, meniscus, and cortical bone. Ultrashort Echo Time (UTE) pulse sequences can provide morphologic contrasts and quantitative maps for short T2 tissues by reducing time-of-echo to the system minimum (e.g., less than 100 us). Therefore, UTE sequences have become a powerful imaging tool for visualizing and quantifying short T2 tissues in many applications. In this work, we developed a new Flexible Ultra Short time Echo (FUSE) pulse sequence employing a total of thirteen acquisition features with adjustable parameters, including optimized radiofrequency pulses, trajectories, choice of two or three dimensions, and multiple long-T2 suppression techniques. Together with the FUSE sequence, an improved analytical density correction and an auto-deblurring algorithm were incorporated as part of a novel reconstruction pipeline for reducing imaging artifacts. Firstly, we evaluated the FUSE sequence using a phantom containing short T2 components. The results demonstrated that differing UTE acquisition methods, improving the density correction functions and improving the deblurring algorithm could reduce the various artifacts, improve the overall signal, and enhance short T2 contrast. Secondly, we applied the FUSE sequence in bovine stifle joints (similar to the human knee) for morphologic imaging and quantitative assessment. The results showed that it was feasible to use the FUSE sequence to create morphologic images that isolate signals from the various knee joint tissues and carry out comprehensive quantitative assessments, using the meniscus as a model, including the mappings of longitudinal relaxation (T1) times, quantitative magnetization transfer parameters, and effective transverse relaxation (T2*) times. Lastly, we utilized the FUSE sequence to image the human skull for evaluating its feasibility in synthetic computed tomography (CT) generation and radiation treatment planning. The results demonstrated that the radiation treatment plans created using the FUSE-based synthetic CT and traditional CT data were able to present comparable dose calculations with the dose difference of mean less than a percent. In summary, this thesis clearly demonstrated the need for the FUSE sequence and its potential for robustly imaging short T2 tissues in various applications

    Sequence-Level Reference Frames In Video Coding

    Get PDF
    The proliferation of low-cost DRAM chipsets now begins to allow for the consideration of substantially-increased decoded picture buffers in advanced video coding standards such as HEVC, VVC, and Google VP9. At the same time, the increasing demand for rapid scene changes and multiple scene repetitions in entertainment or broadcast content indicates that extending the frame referencing interval to tens of minutes or even the entire video sequence may offer coding gains, as long as one is able to identify frame similarity in a computationally- and memory-efficient manner. Motivated by these observations, we propose a “stitching” method that defines a reference buffer and a reference frame selection algorithm. Our proposal extends the referencing interval of inter-frame video coding to the entire length of video sequences. Our reference frame selection algorithm uses well-established feature descriptor methods that describe frame structural elements in a compact and semantically-rich manner. We propose to combine such compact descriptors with a similarity scoring mechanism in order to select the frames to be “stitched” to reference picture buffers of advanced inter-frame encoders like HEVC, VVC, and VP9 without breaking standard compliance. Our evaluation on synthetic and real-world video sequences with the HEVC and VVC reference encoders shows that our method offers significant rate gains, with complexity and memory requirements that remain manageable for practical encoders and decoders

    Virtual inertia for suppressing voltage oscillations and stability mechanisms in DC microgrids

    Get PDF
    Renewable energy sources (RES) are gradually penetrating power systems through power electronic converters (PECs), which greatly change the structure and operation characteristics of traditional power systems. The maturation of PECs has also laid a technical foundation for the development of DC microgrids (DC-MGs). The advantages of DC-MGs over AC systems make them an important access target for RES. Due to the multi-timescale characteristics and fast response of power electronics, the dynamic coupling of PEC control systems and the transient interaction between the PEC and the passive network are inevitable, which threatens the stable operation of DC-MGs. Therefore, this dissertation focuses on the study of stabilization control methods, the low-frequency oscillation (LFO) mechanism analysis of DC-MGs and the state-of-charge (SoC) imbalance problem of multi-parallel energy storage systems (ESS). Firstly, a virtual inertia and damping control (VIDC) strategy is proposed to enable bidirectional DC converters (BiCs) to damp voltage oscillations by using the energy stored in ESS to emulate inertia without modifications to system hardware. Both the inertia part and the damping part are modeled in the VIDC controller by analogy with DC machines. Simulation results verify that the proposed VIDC can improve the dynamic characteristics and stability in islanded DC-MG. Then, inertia droop control (IDC) strategies are proposed for BiC of ESS based on the comparison between conventional droop control and VIDC. A feedback analytical method is presented to comprehend stability mechanisms from multi-viewpoints and observe the interaction between variables intuitively. A hardware in the loop (HIL) experiment verifies that IDC can simplify the control structure of VIDC in the promise of ensuring similar control performances. Subsequently, a multi-timescale impedance model is established to clarify the control principle of VIDC and the LFO mechanisms of VIDC-controlled DC-MG. Control loops of different timescales are visualized as independent loop virtual impedances (LVIs) to form an impedance circuit. The instability factors are revealed and a dynamic stability enhancement method is proposed to compensate for the negative damping caused by VIDC and CPL. Experimental results have validated the LFO mechanism analysis and stability enhancement method. Finally, an inertia-emulation-based cooperative control strategy for multi-parallel ESS is proposed to address the SoC imbalance and voltage deviation problem in steady-state operation and the voltage stability problem. The contradiction between SoC balancing speed and maintaining system stability is solved by a redefined SoC-based droop resistance function. HIL experiments prove that the proposed control performs better dynamics and static characteristics without modifying the hardware and can balance the SoC in both charge and discharge modes

    Computer Vision Approaches to Liquid-Phase Transmission Electron Microscopy

    Get PDF
    Electron microscopy (EM) is a technique that exploits the interaction between electron and matter to produce high resolution images down to atomic level. In order to avoid undesired scattering in the electron path, EM samples are conventionally imaged in solid state under vacuum conditions. Recently, this limit has been overcome by the realization of liquid-phase electron microscopy (LP EM), a technique that enables the analysis of samples in their liquid native state. LP EM paired with a high frame rate acquisition direct detection camera allows tracking the motion of particles in liquids, as well as their temporal dynamic processes. In this research work, LP EM is adopted to image the dynamics of particles undergoing Brownian motion, exploiting their natural rotation to access all the particle views, in order to reconstruct their 3D structure via tomographic techniques. However, specific computer vision-based tools were designed around the limitations of LP EM in order to elaborate the results of the imaging process. Consequently, different deblurring and denoising approaches were adopted to improve the quality of the images. Therefore, the processed LP EM images were adopted to reconstruct the 3D model of the imaged samples. This task was performed by developing two different methods: Brownian tomography (BT) and Brownian particle analysis (BPA). The former tracks in time a single particle, capturing its dynamics evolution over time. The latter is an extension in time of the single particle analysis (SPA) technique. Conventionally it is paired to cryo-EM to reconstruct 3D density maps starting from thousands of EM images by capturing hundreds of particles of the same species frozen on a grid. On the contrary, BPA has the ability to process image sequences that may not contain thousands of particles, but instead monitors individual particle views across consecutive frames, rather than across a single frame

    High Performance Multiview Video Coding

    Get PDF
    Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications

    Low-power techniques for video decoding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 149-156).The H.264 video coding standard can deliver high compression efficiency at a cost of large complexity and power. The increasing popularity of video capture and playback on portable devices requires that the energy of the video processing be kept to a minimum. This work implements several architecture optimizations that reduce the system power of a high-definition video decoder. In order to decode high resolutions at low voltages and low frequencies, we employ techniques such as pipelining, unit parallelism, multiple cores, and multiple voltage/frequency domains. For example, a 3-core decoder can reduce the required clock frequency by 2.91 x, which enables a power reduction of 61% relative to a full-voltage single-core decoder. To reduce the total memory system power, several caching techniques are demonstrated that can dramatically reduce the off-chip memory bandwidth and power at the cost of increased chip area. A 123 kB data-forwarding cache can reduce the read bandwidth from external memory by 53%, which leads to 44% power savings in the memory reads. To demonstrate these low-power ideas, a H.264/AVC Baseline Level 3.2 decoder ASIC was fabricated in 65 nm CMOS and verified. It operates down to 0.7 V and has a measured power down to 1.8 mW when decoding a high definition 720p video at 30 frames per second, which is over an order of magnitude lower than previously published results.by Daniel Frederic Finchelstein.Ph.D
    corecore