960 research outputs found

    Optimizing mining rates under financial uncertainty in global mining complexes

    Get PDF
    AbstractThis paper presents a distributed and dynamic programming framework to the mining production rate target tracking of multiple metal mines under financial uncertainty. A single mine׳s target tracking is stated as a stochastic optimization problem and the solution is obtained by solving the dynamic program which gives the optimal production rate schedule of each mine as a Markovian feedback control on the price process. The global solution is distributed on multiple mines by a policy iteration method, and this iterative method is shown to provide the unique equilibrium among Markovian strategies. Numerical results confirm the efficacy of the proposed global method when compared to individual optimization of mining rate target tracking

    The Case for Asymmetric Systolic Array Floorplanning

    Full text link
    The widespread proliferation of deep learning applications has triggered the need to accelerate them directly in hardware. General Matrix Multiplication (GEMM) kernels are elemental deep-learning constructs and they inherently map onto Systolic Arrays (SAs). SAs are regular structures that are well-suited for accelerating matrix multiplications. Typical SAs use a pipelined array of Processing Elements (PEs), which communicate with local connections and pre-orchestrated data movements. In this work, we show that the physical layout of SAs should be asymmetric to minimize wirelength and improve energy efficiency. The floorplan of the SA adjusts better to the asymmetric widths of the horizontal and vertical data buses and their switching activity profiles. It is demonstrated that such physically asymmetric SAs reduce interconnect power by 9.1% when executing state-of-the-art Convolutional Neural Network (CNN) layers, as compared to SAs of the same size but with a square (i.e., symmetric) layout. The savings in interconnect power translate, in turn, to 2.1% overall power savings.Comment: CNNA 202

    Low-Power Data Streaming in Systolic Arrays with Bus-Invert Coding and Zero-Value Clock Gating

    Full text link
    Systolic Array (SA) architectures are well suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data movements. Even though most of the dynamic power consumption in SAs is due to multiplications and additions, pipelined data movement within the SA constitutes an additional important contributor. The goal of this work is to reduce the dynamic power consumption associated with the feeding of data to the SA, by synergistically applying bus-invert coding and zero-value clock gating. By exploiting salient attributes of state-of-the-art CNNs, such as the value distribution of the weights, the proposed SA applies appropriate encoding only to the data that exhibits high switching activity. Similarly, when one of the inputs is zero, unnecessary operations are entirely skipped. This selectively targeted, application-aware encoding approach is demonstrated to reduce the dynamic power consumption of data streaming in CNN applications using Bfloat16 arithmetic by 1%-19%. This translates to an overall dynamic power reduction of 6.2%-9.4%.Comment: International Conference on Modern Circuits and Systems Technologies (MOCAST

    IndexMAC: A Custom RISC-V Vector Instruction to Accelerate Structured-Sparse Matrix Multiplications

    Full text link
    Structured sparsity has been proposed as an efficient way to prune the complexity of modern Machine Learning (ML) applications and to simplify the handling of sparse data in hardware. The acceleration of ML models - for both training and inference - relies primarily on equivalent matrix multiplications that can be executed efficiently on vector processors or custom matrix engines. The goal of this work is to incorporate the simplicity of structured sparsity into vector execution, thereby accelerating the corresponding matrix multiplications. Toward this objective, a new vector index-multiply-accumulate instruction is proposed, which enables the implementation of lowcost indirect reads from the vector register file. This reduces unnecessary memory traffic and increases data locality. The proposed new instruction was integrated in a decoupled RISCV vector processor with negligible hardware cost. Extensive evaluation demonstrates significant speedups of 1.80x-2.14x, as compared to state-of-the-art vectorized kernels, when executing layers of varying sparsity from state-of-the-art Convolutional Neural Networks (CNNs).Comment: DATE 202

    ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining

    Full text link
    Convolutional Neural Networks (CNNs) are the state-of-the-art solution for many deep learning applications. For maximum scalability, their computation should combine high performance and energy efficiency. In practice, the convolutions of each CNN layer are mapped to a matrix multiplication that includes all input features and kernels of each layer and is computed using a systolic array. In this work, we focus on the design of a systolic array with configurable pipeline with the goal to select an optimal pipeline configuration for each CNN layer. The proposed systolic array, called ArrayFlex, can operate in normal, or in shallow pipeline mode, thus balancing the execution time in cycles and the operating clock frequency. By selecting the appropriate pipeline configuration per CNN layer, ArrayFlex reduces the inference latency of state-of-the-art CNNs by 11%, on average, as compared to a traditional fixed-pipeline systolic array. Most importantly, this result is achieved while using 13%-23% less power, for the same applications, thus offering a combined energy-delay-product efficiency between 1.4x and 1.8x.Comment: DATE 202

    Real-time ECG Monitoring using Compressive sensing on a Heterogeneous Multicore Edge-Device

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.In a typical ambulatory health monitoring systems, wearable medical sensors are deployed on the human body to continuously collect and transmit physiological signals to a nearby gateway that forward the measured data to the cloud-based healthcare platform. However, this model often fails to respect the strict requirements of healthcare systems. Wearable medical sensors are very limited in terms of battery lifetime, in addition, the system reliance on a cloud makes it vulnerable to connectivity and latency issues. Compressive sensing (CS) theory has been widely deployed in electrocardiogramme ECG monitoring application to optimize the wearable sensors power consumption. The proposed solution in this paper aims to tackle these limitations by empowering a gatewaycentric connected health solution, where the most power consuming tasks are performed locally on a multicore processor. This paper explores the efficiency of real-time CS-based recovery of ECG signals on an IoT-gateway embedded with ARM’s big.littleTM multicore for different signal dimension and allocated computational resources. Experimental results show that the gateway is able to reconstruct ECG signals in real-time. Moreover, it demonstrates that using a high number of cores speeds up the execution time and it further optimizes energy consumption. The paper identifies the best configurations of resource allocation that provides the optimal performance. The paper concludes that multicore processors have the computational capacity and energy efficiency to promote gateway-centric solution rather than cloud-centric platforms

    Functional trait variation among and within species and plant functional types in mountainous Mediterranean forests

    Get PDF
    Plant structural and biochemical traits are frequently used to characterise the life history of plants. Although some common patterns of trait covariation have been identified, recent studies suggest these patterns of covariation may differ with growing location and/or plant functional type (PFT). Mediterranean forest tree/shrub species are often divided into three PFTs based on their leaf habit and form, being classified as either needleleaf evergreen (Ne), broadleaf evergreen (Be), or broadleaf deciduous (Bd). Working across 61 mountainous Mediterranean forest sites of contrasting climate and soil type, we sampled and analysed 626 individuals in order to evaluate differences in key foliage trait covariation as modulated by growing conditions both within and between the Ne, Be, and Bd functional types. We found significant differences between PFTs for most traits. When considered across PFTs and by ignoring intraspecific variation, three independent functional dimensions supporting the Leaf-Height-Seed framework were identified. Some traits illustrated a common scaling relationship across and within PFTs, but others scaled differently when considered across PFTs or even within PFTs. For most traits much of the observed variation was attributable to PFT identity and not to growing location, although for some traits there was a strong environmental component and considerable intraspecific and residual variation. Nevertheless, environmental conditions as related to water availability during the dry season and to a smaller extend to soil nutrient status and soil texture, clearly influenced trait values. When compared across species, about half of the trait-environment relationships were species-specific. Our study highlights the importance of the ecological scale within which trait covariation is considered and suggests that at regional to local scales, common trait-by-trait scaling relationships should be treated with caution. PFT definitions by themselves can potentially be an important predictor variable when inferring one trait from another. These findings have important implications for local scale dynamic vegetation models

    Ambipolar charge injection and transport in a single pentacene monolayer island

    Full text link
    Electrons and holes are locally injected in a single pentacene monolayer island. The two-dimensional distribution and concentration of the injected carriers are measured by electrical force microscopy. In crystalline monolayer islands, both carriers are delocalized over the whole island. On disordered monolayer, carriers stay localized at their injection point. These results provide insight into the electronic properties, at the nanometer scale, of organic monolayers governing performances of organic transistors and molecular devices.Comment: To be published in Nano Letter

    Deregulation of methylation of transcribed-ultra conserved regions in colorectal cancer and their value for detection of adenomas and adenocarcinomas

    No full text
    Expression of Transcribed Ultraconserved Regions (T-UCRs) is often deregulated in cancer. The present study assesses the expression and methylation of three T-UCRs (Uc160, Uc283 and Uc346) in colorectal cancer (CRC) and explores the potential of T-UCR methylation in circulating DNA for the detection of adenomas and adenocarcinomas. Expression levels of Uc160, Uc283 and Uc346 were lower in neoplastic tissues from 64 CRC patients (statistically significant for Uc160, p<0.001), compared to non-malignant tissues, while methylation levels displayed the inverse pattern (p<0.001, p=0.001 and p=0.004 respectively). In colon cancer cell lines, overexpression of Uc160 and Uc346 led to increased proliferation and migration rates. Methylation levels of Uc160 in plasma of 50 CRC, 59 adenoma patients, 40 healthy subjects and 12 patients with colon inflammation or diverticulosis predicted the presence of CRC with 35% sensitivity and 89% specificity (p=0.016), while methylation levels of the combination of all three T-UCRs resulted in 45% sensitivity and 74.3% specificity (p=0.013). In conclusion, studied T-UCRs’ expression and methylation status are deregulated in CRC while Uc160 and Uc346 appear to have a complicated role in CRC progression. Moreover their methylation status appears a promising non-invasive screening test for CRC, provided that the sensitivity of the assay is improved
    • …
    corecore