80 research outputs found

    ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

    Full text link
    Vision Transformers (ViTs) have achieved state-of-the-art performance on various vision tasks. However, ViTs' self-attention module is still arguably a major bottleneck, limiting their achievable hardware efficiency. Meanwhile, existing accelerators dedicated to NLP Transformers are not optimal for ViTs. This is because there is a large difference between ViTs and NLP Transformers: ViTs have a relatively fixed number of input tokens, whose attention maps can be pruned by up to 90% even with fixed sparse patterns; while NLP Transformers need to handle input sequences of varying numbers of tokens and rely on on-the-fly predictions of dynamic sparse attention patterns for each input to achieve a decent sparsity (e.g., >=50%). To this end, we propose a dedicated algorithm and accelerator co-design framework dubbed ViTCoD for accelerating ViTs. Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations. On the hardware level, we develop a dedicated accelerator to simultaneously coordinate the enforced denser/sparser workloads and encoder/decoder engines for boosted hardware utilization. Extensive experiments and ablation studies validate that ViTCoD largely reduces the dominant data movement costs, achieving speedups of up to 235.3x, 142.9x, 86.0x, 10.1x, and 6.8x over general computing platforms CPUs, EdgeGPUs, GPUs, and prior-art Transformer accelerators SpAtten and Sanger under an attention sparsity of 90%, respectively.Comment: Accepted to HPCA 202

    i-FlatCam: A 253 FPS, 91.49 μ\muJ/Frame Ultra-Compact Intelligent Lensless Camera for Real-Time and Efficient Eye Tracking in VR/AR

    Full text link
    We present a first-of-its-kind ultra-compact intelligent camera system, dubbed i-FlatCam, including a lensless camera with a computational (Comp.) chip. It highlights (1) a predict-then-focus eye tracking pipeline for boosted efficiency without compromising the accuracy, (2) a unified compression scheme for single-chip processing and improved frame rate per second (FPS), and (3) dedicated intra-channel reuse design for depth-wise convolutional layers (DW-CONV) to increase utilization. i-FlatCam demonstrates the first eye tracking pipeline with a lensless camera and achieves 3.16 degrees of accuracy, 253 FPS, 91.49 μ\muJ/Frame, and 6.7mm x 8.9mm x 1.2mm camera form factor, paving the way for next-generation Augmented Reality (AR) and Virtual Reality (VR) devices.Comment: Accepted by VLSI 202

    The clinical outcomes of laparoscopic proximal gastrectomy with double-tract reconstruction versus tube-like stomach reconstruction in patients with adenocarcinoma of the esophagogastric junction based on propensity score-matching: a multicenter cohort study

    Get PDF
    PurposeLaparoscopic proximal gastrectomy with double-tract reconstruction (LPG-DTR) and laparoscopic proximal gastrectomy with tube-like stomach reconstruction (LPG-TLR) are both function-preserving procedures performed for treating AEG. However, there is no clinical consensus on the selection of digestive tract reconstruction after proximal gastrectomy, and the best way to reconstruct the digestive tract remains controversial. This study aimed at comparing the clinical outcomes of LPG-DTR and LPG-TLR to provide some reference to the choice of AEG surgical modalities.MethodsThis was a multicenter, retrospective cohort study. we collected clinicopathological and follow-up data of patients with consecutive cases diagnosed with AEG from January 2016 to June 2021 in five medical centers. According to the way of digestive tract reconstruction after tumor resection, patients who underwent LPG-DTR or LPG-TLR were included in the present study. Propensity score matching (PSM) was performed to balance baseline variables that might affect the study outcomes. The QOL of the patients was evaluated using the Visick grade.ResultsA total of 124 eligible consecutive cases were finally included. Patients in both groups were matched using the PSM method, and 55 patients from each group were included in the analysis after PSM. There was no statistically significant difference between the two groups in terms of the operation time, amount of intraoperative blood loss, days of postoperative abdominal drainage tube placement, postoperative hospitalization days, total hospitalization cost, the total number of lymph nodes cleared, and the number of positive lymph nodes (P>0.05). There was a statistically significant difference between the two groups in terms of time to first flatus after surgery and postoperative soft food recovery time (P<0.05). For the nutritional status, the weight levels at 1 year after surgery was better in the LPG-DTR group than in the LPG-TLR group (P<0.05). There was no significant difference in Visick grade between the two groups (P>0.05).ConclusionThe anti-reflux effect and quality of life of LPG-DTR for AEG were comparable to those of LPG-TLR. Compared with LPG-TLR, LPG-DTR provide better nutrition status for patients with AEG. LPG-DTR is a superior reconstruction method after proximal gastrectomy

    Addressing Beacon Re-Identification Attacks: Quantification and Mitigation of Privacy Risks

    Get PDF
    The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or “beacon”) is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains, and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size, and an adversary who possesses an individual’s whole genome sequence), the individual’s membership in a beacon can be inferred through repeated queries for variants present in the individual’s genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets

    A Secure Alignment Algorithm for Mapping Short Reads to Human Genome

    No full text
    The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system

    A Watershed-Segmentation-Based Improved Algorithm for Extracting Cultivated Land Boundaries

    No full text
    To accurately extract cultivated land boundaries based on high-resolution remote sensing imagery, an improved watershed segmentation algorithm was proposed herein based on a combination of pre- and post-improvement procedures. Image contrast enhancement was used as the pre-improvement, while the color distance of the Commission Internationale de l´Eclairage (CIE) color space, including the Lab and Luv, was used as the regional similarity measure for region merging as the post-improvement. Furthermore, the area relative error criterion (δA), the pixel quantity error criterion (δP), and the consistency criterion (Khat) were used for evaluating the image segmentation accuracy. The region merging in Red–Green–Blue (RGB) color space was selected to compare the proposed algorithm by extracting cultivated land boundaries. The validation experiments were performed using a subset of Chinese Gaofen-2 (GF-2) remote sensing image with a coverage area of 0.12 km2. The results showed the following: (1) The contrast-enhanced image exhibited an obvious gain in terms of improving the image segmentation effect and time efficiency using the improved algorithm. The time efficiency increased by 10.31%, 60.00%, and 40.28%, respectively, in the RGB, Lab, and Luv color spaces. (2) The optimal segmentation and merging scale parameters in the RGB, Lab, and Luv color spaces were C for minimum areas of 2000, 1900, and 2000, and D for a color difference of 1000, 40, and 40. (3) The algorithm improved the time efficiency of cultivated land boundary extraction in the Lab and Luv color spaces by 35.16% and 29.58%, respectively, compared to the RGB color space. The extraction accuracy was compared to the RGB color space using the δA, δP, and Khat, that were improved by 76.92%, 62.01%, and 16.83%, respectively, in the Lab color space, while they were 55.79%, 49.67%, and 13.42% in the Luv color space. (4) Through the visual comparison, time efficiency, and segmentation accuracy, the comprehensive extraction effect using the proposed algorithm was obviously better than that of RGB color-based space algorithm. The established accuracy evaluation indicators were also proven to be consistent with the visual evaluation. (5) The proposed method has a satisfying transferability by a wider test area with a coverage area of 1 km2. In addition, the proposed method, based on the image contrast enhancement, was to perform the region merging in the CIE color space according to the simulated immersion watershed segmentation results. It is a useful attempt for the watershed segmentation algorithm to extract cultivated land boundaries, which provides a reference for enhancing the watershed algorithm

    Efficient Electrocatalytic Ammonia Synthesis via Theoretical Screening of Titanate Nanosheet-Supported Single-Atom Catalysts

    No full text
    The electrocatalytic nitrogen reduction reaction (NRR) for synthesizing ammonia holds promise as an alternative to the traditional high-energy-consuming Haber–Bosch method. Rational and accurate catalyst design is needed to overcome the challenge of activating N2 and to suppress the competitive hydrogen evolution reaction (HER). Single-atom catalysts have garnered widespread attention due to their 100% atom utilization efficiency and unique catalytic performance. In this context, we constructed theoretical models of metal single-atom catalysts supported on titanate nanosheets (M-TiNS). Initially, density functional theory (DFT) was employed to screen 12 single-atom catalysts for NRR- and HER-related barriers, leading to the identification of the theoretically optimal NRR catalyst, Ru-TiNS. Subsequently, experimental synthesis of the Ru-TiNS single-atom catalyst was successfully achieved, exhibiting excellent performance in catalyzing NRR, with the highest NH3 yield rate reaching 15.19 μmol mgcat−1 h−1 and a Faradaic efficiency (FE) of 15.3%. The combination of experimental results and theoretical calculations demonstrated the efficient catalytic ability of Ru sites, validating the effectiveness of the constructed theoretical screening process and providing a theoretical foundation for the design of efficient NRR catalysts
    corecore