49 research outputs found

    Proteus: Simulating the Performance of Distributed DNN Training

    Full text link
    DNN models are becoming increasingly larger to achieve unprecedented accuracy, and the accompanying increased computation and memory requirements necessitate the employment of massive clusters and elaborate parallelization strategies to accelerate DNN training. In order to better optimize the performance and analyze the cost, it is indispensable to model the training throughput of distributed DNN training. However, complex parallelization strategies and the resulting complex runtime behaviors make it challenging to construct an accurate performance model. In this paper, we present Proteus, the first standalone simulator to model the performance of complex parallelization strategies through simulation execution. Proteus first models complex parallelization strategies with a unified representation named Strategy Tree. Then, it compiles the strategy tree into a distributed execution graph and simulates the complex runtime behaviors, comp-comm overlap and bandwidth sharing, with a Hierarchical Topo-Aware Executor (HTAE). We finally evaluate Proteus across a wide variety of DNNs on three hardware configurations. Experimental results show that Proteus achieves 3.0%3.0\% average prediction error and preserves order for training throughput of various parallelization strategies. Compared to state-of-the-art approaches, Proteus reduces prediction error by up to 133.8%133.8\%

    Optimizing Video Object Detection via a Scale-Time Lattice

    Full text link
    High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e.g. those that require detecting objects from video streams in real time. The key to this problem is to trade accuracy for efficiency in an effective way, i.e. reducing the computing cost while maintaining competitive performance. To seek a good balance, previous efforts usually focus on optimizing the model architectures. This paper explores an alternative approach, that is, to reallocate the computation over a scale-time space. The basic idea is to perform expensive detection sparsely and propagate the results across both scales and time with substantially cheaper networks, by exploiting the strong correlations among them. Specifically, we present a unified framework that integrates detection, temporal propagation, and across-scale refinement on a Scale-Time Lattice. On this framework, one can explore various strategies to balance performance and cost. Taking advantage of this flexibility, we further develop an adaptive scheme with the detector invoked on demand and thus obtain improved tradeoff. On ImageNet VID dataset, the proposed method can achieve a competitive mAP 79.6% at 20 fps, or 79.0% at 62 fps as a performance/speed tradeoff.Comment: Accepted to CVPR 2018. Project page: http://mmlab.ie.cuhk.edu.hk/projects/ST-Lattice

    Quantum Image Processing and Its Application to Edge Detection: Theory and Experiment

    Full text link
    Processing of digital images is continuously gaining in volume and relevance, with concomitant demands on data storage, transmission and processing power. Encoding the image information in quantum-mechanical systems instead of classical ones and replacing classical with quantum information processing may alleviate some of these challenges. By encoding and processing the image information in quantum-mechanical systems, we here demonstrate the framework of quantum image processing, where a pure quantum state encodes the image information: we encode the pixel values in the probability amplitudes and the pixel positions in the computational basis states. Our quantum image representation reduces the required number of qubits compared to existing implementations, and we present image processing algorithms that provide exponential speed-up over their classical counterparts. For the commonly used task of detecting the edge of an image, we propose and implement a quantum algorithm that completes the task with only one single-qubit operation, independent of the size of the image. This demonstrates the potential of quantum image processing for highly efficient image and video processing in the big data era.Comment: 13 pages, including 9 figures and 5 appendixe

    Rapid assessment of T-cell receptor specificity of the immune repertoire

    Get PDF
    Accurate assessment of T-cell-receptor (TCR)–antigen specificity across the whole immune repertoire lies at the heart of improved cancer immunotherapy, but predictive models capable of high-throughput assessment of TCR–peptide pairs are lacking. Recent advances in deep sequencing and crystallography have enriched the data available for studying TCR–peptide systems. Here, we introduce RACER, a pairwise energy model capable of rapid assessment of TCR–peptide affinity for entire immune repertoires. RACER applies supervised machine learning to efficiently and accurately resolve strong TCR–peptide binding pairs from weak ones. The trained parameters further enable a physical interpretation of interacting patterns encoded in each TCR–peptide system. When applied to simulate thymic selection of a major-histocompatibility-complex (MHC)-restricted T-cell repertoire, RACER accurately estimates recognition rates for tumor-associated neoantigens and foreign peptides, thus demonstrating its utility in helping address the computational challenge of reliably identifying properties of tumor antigen-specific T-cells at the level of an individual patient’s immune repertoire

    Active YAP promotes pancreatic cancer cell motility, invasion and tumorigenesis in a mitotic phosphorylation-dependent manner through LPAR3.

    Get PDF
    The transcriptional co-activator Yes-associated protein, YAP, is a main effector in the Hippo tumor suppressor pathway. We recently defined a mechanism for positive regulation of YAP through CDK1-mediated mitotic phosphorylation. Here, we show that active YAP promotes pancreatic cancer cell migration, invasion and anchorage-independent growth in a mitotic phosphorylation-dependent manner. Mitotic phosphorylation is essential for YAP-driven tumorigenesis in animals. YAP reduction significantly impairs cell migration and invasion. Immunohistochemistry shows significant upregulation and nuclear localization of YAP in metastases when compared with primary tumors and normal tissue in human. Mitotic phosphorylation of YAP controls a unique transcriptional program in pancreatic cells. Expression profiles reveal LPAR3 (lysophosphatidic acid receptor 3) as a mediator for mitotic phosphorylation-driven pancreatic cell motility and invasion. Together, this work identifies YAP as a novel regulator of pancreatic cancer cell motility, invasion and metastasis, and as a potential therapeutic target for invasive pancreatic cancer

    Clinical Study Efficacy of Combined Laparoscopic and Hysteroscopic Repair of Post-Cesarean Section Uterine Diverticulum: A Retrospective Analysis

    Get PDF
    Background. Diverticulum, one of the long-term sequelae of cesarean section, can cause abnormal uterine bleeding and increase the risk of uterine scar rupture. In this study, we aimed to evaluate the efficacy of combined laparoscopic and hysteroscopic repair, a newly occurring method, treating post-cesarean section uterine scar diverticulum. Methods. Data relating to 40 patients with post-cesarean section uterine diverticulum who underwent combined laparoscopic and hysteroscopic repair were retrospectively analyzed. Preoperative clinical manifestations, size of uterine defects, thickness of the lower uterine segment (LUS), and duration of menstruation were compared with follow-up findings at 1, 3, and 6 months after surgery. Results. The average preoperative length and width of uterine diverticula and thickness of the lower uterine segment were recorded and analyzed. The average durations of menstruations at 1, 3, and 6 months after surgery were significantly shorter than the preoperative one ( < 0.05), respectively. At 6 months after surgery, the overall success improvement rate of surgery was 90% (36/40). Three patients (3/40 = 7.5%) developed partial improvement, and 1/40 (2.5%) was lost to follow-up. Conclusions. Our findings showed that combined treatment with laparoscopy and hysteroscopy was an effective method for the repair of post-cesarean section uterine diverticulum
    corecore