304 research outputs found

    Multigrain shared memory

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 197-203).by Donald Yeung.Ph.D

    Multigrain Affinity for Heterogeneous Work Stealing

    Get PDF
    International audienceIn a parallel computing context, peak performance is hard to reach with irregular applications such as sparse linear algebra operations. It requires dynamic adjustments to automatically balance the workload between several processors. The problem becomes even more complicated when an architecture contains processing units with radically different computing capabilities. We present a hierarchical scheduling scheme designed to harness several CPUs and a GPU. It is built on a two-level work stealing mechanism tightly coupled to a software-managed cache. We show that our approach is well suited to dynamically control heterogeneous architectures, while taking advantage of a reduction of data transfers

    Dynamic Multigrain Parallelization on the Cell Broadband Engine

    Get PDF
    This paper addresses the problem of orchestrating and scheduling parallelism at multiple levels of granularity on heterogeneous multicore processors. We present policies and mechanisms for adaptive exploitation and scheduling of multiple layers of parallelism on the Cell Broadband Engine. Our policies combine event-driven task scheduling with malleable loop-level parallelism, which is exposed from the runtime system whenever task-level parallelism leaves cores idle. We present a runtime system for scheduling applications with layered parallelism on Cell and investigate its potential with RAxML, a computational biology application which infers large phylogenetic trees, using the Maximum Likelihood (ML) method. Our experiments show that the Cell benefits significantly from dynamic parallelization methods, that selectively exploit the layers of parallelism in the system, in response to workload characteristics. Our runtime environment outperforms naive parallelization and scheduling based on MPI and Linux by up to a factor of 2.6. We are able to execute RAxML on one Cell four times faster than on a dual-processor system with Hyperthreaded Xeon processors, and 5--10\% faster than on a single-processor system with a dual-core, quad-thread IBM Power5 processor

    MultiGrain: a unified image embedding for classes and instances

    Get PDF
    MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted copies. Our joint training is simple: we minimize a cross-entropy loss for classification and a ranking loss that determines if two images are identical up to data augmentation, with no need for additional labels. A key component of MultiGrain is a pooling layer that takes advantage of high-resolution images with a network trained at a lower resolution. When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy. For instance, we obtain 79.4% top-1 accuracy with a ResNet-50 learned on Imagenet, which is a +1.8% absolute improvement over the AutoAugment method. When compared with the cosine similarity, the same embeddings perform on par with the state-of-the-art for image retrieval at moderate resolutions

    Practical performance of image retrieval methods

    Get PDF
    Abstract. Image retrieval is an important category of machine vision which examines the distances and similarities between images. It has many use-cases in archiving, object detection, localization and few-shot recognition. This thesis examines the problem of image retrieval in which set of images are retrieved from large-scale database based on their similarity to a query image. The problem and its different aspects are examined in this thesis as well as its history. The influence of recent development of deep learning is also covered. We experiment few different types of image retrieval problems with some recent, open-source methods and see how deep learning methods specialising in image retrieval outperform in cases where image contents are more important and classical feature extraction work better with purely visual tasks. The best results with visual tasks achieved at most two thirds accurate retrievals while with the semantic task only one in two. This implies that there is still work to do for efficient image retrieval methods.Kuvahaun menetelmien käytännön suorituskyky. Tiivistelmä. Kuvahaku on konenäön tärkeä osa-alue, joka tarkastelee kuvien välisiä etäisyyksiä ja samankaltaisuuksia. Sillä on useita käyttökohteita arkistoinnissa, objektin havaitsemisessa, paikannuksessa ja muutaman otoksen tunnistamisessa. Tämä työ käsittelee kuvahaun ongelmaa, jossa tietokannasta haetaan hakukuvalla saman näköisiä kuvia. Tätä ongelmaa ja sen eri kulmia käsitellään niinkuin myös sen historiaa. Viimeaikojen tekoälyn kehityksen vaikutus käsitellään myös. Työssä testataan paria erilaista kuvahakuongelmaa muutamalla viimeaikaisella, avoimella metodilla, ja nähdään kuinka syväoppivat, erikoistuneet metodit pärjäävät paremmin tapauksissa, joissa kuvan sisällöllä on väliä ja klassiset piirteenirroittajat paremmin visuaalisemmissa ongelmissa. Parhaimmat tulokset visuaalisissa tehtävissä saivat kaksi kolmasosaa hauista oikein ja semanttisissa tehtävissä vain puolet. Tämä viittaa siihen, että tehokkaiden kuvahakumetodien saavuttaminen vaatii vielä työtä

    Exploiting Nested Parallelism on Heterogeneous Processors

    Get PDF
    Heterogeneous computing systems have become common in modern processor architectures. These systems, such as those released by AMD, Intel, and Nvidia, include both CPU and GPU cores on a single die available with reduced communication overhead compared to their discrete predecessors. Currently, discrete CPU/GPU systems are limited, requiring larger, regular, highly-parallel workloads to overcome the communication costs of the system. Without the traditional communication delay assumed between GPUs and CPUs, we believe non-traditional workloads could be targeted for GPU execution. Specifically, this thesis focuses on the execution model of nested parallel workloads on heterogeneous systems. We have designed a simulation flow which utilizes widely used CPU and GPU simulators to model heterogeneous computing architectures. We then applied this simulator to non-traditional GPU workloads using different execution models. We also have proposed a new execution model for nested parallelism allowing users to exploit these heterogeneous systems to reduce execution time

    DETERMINING MILLENNIAL FOOD BUYING PREFERENCES: BASED ON PRODUCT MARKETING WITH “BUZZWORDS”

    Get PDF
    This research focuses on the importance on the Millennial Generation and their perceptions of food buzzwords. Since the Millennial Generation is the largest group purchasing and preparing their own foods, the food industry is becoming dependent on their buying preferences. A survey reflected the participants’ demographics and their buying preferences based on a series of food buzzwords when they are purchasing foods. Results show the Millennial Generation prefers “local” buzzwords. As the Millennial Generation continues to purchase foods for themselves and their families, it can be expected their choices will encourage others to do the same based on their family shopping factors, social interaction, and relationship building traits
    corecore