16 research outputs found

    Emotion analysis of social network data using cluster based probabilistic neural network with data parallelism

    Get PDF
    Social media contains a huge amount of data that is used by various organizations to study people鈥檚 emotions, thoughts and opinions. Users often use emoticons and emojis in addition to words to express their opinions on a topic. Emotion identification from text is no exception, but research in this area is still in its infancy. There are not many emotion annotated corpora available today. The complexity of the annotation task and the resulting inconsistent human comments are a challenge in developing emotion annotated corpora. Numerous studies have been carried out to solve these problems. The proposed methods were unable to perform emotion classification in a simple and cost-effective manner. To solve these problems, an efficient classification of emotions in recordings based on clustering is proposed. A dataset of social media posts is pre-processed to remove unwanted elements and then clustered. Semantic and emotional features are selected to improve classification efficiency. To reduce computation time and increase the efficiency of the system for predicting the probability of emotions, the concept of data parallelism in the classifier is proposed. The proposed model is tested using MATLAB software. The proposed model achieves 92 % accuracy on the annotated dataset and 94 % accuracy on the WASSA-2017 dataset. Performance comparison with other existing methods, such as Parallel K-Nearest Neighboring and Parallel Naive Byes Model methods, is performed. The comparison results showed that the proposed model is most effective in predicting emotions compared to existing models

    Scheduling strategies for optimistic parallel execution of irregular programs

    Full text link

    A Survey on Thread-Level Speculation Techniques

    Get PDF
    Producci贸n Cient铆ficaThread-Level Speculation (TLS) is a promising technique that allows the parallel execution of sequential code without relying on a prior, compile-time-dependence analysis. In this work, we introduce the technique, present a taxonomy of TLS solutions, and summarize and put into perspective the most relevant advances in this field.MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H5 network (TIN2014-53522-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS)

    Supporting speculative parallelization in the presence of dynamic data structures

    Full text link

    Matching non-uniformity for program optimizations on heterogeneous many-core systems

    Get PDF
    As computing enters an era of heterogeneity and massive parallelism, it exhibits a distinct feature: the deepening non-uniform relations among the computing elements in both hardware and software. Besides traditional non-uniform memory accesses, much deeper non-uniformity shows in a processor, runtime, and application, exemplified by the asymmetric cache sharing, memory coalescing, and thread divergences on multicore and many-core processors. Being oblivious to the non-uniformity, current applications fail to tap into the full potential of modern computing devices.;My research presents a systematic exploration into the emerging property. It examines the existence of such a property in modern computing, its influence on computing efficiency, and the challenges for establishing a non-uniformity--aware paradigm. I propose several techniques to translate the property into efficiency, including data reorganization to eliminate non-coalesced accesses, asynchronous data transformations for locality enhancement and a controllable scheduling for exploiting non-uniformity among thread blocks. The experiments show much promise of these techniques in maximizing computing throughput, especially for programs with complex data access patterns

    Automatic skeleton-driven performance optimizations for transactional memory

    Get PDF
    The recent shift toward multi -core chips has pushed the burden of extracting performance to the programmer. In fact, programmers now have to be able to uncover more coarse -grain parallelism with every new generation of processors, or the performance of their applications will remain roughly the same or even degrade. Unfortunately, parallel programming is still hard and error prone. This has driven the development of many new parallel programming models that aim to make this process efficient.This thesis first combines the skeleton -based and transactional memory programming models in a new framework, called OpenSkel, in order to improve performance and programmability of parallel applications. This framework provides a single skeleton that allows the implementation of transactional worklist applications. Skeleton or pattern-based programming allows parallel programs to be expressed as specialized instances of generic communication and computation patterns. This leaves the programmer with only the implementation of the particular operations required to solve the problem at hand. Thus, this programming approach simplifies parallel programming by eliminating some of the major challenges of parallel programming, namely thread communication, scheduling and orchestration. However, the application programmer has still to correctly synchronize threads on data races. This commonly requires the use of locks to guarantee atomic access to shared data. In particular, lock programming is vulnerable to deadlocks and also limits coarse grain parallelism by blocking threads that could be potentially executed in parallel.Transactional Memory (TM) thus emerges as an attractive alternative model to simplify parallel programming by removing this burden of handling data races explicitly. This model allows programmers to write parallel code as transactions, which are then guaranteed by the runtime system to execute atomically and in isolation regardless of eventual data races. TM programming thus frees the application from deadlocks and enables the exploitation of coarse grain parallelism when transactions do not conflict very often. Nevertheless, thread management and orchestration are left for the application programmer. Fortunately, this can be naturally handled by a skeleton framework. This fact makes the combination of skeleton -based and transactional programming a natural step to improve programmability since these models complement each other. In fact, this combination releases the application programmer from dealing with thread management and data races, and also inherits the performance improvements of both models. In addition to it, a skeleton framework is also amenable to skeleton - driven iii performance optimizations that exploits the application pattern and system information.This thesis thus also presents a set of pattern- oriented optimizations that are automatically selected and applied in a significant subset of transactional memory applications that shares a common pattern called worklist. These optimizations exploit the knowledge about the worklist pattern and the TM nature of the applications to avoid transaction conflicts, to prefetch data, to reduce contention etc. Using a novel autotuning mechanism, OpenSkel dynamically selects the most suitable set of these patternoriented performance optimizations for each application and adjusts them accordingly. Experimental results on a subset of five applications from the STAMP benchmark suite show that the proposed autotuning mechanism can achieve performance improvements within 2 %, on average, of a static oracle for a 16 -core UMA (Uniform Memory Access) platform and surpasses it by 7% on average for a 32 -core NUMA (Non -Uniform Memory Access) platform.Finally, this thesis also investigates skeleton -driven system- oriented performance optimizations such as thread mapping and memory page allocation. In order to do it, the OpenSkel system and also the autotuning mechanism are extended to accommodate these optimizations. The conducted experimental results on a subset of five applications from the STAMP benchmark show that the OpenSkel framework with the extended autotuning mechanism driving both pattern and system- oriented optimizations can achieve performance improvements of up to 88 %, with an average of 46 %, over a baseline version for a 16 -core UMA platform and up to 162 %, with an average of 91 %, for a 32 -core NUMA platform

    On the design of architecture-aware algorithms for emerging applications

    Get PDF
    This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures.Ph.D.Committee Chair: Bader, David; Committee Member: Hong, Bo; Committee Member: Riley, George; Committee Member: Vuduc, Richard; Committee Member: Wills, Scot

    Compile-time support for thread-level speculation

    Get PDF
    Una de las principales preocupaciones de las ciencias de la computaci贸n es el estudio de las capacidades paralelas tanto de programas como de los procesadores que los ejecutan. Existen varias razones que hacen muy deseable el desarrollo de t茅cnicas que paralelicen autom谩ticamente el c贸digo. Entre ellas se encuentran el inmenso n煤mero de programas secuenciales existentes ya escritos, la complejidad de los lenguajes de programaci贸n paralelos, y los conocimientos que se requieren para paralelizar un c贸digo. Sin embargo, los actuales mecanismos de paralelizaci贸n autom谩tica implementados en los compiladores comerciales no son capaces de paralelizar la mayor铆a de los bucles en un c贸digo [1], debido a la dependencias de datos que existen entre ellos [2]. Por lo tanto, se hace necesaria la b煤squeda de nuevas t茅cnicas, como la paralelizaci贸n especulativa [3-5], que saquen beneficio de las potenciales capacidades paralelas del hardware y arquitecturas multiprocesador actuales. Sin embargo, 茅sta y otras t茅cnicas requieren la intervenci贸n manual de programadores experimentados. Antes de ofrecer soluciones alternativas, se han evaluado las capacidades de paralelizaci贸n de los compiladores comerciales, exponiendo las limitaciones de los mecanismos de paralelizaci贸n autom谩tica que implementan. El estudio revela que estos mecanismos de paralelizaci贸n autom谩tica s贸lo alcanzan un 19% de speedup en promedio para los benchmarks del SPEC CPU2006 [6], siendo este un resultado significativamente inferior al obtenido por t茅cnicas de paralelizaci贸n especulativa [7]. Sin embargo, la paralelizaci贸n especulativa requiere una extensa modificaci贸n manual del c贸digo por parte de programadores. Esta Tesis aborda este problema definiendo una nueva cl谩usula OpenMP [8], llamada 驴speculative驴, que permite se帽alar qu茅 variables pueden llevar a una violaci贸n de dependencia. Adem谩s, esta Tesis tambi茅n propone un sistema en tiempo de compilaci贸n que, usando la informaci贸n sobre los accesos a las variables que proporcionan las cl谩usulas OpenMP, a帽ade autom谩ticamente todo el c贸digo necesario para gestionar la ejecuci贸n especulativa de un programa. Esto libera al programador de modificar el c贸digo manualmente, evitando posibles errores y una tediosa tarea. El c贸digo generado por nuestro sistema enlaza con la librer铆a de ejecuci贸n especulativamente paralela desarrollada por Estebanez, Garc铆a-Yag眉ez, Llanos y Gonzalez-Escribano [9,10].Departamento de Inform谩tica (Arquitectura y Tecnolog铆a de Computadores, Ciencias de la Computaci贸n e Inteligencia Artificial, Lenguajes y Sistemas Inform谩ticos
    corecore