232 research outputs found

    In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes

    Full text link
    The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amount of time is spent on data preprocessing in these pipelines, and hence improving its e fficiency directly impacts the overall pipeline performance. The community has recently embraced the concept of Dataframes as the de-facto data structure for data representation and manipulation. However, the most widely used serial Dataframes today (R, pandas) experience performance limitations while working on even moderately large data sets. We believe that there is plenty of room for improvement by taking a look at this problem from a high-performance computing point of view. In a prior publication, we presented a set of parallel processing patterns for distributed dataframe operators and the reference runtime implementation, Cylon [1]. In this paper, we are expanding on the initial concept by introducing a cost model for evaluating the said patterns. Furthermore, we evaluate the performance of Cylon on the ORNL Summit supercomputer

    Low-Latency ML Inference by Grouping Correlated Data Objects and Computation

    Full text link
    ML inference workflows often require low latency and high throughput, yet we lack good options for addressing this need. Techniques that reduce latency in other streaming settings (such as caching and optimization-driven scheduling) are of limited value because ML data dependencies are often very large and can change dramatically depending on the triggering event. In this work, we propose a novel correlation grouping mechanism that makes it easier for developers to express application-specific data access correlations, enabling coordinated management of data objects in server clusters hosting streaming inference tasks. Experiments based on a latency-sensitive ML-based application confirm the limitations of standard techniques while showing that our solution yields dramatically better performance. The proposed mechanism is able to maintain significantly lower and more consistent latency, achieves higher node utilization as workload and scale-out increase, and yet requires only minor changes to the code implementing the application

    Studying the effect of parallelization on the performance of Andromeda Search Engine: A search engine for peptides

    Get PDF
    Human body is made of proteins. The analysis of structure and functions of these proteins reveal important information about human body. An important technique used for protein evaluation is Mass Spectrometry. The protein data generated using mass spectrometer is analyzed for the detection of patterns in proteins. A wide variety of operations are performed on the data obtained from a mass spectrometer namely visualization, spectral deconvolution, peak alignment, normalization, pattern recognition and significance testing. There are a number of software that analyze the huge volume of data generated from a mass spectrometer. An example of such a software is MaxQuant that analyzes high resolution mass spectrometric data. A search engine called Andromeda is integrated into MaxQuant that is used for peptide identification. ^ One major drawback of the Andromeda Search Engine is its execution time. Identification of peptides involves a number of complex operations and intensive data processing. Therefore this research work focuses on implementing parallelization as a way to improve the performance of the Andromeda Search Engine. This is done by partitioning the data and distributing it across various cores and nodes. Also multiple tasks are executed concurrently on multiple nodes and cores. ^ A number of bioinformatics applications have been parallelized with significant improvement in execution time over the serial version. For this research work Task Parallel Library (TPL) and Common Library Runtime (CLR) constructs are used for parallelizing the application. The aim of this research work is to implement these techniques to parallelize the Andromeda Search Engine and gain improvement in the execution time by leveraging multi core architecture

    Analytical Queries: A Comprehensive Survey

    Full text link
    Modern hardware heterogeneity brings efficiency and performance opportunities for analytical query processing. In the presence of continuous data volume and complexity growth, bridging the gap between recent hardware advancements and the data processing tools ecosystem is paramount for improving the speed of ETL and model development. In this paper, we present a comprehensive overview of existing analytical query processing approaches as well as the use and design of systems that use heterogeneous hardware for the task. We then analyze state-of-the-art solutions and identify missing pieces. The last two chapters discuss the identified problems and present our view on how the ecosystem should evolve

    A software approach to enhancing quality of service in internet commerce

    Get PDF

    Metodología dirigida por modelos para las pruebas de un sistema distribuido multiagente de fabricación

    Get PDF
    Las presiones del mercado han empujado a las empresas de fabricación a reducir costes a la vez que mejoran sus productos, especializándose en las actividades sobre las que pueden añadir valor y colaborando con especialistas de las otras áreas para el resto. Estos sistemas distribuidos de fabricación conllevan nuevos retos, dado que es difícil integrar los distintos sistemas de información y organizarlos de forma coherente. Esto ha llevado a los investigadores a proponer una variedad de abstracciones, arquitecturas y especificaciones que tratan de atacar esta complejidad. Entre ellas, los sistemas de fabricación holónicos han recibido una atención especial: ven las empresas como redes de holones, entidades que a la vez están formados y forman parte de varios otros holones. Hasta ahora, los holones se han implementado para control de fabricación como agentes inteligentes autoconscientes, pero su curva de aprendizaje y las dificultades a la hora de integrarlos con sistemas tradicionales han dificultado su adopción en la industria. Por otro lado, su comportamiento emergente puede que no sea deseable si se necesita que las tareas cumplan ciertas garantías, como ocurren en las relaciones de negocio a negocio o de negocio a cliente y en las operaciones de alto nivel de gestión de planta. Esta tesis propone una visión más flexible del concepto de holón, permitiendo que se sitúe en un espectro más amplio de niveles de inteligencia, y defiende que sea mejor implementar los holones de negocio como servicios, componentes software que pueden ser reutilizados a través de tecnologías estándar desde cualquier parte de la organización. Estos servicios suelen organizarse como catálogos coherentes, conocidos como Arquitecturas Orientadas a Servicios (‘Service Oriented Architectures’ o SOA). Una iniciativa SOA exitosa puede reportar importantes beneficios, pero no es una tarea trivial. Por este motivo, se han propuesto muchas metodologías SOA en la literatura, pero ninguna de ellas cubre explícitamente la necesidad de probar los servicios. Considerando que la meta de las SOA es incrementar la reutilización del software en la organización, es una carencia importante: tener servicios de alta calidad es crucial para una SOA exitosa. Por este motivo, el objetivo principal de la presente Tesis es definir una metodología extendida que ayude a los usuarios a probar los servicios que implementan a sus holones de negocio. Tras considerar las opciones disponibles, se tomó la metodología dirigida por modelos SODM como punto de partida y se reescribió en su mayor parte con el framework Epsilon de código abierto, permitiendo a los usuarios que modelen su conocimiento parcial sobre el rendimiento esperado de los servicios. Este conocimiento parcial es aprovechado por varios nuevos algoritmos de inferencia de requisitos de rendimiento, que extraen los requisitos específicos de cada servicio. Aunque el algoritmo de inferencia de peticiones por segundo es sencillo, el algoritmo de inferencia de tiempos límite pasó por numerosas revisiones hasta obtener el nivel deseado de funcionalidad y rendimiento. Tras una primera formulación basada en programación lineal, se reemplazó con un algoritmo sencillo ad-hoc que recorría el grafo y después con un algoritmo incremental mucho más rápido y avanzado. El algoritmo incremental produce resultados equivalentes y tarda mucho menos, incluso con modelos grandes. Para sacar más partidos de los modelos, esta Tesis también propone un enfoque general para generar artefactos de prueba para múltiples tecnologías a partir de los modelos anotados por los algoritmos. Para evaluar la viabilidad de este enfoque, se implementó para dos posibles usos: reutilizar pruebas unitarias escritas en Java como pruebas de rendimiento, y generar proyectos completos de prueba de rendimiento usando el framework The Grinder para cualquier Servicio Web que esté descrito usando el estándar Web Services Description Language. La metodología completa es finalmente aplicada con éxito a un caso de estudio basado en un área de fabricación de losas cerámicas rectificadas de un grupo de empresas español. En este caso de estudio se parte de una descripción de alto nivel del negocio y se termina con la implementación de parte de uno de los holones y la generación de pruebas de rendimiento para uno de sus Servicios Web. Con su soporte para tanto diseñar como implementar pruebas de rendimiento de los servicios, se puede concluir que SODM+T ayuda a que los usuarios tengan una mayor confianza en sus implementaciones de los holones de negocio observados en sus empresas

    Just-in-time Analytics Over Heterogeneous Data and Hardware

    Get PDF
    Industry and academia are continuously becoming more data-driven and data-intensive, relying on the analysis of a wide variety of datasets to gain insights. At the same time, data variety increases continuously across multiple axes. First, data comes in multiple formats, such as the binary tabular data of a DBMS, raw textual files, and domain-specific formats. Second, different datasets follow different data models, such as the relational and the hierarchical one. Data location also varies: Some datasets reside in a central "data lake", whereas others lie in remote data sources. In addition, users execute widely different analysis tasks over all these data types. Finally, the process of gathering and integrating diverse datasets introduces several inconsistencies and redundancies in the data, such as duplicate entries for the same real-world concept. In summary, heterogeneity significantly affects the way data analysis is performed. In this thesis, we aim for data virtualization: Abstracting data out of its original form and manipulating it regardless of the way it is stored or structured, without a performance penalty. To achieve data virtualization, we design and implement systems that i) mask heterogeneity through the use of heterogeneity-aware, high-level building blocks and ii) offer fast responses through on-demand adaptation techniques. Regarding the high-level building blocks, we use a query language and algebra to handle multiple collection types, such as relations and hierarchies, express transformations between these collection types, as well as express complex data cleaning tasks over them. In addition, we design a location-aware compiler and optimizer that masks away the complexity of accessing multiple remote data sources. Regarding on-demand adaptation, we present a design to produce a new system per query. The design uses customization mechanisms that trigger runtime code generation to mimic the system most appropriate to answer a query fast: Query operators are thus created based on the query workload and the underlying data models; the data access layer is created based on the underlying data formats. In addition, we exploit emerging hardware by customizing the system implementation based on the available heterogeneous processors â CPUs and GPGPUs. We thus pair each workload with its ideal processor type. The end result is a just-in-time database system that is specific to the query, data, workload, and hardware instance. This thesis redesigns the data management stack to natively cater for data heterogeneity and exploit hardware heterogeneity. Instead of centralizing all relevant datasets, converting them to a single representation, and loading them in a monolithic, static, suboptimal system, our design embraces heterogeneity. Overall, our design decouples the type of performed analysis from the original data layout; users can perform their analysis across data stores, data models, and data formats, but at the same time experience the performance offered by a custom system that has been built on demand to serve their specific use case

    Protection against overflow attacks

    Get PDF
    Buffer overflow happens when the runtime process loads more data into the buffer than its design capacity. Bad programming style and lack of security concern cause overflow vulnerabilities in almost all applications on all the platforms;Buffer overflow attack can target any data in stack or heap. The current solutions ignore the overflowed targets other than return address. Function pointer, for example, is a possible target of overflow attack. By overflowing the function pointer in stack or heap, the attacker could redirect the program control flow when the function pointer is dereferenced to make a function call. To address this problem we implemented protection against overflow attacks targeting function pointers. During compiling phase, our patch collects the set of the variables that might change the value of function pointers at runtime. During running phase, the set is protected by encryption before the value is saved in memory and decryption before the value is used. The function pointer protection will cover all the overflow attacks targeting function pointers;To further extend the protection to cover all possible overflowing targets, we implemented an anomaly detection which checks the program runtime behavior against control flow checking automata. The control flow checking automata are derived from the source codes of the application. A trust value is introduced to indicate how well the runtime program matches the automata. The attacks modifying the program behavior within the source codes could be detected;Both function pointer protection and control flow checking are compiler patches which require the access to source codes. To cover buffer overflow attack and enforce security policies regardless of source codes, we implemented a runtime monitor with stream automata. Stream automata extend the concept of security automata and edit automata. The monitor works on the interactions between two virtual entities: system and program. The security policies are expressed in stream automata which perform Truncation, Suppression, Insertion, Metamorphosis, Forcing, and Two-Way Forcing on the interactions. We implement a program/operating system monitor to detect overflow attack and a local network/Internet monitor to enforce honeywall policies
    corecore