6 research outputs found

    Big Data Optimization: Framework Algorítmico para el análisis de Datos guiado por Semántica

    Get PDF
    En las últimas décadas el aumento de fuentes de información en diferentes campos de la sociedad desde la salud hasta las redes sociales, ha puesto de manifiesto la necesidad de nuevas técnicas para su análisis, lo que se ha venido a llamar el Big Data. Los problemas clásicos de optimización no son ajenos a este cambio de paradigma, como por ejemplo el problema del viajante de comercio (TSP), ya que se puede beneficiar de los datos que proporciona los diferentes sensores que se encuentran en las ciudades y que podemos acceder a ellos gracias a los portales de Open Data. En esta tesis se ha desarrollado un nuevo framework, jMetalSP, para la optimización de problemas en el ´ ámbito del Big Data permitiendo el uso de fuentes de datos externas para modificar los datos del problema en tiempo real. Por otro lado, cuando estamos realizando análisis, ya sea de optimización o machine learning en Big Data, una de las formas más usada de abordarlo es mediante workflows de análisis. Estos están formados por componentes que hacen cada paso del análisis. El flujo de información en workflows puede ser anotada y almacenada usando herramientas de la Web Semántica para facilitar la reutilización de dichos componentes o incluso el workflow completo en futuros análisis, facilitando así, su reutilización y a su vez, mejorando el procesos de creación de los mismos. Para ello se ha creado la ontología BIGOWL, que permite trazar la cadena de valor de los datos de los workflows mediante semántica y además ayuda al analista en la creación de workflow gracias a que va guiando su composición con la información que contiene por la anotación de algoritmos, datos, componentes y workflows

    Big Data Optimization : Algorithmic Framework for Data Analysis Guided by Semantics

    Get PDF
    Fecha de Lectura de Tesis: 9 noviembre 2018.Over the past decade the rapid rise of creating data in all domains of knowledge such as traffic, medicine, social network, industry, etc., has highlighted the need for enhancing the process of analyzing large data volumes, in order to be able to manage them with more easiness and in addition, discover new relationships which are hidden in them Optimization problems, which are commonly found in current industry, are not unrelated to this trend, therefore Multi-Objective Optimization Algorithms (MOA) should bear in mind this new scenario. This means that, MOAs have to deal with problems, which have either various data sources (typically streaming) of huge amount of data. Indeed these features, in particular, are found in Dynamic Multi-Objective Problems (DMOPs), which are related to Big Data optimization problems. Mostly with regards to velocity and variability. When dealing with DMOPs, whenever there exist changes in the environment that affect the solutions of the problem (i.e., the Pareto set, the Pareto front, or both), therefore in the fitness landscape, the optimization algorithm must react to adapt the search to the new features of the problem. Big Data analytics are long and complex processes therefore, with the aim of simplify them, a series of steps are carried out through. A typical analysis is composed of data collection, data manipulation, data analysis and finally result visualization. In the process of creating a Big Data workflow the analyst should bear in mind the semantics involving the problem domain knowledge and its data. Ontology is the standard way for describing the knowledge about a domain. As a global target of this PhD Thesis, we are interested in investigating the use of the semantic in the process of Big Data analysis, not only focused on machine learning analysis, but also in optimization

    Understanding and Optimizing Flash-based Key-value Systems in Data Centers

    Get PDF
    Flash-based key-value systems are widely deployed in today’s data centers for providing high-speed data processing services. These systems deploy flash-friendly data structures, such as slab and Log Structured Merge(LSM) tree, on flash-based Solid State Drives(SSDs) and provide efficient solutions in caching and storage scenarios. With the rapid evolution of data centers, there appear plenty of challenges and opportunities for future optimizations. In this dissertation, we focus on understanding and optimizing flash-based key-value systems from the perspective of workloads, software, and hardware as data centers evolve. We first propose an on-line compression scheme, called SlimCache, considering the unique characteristics of key-value workloads, to virtually enlarge the cache space, increase the hit ratio, and improve the cache performance. Furthermore, to appropriately configure increasingly complex modern key-value data systems, which can have more than 50 parameters with additional hardware and system settings, we quantitatively study and compare five multi-objective optimization methods for auto-tuning the performance of an LSM-tree based key-value store in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. Last but not least, we conduct an in-depth, comprehensive measurement work on flash-optimized key-value stores with recently emerging 3D XPoint SSDs. We reveal several unexpected bottlenecks in the current key-value store design and present three exemplary case studies to showcase the efficacy of removing these bottlenecks with simple methods on 3D XPoint SSDs. Our experimental results show that our proposed solutions significantly outperform traditional methods. Our study also contributes to providing system implications for auto-tuning the key-value system on flash-based SSDs and optimizing it on revolutionary 3D XPoint based SSDs

    Evolutionary multiobjective optimization : review, algorithms, and applications

    Get PDF
    Programa Doutoral em Engenharia Industrial e SistemasMany mathematical problems arising from diverse elds of human activity can be formulated as optimization problems. The majority of real-world optimization problems involve several and con icting objectives. Such problems are called multiobjective optimization problems (MOPs). The presence of multiple con icting objectives that have to be simultaneously optimized gives rise to a set of trade-o solutions, known as the Pareto optimal set. Since this set of solutions is crucial for e ective decision-making, which generally aims to improve the human condition, the availability of e cient optimization methods becomes indispensable. Recently, evolutionary algorithms (EAs) have become popular and successful in approximating the Pareto set. The population-based nature is the main feature that makes them especially attractive for dealing with MOPs. Due to the presence of two search spaces, operators able to e ciently perform the search in both the decision and objective spaces are required. Despite the wide variety of existing methods, a lot of open research issues in the design of multiobjective evolutionary algorithms (MOEAs) remains. This thesis investigates the use of evolutionary algorithms for solving multiobjective optimization problems. Innovative algorithms are developed studying new techniques for performing the search either in the decision or the objective space. Concerning the search in the decision space, the focus is on the combinations of traditional and evolutionary optimization methods. An issue related to the search in the objective space is studied in the context of many-objective optimization. Application of evolutionary algorithms is addressed solving two di erent real-world problems, which are modeled using multiobjective approaches. The problems arise from the mathematical modelling of the dengue disease transmission and a wastewater treatment plant design. The obtained results clearly show that multiobjective modelling is an e ective approach. The success in solving these challenging optimization problems highlights the practical relevance and robustness of the developed algorithms.Muitos problemas matemáticos que surgem nas diversas áreas da atividade humana podem ser formulados como problemas de otimização. A maioria dos problemas do mundo real envolve vários objetivos conflituosos. Tais problemas chamam-se problemas de otimização multiobjetivo. A presença de vários objetivos conflituosos, que têm de ser otimizados em simultâneo, dá origem a um conjunto de soluções de compromisso, conhecido como conjunto de soluções ótimas de Pareto. Uma vez que este conjunto de soluções é fundamental para uma tomada de decisão eficaz, cujo objetivo em geral é melhorar a condição humana, o desenvolvimento de métodos de otimização eficientes torna-se indispensável. Recentemente, os algoritmos evolucionários tornaram-se populares e bem-sucedidos na aproximação do conjunto de Pareto. A natureza populacional é a principal característica que os torna especialmente atraentes para lidar com problemas de otimização multiobjetivo. Devido à presença de dois espaços de procura, operadores capazes de realizar a procura de forma eficiente, tanto no espaço de decisão como no espaço dos objetivos, são necessários. Apesar da grande variedade de métodos existentes, várias questões de investigação permanecem em aberto na área do desenvolvimento de algoritmos evolucionários multiobjetivo. Esta tese investiga o uso de algoritmos evolucionários para a resolução de problemas de otimização multiobjetivo. São desenvolvidos algoritmos inovadores que estudam novas técnicas de procura, quer no espaço de decisão, quer no espaço dos objetivos. No que diz respeito à procura no espaço de decisão, o foco está na combinação de métodos de otimização tradicionais com algoritmos evolucionários. A questão relacionada com a procura no espaço dos objetivos é desenvolvida no contexto da otimização com muitos objetivos. A aplicação dos algoritmos evolucionários é abordada resolvendo dois problemas reais, que são modelados utilizando abordagens multiobjectivo. Os problemas resultam da modelação matemática da transmissão da doença do dengue e do desenho ótimo de estações de tratamento de águas residuais. O sucesso na resolução destes problemas de otimização constitui um desafio e destaca a relevância prática e robustez dos algoritmos desenvolvidos

    Scalarized Preferences in Multi-objective Optimization

    Get PDF
    Multikriterielle Optimierungsprobleme verfügen über keine Lösung, die optimal in jeder Zielfunktion ist. Die Schwierigkeit solcher Probleme liegt darin eine Kompromisslösung zu finden, die den Präferenzen des Entscheiders genügen, der den Kompromiss implementiert. Skalarisierung – die Abbildung des Vektors der Zielfunktionswerte auf eine reelle Zahl – identifiziert eine einzige Lösung als globales Präferenzenoptimum um diese Probleme zu lösen. Allerdings generieren Skalarisierungsmethoden keine zusätzlichen Informationen über andere Kompromisslösungen, die die Präferenzen des Entscheiders bezüglich des globalen Optimums verändern könnten. Um dieses Problem anzugehen stellt diese Dissertation eine theoretische und algorithmische Analyse skalarisierter Präferenzen bereit. Die theoretische Analyse besteht aus der Entwicklung eines Ordnungsrahmens, der Präferenzen als Problemtransformationen charakterisiert, die präferierte Untermengen der Paretofront definieren. Skalarisierung wird als Transformation der Zielmenge in diesem Ordnungsrahmen dargestellt. Des Weiteren werden Axiome vorgeschlagen, die wünschenswerte Eigenschaften von Skalarisierungsfunktionen darstellen. Es wird gezeigt unter welchen Bedingungen existierende Skalarisierungsfunktionen diese Axiome erfüllen. Die algorithmische Analyse kennzeichnet Präferenzen anhand des Resultats, das ein Optimierungsalgorithmus generiert. Zwei neue Paradigmen werden innerhalb dieser Analyse identifiziert. Für beide Paradigmen werden Algorithmen entworfen, die skalarisierte Präferenzeninformationen verwenden: Präferenzen-verzerrte Paretofrontapproximationen verteilen Punkte über die gesamte Paretofront, fokussieren aber mehr Punkte in Regionen mit besseren Skalarisierungswerten; multimodale Präferenzenoptima sind Punkte, die lokale Skalarisierungsoptima im Zielraum darstellen. Ein Drei-Stufen-Algorith\-mus wird entwickelt, der lokale Skalarisierungsoptima approximiert und verschiedene Methoden werden für die unterschiedlichen Stufen evaluiert. Zwei Realweltprobleme werden vorgestellt, die die Nützlichkeit der beiden Algorithmen illustrieren. Das erste Problem besteht darin Fahrpläne für ein Blockheizkraftwerk zu finden, die die erzeugte Elektrizität und Wärme maximieren und den Kraftstoffverbrauch minimiert. Präferenzen-verzerrte Approximationen generieren mehr Energie-effiziente Lösungen, unter denen der Entscheider seine favorisierte Lösung auswählen kann, indem er die Konflikte zwischen den drei Zielen abwägt. Das zweite Problem beschäftigt sich mit der Erstellung von Fahrplänen für Geräte in einem Wohngebäude, so dass Energiekosten, Kohlenstoffdioxidemissionen und thermisches Unbehagen minimiert werden. Es wird gezeigt, dass lokale Skalarisierungsoptima Fahrpläne darstellen, die eine gute Balance zwischen den drei Zielen bieten. Die Analyse und die Experimente, die in dieser Arbeit vorgestellt werden, ermöglichen es Entscheidern bessere Entscheidungen zu treffen indem Methoden angewendet werden, die mehr Optionen generieren, die mit den Präferenzen der Entscheider übereinstimmen
    corecore