13 research outputs found

    Super-intelligence Challenges and Lossless Visual Representation of High-Dimensional Data

    Get PDF
    Fundamental challenges and goals of the cognitive algorithms are moving super-intelligent machines and super-intelligent humans from dreams to reality. This paper is devoted to a technical way to reach some specific aspects of super-intelligence that are beyond the current human cognitive abilities. Specifically the proposed technique is to overcome inabilities to analyze a large amount of abstract numeric high-dimensional data and finding complex patterns in these data with a naked eye. Discovering patterns in multidimensional data using visual means is a long-standing problem in multiple fields and Data Science and Modeling in general. The major challenge is that we cannot see n-D data by a naked eye and need visualization tools to represent n-D data in 2-D losslessly. The number of available lossless methods is quite limited. The objective of this paper is expanding the class of such lossless methods, by proposing a new concept of Generalized Shifted Collocated Paired Coordinates. The paper shows the advantages of proposed lossless technique by proving mathematical properties and by demonstration on real data

    Integrating BIM and GIS in railway projects: A critical review

    Get PDF
    The railway plays a significant role in human life by providing safe, reliable, costeffective services, which are environmental and drive economic growth. Significant decisions are taken at early stage of rail projects which need effective tools to avoid rework and save time, cost and increase work efficiency. Indeed, the continuous upgrading of this sector is needed to respond to technological advances, environmental change and increased customer demands. Integrating Building Information Modelling (BIM) and Geographic Information systems (GIS) is promising since the scope of BIM usually does not extend beyond the footprint of the “building”; it does not provide geospatial data. Therefore, integrating BIM with GIS provides a complete picture of the project. However, this integration is challenging especially in rail projects as they are amongst the most complicated projects and numerous parties are involved in making important decisions. This paper reviews the literature regarding integrating BIM with GIS systematically, with the aim of analysing the need for this integration and its benefits. The paper highlights a lack of a clear guideline for collaboration in the railway project lifecycle and indicates the need for research to focus on this issue as well as the possibility of applying integrated BIM with GIS as a potential solution to improve collaboration for better decision among project participants

    Studying the Effect of Delay on Group Performance in Collaborative Editing

    Get PDF
    International audienceReal-time collaborative editing systems such as Google Drive are increasingly common. However, no prior work questioned the maximum acceptable delay for real-time collaboration or the efficacy of com-pensatory strategies. In this study we examine the performance conse-quences of simulated network delay on an artificial collaborative doc-ument editing task with a time constant and metrics for process and outcome suitable for experimental study. Results suggest that strategy influences task outcome at least as much as delay in the distribution of work in progress. However, a paradoxical interaction between delay and strategy emerged, in which the more generally effective, but highly coupled strategy was also more sensitive to delay

    Efficient Renaming in Sequence CRDTs

    Get PDF
    International audienceTo achieve high availability, large-scale distributed systems have to replicate data and to minimise coordination between nodes. Literature and industry increasingly adopt Conflict-free Replicated Data Types (CRDTs) to design such systems. CRDTs are data types which behave as traditional ones, e.g. the Set or the Sequence. However, unlike traditional data types, they are designed to natively support concurrent modifications. To this end, they embed in their specification a conflict-resolution mechanism. To resolve conflicts in a deterministic manner, CRDTs usually attach identifiers to elements stored in the data structure. Identifiers have to comply with several constraints, such as uniqueness or belonging to a dense order. These constraints may hinder the identifiers' size from being bounded. As the system progresses, identifiers tend to grow. This inflation deepens the overhead of the CRDT over time, leading to performance issues. To address this issue, we propose a new CRDT for Sequence which embeds a renaming mechanism. It enables nodes to reassign shorter identifiers to elements in an un-coordinated manner. Experimental results demonstrate that this mechanism decreases the overhead of the replicated data structure and eventually limits it

    Efficient Renaming in Sequence CRDTs

    Get PDF
    International audienceTo achieve high availability, large-scale distributed systems have to replicate data and to minimise coordination between nodes. For these purposes, literature and industry increasingly adopt Conflict-free Replicated Data Types (CRDTs) to design such systems. Conflict-free Replicated Data Types (CRDTs) are new specifications of existing data types, e.g., Set or Sequence. While CRDTs have the same behaviour as previous specifications in sequential executions, they actually shine in distributed settings as they natively support concurrent updates. To this end, CRDTs embed in their specification conflict resolution mechanisms. These mechanisms usually rely on identifiers attached to elements of the data structure to resolve conflicts in a deterministic and coordination-free manner. Identifiers have to comply with several constraints, such as being unique or belonging to a dense total order. These constraints may hinder the identifier size from being bounded. Identifiers hence tend to grow as the system progresses, which increases the overhead of CRDTs over time and leads to performance issues. To address this issue, we propose a novel Sequence CRDT which embeds a renaming mechanism. It enables nodes to reassign shorter identifiers to elements in an uncoordinated manner. Experimental results demonstrate that this mechanism decreases the overhead of the replicated data structure and eventually minimises it

    Constructing Interactive Visual Classification, Clustering and Dimension Reduction Models for n-D Data

    Get PDF
    The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual discovery of n-D relations for supervised learning. The method includes automatic, interactive, and combined algorithms for discovering linear relations, dimension reduction, and generalization for non-linear relations. This method is a special category of reversible General Line Coordinates (GLC). It produces graphs in 2-D that represent n-D points losslessly, i.e., allowing the restoration of n-D data from the graphs. The projections of graphs are used for classification. The method is illustrated by solving machine-learning classification and dimension-reduction tasks from the domains of image processing, computer-aided medical diagnostics, and finance. Experiments conducted on several datasets show that this visual interactive method can compete in accuracy with analytical machine learning algorithms

    Hybrid human-machine information systems for data classification

    Get PDF
    Over the last decade, we have seen an intense development of machine learning approaches for solving various tasks in diverse domains. Despite the remarkable advancements in this field, there are still task categories that machine learning models fall short of the required accuracy. This is the case with tasks that require human cognitive skills, such as sentiment analysis, emotional or contextual understanding. On the other hand, human-based computation approaches, such as crowdsourcing, are popular for solving such tasks. Crowdsourcing enables access to a vast number of groups with different expertise, and if managed properly, generates high-quality results. However, crowdsourcing as a standalone approach is not scalable due to the latency and cost it brings in. Addressing the challenges and limitations that the human and machine-based approaches have distinctly requires bridging the two fields into a hybrid intelligence, seen as a promising approach to solve critical and complex real-world tasks. This thesis focuses on hybrid human-machine information systems, combining machine and human intelligence and leveraging their complementary strengths: the data processing efficiency of machine learning and the data quality generated by crowdsourcing. In this thesis, we present hybrid human-machine models to address the challenges falling into three dimensions: accuracy, latency, and cost. Solving data classification tasks in different domains has different requirements concerning accuracy, latency, and cost criteria. Motivated by this fact, we introduce a master component that evaluates these criteria to find the suitable model as a trade-off solution. In hybrid human-machine information systems, incorporating human judgments is expected to improve the accuracy of the system. Therefore, to ensure this, we focus on the human intelligence component, integrating profile-aware crowdsourcing for task assignment and data quality control mechanisms in the hybrid pipelines. The proposed conceptual hybrid human-machine models materialize in conducted experiments. Motivated by challenging scenarios and using real-world datasets, we implement the hybrid models in three experiments. Evaluations show that the implemented hybrid human-machine architectures for data classification tasks lead to better results as compared to each of the two approaches individually, improving the overall accuracy at an acceptable cost and latency

    Eine Analyse der Literatur zur Referenzmodellierung im Geschäftsprozessmanagement unter Berücksichtigung quantitativer Methoden

    Get PDF
    Im Geschäftsprozessmanagement nimmt die Referenzmodellierung bei der Gestaltung von Geschäftsprozessen eine große Bedeutung ein, da auf bereits existierende Modelle zurückgegriffen werden kann. So kann Zeit für die Entwicklung der Prozesse eingespart und von bereits etabliertem Wissen profitiert werden. Die vorliegende Masterarbeit analysiert die Literatur im Bereich der Referenzmodellierung im Geschäftsprozessmanagement unter Berücksichtigung quantitativer Methoden. Es werden insbesondere die Forschungsrichtungen bzw. Themenbereiche, Entwicklungen und der aktuelle Stand der Literatur in diesem Bereich ermittelt. Zunächst werden deutsch- und englischsprachige Artikel nach bestimmten Kriterien ausgewählt. Anschließend folgt eine quantitativ orientierte Analyse der Literatur. Dabei kommt die Latente Semantische Analyse zum Einsatz, mit deren Hilfe Themenbereiche ermittelt werden und die einzelnen Beiträge den ermittelten Themenbereichen zugeordnet werden können. Darüber hinaus wird die Entwicklung der Anzahl der Artikel in den Themenbereichen im Zeitverlauf betrachtet und auf Unterschiede zwischen der deutsch- und englischsprachigen Literatur eingegangen. In der darauf folgenden qualitativ orientierten Analyse werden die Artikel der einzelnen Themenbereiche inhaltlich analysiert und der aktuelle Stand der Forschung dargestellt. Nicht zuletzt werden die Ergebnisse der qualitativen Analyse in Bezug zu den Ergebnissen der quantitativen Analyse gesetzt

    Responsive Architecture

    Get PDF
    This book is a collection of articles that have been published in the Special Issue “Responsive Architecture” of the MDPI journal Buildings. The eleven articles within cover various areas of sensitive architecture, including the design of packaging structures reacting to supporting components; structural efficiency of bent columns in indigenous houses; roof forms responsive to buildings depending on their resiliently transformed steel shell parts; creative design of building free shapes covered with transformed shells; artistic structural concepts of the architect and civil engineer; digitally designed airport terminal using wind analysis; rationalized shaping of sensitive curvilinear steel construction; interactive stories of responsive architecture; transformed shell roof constructions as the main determinant in the creative shaping of buildings without shapes that are sensitive to man-made and natural environments; thermally sensitive performances of a special shielding envelope on balconies; quantification of generality and adaptability of building layout using the SAGA method; and influence of initial conditions on the simulation of the transient temperature field inside a wall

    Evaluation and optimization of Big Data Processing on High Performance Computing Systems

    Get PDF
    Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 524V01[Resumo] Hoxe en día, moitas organizacións empregan tecnoloxías Big Data para extraer información de grandes volumes de datos. A medida que o tamaño destes volumes crece, satisfacer as demandas de rendemento das aplicacións de procesamento de datos masivos faise máis difícil. Esta Tese céntrase en avaliar e optimizar estas aplicacións, presentando dúas novas ferramentas chamadas BDEv e Flame-MR. Por unha banda, BDEv analiza o comportamento de frameworks de procesamento Big Data como Hadoop, Spark e Flink, moi populares na actualidade. BDEv xestiona a súa configuración e despregamento, xerando os conxuntos de datos de entrada e executando cargas de traballo previamente elixidas polo usuario. Durante cada execución, BDEv extrae diversas métricas de avaliación que inclúen rendemento, uso de recursos, eficiencia enerxética e comportamento a nivel de microarquitectura. Doutra banda, Flame-MR permite optimizar o rendemento de aplicacións Hadoop MapReduce. En xeral, o seu deseño baséase nunha arquitectura dirixida por eventos capaz de mellorar a eficiencia dos recursos do sistema mediante o solapamento da computación coas comunicacións. Ademais de reducir o número de copias en memoria que presenta Hadoop, emprega algoritmos eficientes para ordenar e mesturar os datos. Flame-MR substitúe o motor de procesamento de datos MapReduce de xeito totalmente transparente, polo que non é necesario modificar o código de aplicacións xa existentes. A mellora de rendemento de Flame-MR foi avaliada de maneira exhaustiva en sistemas clúster e cloud, executando tanto benchmarks estándar coma aplicacións pertencentes a casos de uso reais. Os resultados amosan unha redución de entre un 40% e un 90% do tempo de execución das aplicacións. Esta Tese proporciona aos usuarios e desenvolvedores de Big Data dúas potentes ferramentas para analizar e comprender o comportamento de frameworks de procesamento de datos e reducir o tempo de execución das aplicacións sen necesidade de contar con coñecemento experto para elo.[Resumen] Hoy en día, muchas organizaciones utilizan tecnologías Big Data para extraer información de grandes volúmenes de datos. A medida que el tamaño de estos volúmenes crece, satisfacer las demandas de rendimiento de las aplicaciones de procesamiento de datos masivos se vuelve más difícil. Esta Tesis se centra en evaluar y optimizar estas aplicaciones, presentando dos nuevas herramientas llamadas BDEv y Flame-MR. Por un lado, BDEv analiza el comportamiento de frameworks de procesamiento Big Data como Hadoop, Spark y Flink, muy populares en la actualidad. BDEv gestiona su configuración y despliegue, generando los conjuntos de datos de entrada y ejecutando cargas de trabajo previamente elegidas por el usuario. Durante cada ejecución, BDEv extrae diversas métricas de evaluación que incluyen rendimiento, uso de recursos, eficiencia energética y comportamiento a nivel de microarquitectura. Por otro lado, Flame-MR permite optimizar el rendimiento de aplicaciones Hadoop MapReduce. En general, su diseño se basa en una arquitectura dirigida por eventos capaz de mejorar la eficiencia de los recursos del sistema mediante el solapamiento de la computación con las comunicaciones. Además de reducir el número de copias en memoria que presenta Hadoop, utiliza algoritmos eficientes para ordenar y mezclar los datos. Flame-MR reemplaza el motor de procesamiento de datos MapReduce de manera totalmente transparente, por lo que no se necesita modificar el código de aplicaciones ya existentes. La mejora de rendimiento de Flame-MR ha sido evaluada de manera exhaustiva en sistemas clúster y cloud, ejecutando tanto benchmarks estándar como aplicaciones pertenecientes a casos de uso reales. Los resultados muestran una reducción de entre un 40% y un 90% del tiempo de ejecución de las aplicaciones. Esta Tesis proporciona a los usuarios y desarrolladores de Big Data dos potentes herramientas para analizar y comprender el comportamiento de frameworks de procesamiento de datos y reducir el tiempo de ejecución de las aplicaciones sin necesidad de contar con conocimiento experto para ello.[Abstract] Nowadays, Big Data technologies are used by many organizations to extract valuable information from large-scale datasets. As the size of these datasets increases, meeting the huge performance requirements of data processing applications becomes more challenging. This Thesis focuses on evaluating and optimizing these applications by proposing two new tools, namely BDEv and Flame-MR. On the one hand, BDEv allows to thoroughly assess the behavior of widespread Big Data processing frameworks such as Hadoop, Spark and Flink. It manages the configuration and deployment of the frameworks, generating the input datasets and launching the workloads specified by the user. During each workload, it automatically extracts several evaluation metrics that include performance, resource utilization, energy efficiency and microarchitectural behavior. On the other hand, Flame-MR optimizes the performance of existing Hadoop MapReduce applications. Its overall design is based on an event-driven architecture that improves the efficiency of the system resources by pipelining data movements and computation. Moreover, it avoids redundant memory copies present in Hadoop, while also using efficient sort and merge algorithms for data processing. Flame-MR replaces the underlying MapReduce data processing engine in a transparent way and thus the source code of existing applications does not require to be modified. The performance benefits provided by Flame- MR have been thoroughly evaluated on cluster and cloud systems by using both standard benchmarks and real-world applications, showing reductions in execution time that range from 40% to 90%. This Thesis provides Big Data users with powerful tools to analyze and understand the behavior of data processing frameworks and reduce the execution time of the applications without requiring expert knowledge
    corecore